A new approach is proposed to predict the fractal behavior of a distributed network traffic, in which a random scaling fractal model is used to simulate the self-affine characteristics ofa network traffic.A study of the network traffic is done by sniffing a portion of it using Wireshark. The sniffed traffic is inspected and dissected using filter option, for each differentprotocols. The fractal behavior of the traffic are sniffed and examined by an open-source network analyzer. Later, the packet records that were sniffed are exported to NeuroSolutions builder,SPSS andthen examined. Further, the exported and dissected traffic data is fed as input to train the neural network to let it predict the resultant fractal behavior of the distributed network traffic and an equation is proposed to derive the ultimate close network traffic prediction in SPSS.
A new approach is proposed to predict the fractal behavior of a distributed network traffic, in which a random scaling fractal model is used to simulate the self-affine characteristics ofa network traffic.A study of the network traffic is done by sniffing a portion of it using Wireshark. The sniffed traffic is inspected and dissected using filter option, for each differentprotocols. The fractal behavior of the traffic are sniffed and examined by an open-source network analyzer. Later, the packet records that were sniffed are exported to NeuroSolutions builder,SPSS andthen examined. Further, the exported and dissected traffic data is fed as input to train the neural network to let it predict the resultant fractal behavior of the distributed network traffic and an equation is proposed to derive the ultimate close network traffic prediction in SPSS.
A new approach is proposed to predict the fractal behavior of a distributed network traffic, in which a random scaling fractal model is used to simulate the self-affine characteristics ofa network traffic.A study of the network traffic is done by sniffing a portion of it using Wireshark. The sniffed traffic is inspected and dissected using filter option, for each differentprotocols. The fractal behavior of the traffic are sniffed and examined by an open-source network analyzer. Later, the packet records that were sniffed are exported to NeuroSolutions builder,SPSS andthen examined. Further, the exported and dissected traffic data is fed as input to train the neural network to let it predict the resultant fractal behavior of the distributed network traffic and an equation is proposed to derive the ultimate close network traffic prediction in SPSS.
A Short-Term Traffic Prediction On A Distributed Network Using Multiple Regression Equation Ms.Sharmi .S 1 Dr.M.Punithavalli Research Scholar, Director, MS University,Thirunelvelli SREC,Coimbatore.
Abstract: A new approach is proposed to predict the fractal behavior of a distributed network traffic, in which a random scaling fractal model is used to simulate the self-affine characteristics ofa network traffic.A study of the network traffic is done by sniffing a portion of it using Wireshark. The sniffed traffic is inspected and dissected using filter option, for each differentprotocols. The fractal behavior of the traffic are sniffed and examined by an open-source network analyzer. Later, the packet records that were sniffed are exported to NeuroSolutions builder,SPSS andthen examined. Further, the exported and dissected traffic data is fed as input to train the neural network to let it predict the resultant fractal behavior of the distributed network traffic and an equation is proposed to derive the ultimate close network traffic prediction in SPSS. Keywords: fractal behavior, sniffing, predict, SPSS, NeuroSolution builder, NeuroXL predictor. I INTRODUCTION For the examination of local problems in a small network, monitoring at a single observation point is sufficient to train the network builder. For such cases, a network analyzer may be used which can be a machine running Wireshark and is directly connected to a network segment or the monitoring port of a switch or a router. In larger networks, it is often necessary to perform simultaneous monitoring at multiple observation points to train the constructed neural network in a more efficient manner. In this research a Neural Network(Multi- layer Perceptron)is proposed to be used to predict the dependent variable values over different independent variable value distributions using two specific modeling tools, viz., SPSS and NeuroSolutions. One objective of this is to find the effect of the dependent variable values distributions in the dataset using different modeling tools on the Neural Network prediction performance. A second objective is to compare the performance of the two modeling tools in the predictionof the dependent variable values. Analyzing packet records with wireshark Wireshark [1], formerly known as Ethereal is probably the most popular open-source network analyzer tool. For the experiments, we configured Wireshark on our machine to capture network packets. The data collected is exported in Comma Separated Value (.csv) format. Wireshark can be divided into four main modules: Capture Core, WireTap, Protocol Interpreter and Dissector. Capture Core uses the common library WinPcap to capture data from different network (Ethernet, Ring, etc.); once the data is obtained, WireTap is used to save it as a binary file; since the data is in binary, without the Protocol Interpreter and Dissector, user cannot understand the data. Dissector can be available in a built-in or a plug-in mode. The proposed approach allows profiting from Wireshark's extensive packet inspection facility and protocol dissection capabilities for distributed network analysis. Neuro solutions The NeuralBuilder helps to construct the neural network by selecting parameters. The four currently available problem types in the NeuralExpert are Classification, Prediction, Function Approximation, and Clustering. Later, a parameter list is selected to train the neural network and the desired traffic is output to train the network. International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013
ISSN: 2231-2803 http://www.ijcttjournal.org Page 2453 Figure 1. Flow diagram to deploy traffic prediction using ANN.
An ANN is a computational method motivated by biological models. ANNs attempt to mimic the fundamental operation of the human brain and can be used to solve a broad variety of problems [10]. One of the most important features of ANNs is that it can discover hidden patterns from data sets [11], and solve complex problems especially when a mathematical model does not exist (or when the model is not suitable for the case at hand). Furthermore, ANNs are commonly immune to noise and irregularities present in the data [12, 13]. ANN learning is typically based on two data sets: the training set and the validation set. The training set is used on a new artificial neural network, as its name indicates, for training. The validation set is used after the neural network has been trained to assess its performance. The validation set in most case is similar to the training set but not same [14, 15 ]. Data mapping In artificial intelligence, a desired output is commonly known as the target. For the specific case of ANNs, the target is used for network training [9]. ANNs can map a given input to a desired output; when an ANN is used for this purpose, the ANN is typically called a mapping ANN. The network is trained by applying the desired input to the ANN, and then monitoring the actual ANN output. The difference between the actual ANN output and the desired output is normally used to manage the learning process. During the process of training, the learning algorithm attempts to reduce the error measured between the actual network output and the targetin the training set [9, 11]. The training process may be time consuming, but when the process has been successfully completed, an ANN canquickly calculate its output once the input data has International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013
ISSN: 2231-2803 http://www.ijcttjournal.org Page 2454 Start Translate the network traffic data parameters. Train the NNs architecture for N number of epochs. Evaluate the performance Criteria Satisfy Extract a new traffic dataset dissected Perform Prediction- Original expected traffic Dissect the network traffic dataset and enlist the Step: 1 Step : 2 Step :3 Step : 4 Step : 5 Stop Y N
Figure 2: Flow diagram of ANN using NeuroSolutions:
been applied to the network input. Data classification Data classification or just classification is the process of identifying an object from a set of possible outcomes [9, 12]. An ANN can be trained to identify and classify any kind of objects. These objects can be numbers, images, sounds, signals, etc. An ANN used for this purpose is also known as a classifier.
Figure 3. Training fractal-dataset graph International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013
ISSN: 2231-2803 http://www.ijcttjournal.org Page 2455 The traffic data is trained initially with a network traffic-dataset that had been downloaded from wireshark sample captures as a pcap file and the data is exported to network builder for prediction. The predicted fractal behavior on the traffic data set is shown in table 1.
II INVESTIGATION OF CORRELATION COEFFICIENT VALUE On investigating the effect of dependent variable values and the distribution on the prediction accuracy rate. The results of the analyses lets us to find the effect of the dependent variable values distribution on prediction accuracy that exploits and leads us generating an equation that would predict the expected traffic based on the independent variable-values distribution using the modeling tool SPSS. Correlation Coefficient, R, is a measure of the strength of the association between the independent (explanatory) variables and the dependent (prediction) variable.R is never a negative value. This can be seen from the formula below, since the square root of this value indicates the positive root[2,3]. Formula for R,Formula for two independent variables, X1 and X2
The coefficient of multiple correlation estimates the combined influence of two or more variables on the observed (dependent) variable. To analyse the traffic data using multiple regression, part of the process involves the following assumptions to be verified[8]. The dependent variable is measured on a continuous scale. Two or more independent variables, are continuous or categorical. Observatios should be recorded. Linear relationship exists between the dependent variable and each of the independent variables. Traffic data shows homoscedasticity, which is where the variances along the line of best-fit remain similar as one move along the line. The data does not show multicollinearity, which occurs when two or more independent variables are highly correlated. There are no significant outliers, high leverage points or highly influential points. Residuals (errors) are approximately normally distributed. The above listed assumptions are not violated and henceforth the Multiple Correlation Coefficient, R, is computed to measure the strength of the association between the independent (explanatory) variables and a single dependent (prediction) variable. Multiple Regression-booster prediction phases: In MR-Booster, by using each feature of the association existing between the actual traffic and the dissected traffic explicitly helped to generate the prediction equation and the standard error factor when probed in further boosts a better way to refine the regression equation that predicts the network traffic. The correlation structure of traffic is finally generated in a much easier way. Phase 1: a. The sniffed traffic data is plotted as a scatter plot graph to visualize if there is a possible linear relationship. b. Calculate and interpret the linear correlation coefficient, using the data sets. International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013
ISSN: 2231-2803 http://www.ijcttjournal.org Page 2456 Phase 2: c. Determine all possible regression equation for the data by refining it further by adjusting the constant standard error from it. d. Select and apply the best generated regression equation and forecast. Phase 3: e. Identify outliers and note the observations. f. Process and interpret the performance of, R-booster prediction.
Table 2.Correlation Coefficientsa (a-dependent actual traffic-graph) Model Unstandardized Coefficients Standardized Coefficients T Sig. R Std. Error Beta 1 (Constant) .013 .005 2.711 .007 Network1(n1) .880 .007 .830 133.561 .000 Network2(n2) 1.047 .032 .083 32.668 .000 Network3(n3) .229 .008 .170 27.395 .000
The equation generated to predict the actual traffic that could be generated for the following dissected protocol-traffic. Predicted traffic(w.r.t time slice)=n1 *(R( n1) standard Error-n1) + n2 *(R(n2) standard Error) +n3 * (R(n3) standard Error) +(R-constant standard Error)
Predicted-traffic=Traffic-n1*0.873+Traffic- n2*1.015+Traffic-n3*0.221+0.008 R value of traffic from n1 and n2 have a strong association with the actual traffic, where as traffic from n3 has a weak association is shown in table 3.
Table 3.R value strength. R value Interpretation 0.9 strong association 0.5 moderate association 0.25 weak association
International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013
Figure 4. Actual-traffic vs Predicted traffic(NeuroSolutions) and Computed-traffic(SPSS) The figure 4, shows that the traffic computed using the generated equation is very close to the actual-target-traffic.
III PERFORMANCE EVALUATION The overall performance of the analyzed prediction methods are stated here to estimate the prediction accuracy. Coefficient Efficiency(E) is one such estimation method that measures the performance and reveals the efficiency rate. The efficiency coefficient can take values in the domain (, 1]. If E = 1, we have a perfect fit between the observed and the forecasted data. A value of E =0 occurs when the prediction corresponds to estimating the mean of the actual values. An efficiency less than zero, i.e. < E < 0, indicates that the average of the actual values is a better predictor than the analyzed forecasting method. The closer E is to 1, the more accurate the prediction is as the coefficient efficiency stays at 0.9 for the forecasted traffic
IV CONCLUSION The experimental results demonstrate that 1) the regression model is more effective for traffic prediction; and 2) both the proposed prediction equation and standard error based R(correlation coefficient) update scheme are effective to predict the traffic in a easier way.The goal of the experiments is to evaluate and to compare the performance of the ANN prediction approaches presented earlier in this paper. Hence, the linear regression model offers is a powerful tool for analyzing the association between one or more independent variables and a single dependent variable. Some novice researchers wish to move quickly beyond this model and learn to use more sophisticated models because they get discouraged about its limitations and believe that other regression models are more appropriate for their analysis needs. References [1]Wireshark Homepage, http:// www. wireshark.org , 2008. [2] ClearSight Networks, Inc. Homepage, http://www.clearsightnet.com,2008. [3]https://statistics.laerd.com/spss-tutorials /multiple- regression-using-spss-statistics. php [4]http://en.wikipedia.org/wiki/Multiple _ correlation [5]http://www.yeatts.us/6200-Multivariate %20Stats/Lectures-tests/Test%202/Week-12- assumptions.pdf [6] WildPackets, Inc. Homepage, http: //www. wildpackets.com, 2008. [7] S. Waldbusser, Remote Network Monitoring Management InformationBase, RFC 2819 (Standard), May 2000. [8] T. Masters, Practical Neural Network Recipes in C++. Preparing Input Data (C-16), Academic Press, Inc., pp.253- 270, (1993). [9] S. J. Russel and P. Norvig, Artificial Intelligence: A Modern Approach.Prentice-Hall of India, Second Edition.Statistical Learning Methods (C-20), pp. 712-762, (2006). [10] T. Masters, Neural, Novel & Hybrid Algorithms for Time Series Prediction. Neural Network Tools (C-10), J ohn Wiley & Sons Inc., pp. 367-374, (1995). [11] T. Masters, Signal and Image Processing With Neural Networks. Data Preparation for Neural Networks (C-3), J ohn Wiley & Sons Inc., pp. 61-80, (1994). [12] T. Masters, Advanced Algorithms for Neural Networks. Assessing Generalization Ability (C-9), John Wiley & Sons Inc., pp. 335-380, (1995). [13] R. D. Reed and R. J. Marks II, Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks. Factors Influencing Generalization (C-14), The MIT Press, pp. 239-256, (1999). [14]http://en.wikipedia.org/wiki/Neural _Lab
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve