Forecasting Solid Waste Generation in Juba Town, South Sudan Using Artificial Neural Networks (ANNs) and Autoregressive Moving Averages (ARMA)

Journal of Environment and Waste Management
JEWM
Vol. 4(2), pp. 211-223, August, 2017. www.premierpublishers.org. ISSN: XXXX-XXXX
Research Article
Forecasting solid waste generation in Juba Town, South

Sudan using Artificial Neural Networks (ANNs) and
Autoregressive Moving Averages (ARMA)
1David Lomeling* and 2Santino Wani Kenyi
1,2Departmentof Agricultural Sciences, College of Natural Resources and Environmental Studies (CNRES), University of
Juba, P.O. Box 82, Juba, South Sudan
Prediction of solid waste generation is critical for any long term sustainable waste management,
especially of a fast-growing municipality. Lack of, or inaccurate solid waste generation records
poses unparalleled challenges in developing cohesive and workable waste management
strategies for any concerned authorities, as this is influenced by several interlinked demo-
graphic, economic, and socio-cultural factors. The objective of this study was to compare two
models in forecasting of MSW generation and how this would be built into an effective MSW
management program. Two models, the Autoregressive Moving Average (ARMA 1,1) and the
Artificial Neural Networks (ANNs) were tested for their ability to predict weekly waste generation
of 14 households in Juba Town, Central Equatoria State (CES), South Sudan. Results showed that
both the artificial intelligence model ANNs and the traditional ARMA model had good prediction
performances; where for ANNs the RMSE, MAPE and r were 0.080, 10.64%, 0.238 respectively,
whereas for ARMA the RMSE, MAPE and r were 0.102, 6.98% and 0.274 respectively. Both models
showed no significant differences and could be therefore be used for Solid Waste (SW)
forecasting. Based on the results, the weekly SW generated 52 weeks later (end of year) had
reached 0.365 kg/capita indicating a 18.2% rise from 0.3 kg/capita at the start of the study. Under
the current consumption rate, the weekly SW per capita in Juba Town is expected to reach 0.596
kg by 2020.
Keywords: Artificial Neural Networks, Autoregressive Moving Averages, Continuous Wavelet Transform, Waste
Generation Forecasting,
INTRODUCTION
South Sudan witnessed rapid economic growth rates with inadvertently also increased. This meant that more land
Gross Domestic Product (GDP) estimates of over and resources would have to be required for planned
$16billion (World Bank Report 2008) immediately after the waste disposal site in the short to long term, if serious
signing of the Comprehensive Peace Agreement, CPA in environmental pollution through indiscriminate waste
2005. Huge oil revenues from crude oil sales attracted disposal is to be abated.
investments and rapid population growth especially in
urban centers like Juba. Conversely, this economic and
population growth put enormous strain on the local
environment and on the availability of natural resources *Corresponding author: David Lomeling, Department of
(Lomeling et al., 2016) in terms of increased demand for Agricultural Sciences, College of Natural Resources and
areas for settlement as well as depleting the forest Environmental Studies (CNRES), University of Juba, P.O.
resources as cheap energy source. With the rapid Box 82, Juba, South Sudan. Email:
population increase in Juba, waste generation dr.david_lomeling@gmx.net
Forecasting weekly SW generation using ANNs and ARMA models in Juba Town, South Sudan
Lomeling and Kenyi 212
Prediction or forecasting of solid waste generation in Juba unit or perceptron is linked with many others and can either
Town has become indispensable and crucial, if available be enforcing once the summation function has surpassed
resources are to be effectively deployed in the sustainable some threshold value to be propagated or inhibitory in their
management of SW. Time series offers an important area effect once below the summation function value. The
of stochastic forecasting in which past observations of a single neuron is illustrated by the McCulloch-Pitts Model
specific variable are analyzed to develop a model that can (1943).
be used to make future projections. Over the past
decades, much effort has been undertaken in the X1 w1 b
development and improvement of time series models that
can be applied for forecasting like the Artificial Neural
Networks (ANNs) and the Auto-Regressive Moving
X2

w2
S y
Averages (ARMA) as compared to classical and traditional
Xn wn
methods like linear and multiple regressions or polynomial
functions. During the last two decades, several stochastic
models have been used to predict SW waste like the (input e.g. SW (b, bias that increases (output or
support vector machine, (Abbasi et al., 2013); artificial data) or lowers the net input of forecasted
activation/sigmoid function) value)
neural networks (Abdoli et al., 2011); (Antanasijevic et al.,
2013); hybrid procedure (Xu et al., 2013); time series Figure 1: A model of a three-layers perceptron.
analysis, (Mwenda et al., 2014); multi-step chaotic model
(Song and He, 2014); principal component analysis and Mathematically, the output variable, y is the sum of the
gamma testing (Noori et al., 2010); grey fuzzy dynamic individual weighted variables and bias that influences the
modeling (Chen and Chang, 2010); Fourier series (Darko activation function S:
et al., 2016); simulated annealing based hybrid forecast (x1 w1 + x2 w2 + xn wn )+b=y= nj=1(xj wj + b)
(Song et al., 2014). In general, most ANN prediction
Equation (1)
models have clearly outlined architecture with specific
The underlying concept is to arrive at a function that
number of input variables at the input layer and
minimizes the error (E) between the input (actual) and
corresponding number of expected outputs at the output
output (forecasted) variables thereby enhancing the
layer. Depending on the problem to be modeled, several
accuracy in the forecasting or prediction.
input variables may be chosen for a given number of
anticipated outputs. The choice of any one single, all-
purpose model under any prevailing conditions is therefore Emin(x,b)=[n Equation (2)
unrealistic. Structurally, such a multi-purpose model would j=1 x;w,by]
require more complex algorithms capable of handling
several calculations simultaneously for some desired Usually the sums of each input signal (x1, x2, xn) and
number of input variables and then come up with optimal intensity or weighted values (w1, w2,wn) are passed on
predictions. through a non-linear function known as an activation or
transfer function that usually has a sigmoid shape, that is
The objective of this study was to compare the bounded, and differentiable as:
effectiveness of machine learning method (Artificial Neural
1
Networks ANNs) and the stochastic linear model (x) = (1+ex ) Equation (3)
(Autoregressive Moving Average ARMA) for medium to
long-term weekly forecasting of solid waste generation in
some households of Juba Town, South Sudan. The Many theoretical and experimental works have shown that
integration of Continuous Wavelet Transform (CWT) with a single hidden layer (with one or more several hidden
both ANNs and ARMA models was primarily used for easy nodes) is sufficient for ANN to approximate any complex
visualization and interpretation of input signals as well as nonlinear function (Dreiseitl and Ohno-Machado, 2002;
frequencies in the time series. Using both models, a good Chattopadhyay and Bandyopadhyay 2007; Matias et al.,
estimate of the weekly SW per capita was to be made 2013; Vishwakarma and Gupta, 2011; Aggarwal and
during the 2012-2020 forecasting period. Kumar 2015). A more plausible argument is that, the low
number of nodes in the hidden layer directly linked to the
input neuron have a low bias (b) and would tend to
Forecasting Models
increase the input of the activation function. This in turn
Artificial Neural Networks (ANN) would enhance large changes in their weights and learn
very quickly and so incur less errors as manifested by the
ANNs is basically a computational approach whose high correlation coefficients for both training and test sets.
architecture is mostly composed of three layers: an input, In this study, a model based on a feedforward neural
hidden and output layers mimicking the way biological network with a single hidden layer was used. Hereby, the
neurons receive, transfer and output signals. Each neural learning process in understanding hidden and strongly
non-linear dependencies in the time series of the observed
J. Environ. Waste Manag. 213
0.7
y = 0,0011x + 0,3077
0.6 r = 0,04
Weekly SW/household
0.5
0.4
0.3
0.2
0.1
0
1 8 15 22 29 36 43 50
Time. Weeks
Figure 2a. Run plot sequence of weekly disposed solid waste showing seasonality in the time series
0.7
0.6
Kg of SW per household
0.5 y = 0,0011x + 0,3077

r = 0,04
0.4
0.3
0.2
0.1
0
1 8 15 22 29 36 43
Time.: Weeks
Obs. a=0,2 a=0,5 a=0,8
Figure 2b. Effects of alpha term () on exponential smoothing or the exponentially weighted moving average (EWMA)
and modeled data in the training and test were faster and On the other hand, a time series that shows seasonality as
the forecasting made easier. However, such forecasts can in Figure 2a can be exponentially smoothened by an
only be made for shorter prediction times and not for exponentially weighted non-parametric value () to
extensively longer future times as the error would tend to smoothen the value X_t to a new value X recursively
increase. (Figure 2b):
Auto-Regressive Moving Average, ARMA model = Xt + (1 )Xt1

X Equation (4)
where 0 1
The first step in developing the ARMA model was
determining the stationarity of the time series in which case Our data showed that the best seasonal exponential
the mean and variance are time invariable. The smoothing was when =0,2 with r=0,26; =0,5 with
autocorrelation function (ACF) may signify stationarity of r=0,08 and =0,8 with r=0,04. The value =0,2 best
the time series, if it cuts off or decomposes quite rapidly approximated the mean value and regression constant of
towards zero. Conversely, if the ACF decomposes very the time series and therefore gave a better trend analysis.
slowly and gradually towards zero, this would indicate non-
stationarity and would need to be transformed or Owing to its simplicity, the exponential smoothing only de-
differenced to obtain stationarity by stabilizing the variance seasonalizes a time series thereby assuming the nature
of a time series. Differencing helps stabilize the mean of a of a linear regression equation between two variables.
time series by removing changes in the level of a time However, its predictive ability is inadequate in more
series, and so eliminating trend and seasonality. complex stochastic and non-linear processes with more
than 3 or more input variables. Although some authors data preprocessing of the neural networks. Additionally,
(Mwenda et al., 2014; Petridis et al., 2016); Rimaityte et the Continuous Wavelet Transfer (CWT) using the PAST3
al., 2011; Karpuenkait et al., 2016) have mentioned the software was used to illustrate through the spectral power
theory behind single exponential smoothing, however, not the peaks or spikes of weekly solid waste disposal in the
much in terms of its practical applicability have yet been time series.
reported.
Model identification
The ARMA (p,q) model used herein is made up of an
Autoregressive AR(p) and Moving Average MA(q) The effect of model choice on both correlation functions is
components. This means that, the forecast value of (X) at shown in Figure 3 (a) and (b). The spikes presented in the
time (t) in a time series is a function of both linear ACF and PACF showed a correlation in the data every 2
combination of past X-innovations and a moving average lag units. The model identification revealed that with the
of series (t ), known as white noise process characterized cut-off at lag 1, the autoregressive of order p and the
by zero mean ( ) and variance (). moving average of order q was also 1 as the ACF was got
below zero after first lag. For illustrative purposes, the
ARMA (1,2) as opposed to ARMA (1,1) was also used to
Xt = c + t + i=1 i Xt1 + i=1 i ti compare the parameter values and how these influenced
Equation (5) the model choice. The ARMA (1,2) in Figure 4 (a) and (b)
as compared to ARMA (1,1) showed dissimilar AC and
With the values p and q identified from the ACF and PACF, PAC functions at lag 1 and was outside the 95%
the model parameters (i ) and (i ) can then be estimated. confidence limits. The ARMA (1,1) was then chosen and
Once we had confirmed the stationarity of the time series, there was therefore no need for any differencing.
the autocorrelation (ACF) and partial autocorrelation
functions (PACF) were used to determine the correlation Training Set: From this, 32 data entries of the weekly solid
and model structure of the data. waste disposed per household were trained through a
process of finding values for the weights (w) and biases (b)
whereby the error between the measured and predicted
MATERIALS AND METHOD values was minimized. The back-propagation algorithm
was used here. The accuracy of the resulting training
This study focused on solid waste generated by single process was then applied for making projections in neural
persons in 14 households in Kator residential area of Kator network model. Figure 5 shows the accuracy of the
Payam, Juba County of Central Equatoria State in South training set between the observed and forecasted values
Sudan. Collected waste was placed into a container whose with a high r=0.99.
tare weight was initially determined using the hanging
Testing Set: This data set consisted of entries of the last
scale. The net weight of the solid waste was then
weekly solid waste disposed per household and was
determined. The data were collected weekly over a period
used to test whether, or not the accuracy derived from
of 31 weeks as from June 2010 till January 2011. Each
the training process would provide the most appropriate
household had on average 6 persons with monthly income
solution and hence confirm the predictive power of the
of about 650 SDG (Sudanese Guinee equivalent to
network model. Similarly, the test set showed high
145$/month as of June 2010). About 40-60% of the
correlation coefficient of r=0.99 as shown in the training
collected waste was made up of predominantly degradable
set (Table 2).
organic component consisting mainly of food residues and
partly cartons and newspapers. The rest was made up of Table 2: Comparison of the learning ability (MSE) between the
PET plastic water bottles. The waste was collected at the training and test set of a ANN.
end of each week, weighed and the daily amounts per
capita generated (kg/capita/week) was then calculated Training set Test set
Nr. of rows 32 12
This work attempted to predict the weekly waste generated
Nr. of Good Forecast 30(94%) 12(100%)
in the remaining 21 weeks till July 2011 based on the Average MSE 8,91E-05 1,25E-05
previous data. For the training set, data from the first 20 r 0,99 0,99
weeks were used representing 90% of the actual data. The
rest 10 weeks representing 10% of the actual data were Validation Set: Generally, this data set is used to minimize
then used as test set. overfitting in the model. As in our case, given the high
correlation coefficients of both the training and test sets
Data description and analysis and the high predictive power of the neural network model,
there was therefore no need for a validation set. Figure 6
From the weekly solid waste data reports, the Excel-based shows simulation runs of both the training and test sets
Alyuda Forecaster XL software was used to make future and confirmed the accuracy of the ANN in predicting future
projections in the time series. Its algorithm allows an easy solid waste data.
ACF residuals ARMA (1,1)

1.20
Autocorrelation Function, ACF

1.00
0.80
0.60
0.40
0.20
0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13
-0.20
-0.40
-0.60
Lag (a)
PACF residuals ARMA (1,1)

1.2
1
Partial Autocorrelation
0.8
Function, PACF
0.6
0.4
0.2
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13
-0.2
-0.4
-0.6
Lag (b)
Figure 3. Autocorrelation plot of weekly per capita solid waste disposed in Juba (red dotted lines are upper and lower 95% confidence
limits). ACF plot of residuals of ARMA (1,1) with 1 = -0,999 and 1 = -0,800 (a) and PACF plot of residuals of ARMA (1,1) with 1 = -
0,999 and 1 = -0,800 (b).
acf residuals of ARMA (1,2)

1.2
1
Autocorrelation Function,
0.8
0.6
0.4
ACF
0.2
0
-0.2 0 1 2 3 4 5 6 7 8 9 10 11 12 13
-0.4
-0.6
Lag (a)
pacf residuals of ARMA (1,2)

1.2
1
Autocorrelation Function,
0.8
0.6
0.4
ACF
0.2
0
-0.2 0 1 2 3 4 5 6 7 8 9 10 11 12 13
-0.4
-0.6
Lag (b)
Figure 4: ACF and PACF plot of residuals of ARMA (1,2) with AR (1) at 1 = -0,999 and AM (2) at 1 = -0,526 and 2 = -0,800
0.6
Training set Test set
0.51
0.5
Forecasted
Forecasted
0.4 0.49
0.3 y = 0,9798x + 0,0069 0.47

y = 0,9855x + 0,0069
r = 0,99
r = 0,99
0.2 0.45
0.1 0.43
0.1 0.2 0.3 0.4 0.5 0.6 0.43 0.48 0.53
Actual Actual
Figure 5. Scatter plot of a ANN for both training and test set of a weekly per capita solid waste in Juba Town
Actual Forecasted 0.55 0.55

0.7 0.8 Actual Forecasted
0.6 0.7
Forescasted
0.6 0.5 0.5
Forecasted
0.5
Actual
0.5
Actual
0.4
0.4
0.3
0.3 0.45 0.45
0.2 0.2
0.1 0.1
0 0 0.4 0.4
1 4 7 10 13 16 19 22 25 28 31 34 37 1 4 7 10 13
Rows
Rows
Figure 6. Simulation of the training and test data sets
Error by training set n=40 Error by test set n=12

0.03
0.006
0.02
0.004
0.01 0.002
Error
Error
0 0
1 7 13 19 25 31 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14
-0.01 -0.002
-0.004
-0.02
-0.006
-0.03 Rows
Rows
Figure 7. Error distribution of training and test sets
Figure 7 shows the error distribution of both the training RESULTS

and test sets. Whereas the error margin by the training set
varied between -0,03 and 0,03, for the test set, this was ANN Model
between -0,006 and 0,006. The error magnitude in the
training set was ten-fold more than that in the test set. Our simulation was based on a simple network
Presumably, the larger the data set, the larger the variance architecture that returned the smallest MSE and therefore
between the observed and forecasted and hence the the best prediction accuracy. The MSE and (r) recorded
larger is the error margin and vice versa. for both the training and test sets have already been
Table 3. Parameters of ARMA (1,1) and (1,2) models of weekly SW generated in Juba Town, South Sudan
Parameters Standard Error, SE t-statistics p-Value AIC LLF

ARMA
(1,1) 0,073 7,643 7,66E-10 - 50,28
AR (1) -0,999 (1) 96,56
AM (1) -0,888 (1)
ARMA
(1,2) 0,000 12,738 1,08E-16 - 51,17
AR (1) -0,999 (1) 96,34
AM (1) -0,526 (1)
AM (2) -0,278 (2)
ARMA
(1,3)
AR (1) -0,999 ( 1)
AM (1) -0,632 (1) 0,000 13,464 3,304E-17 - 53,53
AM (2) -0,420 (2) 99,05
AM (3) 0,289 (3)
presented in Table 2, from where we observed 1-1-1 (1 (AIC) value and highest likelihood estimation here denoted
input layer, 1 hidden layer, and 1 output layer) gives an as Logarithmic Likelihood Function (LLF) Table 3.
accurate prediction of the weekly solid waste output.
Applying the rule-of-thumb method for estimating the Judging by the values of AIC and LLF, it is evident that
number of neurons in the hidden layer reported by experimented values at ARMA (1,2) and (1,3) are close to
Karsoliya (2012), we can assume that the number of the actual ARMA (1,1) values and suggested the
neurons in the hidden layer is approximately 1. (or 70- adequacy of the ARMA (1,1) model.
90%) of the input layer. Noticeably, the number of hidden
neurons is equal to the number of input nodes whereby the The scatter plot in Figure 8 shows a positive trend with
larger number of neurons would ostensibly lead to several points around the trend line. The relatively low (r)
overfitting whereas with a relatively smaller number of values suggested that there was neither an under- or
neurons would lead to underfitting. The good fit (r) for overestimation for both ANN and ARMA models with most
both training and test sets shows that the ANN modeled points between the 0,3-0,4 kg/week for both the observed
the observed data quite accurately. There is generally no and ANN-ARMA models. Whereas the ANN model had a
golden rule for the number of hidden layers that is r-value of 0,238 this was 0,274 for ARMA model, these r-
applicable for all non-linear time series. Whereas during values for both models were not significantly different from
the training process other problems are best predicted with each other. The comparatively lower r-value of the ANN
two hidden layers (Srinivasan et al., 1994; Zhang, 1994; would suggest the inability of a linear function in describing
Baron, 1994). Other studies (Wanas, et al., 1998) showed an entirely non-linear time series data.
that the best performance of a neural network occurred
when the number of hidden neurons was equal to log (N), Performance evaluation
where N is the number of training samples. Another study
conducted by Mishra and Desai (2006) showed that the The performance of either of the models was determined
optimal number of hidden neurons is (2n+1), where n is the by measuring the difference between the observed and
number of input neurons. predicted values in the time series. Best estimates were
those error values that were closer to zero, indicating less
ARMA Model differences between the measured and observed values.
Although model identification as per Box-Jenkins Three accuracy measures were used as follows:
methodology clearly showed an ARMA (1,1)
autoregressive (p) and moving average (q), we (1) Mean Absolute Percentage Error (MAPE):
experimented with different q values to see to what extent
this influenced not only model estimation but also the MAPE
diagnostic checking and consequently the forecast. Three
1 Pobs Ppred
different model MA values were varied while the AR was = | | 100 Equation (6)
kept constant. i.e. ARMA (1,1); (1,2) and (1,3) respectively. Pobs
=1
The best model parameters were selected based on the
model that gave the least Akaike Information Criterion (2) Root Mean Square Error (RMSE):
0.6 0.6
y = 0,3355x + 0,2367
r = 0,238
0.5 0.5
0.4 0.4
ARMA
ANN
0.3 y = 0,3139x + 0,2314 0.3

r = 0,274
0.2 0.2
0.1 0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Actual
ARMA Forecast ANN Forecast
Figure 8. Comparing the ARMA (p, q) and ANN for the observed and forecasted weekly SW per capita
1 fitting with minimal errors between the forecasted and

RMSE = (Pobs Ppred ) Equation (7) observed data.
n
Weekly SW generation for the first 32 weeks which Diagnostic checking

included both the training and test sets were later
combined with the lead time 20 weeks and the errors Based on the autocorrelation plot, the Ljung-Box (1978)
between observed and predicted assessed as in Table 3. test attempts to establish the overall data randomness
Based on the MAPE, RMSE performance comparison within a time series at some chosen or predetermined lags.
between both models, there was however no significant Basically, the null hypothesis (H0) assumes that the data
difference between both models. The slightly higher MAPE are random or independently distributed whereas this is
value for the ANN model could be due to inherent the contrary for the alternative hypothesis (Ha) that
drawback of the MAPE in overestimating error value assumes the non-randomness nature of the data. We then
especially when the difference between observed and performed the Ljung-Box statistic test as:
predicted values is zero. (Pobs. = 0).
2

Q (LB)= Equation (8)
Table 4. Comparison of the model performances in terms of MAPE and =1
RMSE of both ANNS and ARMA models.
ANN ARMA
Where n=sample size, 2 =sample correlation at lag k,
MAPE (%) 10,60 6,98 =number of lags being tested. Choosing the 2 at lag 7
RMSE 0,080 0,102 and testing at the p=0,05 confidence level, the Q(LB) value
at 0,645 was less than the (chi-square) at 2.013 thereby
Ideally, the ARMA model would have shown poor reaffirming the H0 and adequacy of the ARMA (1,1) model.
performances in both MAPE and RMSE due to its inability As aforementioned the Q(LB) for randomness is reinforced
to model non-linear variables as the ANN model. In by the autocorrelation functions as in Figures 3 and 4
complex and non-linear data in the time series, the ARMA Hereby, all the autocorrelation at the subsequent lags fall
model would inevitably lose predictive accuracy as it is within the 95% confidence limits, other than that at lag 0.
unable to adequately capture the errors between the
observed and forecasted values in the time series and so Model verification
the prediction errors would increase (see Figure HH).
Conversely, the ANN model is capable of identifying Model verification dealt with ascertaining whether the
nonlinearity in the time series by having an additional residuals of the ARMA model as expressed by the ACF
computation or hidden layer that allows for better curve- and PACF had any discernible and systematic patterns
0.7 0.7
0.6 y = 0,0011x + 0,3077 0.6

r = 0,04
ARMA (1, 1) 0.5 0.5
0.4 0.4
ANN
0.3 0.3
0.2 0.2
0.1 0.1
0 0
1 8 15 22 29 36 43 50
Time. Weeks
Obs. ARMA (1, 1) ANN
Figure 9. Forecasting SW using the ANN and ARMA (1, 1) models
SW forecast using ARMA (1, 1) model

0.7 0.70
0.6 Actual Forecast 0.60

y = 0,0011x + 0,3033
r = 0,06
0.5 0.50
Forecast
Actual
0.4 0.40
0.3 0.30
0.2 0.20
0.1 0.10
0 -
1 6 11 16 21 26 31 36 41 46 51
Time: Weeks
Figure 10. Weekly SW forecast of households in Juba Town using the ARMA (1,1) model
with respect to the lags. Our study showed that, none of resources and personnel needed for effective SW
these correlations was significantly different from zero at management.
the 95% confidence limit indicating the goodness of fit and
appropriateness of the ARMA model. 2.2 Continuous Wavelet Transform (CWT)
One of the main goals of a CWT is to enable the easy
Forecasting visualization and interpretation of input signals and
frequencies as a function of time. The CWT decomposes
The two forecasting models ARMA and ANNs presented a continuous time function of a time series into the
in this paper (Figures 9 and 10) allowed us to predict the components called wavelets each with a different localized
weekly generated SW with mean value of about 0.35 frequency. Although CWT are usually applied for non-
kg/household and upper and lower limits of about 0.63 and stationary signals, we tried to apply both the Morlet wavelet
0.16 kg/household respectively. Towards the 51st week, transform and the Derivative of Gaussian (DOG) to
the generated SW was well below the 0.4 kg level and account for the high instantaneous amplitude or signal
would under constant economic and political conditions outbursts in the time series as well as in the recognition of
remain below the 0.5 kg level till 2020. This forecasting is inherent frequency patterns. The Morlet wavelet transform
useful in mobilizing and optimizing available financial (Goupillaud, et al., 1984) is given as:
2
0 () = 0 2 Equation (9) For the null hypothesis (H0), we assumed that the time
series had a mean power spectrum. Spikes or peaks in the
Where 0 is the non-dimensional frequency and the
wavelet power spectrum above this background spectrum
vertical scale corresponds to the length of wavelet, i.e., the
were shown as black contoured spectrum significant at
number of time steps used for the CWT. The cone of
the 5% level or equivalent to the 95% confidence level.
influence (COI) is the region where the wavelet power
Therefore, if the peak in the power spectrum of the Morlet
spectra are limited due to the influence of the end points
and DOG wavelets are significantly larger than the
of finite length signals also known as the e-folding time.
background spectrum, it is then assumed to be a true
Here, the signal discontinuity drops by a factor of (e-2) and
feature. There was better interpretation of the spectral
ensures that edge effects are negligible beyond this point.
decomposition for the observed data as well as for both
The DOG with m=derivative was set at 6 and expressed
models when the DOG m=6 as opposed to Morlet.basis
as:
2
function was applied.
(1)+1
DOG = ( ) Equation (10)
2
1
(+ )
2
Values of m=2 or 4 using DOG basis functions did not SUMMARY AND CONCLUSION
describe the spectral decomposition in the time series
adequately. The choice of the wavelet used for time- In this study on time series models, we analyzed and
frequency decomposition in a time series is critical. The compared the Artificial Neural Networks (ANNs) and the
use of the Morlet wavelet for example, showed that the Autoregressive Moving Averages (ARMA) in forecasting
frequency resolution was lumped together and was the weekly amounts of solid waste generated by single
localized within a 95% confidence limit. On the contrary, persons in fourteen households of Kator residential area of
using Derivative of Gaussian (DOG) wavelet with m=6, Juba town. For the ANNs model, the input training data
(Figure 11 a and b), the result was good time localization used were the average weekly amounts of solid waste
with strong frequencies. The Morlet and Derivative of collected from during the month of June 2011. The test
Gaussian (DOG) wavelets are plotted below. data of July 2011 were then used to validate the ANNs
model. The result showed that ARMA (1,1) slightly
The choice for the type of wavelets in interpreting the outperformed the ANNs model in terms of MAPE but not
frequency and intensity of data entries (xn) in a time series in terms of the RMSE. However, considering both
are shown in Figures 11 and 12. We used the forecasted performance indicators, there was no significant
data from both models as input data to generate the differences between both models and so either model
respective wavelets. For both models, (Figure 11a and b), could be used to forecast the amount of the solid waste
the dominant power spectrum was characterized by generation for the next weeks and years. Using both
smaller spikes between log scale 1.2 and 3.2 from week15 models is, however, for comparative reasons imperative in
to 30. This time coincided with the highest weekly waste order qualify and quantify the extent of deviation of the
generation around Christmas season where expectedly, estimated values from the observed mean. The projected
more households had much disposable incomes enabling values showed that by 2020, the weekly generation
an increased consumption and so increased solid waste according to both models will have reached about 0.596
generation. However, the ARMA as opposed to the ANNs kg/capita with a 95% confidence interval lying between
model showed certain areas outside the COI, clearly an 0.2-0.6 kg/capita. Such projections may be used by Juba
indication of model overestimation by the former. In both municipal or town council in the proper planning and
cases, using the Morlet basis function (Figure 11c and d) management of solid waste.
showed poor spectral decomposition as a function of time
than the DOG m=6.
ACKNOWLEDGEMENTS
As aforementioned with the Ljung-Box randomness test in
a time series with the ARMA model, the resulting power My special gratitude goes to Mr. Santino Wani Kenyi for
spectrum using the CWT can as well exhibit coherent and data collection as part of his BSc dissertation. We the
significant structures. A test of significance can be used to authors also wish to extend our thanks to the households
distinguish between significant and random structures in Kator Payam for allowing the conduction of the
(Mohr, 2003) in which case values in a power spectrum interviews.
may be considered as statistically significant at the 95%
level and therefore not random.
(a) (b)
(c) (d)
Figure 11. (a) Weekly solid waste output signals for the ARMA model when using the DOG m=6 (b) for the ANNs models. Whereas when using the
Morlet basis function for ARMA (d) for ANNs model. Wavelet power spectrum showing cone of influence (COI) at the 95% confidence interval with areas
of intense peaks and signals in red contoured in (black) while those with poor signals in (blue).
(a) (b)
Figure 12. (a) Weekly solid waste output signals using the Morlet basis function for the observed data and (b) using the DOG m=6 basis function.
Wavelet power spectrum showing cone of influence (COI) at the 95% confidence interval with areas of intense peaks and signals in red contoured in
(black) while those with poor signals in (blue).
REFERENCES using neural networks and sustainability indicators.

Sustain Science. 8: 37-46.
Abbasi M, Abduli MA, Omidvar B, Baghvand A (2013). Asante-Darko D, Adabor ES, Amponsah SK (2012). A
Results uncertainty of support vector machine and Fourier series model for forecasting solid waste
hybrid of wavelet transform-support vector machine generation in the Kumasi metropolis of Ghana.
models for solid waste generation forecasting. Transactions on Ecology and The Environment. 202:
Environmental Progress and Sustainable Energy. 173-185.
33:220. Barron, AR (1994). A comment on Neural networks: A
Abdoli, MA, Nezhad MF, Sede RS, Sadegh B review from a statistical perspective. Statistical
(2011). Environmental Progress and Sustainable Science. 9 (1): 3335.
Energy. 31 (4): 628636. Chattopadhyay, S and Bandyopadhyay, G (2007). Single
Aggarwal R, Kumar K (2015). Effect of Training Functions hidden layer artificial neural network models versus
of Artificial Neural Networks (ANN) on Time Series multiple linear regression model in forecasting the time
Forecasting. International Journal of Computer series of total ozone. Int. J. Environ. Sci. Tech. 4 (1):
Applications. (0975 8887), 109(3): 14-17. 141-149.
Antanasijevic D, Pocajt, V, Popovic I, Redzic N, Ristic M Chen HW and Chang NB (2000). Prediction analysis of
(2013). The forecasting of municipal waste generation solid waste generation based on grey fuzzy dynamic
modeling. Resour. Conserv. Recycling. 29, 1-18.
Dreiseitl S, Ohno-Machado L. (2002). Logistic regression Song J, He J, Zhu M, Tan D, Zhang Y, Ye S, Shen D, Zou
and artificial neural network classification models: a P (2014). Simulated Annealing Based Hybrid Forecast
methodology review. J Biomed Inform. 35: 352359. for Improving Daily Municipal Solid Waste Generation
Foster G (1996). Wavelets for period analysis of unevenly Prediction. The Scientific World Journal. 1-7.
sampled time series. Astron J. 112: 1709-1729. Song J and He J (2014). A Multistep Chaotic Model for
Frick P, Baliunas SL, Galyagin, D (1997). Wavelet analysis Municipal Solid Waste Generation Prediction.
of stellar chromospheric activity variations. Environmental Engineering Science. 31(8): 461-468.
Astrophysics J. 483: 426-434. Srinivasan D, Liew AC, Chang, CS. (1994). A neural
Goupillaud, P., Grossman, A., Morlet, J (1984). Cycle- network short-term load forecaster. Electric Power
octave and related transforms in seismic signal Systems Research. 28: 227234.
analysis. Geo-exploration. 23: 85.102. Sweldens W. (1998). The lifting scheme, a construction of
Karpuenkait A, Denafas G, Ruzgas T. (2016). second generation wavelets. SIAM J Math. Anal.
Forecasting Hazardous Waste Generation using Short 8(29): 511-546.
Data Sets: Case Study of Lithuania. Environmental Vishwakarma VP and Gupta MN (2011). A New Learning
Protection Engineering. 8(4): 357364. Algorithm for Single Hidden Layer Feedforward Neural
Karsoliya S. (2012). Approximating Number of Hidden Networks. International Journal of Computer
layer neurons in Multiple Hidden Layer BPNN Applications. (0975 8887), 28(6):26-33.
Architecture. International Journal of Engineering Wanas N, Auda G, Kamel MS, Karray F. (1998). On the
Trends and Technology. 3(6): 714-717. optimal number of hidden nodes in a neural network.
Ljung, GM, Box, GEP (1978). On a Measure of a Lack of Proceedings of the IEEE Canadian Conference on
Fit in Time Series Models. Biometrika. 65(2), 297 Electrical and Computer Engineering., 918921.
303. Xu L, Gao P, Cui S, Liu C (2013). A hybrid procedure for
Lomeling D, Modi, A. L., Kenyi, M. S., Kenyi, M. C., MSW generation forecasting at multiple time scales in
Silvestro, G. M., Yieb, J. L. L. (2016): Comparing the Xiamen City, China. Waste Management, 33(6):1324-
Macroaggregate Stability of Two Tropical Soils: Clay 31.
Soil (Eutric Vertisol) and Sandy Loam Soil (Eutric Zhang X. (1994). Time series analysis and prediction by
Leptosol). International Journal of Agriculture and networks. Optimization Methods and Software. 4:
Forestry. 6(4): 142-151. 151170.
Matias T, Souza F. Arajo R, Carlos AH (2013). Learning
of a single-hidden layer feedforward neural network
using an optimized extreme learning machine. Accepted 23 August 2017
Neurocomputing. 129: 428436.
McCulloch W and Pitts W (1943). A logical calculus of the
Citation: Lomeling D and Kenyi SW (2017) Forecasting
ideas immanent in nervous activity. Bulletin of
solid waste generation in Juba Town, South Sudan using
Mathematical Biophysics. 5:115133.
Artificial Neural Networks (ANNs) and Autoregressive
Mishra AK, Desai VR. (2006). Drought forecasting using
feed-forward recursive neural network. Ecological Moving Averages (ARMA). Journal of Environment and
Modelling. 198 (12): 127138. Waste Management 4(2): 211-223.
Mohr, LB (2003). Understanding significance testing. Sage
Pub., Inc., 2003.
Mwenda A, Kuznetsov D, Mirau S (2014). Time series
forecasting of solid waste generation in Arusha city-
Copyright: 2017 Lomeling and Kenyi. This is an open-
Tanzania. Mathematical Theory and Modelling. 4 (8):
2939. access article distributed under the terms of the Creative
Noori R, Karbassi A, Salman SM. (2010). Evaluation of Commons Attribution License, which permits unrestricted
PCA and Gamma test techniques on ANN operation use, distribution, and reproduction in any medium,
for weekly solid waste. J Environ Management. provided the original author and source are cited.
91(3):767-71. doi: 10.1016/j.jenvman.2009.10.007.
Petridis NE, Stiakakis E, Petridis K, Dey P (2016).
Estimation of computer waste quantities using
forecasting techniques. Journal of Cleaner Production.
112: 3072-3085.
Rimaityte I, Ruzgas T, Denafas G, Racys V, Martuzevicius
D (2011). Application and evaluation of forecasting
methods for municipal solid waste generation in an
eastern-European city. Waste Management &
Research. 30(1) 8998.

Forecasting Solid Waste Generation in Juba Town, South Sudan Using Artificial Neural Networks (ANNs) and Autoregressive Moving Averages (ARMA)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Forecasting Solid Waste Generation in Juba Town, South Sudan Using Artificial Neural Networks (ANNs) and Autoregressive Moving Averages (ARMA)

Uploaded by

Copyright:

Available Formats

Journal of Environment and Waste Management

Forecasting solid waste generation in Juba Town, South

0.5 y = 0,0011x + 0,3077

Auto-Regressive Moving Average, ARMA model = Xt + (1 )Xt1

ACF residuals ARMA (1,1)

Autocorrelation Function, ACF

PACF residuals ARMA (1,1)

acf residuals of ARMA (1,2)

pacf residuals of ARMA (1,2)

0.3 y = 0,9798x + 0,0069 0.47

Actual Forecasted 0.55 0.55

Error by training set n=40 Error by test set n=12

Figure 7. Error distribution of training and test sets

Figure 7 shows the error distribution of both the training RESULTS

Parameters Standard Error, SE t-statistics p-Value AIC LLF

0.3 y = 0,3139x + 0,2314 0.3

1 fitting with minimal errors between the forecasted and

Weekly SW generation for the first 32 weeks which Diagnostic checking

0.6 y = 0,0011x + 0,3077 0.6

ARMA (1, 1) 0.5 0.5

Obs. ARMA (1, 1) ANN

Figure 9. Forecasting SW using the ANN and ARMA (1, 1) models

SW forecast using ARMA (1, 1) model

0.6 Actual Forecast 0.60

REFERENCES using neural networks and sustainability indicators.

You might also like