You are on page 1of 18

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/321947685

An Improved Fuzzy Time Series Forecasting Model

Chapter  in  Studies in Computational Intelligence · January 2018


DOI: 10.1007/978-3-319-73150-6_38

CITATIONS READS

0 47

6 authors, including:

Ha Che Ngoc Tai Vo-Van


Ton Duc Thang University Can Tho University
9 PUBLICATIONS   0 CITATIONS    26 PUBLICATIONS   27 CITATIONS   

SEE PROFILE SEE PROFILE

Thao Nguyen-Trang
Ton Duc Thang University
32 PUBLICATIONS   70 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Bayesian classification View project

Clustering of probability density function using evolutionary technique. View project

All content following this page was uploaded by Tai Vo-Van on 19 March 2018.

The user has requested enhancement of the downloaded file.


An Improved Fuzzy Time Series
Forecasting Model

Ha Che-Ngoc1 , Tai Vo-Van2 , Quoc-Chanh Huynh-Le1 , Vu Ho3 ,


Thao Nguyen-Trang1,4(B) , and Minh-Tuyet Chu-Thi1,2,3,4
1
Faculty of Mathematics and Statistics, Ton Duc Thang University,
Ho Chi Minh City, Vietnam
nguyentrangthao@tdt.edu.vn
2
Department of Mathematics, Can Tho University, Can Tho, Vietnam
3
Faculty of Mathematical Economics, Banking University of Ho Chi Minh City,
Ho Chi Minh City, Vietnam
4
Division of Computational Mathematics and Engineering, Institute
for Computational Science, Ton Duc Thang University, Ho Chi Minh City, Vietnam

Abstract. This model is developed from the model of Abbasov and


Mamedova (2003) in which the parameters are investigated by meth-
ods and algorithm to obtain the most suitable values for each data set.
The experiments on Azerbaijan’s population, Vietnam’s population and
Vietnam’s rice production demonstrate the feasibility and applicability
of the proposed methods.

Keywords: Fuzzy time series · Abbasov-Mamedova · Population


GDP · Vietnam

1 Introduction

Forecasting is the process of making prediction for the future based on summing
experiences, assembling knowledge and analyzing related problems. It is consid-
ered as the basis process, the first step for organizations as well as governments to
build their policies and objectives. Because of its important role in many fields,
forecasting has received much attention from scientists. Despite several discus-
sions in the literature, the problems of forecasting have not yet been completely
solved. Based on the historical data, looking for principles and rules to establish
a suitable forecasting model is the major method of statistics. Time series and
regression models have important roles in forecasting using statistical methods,
but they have many disadvantages in practice. A regression model (Galton (1888);
Pearson (1896)) requires a number of assumptions that are unsatisfactory, whereas
a time series model, like ARIMA (Box and Jenkins (1976)), performs poorly when
there are abnormal changes or the time series is nonstationary. To overcome the
distadvantages of these two models, various models have recommended by many
researches, such as (Zecchin et al. (2011); Wang and Fu (2006); Wang et al. (2001);
Ren et al. (2016); Gupta and Wang (2010); Zhu and Wang (2010); Park (2010);
c Springer International Publishing AG 2018
L. H. Anh et al. (eds.), Econometrics for Financial Applications, Studies in Computational
Intelligence 760, https://doi.org/10.1007/978-3-319-73150-6_38
An Improved Fuzzy Time Series Forecasting Model 475

Teo et al. (2001); Ghazali et al. (2009)). These proposals are the important con-
tributions for forecasting problem because they have given good results in consid-
ered data sets. However, we could not obtain optimum for all cases. Some other
models like Artificial Neuron Network, Supported Vector Regression (Cortes and
Vapnik (1995)), Multivariable Adaptive Regression Spline (Friedman (1991)),
Adaptive Spline Threshold Autoregressive (Lewis and Stevens (1991)), Autore-
gressive conditional heteroscedasticity (Engle (1982)) or hybrid models (Zhang
(2003)); (de Oliveira and Ludermir (2014)) were also proposed; however, most of
them still have many disadvantages in real forecasting applications.
Based on the fuzzy theory of Zadeh (Zadeh (1965)), fuzzy time series (FTS)
introduced by Song (Song and Chissom (1993)) can solve the gap mentioned
above. FTS has been then interested to research and have been shown to be
more efficient than traditional statistical techniques (Song and Chissom (1993));
(Tseng and Tzeng (2002)). Among them, Abbasov and Manedova (AM) pro-
posed the model where the variations of data are represented by language level to
forecast the population of Azerbaijan (Abbasov and Mamedova (2003)). Because
of its better performance for some kinds of forecasting problems, AM model has
been applied in many applications; for instance, Sasu utilized the AM model to
forecast the population of Romanian (Sasu (2010)). Some other important stud-
ies of FTS can be listed as the models in (Chen (2004); Huarng (2001); Singh
(2008)). Nonetheless, all of the above methods use only historical fuzziness data
without forecasting. Moreover, the parameters in the models are not properly
investigated to find the optimal values for each data set. One model is only rated
as better than the others in some specific cases. As a result, there is no model
that is considered optimal in all situations.
To overcome the gap mentioned above, this article proposes the methods to
identify the suitable parameters in the AM model. Specifically, w, the number of
elements in the data set used as prior information to forecast the data is chosen
through the value of partial autocorrelation function (PACF), the number of
fuzzy sets n is selected through an index that can evaluating the compactness of
the divided intervals. After determining suitable w and n, the optimal choice of C
is searched via an efficient algorithm so that the forecasting error is the smallest.
The numerical examples illustrate the proposed theories in detail and prove that
this method can improve the performance in term of forecasting accuracy.
The remainder of this paper is organized as follows. Section 2 reviews the AM
model and some the related definitions. Section 3 proposes the modifications for
AM model, in which the suitable parameters are determined by new methods and
algorithm. The numerical examples are presented in Sects. 4 and 5 is the conclusion.

2 Related Definitions and Abbasov-Mamadova Model


2.1 Related Definitions
Definition 1. Let U be a universe (domain), with a generic element of U
denoted by u. A fuzzy set A on universe U is a set defined by the membership
function μA (u) which is a mapping from the universe U into the unit interval:
μA (u) : U → [0, 1]
476 H. Che-Ngoc et al.

The value of the membership function of a specific u, μA (u) is called as the


membership degree or grade of membership. If the grade of membership equals
one, u belongs completely to the fuzzy set. If the grade of membership equals
zero, u does not belong to the set. If the grade of membership is between 0 and
1, u is a partial member of the fuzzy set.


⎨= 1 u is a full member of A
μA (u) ∈ (0, 1) u is a partial member of A


=0 u is not member of A

In above definition, a fuzzy set A in U is characterized by a membership func-


tion μA (u). There are several ways to define the membership function μA (u).
Some of forms of membership functions which are often used such as trapezoidal
membership function, triangular membership function, Gaussian membership
function, etc. An example of some membership functions with different shapes
is presented in Fig. 1. We next examine a special case of Definition 1 where the
universe is a time series and introduce the definition proposed by (Song and
Chissom (1993)) as follows.

Fig. 1. Algorithm to determine C.

Definition 2. Let Y (t) ∈ R, t = 0, 1, 2, . . . be a time series, with a generic ele-


ment denoted by yt . If μA (yt ) is the membership function which is a mapping
from the universe Y (t) into [0, 1] and F (t) = {μA (y0 ), μA (y1 ), μA (y2 ), . . .} is a
collection of μA (yt ) then F (t) is called a fuzzy time series.
An Improved Fuzzy Time Series Forecasting Model 477
 
Definition 3. Given the actual data {Xi } and predictive value X̂i , i =
1, 2, . . . , m, respectively, we have the popular indexes to evaluate established
model as follows:

Mean absolute error:


1 
m

M AE = X̂i − Xi . (1)
m i=1
Mean squared error:
1 
2
m
M SE = X̂i − Xi . (2)
m i=1
Mean absolute percentage error:


1 m X̂ i − X i
M AP E = ⎝ .100⎠ . (3)
m i=1 Xi

2.2 Abbasov-Mamedova Model


Given the historical data Xt corresponded to year t = 1, 2, ..., m. The AM model
consists of the following six steps.
– Step 1: Compute the variation Vt between every next and previous year by
Formula (4). Then define the universal set U by Formula (5).

Vt = Xt − Xt−1 (4)

U = [Vmin − D1 , Vmax + D2 ] (5)


where Vmin is the smallest variation, Vmax is the greatest variation, D1 and
D2 are positive numbers.
– Step 2: Divide the universal set U into n equal-length intervals ui , i =
1, 2, . . . , n, such that each interval ui contains at least one variation value.
Then find the middle points uim of each interval.
– Step 3: Define the fuzzy set Ai , i = 1, 2, . . . , n, on the universal set U by the
following formula:
1
μAi (u) = 2, (6)
1 + [C × (u − uim )]
where u is a generic element of universal set U , uim is the middle point of the
corresponding interval ui , (i = 1, 2, . . . , n) and C is a constant.
– Step 4: Convert the input data, time-point variations, into fuzzy values by
Formula 6.
– Step 5: Select an integer w, 1 < w < l, where l is the number of years, prior
to the current year included in experimental evaluation. Based on the chosen
w and Mamdani inference system, we establish an operation matrix Ow (t) of
size i × j (here i is the number of rows, which conforms to the sequence of
478 H. Che-Ngoc et al.

years t − 2, t − 3, . . . , t − w, j is the number of columns, which conforms to


the number of variation intervals) and a criteria matrix K(t) of size 1 × j (a
row matrix corresponding to fuzzy variation in total population for the year
t − 1). After that, the relationship matrix R(t) is calculated as follows.

R(t) [i, j] = Ow [i, j] ∩ K(t) [j] ,

or ⎡ ⎤
R11 R12 ... R1j
⎢ R R22 ... R2j ⎥
R(t) = Ow (t) ⊗ K(t) = ⎢ 21
⎣ ...
⎥,
... ... ... ⎦
Ri1 Ri2 ... Rij
where Ow (t) is the operation matrix, K(t) is the criteria matrix, ⊗ is the min
operator (∩).
Define F (t), the fuzzy forecasting of variations for the year t, in a fuzzy form
as follows.
F (t) = [max(R11 , . . . , Ri1 ), . . . , max(R1j , . . . , Rij )]
= [μA1 (Vt ), μA2 (Vt ), . . . , μAm (Vt )].

– Step 6: Defuzzify the obtained results of the 5-th step according to the
Formula 7.
m
μAi (Vt ) × uim
i=1
V (t) = 
m , (7)
μAi (Vt )
i=1

where μAi (Vt ) is the value of membership function of the forecast variation
in interval i, V (t) is the defuzzified forecast variation.
In orders to estimate the forecast value X(t) for year t, the following formula
is utilized:

X(t) = X(t − 1) + V (t), (8)


where X(t − 1) is the forecast value for year t − 1, V (t) is the variation for
year t.

3 The Proposed Method


In AM model, there are three parameters including the number of equal-length
intervals n, the positive integer w and the constant C have effects on the fore-
casting result. However, in the studies of (Abbasov and Mamedova (2003); Sasu
(2010)), these parameters were only identified according to the experiences.
Hence, this method is not suitable when dealing with various types of time
series. For w, Song and Chissom conducted a survey on the specific data and
pointed out that w = 2 is the best. They also concluded that the forecasting
An Improved Fuzzy Time Series Forecasting Model 479

result is better if we utilize a less complex model, with smaller w (Song and
Chissom (1993)). However, this conclusion is only drawn from a few specific sur-
veys, so lose generality. In fact, w is number of previous times that have a strong
influence on current value of time series (it is similar to the partial autocorre-
lation p in autoregressive integrated moving average, ARIMA). Therefore, it is
not reasonable if we utilize a model, with a fixed value of w, for all type of time
series. For instance, when dealing with a monthly or quarterly data, w = 7 is
consider as an unreasonable parameter. According to above remarks, it is certain
that the forecasting performance of the AM model can be significantly improved
if its parameters are determined in reasonable ways. To overcome the limita-
tions mentioned above, this section proposes a method called MAM (Modified
Abbasov-Mamedova model), which can identify the parameters n, w and C in a
reasonable way. Details of the proposed method are presented as follows.

3.1 Determine the Number of Interval n

In the fuzzification step, the middle point ui0 is used as the representative element
of ith interval. Therefore, if the data in each interval are well-represented by
ui0 , the forecasting performance can be improved. In general, we can evaluate
whether data are well-represented by ui0 or not according to the compact measure
between this middle point and elements in the interval i. Figure 2 illustrates a
few cases of representative elements. It can be seen that the universal set U
is defined as the interval (0, 3) and divided into three equal-length intervals.
The distance between middle points (red points) and elements belonging to the
intervals (0, 1) and (1, 2) are really large; therefore, using u10 and u20 as the
representative elements can lead to a low forecasting performance. Conversely,
u30 is close to the elements in 3rd interval, and it can lead to a good measure of
compactness as well as a high forecasting performance. Therefore, it is important
to point out the number of intervals n so that the measure of compactness is
optimized. According to mentioned idea, this paper proposes a measure denoted
as MMSE to evaluate the compactness of algorithm. MMSE is computed by (9):
n  2
1   vt − ui0
M M SE = , (9)
n i=1 v ∈u ni
t i

where n is the number of equal-intervals, vt is the variation, ui0 and ni is the


middle point and the number of elements belonging to the interval i, respectively.
Clearly, the smaller distances between middle point ui0 and variations vt are,
the smaller of numerator as well as the MMSE criterion. Therefore, MMSE can
be used to evaluate the compactness of time series model with different number
of intervals n. In addition, when the number of intervals n is extended up to a
specific number, the empty intervals containing no variation values are created.
Meanwhile, the denominator in (9) is equal to 0, and MMSE converges to infinity.
Hence, the choice of n in this case is not suitable. This is also entirely consistent
with the constraint mentioned in (Abbasov and Mamedova (2003)).
480 H. Che-Ngoc et al.

Fig. 2. Illustrate for representing elements.

3.2 Determine w

In literature, the parameter w is chosen according to experiences. Although Song


and Chissom conducted a survey on specific datasets and pointed out that w = 2
is the best, this conclusion is only drawn from a few specific surveys and does lose
generality. Here, an effective method that can determine w is proposed, based
on the partial autocorrelation function (PACF) of the time series.
Let φkk be PACF at lag k (k = 1, 2, . . .), φkk can be obtained according to
the recursive formula (Durbin (1960)) as follows:

φp+1,j = φp,j − φp+1,p+1 φp,p−j+1 , (10)


p
rp+1 − φp,j rp+1−j
j=1
φp+1,p+1 = , (11)
p
1− φp,j rj
j=1

where rk is the autocorrelation function (ACF) at lag k, φ1,1 = r1 .


The PACF at lag k considers only the direct correlation between vt and vt−k ,
with the linear dependences between the intermediate variables are removed.
Therefore, it can accurately reflect the number of the previous years, on which the
current year depends. Based on this result, we can determine the appropriate w.
An Improved Fuzzy Time Series Forecasting Model 481

Note that, when PACF presents the largest value at lag 1, w is considered as
2 so that the AM model conditions are fitted. In practice, based on statistical
programs including R, Matlab, etc., it is possible to calculate the PACF and
determine the reasonable w for fuzzy time series model.

3.3 Determine the Constant C


Given AM model with specific parameters w and n, the value of μAi (ui ) as well
as the forecasting result is strongly affected by the constant C. However, the
previous studies of (Abbasov and Mamedova (2003); Sasu (2010)) did not offer
guidance on how to determine the reasonable C for each specific dataset. This
subsection proposes an algorithm to determine optimum C (DOC), using the
following five steps:
Step 1. Initialize an integer k (k > 499), a very small positive number , where
k is the number points which divided for each iteration and  is the error of C.
Step 2. When t = 0, assign values: a(0) = 0, b(0) = 1, ΔC (0) = 12 , n(0) = 1.
Step 3. When t = i, i ≥ 1, calculate the values
 
a(t) = a(t−1) + n(t−1) − 1 ΔC (t−1)

  b(t) − a(t)
b(t) = a(t−1) + n(t−1) + 1 ΔC (t−1) , ΔC (t) =
k
(t)
If a=0 and b = 1, then Ci = a(t) + iΔC (t) , i = 1, 2, . . . , k − 1.
(t)
If a=0 and b = 1, then Ci = a(t) + iΔC (t) , i = 1, 2, . . . , k.
(t)
If a = 0 and b = 1, then Ci = a(t) + iΔC (t) , i = 1, 2, . . . , k − 1.
(t)
If a = 0 and b = 1, then Ci = a(t) + iΔC (t) , i = 1, 2, . . . , k.
(t)
Step 4. Run the Abbasov-Mamedova model with all the values Ci in Step 3.
(t)
Find Cn at which the criterion CEF is the current best.
(m)
Step 5. With the new n, repeat the Step 3 and Step 4 to find C = Cn =
a(m) + nΔC (m) until b(m) − a(m) < ε.
Note that,

(i) In each iteration, (k + 1) values of C are considered. In the numerical exam-


ples in this paper, k = 1000 is chosen.
(ii)  is a very small number and is chosen arbitrarily. The smaller  is, the more
iterations and computer time it are required. In fact, the optimum value of
C can be determined with an acceptable error depending on the value of .
In numerical examples,  = 10−6 is chosen.
(iii) There are a many criterions considered to evaluate the forecasting model
(CEF). In this article, we use MAE, MAPE and MSE presented in Sub-
sect. 2.1 to compare established models.
482 H. Che-Ngoc et al.

The DOC algorithm is illustrated by Fig. 3.

Fig. 3. Algorithm to determine C.

4 Numerical Examples
Section 3 proposes the methods of determining w, n and C in order to improve
the forecasting performance of AM model. In Sect. 4, this paper presents two
examples to illustrate and test the forecasting performance of proposed method.
Specifically, Example Sect. 4.1 presents in detail the proposed method when deal-
ing with the well-known data, Azerbaijan’s population. This example, in addi-
tion to clarifying the proposed algorithm, can test its forecasting performance.
In Example Sect. 4.2, the new method is applied to forecast the GDP per capita
in Vietnam. In each example, we compare the forecasting results of proposed
method with those of the AM model (Abbasov and Mamedova (2003)), the
Chen model (Chen (1996)) and the Huarng model (Huarng (2001)). Further-
more, to present that the proposed method is more efficient in predicting time
series than the traditional statistical methods, MAM is also compared with the
auto-regressive model AR(p) where p is choose based on AIC criterion.
An Improved Fuzzy Time Series Forecasting Model 483

4.1 Example 1

This example forecasts the annual population (thousand persons) of Azerbaijan


from 1980 to 2001 to clarify the proposed method and test its performance.
This is a well-known dataset presented in (Abbasov and Mamedova (2003)).
The detailed procedure is presented by six following steps.
Step 1. Table 1 presents the annual populations over 1980–2001 and variations
in all given years. Variation for the current year is the difference between the
population values in current year and previous year. For example, variation for
1990 is equal to 7131900 − 7021200 = 1110700. To define the universal set U ,
first of all, the smallest and greatest variation values must be found over the
interval [1980, 2001], later, to ensure the smoothness of boundaries of the interval,
adequate non-negative numbers D1 , D2 are selected. After that, the universal
set U can be defined as U :U = [Vmin − D1 , Vmax + D2 ], where Vmin = 62800
is the smallest variation, Vmax = 115900 is the greatest variation, D1 = 0,
D2 = 0. Thus, the universal set U is defined as: U = [62800, 115900]. Based on
the variation in Table 1, MMSE and PACF that are computed and presented in
Fig. 4 are considered as the criterion to find the suitable n and w. From Fig. 4,
it can be seen that MMSE is inversely correlated with the number of equal
intervals n. It proves that the more intervals we have, the better representation
the middle points make. However, the split of intervals must stop at a specific
level. Specifically, in Fig. 4, the empty intervals, which is associated with n, are
created when n is greater or equal to 10. As a result, MMSE is unspecified, and
the method is considered to be unreasonable in those cases, which stands for
the suitable number of equal intervals is 9. For w, according to Fig. 4, it can
be observed that PACF reaches the maximum value at lag 1. As mentioned in
Sect. 3, w is chosen as 2 in this case. It is also suitable with the survey of (Song
and Chissom (1993)). In summary, based on MMSE and PACF, n = 9 and
w = 2 are utilized in the AM model. With n = 9 and w = 2, performing the
DOC algorithm, the optimum value of C is reach at 0.3197.

Series variation
25
0.0 0.2 0.4 0.6

20
Partial ACF

15

10
−0.4

2 4 6 8 10 12
0
2 3 4 5 6 7 8 9 10 11 12

Lag

Fig. 4. MMSE according to n and the PACF.


484 H. Che-Ngoc et al.

Step 2. The universal set U must be divided into 9 equal intervals:


u1 = [62800, 68700], u2 = [68700, 74600]
u3 = [74600, 80500], u4 = [80500, 86400]
u5 = [86400, 92300], u6 = [92300, 98200]
u7 = [98200, 104100], u8 = [104100, 110000]
u9 = [110000, 115900]
The middle points of the intervals are determined as follows: u1m = 65750,
u2m = 71650, u3m = 77550, u4m = 83450, u5m = 89350, u6m = 95250, u7m = 101150,
u8m = 107050, u9m = 112950.
Steps 3 and 4. Define the fuzzy sets A1 , A2 , . . . , A9 on the universal set U and
convert the input data into fuzzy values by formula 6. An exemplary growth
of the continuous membership functions of fuzzy sets Ai is shown in Fig. 5.
For the sake of briefly, the results of fuzzification for all the given years with last
two digits are shown in Table 1.

Table 1. Population of Azerbaijan

T Nt Vt Fuzzy time series Ft


1980 6114.3 0
1981 6206.7 92.4 0.00 0.02 0.04 0.11 0.51 0.55 0.11 0.04 0.02
1982 6308.8 102.1 0.01 0.01 0.02 0.03 0.06 0.17 0.92 0.29 0.08
1983 6406.3 97.5 0.01 0.01 0.02 0.05 0.13 0.66 0.42 0.10 0.04
1984 6513.3 107.0 0.01 0.01 0.01 0.02 0.03 0.07 0.22 1.00 0.22
1985 6622.4 109.1 0.01 0.01 0.01 0.01 0.02 0.05 0.13 0.70 0.40
1986 6717.9 95.5 0.01 0.02 0.03 0.06 0.21 0.99 0.23 0.07 0.03
1987 6822.7 104.8 0.01 0.01 0.01 0.02 0.04 0.10 0.42 0.66 0.13
1988 6928.0 105.3 0.01 0.01 0.01 0.02 0.04 0.09 0.36 0.76 0.14
1989 7021.2 93.2 0.01 0.02 0.04 0.09 0.40 0.70 0.13 0.05 0.02
1990 7131.9 110.7 0.00 0.01 0.01 0.01 0.02 0.04 0.10 0.42 0.66
1991 7218.5 86.6 0.02 0.04 0.11 0.50 0.56 0.12 0.04 0.02 0.01
1992 7324.1 105.6 0.01 0.01 0.01 0.02 0.04 0.08 0.33 0.82 0.15
1993 7440.0 115.9 0.00 0.00 0.01 0.01 0.01 0.02 0.04 0.11 0.53
1994 7549.6 109.6 0.01 0.01 0.01 0.01 0.02 0.05 0.12 0.60 0.47
1995 7643.5 93.9 0.01 0.02 0.04 0.08 0.32 0.84 0.16 0.05 0.03
1996 7726.2 82.7 0.03 0.07 0.27 0.95 0.18 0.06 0.03 0.02 0.01
1997 7799.8 73.6 0.14 0.72 0.39 0.09 0.04 0.02 0.01 0.01 0.01
1998 7879.7 79.9 0.05 0.13 0.64 0.44 0.10 0.04 0.02 0.01 0.01
1999 7953.4 73.7 0.13 0.70 0.40 0.09 0.04 0.02 0.01 0.01 0.01
2000 8016.2 62.8 0.53 0.11 0.04 0.02 0.01 0.01 0.01 0.00 0.00
2001 8081.0 64.8 0.92 0.17 0.06 0.03 0.02 0.01 0.01 0.01 0.00
An Improved Fuzzy Time Series Forecasting Model 485

Fig. 5. Membership functions of 9 fuzzy sets.

Step 5. Apply the min − max operator to forecast the population in 1990, we
have the results:

O2 (1990) = [0.01 0.01 0.01 0.02 0.04 0.09 0.36 0.76 0.14]
K(1900) = [0.01 0.02 0.04 0.09 0.40 0.70 0.13 0.05 0.02]
R(1990) = [0.01 0.01 0.01 0.02 0.04 0.09 0.13 0.05 0.02]

Hence, the fuzzy forecasting of the variation for the year 1990, F (1990), is

[0.01 0.01 0.01 0.02 0.04 0.09 0.13 0.05 0.02]

Step 6. Finally, compute the variations in 1990 by the Formula 7.

0.01 ∗ 65750 + . . . + 0.02 ∗ 112950


V (1990) = = 97181
0.01 + . . . + 0.02

Hence, the forecasting population in 1990 is:

X (1990) = X (1989) + V (1990)


= 7021200 + 97181 = 7118381

Perform in a similar way for the remainder, the forecasting results are
presented in Table 2 and Fig. 6. As shown in Table 3, in addition to the pro-
posed method, the performance of models presented in (Abbasov and Mamedova
(2003)); Chen (1996); Huarng (2001)) are examined for comparison purpose.
It can be observed that MAM model outperforms others in term of accuracy
for all cases of criterion. The result verifies that the proposed method is suitable
at first and need to be retested in actual application as follows.
486 H. Che-Ngoc et al.

Table 2. Actual and forecast population (thousand person)

Years Actual Forecasted


Total Variation Total Variation
1988 6928.0 105.3 6921.096 98.396
1989 7021.2 93.2 7031.659 103.659
1990 7131.9 110.7 7118.381 97.181
1991 7218.5 86.6 7230.382 98.482
1992 7324.1 105.6 7314.037 95.537
1993 7440.0 115.9 7418.289 94.189
1994 7549.6 109.6 7545.407 105.407
1995 7643.5 93.9 7658.468 108.868
1996 7726.2 82.7 7742.128 98.628
1997 7799.8 73.6 7814.826 88.626
1998 7879.7 79.9 7879.700 79.900
1999 7953.4 73.7 7958.389 78.689
2000 8016.2 62.8 8032.098 78.698
2001 8081.0 64.8 8089.909 73.709

Actual series vs forecated series by Abbasov−Mamedova model of 9 fuzzy set


with w = 2 and C = 0.319363409408
8500

Actual Forecasted
7500
data

6500

1980 1985 1990 1995 2000 2005

point

Fig. 6. The forecasting result of proposed model.

Table 3. The performance of comparative methods

MAM AM Chen (1996) Huarng (2001)


MAE 9.989 15.007 77.756 77.756
MAPE 0.136 0.197 1.099 1.099
MSE 127.751 290.459 8835.054 8835.054
An Improved Fuzzy Time Series Forecasting Model 487

4.2 Example 2

GDP is gross domestic product converted to international dollars using pur-


chasing power parity rates. It is one of the major measures of nation’s economic
health. Therefore, GDP forecasting has an essential role for countries all over the
world. In this example, the new method is applied to forecast the GDP per capita
in Vietnam. In particular, the GDP per capita (USD) from 1990 to 2015 are
collected (http://data.worldbank.org). Similar to Example 1, the performance
of the proposed method are compared with those of (Abbasov and Mamedova
(2003); Chen (1996); Huarng (2001)) (Fig. 7 and Table 4). In addition, the fore-
casts for the 5-year period 2016–2020 are presented (Table 5). Table 4 and Fig. 7
show that the proposed method has good forecasting results. The established
model fits well almost all the actual data, with the mean of absolute error is less
than 26USD% per year and the mean of absolute percentage error is less than
8% in comparison with actual data. In addition, Table 4 and Fig. 7 demonstrate

Table 4. Forecasting results of comparative methods

GDP per capita (USD)


MAM model AM model The Chen mode The Huarng model
MAE 25.669 42.942 201.552 201.552
MAPE 0.744 1.224 7.86 7.86
MSE 1117.06 2834.763 62332.37 62332.37

Actual GDP vs forecated GDP by models


4000
USD

2000

Actual GDP
Abbasov−Mamedova
New model
Chen
Huarng

1990 1995 2000 2005 2010 2015


year

Fig. 7. Vietnam GDP per capita forecasting results of comparative models


488 H. Che-Ngoc et al.

Table 5. Out-of-sample forecasting results

GDP
Year Forecast
2016 5908.166
2017 6148.920
2018 6372.591
2019 6588.328
2020 6796.071

the superiority of proposed method over comparative models, when it always


shows the best results. For out-of-sample forecasting, it can be seen from Table 5
that Vietnam’s GDP continues to increase steadily, with average growth rates
of over 200 USD each year. GDP will reach over 6500 USD per capita by 2019.

4.3 On the Comparison Between MAM and the Traditional


Statistical Method
As mentioned earlier, we resolve two above experiments in which MAM is com-
pared with the auto-regressive model AR(p), (auto-regressive order p is choose
based on AIC criterion). The brief summary of results in Table 6 present that
the MAM model outperforms the AR model. Based on the above examples, at
first, we can be see that MAM is a good and competitive model in comparison
with traditional statistical method as well as other fuzzy time series model. It is
feasible and capable of practical problems, particularly of population and GDP
forecasting.

Table 6. The results of MAM and AR models for Example 1 and Example 2

Example 1 Example 2
MAM AR MAM AR
MAE 9.989 93.908 25.669 165.612
MSE 127.751 8995.331 1117.06 30881.76
MAPE 0.136 1.329 0.744 5.323

5 Conclusion
This study proposes an improved fuzzy time series forecasting model based on
the methods to determine the suitable parameters for each data set in the AM
model. The numerical examples prove that the proposed method is more feasible
and capable of practical problems. In future, a program will be written in the
R statistical software to apply the proposed model in many different practice
problems.
An Improved Fuzzy Time Series Forecasting Model 489

References
Abbasov, A.M., Mamedova, M.H.: Application of fuzzy time series to population fore-
casting. Vienna Univ. Technol. 12, 545–552 (2003)
Box, G.E.P., Jenkins, G.M.: Time series analysis: forecasting and control. Holden-Day
Series in Time Series Analysis, Revised edn. Holden-Day, San Francisco (1976)
Chen, S.M.: Forecasting enrollments based on fuzzy time series. Fuzzy Sets Syst. 81(3),
311–319 (1996)
Chen, S.M., Hsu, C.C.: A new method to forecast enrollments using fuzzy time series.
Int. J. Appl. Sci. Eng. 2(3), 234–244 (2004)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Durbin, J.: The fitting of time-series models. Revue de l’Institut International de Statis-
tique 28(3), 233–244 (1960)
Engle, R.F.: Autoregressive conditional heteroscedasticity with estimates of the vari-
ance of United Kingdom inflation. Econometrica 50, 987–1007 (1982). Journal of
the Econometric Society
Friedman, J.H.: Multivariate adaptive regression splines. Ann. Stat. 19, 1–67 (1991)
Galton, F.: Co-relations and their measurement, chiefly from anthropometric data.
Proc. Roy. Soc. Lond. 45(273–279), 135–145 (1888)
Ghazali, R., Hussain, A.J., Al-Jumeily, D., Lisboa, P.: Time series prediction using
dynamic ridge polynomial neural networks. In: 2009 Second International Conference
on Developments in eSystems Engineering (DESE), pp. 354–363. IEEE (2009)
Gupta, S., Wang, L.P.: Stock forecasting with feedforward neural networks and gradual
data sub-sampling. Aust. J. Intell. Inf. Process. Syst. 11(4), 14–17 (2010)
Huarng, K.: Heuristic models of fuzzy time series for forecasting. Fuzzy Sets Syst.
123(3), 369–386 (2001)
Lewis, P.A.W., Stevens, J.G.: Nonlinear modeling of time series using multivariate
adaptive regression splines (MARS). J. Am. Stat. Assoc. 86(416), 864–877 (1991)
de Oliveira, J.F.L., Ludermir, T.B.: A distributed PSO-ARIMA-SVR hybrid system
for time series forecasting. In: 2014 IEEE International Conference on Systems, Man
and Cybernetics (SMC), pp. 3867–3872. IEEE (2014)
Park, D.C.: A time series data prediction scheme using bilinear recurrent neural net-
work. In: 2010 International Conference on Information Science and Applications
(ICISA), pp 1–7. IEEE (2010)
Pearson, K.: Mathematical contributions to the theory of evolution. III. Regression,
heredity, and panmixia. Philos. Trans. Roy. Soc. Lond. Ser. A Contain. Pap. Math.
Phys. Character 187, 253–318 (1896)
Ren, Y., Suganthan, P.N., Srikanth, N., Amaratunga, G.: Random vector functional
link network for short-term electricity load demand forecasting. Inf. Sci. 367, 1078–
1093 (2016)
Sasu, A.: An application of fuzzy time series to the Romanian population. Bulletin
Transilv. Univ. Brasov 3, 52 (2010)
Singh, S.R.: A computational method of forecasting based on fuzzy time series. Math.
Comput. Simul. 79(3), 539–554 (2008). https://doi.org/10.1016/j.matcom.2008.02.
026
Song, Q., Chissom, B.S.: Forecasting enrollments with fuzzy time series part I. Fuzzy
Sets Syst. 54(1), 1–9 (1993)
Teo, K., Wang, L., Lin, Z.: Wavelet packet multi-layer perceptron for chaotic time series
prediction: effects of weight initialization. In: Computational Science-ICCS 2001, pp
310–317 (2001)
490 H. Che-Ngoc et al.

Tseng, F.M., Tzeng, G.H.: A fuzzy seasonal ARIMA model for forecasting. Fuzzy
Sets Syst. 126(3), 367–376 (2002). https://doi.org/10.1016/S0165-0114(01)00047-1.
http://www.sciencedirect.com/science/article/pii/S0165011401000471
Wang, L., Fu, X.: Data Mining with Computational Intelligence. Springer Science &
Business Media, New York (2006)
Wang, L., Teo, K.K., Lin, Z.: Predicting time series with wavelet packet neural net-
works. In: Proceedings of the International Joint Conference on Neural Networks,
IJCNN 2001. vol 3, pp 1593–1597. IEEE (2001)
Zadeh, L.A.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)
Zecchin, C., Facchinetti, A., Sparacino, G., De Nicolao, G., Cobelli, C.: A new neural
network approach for short-term glucose prediction using continuous glucose moni-
toring time-series and meal information. In: 2011 Annual International Conference
of the IEEE Engineering in Medicine and Biology Society, EMBC, pp 5653–5656.
IEEE (2011)
Zhang, G.P.: Time series forecasting using a hybrid ARIMA and neural network model.
Neurocomputing 50, 159–175 (2003)
Zhu, M., Wang, L.: Intelligent trading using support vector regression and multilayer
perceptrons optimized with genetic algorithms. In: The 2010 International Joint
Conference on Neural Networks (IJCNN), pp. 1–5. IEEE (2010)

View publication stats

You might also like