You are on page 1of 29

KIT University of the State of Baden-Wuerttemberg and

National Research Center of the Helmholtz Association


KNOWLEDGE MANAGEMENT GROUP
INSTITUTE OF APPLIED INFORMATICS AND FORMAL DESCRIPTION METHODS , FACULTY OF ECONOMICS AND BUSINESS ENGINEERING
www.kit.edu
Complex Time Series Analysis
Binh Luong
Presentation of Diploma Thesis

Supervisors: Prof. Dr. Rudi Studer (KIT)
Dr. Christoph Lingenfelder and Dr. Boris Charpiot (IBM Deutschland)
Dr. Achim Rettinger and Dipl. Inform. Benedikt Kmpgen (KIT)
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
2 20-07-2012
Overview
Introduction
Time Series
Seasonality
Time Series Modeling Techniques
ARIMA
Exponential Smoothing
Problems Analysis and Approaches
Unstable Seasonal Pattern
Multiple Seasonal Patterns
Non Integer Periodicity
Evaluation Results
Conclusions and Future Works
Binh Luong Complex Time Series Analysis
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
3 20-07-2012
INTRODUCTION
Complex Time Series Analysis
Binh Luong Complex Time Series Analysis
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
4 20-07-2012
Evaluation Results
Time Series - Definition
Motivation: How to plan for the future?
Need of tool to analyze past data and predict future data
A time series (TS) is an ordered sequence of
numeric values, observed at successive points of
time.
Time series are overall:
Stock price
Exchange rate, interest rate, inflation rate, national GDP
Retail sales
Electric power consumption
Temperatures at a weather station
Number of unemployment figures for a region
Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
5 20-07-2012
Evaluation Results
Time Series - Components
A TS is a combination of 4 components: trend, seasonal, cycle, error


Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
6 20-07-2012
Evaluation Results
Seasonality in Time Series
IBM Netezza Analytics (INZA) determines the seasonality
period as follows:
Run Fast Fourier Transformation
Find peaks in the frequency diagram
Calculate weight of each peak
The nearest integer of the peak with the highest weight is set to be
the periodicity of the time series



Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
Kernel-Run Analysis: Detected seasons
Season: 10.1 Weight: 0.67
Season: 4.9 Weight: 0.47
Detected periodicity = 10
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
7 20-07-2012
Evaluation Results
Time Series Modeling Techniques
The most two common TS modeling techniques are:
ARIMA
Exponential Smoothing
ARIMA [1]:
The forecast for a period is calculated as a weighted linear
combination of its own past values and past errors

=1
+

=0

Exponential Smoothing [2,3]:
Each component of a time series (trend, seasonal, error) is
represented as a weighted moving average of all past values
with the weights decreasing exponentially

=
1
+ 1
1

Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
8 20-07-2012
PROBLEMS ANALYSIS AND
APPROACHES
Complex Time Series Analysis
Binh Luong Complex Time Series Analysis
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
9 20-07-2012
Evaluation Results
Overview of the problems
In this work approaches are designed for 3 separate problems:
Time series with unstable seasonal pattern
Time series with non-integer periodicity
Time series with multiple seasonal patterns
Our approaches work as a pre-processing and post-processing
steps to solve those issues. ARIMA and Exponential Smoothing
are still applied to model and forecast time series.
Formal description for each problem:
Input:
- A time series with an above-mentioned issue
- Forecast horizon: a point of time in the future in which forecasts should be
made
Output:
- Forecasting results: a list of pairs of <time, value>




Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
10 20-07-2012
Evaluation Results
Issue 1: Unstable Seasonal Pattern
In some cases, the seasonal pattern in a TS is not stable, i.e. the length of
the periodicity varies over time
Example: monthly seasonal pattern in daily time series
Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
0
1000
2000
3000
4000
5000
6000
7000
01.01.04 01.02.04 01.03.04 01.04.04 01.05.04 01.06.04 01.07.04 01.08.04 01.09.04 01.10.04
ARIMA
1/8 31/8 30/9
Kernel-Run Analysis: Detected seasons
Season: 30.428 Weight: 0.377501
Season: 10.1519 Weight: 0.184886
Season: 15.2192 Weight: 0.164569
Detected periodicity = 30
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
11 20-07-2012
Evaluation Results
Approach for Unstable Seasonal Pattern (1)
Problem: the length of each period varies over time (e.g.
monthly seasonal pattern with 29, 30 or 31 days / month)

Approach:
1. Transform each period in the original TS into new
ones based on a unique mean period length.
2. Apply ARIMA or Exponential Smoothing for
forecasting.
3. At the end retransform the forecasting results based
on their real period lengths.

Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
12 20-07-2012
Evaluation Results
Approach for Unstable Seasonal Pattern (2)
Illustration:
Transformation of a month from 31 days into 30 days







All periods have a stable period length
Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
0
5
10
15
20
25
30
35
40
45
123456789
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
0
5
10
15
20
25
30
35
40
45
123456789
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
Linear Splines Interpolation
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
13 20-07-2012
Evaluation Results
Issue 2: Non Integer Periodicity
When the spectral analysis finds a seasonal pattern whose length is
not an integer its length will be rounded up.
This problem causes inaccurate forecasted values.
To illustrate the problem we can use a trigonometrical function:
= sin (
2

). +
For p=7.5 with Exponential Smoothing:

Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
460
470
480
490
500
510
520
530
540
1 16 31 46 61 76 91 106 121 136
Kernel-Run Analysis: Detected seasons
Season: 7.58903 Weight: 0.99993
Detected periodicity = 8
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
14 20-07-2012
Evaluation Results
Approach for Non-Integer Periodicity (1)
Problem:
- Although the TS has a non-integer periodicity Exponential
Smoothing can not realize that and just use the rounded
periodicity found by FFT for further analyzing.
- ARIMA is not affected by this problem.

Approach:
1. Transform the original TS to a new one that has an
integer periodicity.
2. Apply Exponential Smoothing for the new TS.
3. Retransform the forecasting results using the non-integer
periodicity at the beginning.

Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
15 20-07-2012
Evaluation Results
Approach for Non-Integer Periodicity (2)
Illustration:
Transformation a TS with p=7.5 into p=8







The new TS has now an integer periodicity p=8
Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
Linear Splines Interpolation
460
470
480
490
500
510
520
530
540
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
460
470
480
490
500
510
520
530
540
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
16 20-07-2012
Evaluation Results
Issue 3: Multiple Seasonal Patterns
Binh Luong Complex Time Series Analysis
Some TS contain multiple seasonal patterns of different lengths
To illustrate the problem we can use a trigonometrical function:
= sin
2

1
+sin
2

2
. +
For p
1
=9 and p
2
=15 with Exponential Smoothing:

Problems Analysis and Approaches Introduction Conclusions and Future Works
430
450
470
490
510
530
550
570
1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 211
Kernel-Run Analysis: Detected seasons
Season: 8.91999 Weight: 0.599654
Season: 15.0681 Weight: 0.400238
Detected periodicity = 9
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
17 20-07-2012
Evaluation Results
Approach for Multiple Seasonal Patterns (1)
Problem:
Exponential Smoothing can only handle one seasonal
pattern. ARIMA provides quite good forecasts which still can
be improved.

Approach:
1. Remove all seasonal patterns iteratively one after another
until there are no seasonal patterns in the TS.
2. Apply ARIMA or Exponential Smoothing for the
deseasonalized TS.
3. Add all existing seasonal patterns into the forecasting
results.

Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
18 20-07-2012
Evaluation Results
Approach for Multiple Seasonal Patterns (2)
Removing seasonal patterns iteratively:


Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works
430
460
490
520
550
1 11 21 31 41 51 61 71 81 91
Adjusted TS with 1 seasonal
pattern p=9
430
460
490
520
550
1 11 21 31 41 51 61 71 81 91
Original TS with 2 seasonal
patterns p
1
=9 and p
2
=15
490
495
500
505
510
1 11 21 31 41 51 61 71 81 91
Adjusted TS with no seasonal pattern
430
460
490
520
550
1 11 21 31 41 51 61 71 81 91
Adjusted TS with 1 seasonal
pattern p=15
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
19 20-07-2012
EVALUATION RESULTS
Complex Time Series Analysis
Binh Luong Complex Time Series Analysis
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
20 20-07-2012
Evaluation Metrics
Root Mean Square Error (RMSE) [4,5]
=
1

)
2

=1


Percentage Improvement

100%
Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works Evaluation Results
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
21 20-07-2012
Unstable Seasonal Pattern
ARIMA Exponential Smoothing






Existing implementation vs. our approach
0
1000
2000
3000
4000
5000
6000
7000
Binh Luong Complex Time Series Analysis
0
1000
2000
3000
4000
5000
6000
7000
ARIMA Exponential Smoothing
RMSE
before
827,13 909,09
RMSE
after
48,83 36,53
Improvement 94,10% 95,98%
Problems Analysis and Approaches Introduction Conclusions and Future Works Evaluation Results
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
22 20-07-2012
Non-Integer Periodicity
Before (Exponential Smoothing) After (Exponential Smoothing)






Existing implementation vs. our approach
Binh Luong Complex Time Series Analysis
460
470
480
490
500
510
520
530
540
1 16 31 46 61 76 91 106 121 136
460
470
480
490
500
510
520
530
540
1 16 31 46 61 76 91 106 121 136
Exponential Smoothing
p= 3,5 p=7,5 p=18,5 p=30,5
RMSE
before
34,36 23,74 15,13 7,88
RMSE
after
5,2 1,21 0,21 0,08
Improvement 41,69% 94,89% 98,58% 98,98%
Problems Analysis and Approaches Introduction Conclusions and Future Works Evaluation Results
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
23 20-07-2012
Multiple Seasonal Patterns
Real data: hourly utility demand from a company in USA
Existing implementation with ARIMA





Our approach with ARIMA





Binh Luong Complex Time Series Analysis
0
2000
4000
6000
8000
10000
12000
14000
16000
1 201 401 601 801 1001 1201 1401 1601 1801 2001 2201 2401 2601 2801 3001
0
2000
4000
6000
8000
10000
12000
14000
16000
1 201 401 601 801 1001 1201 1401 1601 1801 2001 2201 2401 2601 2801 3001
Problems Analysis and Approaches Introduction Conclusions and Future Works Evaluation Results
ARIMA
RMSE
before
6535,4
RMSE
after
1515,96
Improvement 76,80%
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
24 20-07-2012
Multiple Seasonal Patterns
Real data: hourly utility demand from a company in USA
Existing implementation with Exponential Smoothing





Our approach with Exponential Smoothing





Binh Luong Complex Time Series Analysis
0
2000
4000
6000
8000
10000
12000
14000
16000
1 201 401 601 801 1001 1201 1401 1601 1801 2001 2201 2401 2601 2801 3001
0
2000
4000
6000
8000
10000
12000
14000
16000
1 201 401 601 801 1001 1201 1401 1601 1801 2001 2201 2401 2601 2801 3001
Problems Analysis and Approaches Introduction Conclusions and Future Works Evaluation Results
Exponential Smoothing
RMSE
before
1398,67
RMSE
after
853,05
Improvement 39,01%
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
25 20-07-2012
CONCLUSIONS AND
FUTURE WORKS
Complex Time Series Analysis
Binh Luong Complex Time Series Analysis
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
26 20-07-2012
Conclusions
Design solution approaches that can be combined with the
existing modelling techniques (ARIMA or Exponential
Smoothing) to analyse time series with:
Unstable seasonal pattern
Non-Integer Periodicity
Multiple Seasonal Patterns

Prototype implementation.

Evaluate our approaches and get a significant improvement of
forecast accuracy compared to the existing implementation.




Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works Evaluation Results
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
27 20-07-2012
Future Works
Unstable Seasonal Pattern
Extend the algorithm to handle time series with numerical
time column

Non-Integer Periodicity
Distinguish between real non- integer periodicities and those
caused by rounding error of the spectral analysis

Multiple Seasonal Patterns
Specify a reasonable threshold to filter out the seasonal
patterns that are also results of spectral analysis but do not
present real seasonal variation in the time series

Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works Evaluation Results
Time Column
real date
numeric
Real non-integer vs. Rounding error
found periodicity weight real periodicity
Y
Y
N
?
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
28 20-07-2012
References
[1] G.E.P Box and G.M. Jenkins, Time series analysis, forecasting
and control., Holden-Day, San Francisco, 1970
[2] E.S.Gardner, Jr, Exponential smoothing: the state of art, Journal of
Forecasting 4 (1985)
[3] E.S.Gardner, Jr, Exponential smoothing: the state of art part II,
International Journal of Forecasting 22 (2006)
[4] B. Abraham and J. Ledolter, Statistical methods for forecasting, John
Wiley & Sons, New York, 1983
[5] W. Reinmuth, W. Mendenhall, and R. J. Beaver, Statistics for
management and economics, Duxbury Press, Belmonth, California,
1993
Binh Luong Complex Time Series Analysis
Problems Analysis and Approaches Introduction Conclusions and Future Works Evaluation Results
Knowledge Management Group
Institute of Applied Informatics and Formal Description Methods
29 20-07-2012



Thank you
for your attention!
Binh Luong Complex Time Series Analysis

You might also like