Professional Documents
Culture Documents
Applies to:
Basic statistic measures and statistical forecasting of time series. A forecast sample was taken from SAP
Forecasting and Replenishment 5.1, but apart from that sample, the article is not linked to SAP applications.
For more information, visit the Retail homepage.
Summary
Statistics and forecasts are a matter of our daily business and private life. Therefore, a basic knowledge of
the statistical key-figures and the forecast methods often used is required. The paper illustrates the statistical
key figures of mean values, variance and standard deviation, Normal and Poisson distribution. It explains
basic forecast methods such as moving average, exponential smoothing and linear regression.
Author Bio
Barbara Wessela from SAP AG works in the Solution Management “Supply Chain” in the
Industry Sector Trading Industries, Industry Business Unit Retail.
Barbara joined SAP in 1999 and has specialized the past 5 years in SAP Forecasting and
Replenishment (releases 4.1, 5.0 and 5.1). She gained a lot of practical experience with the
application by testing and building up a demo system. She has developed various training
and documentation materials for SAP F&R and has teached numerous customer and
partner workshops in that area.
Table of Contents
1 Running into Statistics and Forecasts in your daily and business life.............................................................3
2 Refresh your knowledge: Basics statistics ......................................................................................................4
2.1 Qualitative and quantitative characteristics – How to describe objects? ..................................................4
2.2 What are histograms for?..........................................................................................................................5
2.3 Mean values: one for all ............................................................................................................................7
2.4 How can we measure the variance?.......................................................................................................10
2.5 The Normal Distribution ..........................................................................................................................14
2.6 The Poisson Distribution for rare events.................................................................................................17
3 Basics Forecasting ........................................................................................................................................18
3.1 What is Forecasting? ..............................................................................................................................18
3.1.1 Applications of forecasting ................................................................................................................................19
3.1.2 Forecast Approaches ........................................................................................................................................19
3.2 What are Time Series? ...........................................................................................................................21
3.3 Basic Forecasting Methods.....................................................................................................................22
3.3.1 Moving Average ................................................................................................................................................22
3.3.2 Weighted Moving Average ................................................................................................................................25
3.3.3 First Order Exponential Smoothing ...................................................................................................................27
3.3.4 Seasonal adjustment of time series as a general statistical method .................................................................30
3.3.5 Exponential Smoothing with Trend and Seasonality .........................................................................................34
3.3.6 Linear Regression .............................................................................................................................................35
3.3.7 More sophisticated Regression Methods: .........................................................................................................37
3.4 Causal based forecasting .......................................................................................................................38
3.5 Forecasting Performance Measures.......................................................................................................40
References........................................................................................................................................................45
Copyright...........................................................................................................................................................46
1 Running into Statistics and Forecasts in your daily and business life
Flip the newspaper open and you will find reams of different statistics and graphics about various
sociopolitical, economical or natural scientific topics (such as employment rates, economic indices, inflation,
diseases, age pyramid and many others). You will also run into numerous predictions of the future, such as
for the economic growth, your horoscope, the world’s population and the weather forecast.
Wherever you may live or work, you will be confronted with statistics and forecasts whether you are aware of
it or not. Sharp tongues might quote Benjamin Disraeli who said: “There are three kinds of lies: lies, damned
lies, and statistics” to argue that statistics are often taken to prove the case for the own opinion. Nevertheless
the better you understand the basics of statistics and forecasting, the better position you will be in to judge
the quality of the statistic or forecast you’re facing.
Of course, there is a lot of science and popular science literature on the market about understanding
statistics and statistical reporting (see for example [1], [2]). Most of you had to pass exams about statistics
during your education. This paper aims to remind you of the very basics of statistics as well as explain basic
forecast methods for time series forecasts. Of course, it is not a scientific paper covering all kinds of forecast
approaches. It rather gives an illustration of statistic and forecast principles on a high level in order to lay the
foundations and to get a better feeling for statistical forecast.
You run into an old schoolmate and talk to him or her about another person the name of whom you forgot.
You will certainly describe the person: hair style and color, skin color, height, voice, special characteristics.
This work perfectly for only one person or a few people. However, for larger number of people or objects, you
have to better organize the data in order to keep track of the essential information.
Figure 2: Histogram
First, we create intervals within the value range of body heights. Then we count the number of children per
interval. That is: how many children have a body height between 0.80 m to 1.00 m and between 1.01 m to
1.20 and so on (see Fig. 2).
The histogram helps already to reduce the amount of information to get a better overview of the data. It is an
abstraction from the real world. Of course, some detailed information gets lost. Therefore it is the challenge
is to find meaningful intervals, not too many, not to few, depending on the number of objects.
Figure 3: Histogram for Large Groups Æ Normal Distribution or Normal Probability Curve.
The bigger the group becomes, the histogram probably will come closer and closer to the shape of a bell.
The bars of the histogram will be symmetrical around a mean value. Such a distribution is called a Normal
distribution.
Normal distributions can be found in many examples in nature, such as the mass of chicken eggs or
elephants, the body height of mice or giraffes etc. You can also find it in economy, for example for the daily
deviations of shares of a stock index.
When values are influenced by many random factors, you can expect a normal distribution of these values,
because a normal distribution is characterized by random deviations of actual values from an expected
value.
We will come back to the normal distribution later using another approach.
The arithmetic mean is the sum of all data values divided by its number. In our example, it is the sum of all
body heights divided by the number of children (see Fig. 4).
It is a big advantage that the arithmetic mean can also be calculated if the single values are unknown; it is
sufficient to know the sum and the number of values. Example: the average number of beer a German drinks
in a year is simply determined by the total beer consumption divided by the number of German citizens.
The variance of this group of children is the sum of squared deviations divided by the number of
children.
(Sometimes you will also find a formula where the sums of squared deviations are divided by the number
minus 1. This is a correction that can be done in order to count for the value which is very close to the
average. However, for large numbers, the difference of both formulas becomes negligible.)
You can easily see that the standard deviation has the same unit of measure as the original values.
Therefore, you can plot the standard deviation into the data graphics by plotting a line for the arithmetic
mean value plus the standard deviation and a line for the mean value minus standard deviation.
These two lines give a range. If the number of values is big enough, then about 68% of the values will be in
this range.
The standard normal distribution is the normal distribution with a mean of zero and a variance of
one.
68,27 % of all values deviate not more than σ from the mean value
95,45 % of all values deviate not more than 2σ from the mean value
99,73 % of all values deviate not more than 3σ from the mean value
3 Basics Forecasting
3.1 What is Forecasting?
• Time series methods: Time series methods use historical data as the basis of estimating future
outcomes.
o Moving average
o Exponential smoothing
o Extrapolation
o Linear prediction
o Trend estimation
o Growth curve
o Topics
• Causal / econometric methods: Some forecasting methods use the assumption that it is possible to
identify the underlying factors that might influence the variable that is being forecast. For example,
sales of umbrellas might be associated with weather conditions. If the causes are understood,
projections of the influencing variables can be made and used in the forecast.
o Regression analysis using linear regression or non-linear regression
o Autoregressive moving average
o Autoregressive integrated moving average
o Econometrics
• Judgmental methods: Judgmental forecasting methods incorporate intuitive judgments, opinions and
probability estimates.
o Surveys
o Delphi method
o Scenario building
o Technology forecasting
• Other methods:
o Simulation
o Prediction market
o Probabilistic forecasting and Ensemble forecasting
o Reference class forecasting
A model in science is a physical, mathematical, or logical representation of a system of entities, phenomena,
or processes. Basically a model is a simplified abstract view of the complex reality. It may focus on particular
views, enforcing the "divide and conquer" principle for a compound problem. Formally a model is a
formalized which deals with empirical entities, phenomena, and physical processes in a mathematical or
logical way.
A simulation is the implementation of a model over time. A simulation brings a model to life and shows how
a particular object or phenomenon will behave. It is useful for testing, analysis or training where real-world
systems or concepts can be represented by a model.
For more information regarding the above mentioned, see [3].
Forecast Approaches addressed in this paper:
Time Series methods: use historical data as the basis of estimation future outcomes. Examples: Moving
Average or Exponential Smoothing.
Causal methods: Like time series methods, but underlying factors may influence the forecast. Example:
Regression analysis.
These forecast methods perform an extrapolation of time series. Time series are based on historical data.
Example: Demand Forecast based on historic sales.
Time Series [3]: In statistics, signal processing, and many other fields, a time series is a sequence of data
points, measured typically at successive times, spaced at (often uniform) time intervals. Time series
analysis comprises methods that attempt to understand such time series, often either to understand the
underlying context of the data points (where did they come from? what generated them?), or to make
forecasts (predictions). Time series forecasting is the use of a model to forecast future events based on
known past events: to forecast future data points before they are measured. A standard example in
econometrics is the opening price of a share of stock based on its past performance.
Starting point:
Time t 1 2 3 4 5 6 7 8 9
Data Dt 77 + 10
10 + 88 10 12 10 8 11 9
Moving average Mt 3
Moving forward:
t 1 2 3 4 5 6 7 8 9
Dt 7 10
10 + 8 + 10
10 12 10 8 11 9
3
Mt 8.33 9.33
…
© SAP 2008 / Page 8
Fig. 18 shows a sample time series consisting of data Dt at subsequent points in time t. Suppose the number
of values to be considered for the calculation of the mean value is N=3. In order to calculate the first moving
average value M4, you calculate the arithmetic mean value of the first three values. You move forward by
always calculating the mean value of the three preceding values.
The Moving Average moves from one data point to the next and thereby performs a smoothing of the values.
… t
Dt
1
7
2
10
3
8
4
10
5
12
6
10
7
88
8 9 10
+ 11 + 99 9.67
3
t 1 2 3 4 5 6 7 8 9 10 11 12
Dt 7 10 8 10 12 10 8 11 + 99 + 9.67
9.67 9.89
3
t 1 2 3 4 5 6 7 8 9 10 11 12
Dt 7 10 8 10 12 10 8 11 9 + 9.67
11 9.67 + 9.89
9.89 9.85
3
13
12
11
10 Dt
Moving average (N=3)
9 Forecast
1
6
F t +1 = ( D t + D t −1 + L + D t +1− N )
1 2 3 4 5 6 7 8 9 10 11 12 N
time t t
1
F t +1 = ∑ Di
N i= t +1− N
Always take e.g. 50% of what you calculated so far plus 50% of the
next data value…
t 1 2 3 4 5 6 7 8 9 10
Dt / Forecast Ft 7 10 8 10 12 10 8 11 9 9.53
Exponential
smoothing with α=0.5 10.07·0.5 + 9·0.5
1.56% input from older periods 98.44% input from last 5 periods
Weighting
factor
Smoothing Factor α
αα == 0.5
0.5
98.4% reacts quickly on a level shift
1.6% older
5 periods
0.5
reacts strongly on a short pulse
0.25
0.125
αα == 0.1
0.1
0.1
0.07
0.07
0.06
120
115
110
105
100
95
90
1/I 1/II 1/III 1/IV 2/I 2/II 2/III 2/IV 3/I 3/II 3/III 3/IV 4/I 4/II
Value 116 100 92 100 108 100 92 100 116 108 100 100 112 108
Dt
Base Bt = α + (1 − α )( Bt −1 + Tt −1 )
S t −m
Trend Tt = β ( Bt − Bt −1 ) + (1 − β )Tt −1
Dt
Season factors S t = γ + (1 − γ ) S t − m
Bt
Forecast Ft + k = ( Bt −1 + kTt −1 ) S t + k − m
Note, that in these formulas, the seasonality is assumed to be multiplicative, that means, the amplitude
increases from season to season. There are other methods considering the seasonal pattern as additive.
Find more information in forecasting literature such as [4].
Ca e nts
le n
d ar E l ev
v en ca
ts Lo
ge s
ric e C ha n
Sale s P
Sales Promotion
De t e
rmin
i st ic
d em a
nd s
past future
occurr. occurr.
© SAP 2008 / Page 25
•
Figure 37: Mean Forecast Error
Fig. 37 shows the mean forecast error: it is the sum of all deviations divided by the number of values. It is
obvious, that positive and negative deviations can cancel out. Therefore, the mean forecast error can only
detect an under- or overshooting of the forecast.
14
actuals/forecast
12
10
8
n
1
6 MAD =
n
∑
t =1
D t − Ft
4
2
time
0
1 2 3 4 5 6
Forecast 3 5 7 9 11 13
Actuals 2 6 5 10 13 11
Absolute deviation 1 1 2 1 2 2 MAD = 1.5
14
actuals/forecast
12
10
n
8
1
6
MSE =
n
∑t =1
( D t − Ft ) 2
4
2
time
0
1 2 3 4 5 6
Forecast 3 5 7 9 11 13
Actuals 2 6 5 10 13 11
Squared deviation 1 1 4 1 4 4 MSE = 2.5
References
[1] Walter Krämer: Statistik verstehen, Piper Verlag GmbH, München, 6th Edition, 2007
[2] Walter Krämer, So lügt man mit Statistik, Piper Verlag GmbH, München, 9th Edition, 2007
[3] Wikipedia, the free encyclopedia, http://en.wikipedia.org/wiki/Main_Page, search for the keywords ‘normal
distribution’, ‘Poisson distribution’, ‘Forecasting’, ‘Time Series’, ‘Linear regression’
[4] Peter Mertens, Susanne Rässler (eds.): Prognoserechnung, Physica-Verlag Heidelberg, 6th edition, 2005
[5] BNET Business Dictionary (http://dictionary.bnet.com), search for the keyword ‘Forecasting’
[6] Talk given by Stephan R. Lawrence, Demand Forecasting: Time Series Models, College of Business and
Administration, University of Colorado, Boulder
[7] Wikipedia, die freie Enzyklopädie, http://de.wikipedia.org/wiki/Regressionsanalyse
For more information, visit the Retail homepage.
Copyright
© 2008 SAP AG. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG.
The information contained herein may be changed without prior notice.
Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.
Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation.
IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iSeries, pSeries, xSeries,
zSeries, System i, System i5, System p, System p5, System x, System z, System z9, z/OS, AFP, Intelligent Miner, WebSphere,
Netfinity, Tivoli, Informix, i5/OS, POWER, POWER5, POWER5+, OpenPower and PowerPC are trademarks or registered trademarks of
IBM Corporation.
Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or registered trademarks of Adobe Systems
Incorporated in the United States and/or other countries.
Oracle is a registered trademark of Oracle Corporation.
UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.
Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of
Citrix Systems, Inc.
HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C®, World Wide Web Consortium, Massachusetts
Institute of Technology.
Java is a registered trademark of Sun Microsystems, Inc.
JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by
Netscape.
MaxDB is a trademark of MySQL AB, Sweden.
SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and other SAP products and services mentioned herein as well as their
respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All
other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves
informational purposes only. National product specifications may vary.
These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP
Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or
omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the
express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an
additional warranty.
These materials are provided “as is” without a warranty of any kind, either express or implied, including but not limited to, the implied
warranties of merchantability, fitness for a particular purpose, or non-infringement.
SAP shall not be liable for damages of any kind including without limitation direct, special, indirect, or consequential damages that may
result from the use of these materials.
SAP does not warrant the accuracy or completeness of the information, text, graphics, links or other items contained within these
materials. SAP has no control over the information that you may access through the use of hot links contained in these materials and
does not endorse your use of third party web pages nor provide any warranty whatsoever relating to third party web pages.
Any software coding and/or code lines/strings (“Code”) included in this documentation are only examples and are not intended to be
used in a productive system environment. The Code is only intended better explain and visualize the syntax and phrasing rules of
certain coding. SAP does not warrant the correctness and completeness of the Code given herein, and SAP shall not be liable for errors
or damages caused by the usage of the Code, except if such damages were caused by SAP intentionally or grossly negligent.