You are on page 1of 46

Basic Principles of Statistics and

Forecasts in your Daily and


Business Life

Applies to:
Basic statistic measures and statistical forecasting of time series. A forecast sample was taken from SAP
Forecasting and Replenishment 5.1, but apart from that sample, the article is not linked to SAP applications.
For more information, visit the Retail homepage.

Summary
Statistics and forecasts are a matter of our daily business and private life. Therefore, a basic knowledge of
the statistical key-figures and the forecast methods often used is required. The paper illustrates the statistical
key figures of mean values, variance and standard deviation, Normal and Poisson distribution. It explains
basic forecast methods such as moving average, exponential smoothing and linear regression.

Author: Dr. Barbara Wessela


Company: SAP AG
Created on: 09 February 2009

Author Bio
Barbara Wessela from SAP AG works in the Solution Management “Supply Chain” in the
Industry Sector Trading Industries, Industry Business Unit Retail.
Barbara joined SAP in 1999 and has specialized the past 5 years in SAP Forecasting and
Replenishment (releases 4.1, 5.0 and 5.1). She gained a lot of practical experience with the
application by testing and building up a demo system. She has developed various training
and documentation materials for SAP F&R and has teached numerous customer and
partner workshops in that area.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 1
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Table of Contents
1 Running into Statistics and Forecasts in your daily and business life.............................................................3
2 Refresh your knowledge: Basics statistics ......................................................................................................4
2.1 Qualitative and quantitative characteristics – How to describe objects? ..................................................4
2.2 What are histograms for?..........................................................................................................................5
2.3 Mean values: one for all ............................................................................................................................7
2.4 How can we measure the variance?.......................................................................................................10
2.5 The Normal Distribution ..........................................................................................................................14
2.6 The Poisson Distribution for rare events.................................................................................................17
3 Basics Forecasting ........................................................................................................................................18
3.1 What is Forecasting? ..............................................................................................................................18
3.1.1 Applications of forecasting ................................................................................................................................19
3.1.2 Forecast Approaches ........................................................................................................................................19
3.2 What are Time Series? ...........................................................................................................................21
3.3 Basic Forecasting Methods.....................................................................................................................22
3.3.1 Moving Average ................................................................................................................................................22
3.3.2 Weighted Moving Average ................................................................................................................................25
3.3.3 First Order Exponential Smoothing ...................................................................................................................27
3.3.4 Seasonal adjustment of time series as a general statistical method .................................................................30
3.3.5 Exponential Smoothing with Trend and Seasonality .........................................................................................34
3.3.6 Linear Regression .............................................................................................................................................35
3.3.7 More sophisticated Regression Methods: .........................................................................................................37
3.4 Causal based forecasting .......................................................................................................................38
3.5 Forecasting Performance Measures.......................................................................................................40
References........................................................................................................................................................45
Copyright...........................................................................................................................................................46

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 2
Basic Principles of Statistics and Forecasts in your Daily and Business Life

1 Running into Statistics and Forecasts in your daily and business life
Flip the newspaper open and you will find reams of different statistics and graphics about various
sociopolitical, economical or natural scientific topics (such as employment rates, economic indices, inflation,
diseases, age pyramid and many others). You will also run into numerous predictions of the future, such as
for the economic growth, your horoscope, the world’s population and the weather forecast.
Wherever you may live or work, you will be confronted with statistics and forecasts whether you are aware of
it or not. Sharp tongues might quote Benjamin Disraeli who said: “There are three kinds of lies: lies, damned
lies, and statistics” to argue that statistics are often taken to prove the case for the own opinion. Nevertheless
the better you understand the basics of statistics and forecasting, the better position you will be in to judge
the quality of the statistic or forecast you’re facing.
Of course, there is a lot of science and popular science literature on the market about understanding
statistics and statistical reporting (see for example [1], [2]). Most of you had to pass exams about statistics
during your education. This paper aims to remind you of the very basics of statistics as well as explain basic
forecast methods for time series forecasts. Of course, it is not a scientific paper covering all kinds of forecast
approaches. It rather gives an illustration of statistic and forecast principles on a high level in order to lay the
foundations and to get a better feeling for statistical forecast.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 3
Basic Principles of Statistics and Forecasts in your Daily and Business Life

2 Refresh your knowledge: Basics statistics

2.1 Qualitative and quantitative characteristics – How to describe objects?

You run into an old schoolmate and talk to him or her about another person the name of whom you forgot.
You will certainly describe the person: hair style and color, skin color, height, voice, special characteristics.
This work perfectly for only one person or a few people. However, for larger number of people or objects, you
have to better organize the data in order to keep track of the essential information.

Figure 1: Qualitative and Quantitative Characteristics


In Figure 1, there are some possible characteristics for people such as this group of children. Such
characteristics can be divided into:
• Qualitative characteristics that describe properties such as: sex, hair color, religion. Values of these
characteristics can be: male or female, blond, black or brown hair, Christian, Jewish or Moslem.
• Quantitative characteristics that have metric values which can be added, subtracted etc. Examples:
Body height, body mass, age etc
For larger numbers of children or people, it becomes unpractical to describe the individuals. We have to
somehow sort the information. The histogram is the oldest method to preprocess metric data.
Let’s use the body height as example, which is a quantitative characteristic with metric values.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 4
Basic Principles of Statistics and Forecasts in your Daily and Business Life

2.2 What are histograms for?

Figure 2: Histogram
First, we create intervals within the value range of body heights. Then we count the number of children per
interval. That is: how many children have a body height between 0.80 m to 1.00 m and between 1.01 m to
1.20 and so on (see Fig. 2).
The histogram helps already to reduce the amount of information to get a better overview of the data. It is an
abstraction from the real world. Of course, some detailed information gets lost. Therefore it is the challenge
is to find meaningful intervals, not too many, not to few, depending on the number of objects.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 5
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 3: Histogram for Large Groups Æ Normal Distribution or Normal Probability Curve.
The bigger the group becomes, the histogram probably will come closer and closer to the shape of a bell.
The bars of the histogram will be symmetrical around a mean value. Such a distribution is called a Normal
distribution.
Normal distributions can be found in many examples in nature, such as the mass of chicken eggs or
elephants, the body height of mice or giraffes etc. You can also find it in economy, for example for the daily
deviations of shares of a stock index.
When values are influenced by many random factors, you can expect a normal distribution of these values,
because a normal distribution is characterized by random deviations of actual values from an expected
value.
We will come back to the normal distribution later using another approach.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 6
Basic Principles of Statistics and Forecasts in your Daily and Business Life

2.3 Mean values: one for all


If you don’t want to create a histogram and if your data set is too small, you can use simple formulas to
calculate mean values. There are different mean values. The most common is the arithmetic mean value.

Figure 4: Mean values: Arithmetic Mean

The arithmetic mean is the sum of all data values divided by its number. In our example, it is the sum of all
body heights divided by the number of children (see Fig. 4).
It is a big advantage that the arithmetic mean can also be calculated if the single values are unknown; it is
sufficient to know the sum and the number of values. Example: the average number of beer a German drinks
in a year is simply determined by the total beer consumption divided by the number of German citizens.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 7
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 5: Mean Values: Median


The Median is the value of the object in the middle if you sort the objects in ascending (or descending) order.
In our example, it is the body height of the child standing in the middle of the sorted queue (see Fig. 5)
It is easy to find, because no calculation is necessary. It has the advantages, that:
„ it is not sensitive to extremely high or low values
„ it doesn’t lead to unnatural values like the average of 1.75 children per family.
In most statistics, the arithmetic mean is used instead of the median, because it allows drawing conclusions
from a random sample to the total amount. The median doesn’t contain this information, but therefore, it can
also be used for non-metric, qualitative characteristics, for example: the average educational certification of
people in a company: you simply sort all possible certification and take the one in the middle as the median.
Further mean values are:
„ geometric mean value
„ harmonic mean value
„ Weighted arithmetic mean.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 8
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 6: Different Groups but Same Arithmetic Mean


In our example of a children’s group, the mean value doesn’t tell everything about the body height of this
group. There could be another group of children with the same average body height but still it could look
different (see Fig. 6). For instance, the children of the second group could be all about the same height. That
means that the variance would be much bigger in the first group.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 9
Basic Principles of Statistics and Forecasts in your Daily and Business Life

2.4 How can we measure the variance?

Figure 7: Variance of the first Group of Children (1)


Let’s first calculate the deviations of each child from the average. However, you can easily see that the
deviation can be positive or negative. If we just added them up, they would balance out. Therefore, we
calculate the squared deviations, which are always positive. Taking the square also means, that values with
a bigger deviation have much more impact than values with smaller deviations.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 10
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 8: Variance of the first Group of Children (2)


In order to get a normalized value, we divide the sum of the squared deviations by the number of children.
This gives the variance.

The variance of this group of children is the sum of squared deviations divided by the number of
children.

(Sometimes you will also find a formula where the sums of squared deviations are divided by the number
minus 1. This is a correction that can be done in order to count for the value which is very close to the
average. However, for large numbers, the difference of both formulas becomes negligible.)

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 11
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 9: Standard Deviation of this Group of Children


Because of the squaring in the formula of the variance, the variance doesn’t have the same unit of measure
than the original values; it cannot be plot in the graphics and is also not really evident. Therefore, one can
extract the root of the variance, to get the standard deviation.

The standard deviation is the root of the variance.

You can easily see that the standard deviation has the same unit of measure as the original values.
Therefore, you can plot the standard deviation into the data graphics by plotting a line for the arithmetic
mean value plus the standard deviation and a line for the mean value minus standard deviation.
These two lines give a range. If the number of values is big enough, then about 68% of the values will be in
this range.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 12
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 10: Same Arithmetic Mean but Different Standard Deviations


If you calculate the arithmetic mean value, the variance and the standard deviation for these two groups of
children, you can see, that although they have the same mean value, the variance and thus the standard
deviation is much bigger for the first group than for the second (see Fig. 19). That means, that the variance
and standard deviation help describing the body height distributions of the children’s groups much better
than the mean value alone.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 13
Basic Principles of Statistics and Forecasts in your Daily and Business Life

2.5 The Normal Distribution

Figure 11: Normal Distribution


Taking many values into account that vary because of random factors, you can often find a normal
distribution for these values when you plot the number of values with a certain deviation around the mean
value (see Fig. 11 and compare also to the histogram in Fig. 3).
Normal distribution, definition from [3]: The normal distribution, also called the Gaussian distribution, is
an important family of continuous probability distributions, applicable in many fields. Each member of the
family may be defined by two parameters, the mean ("average", μ) and variance (standard deviation
squared, σ2) respectively. The standard normal distribution is the normal distribution with a mean of zero
and a variance of one. Carl Friedrich Gauss became associated with this set of distributions when he
analyzed astronomical data using them and defined the equation of its probability density function. It is often
called the bell curve because the graph of its probability density resembles a bell.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 14
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 12: Standard Normal Distribution

The standard normal distribution is the normal distribution with a mean of zero and a variance of
one.

ƒ 68,27 % of all values deviate not more than σ from the mean value
ƒ 95,45 % of all values deviate not more than 2σ from the mean value
ƒ 99,73 % of all values deviate not more than 3σ from the mean value

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 15
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 13: Normal Distribution with Different Parameters


If the mean value µ deviates from zero, the function is shifted horizontally.
If the variance σ2 is bigger than one, the function becomes broader and flatter than the standard normal
distribution. If the standard deviation is smaller than one, the function becomes tighter and higher.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 16
Basic Principles of Statistics and Forecasts in your Daily and Business Life

2.6 The Poisson Distribution for rare events

Figure 14: Poisson distribution


Although many natural and business events are distributed normally, there is another very important
distribution: the Poisson distribution. It is especially important for events that happen rarely but have many
opportunities to happen. Examples from nature: nuclear decay of atoms or chromosome mutations in DNA –
the events have a low probability for each atom or chromosome to happen, but the overall number can be
high regardless. A business example is the intermittent demand of slow-moving products: the more products
and product variants are in the assortments, the smaller the individual sales become. A product might sell
only once every two weeks but it is hard to predict when the next sales transaction will happen and how
many will be sold in this transaction.

Poisson distribution, definition from [3]:


The Poisson distribution is a discrete probability distribution that expresses the probability of a number of
events occurring in a fixed period of time if these events occur with a known average rate and independently
of the time since the last event. The Poisson distribution can also be used for the number of events in other
specified intervals such as distance, area or volume.
The distribution was discovered by Siméon-Denis Poisson (1781–1840) and published 1838. The work
focused on certain random variables N that count, among other things, a number of discrete occurrences
that take place during a time-interval of given length. If the expected number of occurrences in this interval is
λ, then the probability that there are exactly k occurrences (k being a non-negative integer, k = 0, 1, 2, ...) is
equal to the formula shown in Figure 14.
The Poisson distribution can be applied to systems with a large number of possible events, each of which is
rare. A classic example is the nuclear decay of atoms.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 17
Basic Principles of Statistics and Forecasts in your Daily and Business Life

3 Basics Forecasting
3.1 What is Forecasting?

“Forecasting is a mixture of science, art and luck.” [4]

Forecasting – Two Definitions from the Internet:


Forecasting is the process of estimation in unknown situations. Prediction is a similar, but more general
term. […] Usage can differ between areas of application: for example in hydrology, the terms "forecast" and
"forecasting" are sometimes reserved for estimates of values at certain specific future times, while the term
"prediction" is used for more general estimates, such as the number of times floods will occur over a long
period of time. Risk and uncertainty are central to forecasting and prediction. Forecasting is used in the
practice of Customer Demand Planning in every day business forecasting for manufacturing companies. The
discipline of demand planning, also sometimes referred to as supply chain forecasting, embraces both
statistical forecasting and a consensus process. Forecasting is commonly used in discussion of time-series
data. [3]
Forecasting is the prediction of outcomes, trends, or expected future behavior of a business, industry
sector, or the economy through the use of statistics. Forecasting is an operational research technique used
as a basis for management planning and decision making. Common types of forecasting include trend
analysis, regression analysis, Delphi technique, time series analysis, correlation, exponential smoothing, and
input-output analysis. [5]

Figure 15: Every day forecasts


The following list is taken from [5]

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 18
Basic Principles of Statistics and Forecasts in your Daily and Business Life

3.1.1 Applications of forecasting


Forecasting has application in many situations:
ƒ Supply chain management
ƒ Weather forecasting, Flood forecasting and Meteorology
ƒ Transport planning and Transportation forecasting
ƒ Economic forecasting
ƒ Technology forecasting
ƒ Earthquake prediction
ƒ Land use forecasting
ƒ Product forecasting
ƒ Player and team performance in sports
ƒ Telecommunications forecasting
ƒ Political Forecasting

Figure 16: Forecast Approaches

3.1.2 Forecast Approaches


The following classification is taken from [5], see also Fig. 16.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 19
Basic Principles of Statistics and Forecasts in your Daily and Business Life

• Time series methods: Time series methods use historical data as the basis of estimating future
outcomes.
o Moving average
o Exponential smoothing
o Extrapolation
o Linear prediction
o Trend estimation
o Growth curve
o Topics
• Causal / econometric methods: Some forecasting methods use the assumption that it is possible to
identify the underlying factors that might influence the variable that is being forecast. For example,
sales of umbrellas might be associated with weather conditions. If the causes are understood,
projections of the influencing variables can be made and used in the forecast.
o Regression analysis using linear regression or non-linear regression
o Autoregressive moving average
o Autoregressive integrated moving average
o Econometrics
• Judgmental methods: Judgmental forecasting methods incorporate intuitive judgments, opinions and
probability estimates.
o Surveys
o Delphi method
o Scenario building
o Technology forecasting
• Other methods:
o Simulation
o Prediction market
o Probabilistic forecasting and Ensemble forecasting
o Reference class forecasting
A model in science is a physical, mathematical, or logical representation of a system of entities, phenomena,
or processes. Basically a model is a simplified abstract view of the complex reality. It may focus on particular
views, enforcing the "divide and conquer" principle for a compound problem. Formally a model is a
formalized which deals with empirical entities, phenomena, and physical processes in a mathematical or
logical way.
A simulation is the implementation of a model over time. A simulation brings a model to life and shows how
a particular object or phenomenon will behave. It is useful for testing, analysis or training where real-world
systems or concepts can be represented by a model.
For more information regarding the above mentioned, see [3].
Forecast Approaches addressed in this paper:
Time Series methods: use historical data as the basis of estimation future outcomes. Examples: Moving
Average or Exponential Smoothing.
Causal methods: Like time series methods, but underlying factors may influence the forecast. Example:
Regression analysis.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 20
Basic Principles of Statistics and Forecasts in your Daily and Business Life

These forecast methods perform an extrapolation of time series. Time series are based on historical data.
Example: Demand Forecast based on historic sales.

3.2 What are Time Series?

Figure 17: Time Series Definition and Examples


Definitions from the Internet
Time Series [5]: Values taken by a variable over time (such as daily sales revenue, weekly orders, monthly
overheads, yearly income) and tabulated or plotted as chronologically ordered numbers or data points. To
yield valid statistical inferences, these values must be repeatedly measured, often over a four to five year
period. Time series consist of four components:
(1) Seasonal variations that repeat over a specific period such as a day, week, month, season, etc.
(2) Trend variations that move up or down in a reasonably predictable pattern
(3) Cyclical variations that correspond with business or economic 'boom-bust' cycles or follow their own
peculiar cycles, and
(4) Random variations that do not fall under any of the above three classifications.

Time Series [3]: In statistics, signal processing, and many other fields, a time series is a sequence of data
points, measured typically at successive times, spaced at (often uniform) time intervals. Time series
analysis comprises methods that attempt to understand such time series, often either to understand the

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 21
Basic Principles of Statistics and Forecasts in your Daily and Business Life

underlying context of the data points (where did they come from? what generated them?), or to make
forecasts (predictions). Time series forecasting is the use of a model to forecast future events based on
known past events: to forecast future data points before they are measured. A standard example in
econometrics is the opening price of a share of stock based on its past performance.

3.3 Basic Forecasting Methods

3.3.1 Moving Average

Moving Average for Forecasting - Principle

Starting point:

Time t 1 2 3 4 5 6 7 8 9
Data Dt 77 + 10
10 + 88 10 12 10 8 11 9
Moving average Mt 3

(number of values N = 3) 8.33

Moving forward:

t 1 2 3 4 5 6 7 8 9
Dt 7 10
10 + 8 + 10
10 12 10 8 11 9
3

Mt 8.33 9.33


© SAP 2008 / Page 8

Figure 18: Moving Average – Principle

Fig. 18 shows a sample time series consisting of data Dt at subsequent points in time t. Suppose the number
of values to be considered for the calculation of the mean value is N=3. In order to calculate the first moving
average value M4, you calculate the arithmetic mean value of the first three values. You move forward by
always calculating the mean value of the three preceding values.
The Moving Average moves from one data point to the next and thereby performs a smoothing of the values.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 22
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Moving Average for Forecasting

At the end of the original data values, you start forecasting:

… t
Dt
1
7
2
10
3
8
4
10
5
12
6
10
7
88
8 9 10
+ 11 + 99 9.67
3

Mt 8.33 9.33 9.67 10.33 10 10 9.67

t 1 2 3 4 5 6 7 8 9 10 11 12
Dt 7 10 8 10 12 10 8 11 + 99 + 9.67
9.67 9.89
3

Mt 8.33 9.33 9.67 10.33 10 10 9.67 9.89

t 1 2 3 4 5 6 7 8 9 10 11 12
Dt 7 10 8 10 12 10 8 11 9 + 9.67
11 9.67 + 9.89
9.89 9.85
3

Mt 8.33 9.33 9.67 10.33 10 10 9.67 9.89 9.85


© SAP 2008 / Page 9

Figure 19: Moving Average for Forecasting


At the end of the original data values, the last average value serves as the first forecast value (see Fig. 19).
From then on, you also consider forecast values for the calculation of the moving average. That means in
this example, that after three periods the forecast is purely based on previous forecast values.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 23
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Moving Average for Forecasting, Example

13

12

11

10 Dt
Moving average (N=3)
9 Forecast

1
6
F t +1 = ( D t + D t −1 + L + D t +1− N )
1 2 3 4 5 6 7 8 9 10 11 12 N
time t t
1
F t +1 = ∑ Di
N i= t +1− N

Issues with moving average for forecasting:


„ If the constant level changes, it takes N periods until the forecasted value
adapts
„ All N values used to calculate the Moving average have the same impact,
although recent values better represent the recent development of values

© SAP 2008 / Page 10

Figure 20: Moving Average for Forecasting, Example


Plotting the original data values, the moving average and the forecast against the time, shows how the
moving average performs a smoothing of the original time series together with a time shift of N periods (see
Fig. 20). Suppose there is a new original data point at the next point in time (by collecting the original time
series sequentially), the forecast can adapt with a lead time to peaks, constant level changes or trends in the
original time series.
The moving average is a simple method, but it considers all N values with the same weight, although recent
values might better represent the recent development. It is apparent, that it can only be used for a short term
forecast.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 24
Basic Principles of Statistics and Forecasts in your Daily and Business Life

3.3.2 Weighted Moving Average

Figure 21: Weighted Moving Average – Principle


In order to overcome the issue of the Moving Average method, that all N values have the same impact, there
is an improvement in the method of Weighted Moving Average. Although the principle of how to start, to
move forward and to calculate the forecast is the same, the values will be weighted with weighting factors
that need to be specified. In this above example, the weighting factors for the N=3 values was chosen 0.167,
0.333 and 0.5 to give a weighting of 1 in total (see Fig. 21).

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 25
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 22: Weighted Moving Average – Example


As a result of weighting the values considered for the calculation of the weighted mean value, this method
better reacts on constant level changes, trends or other fluctuations of the original time series, because it
gives the recent values more impact than the distant ones (see Fig. 22).

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 26
Basic Principles of Statistics and Forecasts in your Daily and Business Life

3.3.3 First Order Exponential Smoothing

First Order Exponential Smoothing, Principle

Always take e.g. 50% of what you calculated so far plus 50% of the
next data value…

t 1 2 3 4 5 6 7 8 9 10
Dt / Forecast Ft 7 10 8 10 12 10 8 11 9 9.53
Exponential
smoothing with α=0.5 10.07·0.5 + 9·0.5

8.5 8.25 9.13 10.56 10.28 9.14 10.07 9.53

1.56% input from older periods 98.44% input from last 5 periods
Weighting
factor

Smoothing factor αα == 0.5


Smoothing factor 0.5
0.5
0.25
0.0078 0.0156 0.0313 0.125
0.0625
time periods
© SAP 2008 / Page 13

Figure 23: First Order Exponential Smoothing, Principle


First Order Exponential Smoothing is actually a further enhancement in weighting the values taken into
account for calculating the mean value. Moreover, the mean values can easily be calculated out of the
previous mean values and the next data value (or forecast value, respectively).
Fig. 23 shows an example: start with the first two data values with an equal weighting of 0.5 to get the first
average value. Take this average and next data value D3 again with a weighting of 0.5 each. Proceed like
that until the first forecast value that is shown in the figure. (Forecast values are taken into account for further
extrapolation.)
When recalculating the weighting factors that each data value in the past got, you will see, that the factors
describe an exponential curve.
This means, that all past values will have an impact on the forecast, although this impact decreases
exponentially.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 27
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Smoothing Factor α

αα == 0.5
0.5
98.4% reacts quickly on a level shift
1.6% older
5 periods

0.5
reacts strongly on a short pulse

0.25
0.125

αα == 0.1
0.1

reacts slowly on a level shift


34.9% 65.1% from
older last 10 periods

reacts little on a short pulse


0.09
0.08

0.1
0.07
0.07
0.06

The smoothing factor α determines both the responsiveness and the


stability of the forecast. Common values α = 0.2 or α = 0.3.
© SAP 2008 / Page 14

Figure 24: Smoothing Factor α


The weighting factor used to weight the most recent data value is the Smoothing Factor α, whereas the last
average calculated with exponential smoothing is weighted with 1-α. α determines two characteristics of the
exponential smoothing at the same time:
• Responsiveness, that is how quickly the exponentially smoothed values (and also the forecast)
react on level shifts
• Stability, that is how strong the smoothed values and the forecast react on short pulses
Obviously, these both characteristics run contrary to each other. Reasonable results can be found for α = 0.2
or 0.3 (see Fig. 24).

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 28
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 25: First Order Exponential Smoothing, Example and Formula


Comparing the plot of the exponential smoothing with the moving average of Fig. 20 or the weighted moving
average of Fig. 22, you can see that exponential smoothing better follows the fluctuation of the original time
series (see Fig. 25). A further advantage is that you can calculate the exponentially smoothed value from two
values only: the latest smoothed value and the next data value.
However, like the moving average methods, first order exponential smoothing is not able to predict trends or
seasonality pattern in the forecast. All these methods can only follow such fluctuations when smoothing the
original data values, but are not able to predict them in the future. Therefore, further enhancements are
needed.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 29
Basic Principles of Statistics and Forecasts in your Daily and Business Life

3.3.4 Seasonal adjustment of time series as a general statistical method

Seasonal Adjustment - Example

Example: Hypothetic unemployment numbers for 3.5 years

120

115

110

105

100

95

90
1/I 1/II 1/III 1/IV 2/I 2/II 2/III 2/IV 3/I 3/II 3/III 3/IV 4/I 4/II
Value 116 100 92 100 108 100 92 100 116 108 100 100 112 108

Question: is there a positive


trend in the last quarter if
seasonal effects are neglected?

© SAP 2008 / Page 16

Figure 26: Seasonal Adjustment – Example


The following example shows a method to adjust seasonal patterns in a time series was taken from [1].
Fig. 26 shows hypothetical unemployment numbers per quarters over 3.5 years. In the last quarter, you can
observe a drop from 112 to 108 (relative numbers). The question is, whether this drop is real or only due to
the season that usually leads to a decrease of the unemployment rate.
You can see easily that there is a seasonal pattern indeed: every year, there is a maximum of unemployment
in the first quarter and a minimum in the third. But how big is this seasonal effect?

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 30
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 27: Seasonal Adjustment – Example


The first step to answer to this question (How big is the seasonal effect?) is to calculate the moving average
of the original time series with N=4 (see Fig. 27). One can’t start before the quarter III of the first year and
take the following formula in order to balance the values around the quarter III:
moving average = (½ of the quarter before the last + last quarter + current quarter + next quarter + ½ of the
quarter after next) / 4
The moving average time series has to end at quarter IV of the third year.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 31
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 28: Seasonal Adjustment – Example


The next step is to calculate a “seasonal figure” which is the mean deviation of the original data from the
moving average (see Fig. 28). The seasonal factors are:
Quarter I: (8+11)/2 = 9.5
Quarter II: (0+2)2 = 1
Quarter III: (-9-9-5.5)/3 = -7.83
Quarter IV: (0-3-5)/3 = -2.67

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 32
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 29: Seasonal Adjustment – Example


This seasonal figure can be applied to each year in order to calculate a further approximation for the season-
independent trend component (see Fig. 29). This seasonally adjusted time series contains now also values
at the beginning and the end of the time series, unlike the moving average.
As a result you can find that the seasonally adjusted unemployment (unlike the non-adjusted one) increased
from 102.5 to 107 in the last quarter.
Statistical seasonal adjustments usually work in similar ways. Season figures can also be used for
forecasting future seasons after having isolated the season factors from the original data.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 33
Basic Principles of Statistics and Forecasts in your Daily and Business Life

3.3.5 Exponential Smoothing with Trend and Seasonality

Figure 30: Exponential Smoothing with Trend and Seasonality


Remember, that simple exponential smoothing can follow a time series, but it can extrapolate only constant
values.
Seasonal adjustment is for separating the seasonality effect from the base. Moreover, you can also
determine a trend component, e.g. by performing a second order exponential smoothing.
In the example of fig. 30, which was adapted from [6], the isolation of trend and season portions was
performed with the help of the following formulas:

Dt
Base Bt = α + (1 − α )( Bt −1 + Tt −1 )
S t −m

Trend Tt = β ( Bt − Bt −1 ) + (1 − β )Tt −1

Dt
Season factors S t = γ + (1 − γ ) S t − m
Bt

Forecast Ft + k = ( Bt −1 + kTt −1 ) S t + k − m

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 34
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Note, that in these formulas, the seasonality is assumed to be multiplicative, that means, the amplitude
increases from season to season. There are other methods considering the seasonal pattern as additive.
Find more information in forecasting literature such as [4].

3.3.6 Linear Regression


The following example for linear regression is based on [6].

Figure 31: Regression Example


Fig. 31 shows an example taken from [7]: A champagne producer wants to launch a new champagne
product and searches for the retail price. Before making any decision, the producer wants to find out how the
sales depend on the price. Therefore, a selling test is performed in 6 stores with prices between 10 and 20
Euros. The sales per day are plotted against the retail prices. There seems to be a linear dependency. This
can be analyzed with linear regression.
Linear Regression, definition from [3]: In statistics, linear regression is a form of regression analysis in
which the relationship between one or more independent variables and another variable, called dependent
variable is modeled by a least squares functions, called linear regression equation. This function is a linear
combination of one or more model parameters, called regression coefficients. A linear regression equation
with one independent variable represents a straight line. The results are subject to statistical analysis.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 35
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 32: Regression Example


The least square analysis can be used to find the regression line that best fits into the data set of the two
depending variables (for formulas see Fig. 32).

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 36
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 33: Regression Example


As a result, you obtain the regression line shown in Fig. 33. You can interpret the line in the following way;
the store can sell one bottle less a day for each Euro where the champagne costs more.
Interpolation and extrapolation:
If the prediction is to be done within the range of values of the x variables used to construct the model this is
known as interpolation. In the champagne example, this would mean: at a price of 12 Euro, the store could
sell 8 bottles a day. Prediction outside the range of the data used to construct the model is known as
extrapolation and it is more risky. In the champagne example, this could mean: at a price of 8 Euro (which
was not tested), the store could sell 12 bottles a day.

3.3.7 More sophisticated Regression Methods:


Non-linear Regression:
ƒ The response Y depends on a non-linear function of the variable x, such as e-function, logarithm
etc.
ƒ Solution approach: the variable x is plotted in a suitable scale (e.g. logarithmic scale) to result in
a linear curve
Multiple Regression:
ƒ The response Y depends on several linear dependent variables x1, x2, etc.
ƒ Solution approach: Linear least squares method for a number of normal equations that can be
described as matrices

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 37
Basic Principles of Statistics and Forecasts in your Daily and Business Life

3.4 Causal based forecasting

Causal Factors in Demand Forecasting,


Examples and Principle

Ca e nts
le n
d ar E l ev
v en ca
ts Lo

ge s
ric e C ha n
Sale s P

Sales Promotion
De t e
rmin
i st ic
d em a
nd s

Sales data Forecast data


Sales/Demand
of a product
in a location

past future
occurr. occurr.
© SAP 2008 / Page 25

Figure 34: Causal Factors in Demand Forecasting, Examples and Principle


A causal factor is an external factor with a significant influence on the sales or demand of a product.
By applying concrete occurrences of causal factors to either locations or location products, the forecast can
use the information about the effects of such occurrences in the past in order to predict its influence on the
future sales or demand.
Fig. 34 shows examples for causal factors together with a hypothetical sales and forecast curves which
should reflect the following principle: The correlation of the sales peak with the causal factor occurrences in
the past is applied to future occurrences.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 38
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 35: Impact of Seasonality and Causal Factors on Forecast, Example


Forecast methods such as exponential smoothing and regression together with causal factor analysis are
used for example in automatic replenishment software in the retail industry.
Fig. 35 shows a graphic of a forecast calculated and displayed in SAP Forecasting and Replenishment 5.1.
The consumption time series represents a hypothetical sales curve that is characterized by a yearly seasonal
pattern, positive slopes around Christmas and additional peaks during promotions. Promotions and
Christmas seasons were indicated as causal factors (“Demand Influencing Factors” in SAP F&R) in the
system. The forecast method was a regression method taking into account both the seasonality and the
effect of causal factors. The forecast was able to reproduce all effects.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 39
Basic Principles of Statistics and Forecasts in your Daily and Business Life

3.5 Forecasting Performance Measures

Figure 36: Forecasting Performance Measures


Fig. 36 shows an example of a linear curve representing a supposed forecast together with some supposed
actual values, taken after the forecast had been calculated. The question is now: how good is the forecast?
In order to measure the forecast quality, there are some common measures:
• Mean Forecast error (MFE or Bias)
• Mean Absolute Deviation (MAD)
• Mean Absolute Percentage Error (MAPE)
• Standard Squared Error (MSE)

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 40
Basic Principles of Statistics and Forecasts in your Daily and Business Life


Figure 37: Mean Forecast Error
Fig. 37 shows the mean forecast error: it is the sum of all deviations divided by the number of values. It is
obvious, that positive and negative deviations can cancel out. Therefore, the mean forecast error can only
detect an under- or overshooting of the forecast.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 41
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Mean Absolute Deviation (MAD)

Mean Absolute Deviation (MAD): Measures absolute error


„ Positive and negative errors thus do not cancel out (as with MFE)
„ Want MAD to be as small as possible
„ No way to know if MAD error is large or small in relation to the actual data

14
actuals/forecast

12

10

8
n
1
6 MAD =
n

t =1
D t − Ft
4

2
time
0
1 2 3 4 5 6
Forecast 3 5 7 9 11 13
Actuals 2 6 5 10 13 11
Absolute deviation 1 1 2 1 2 2 MAD = 1.5

© SAP 2008 / Page 29

Figure 38: Mean Absolute Deviation (MAD)


Fig. 38 shows the mean absolute deviation which uses the absolute deviations instead of the actual one. As
a result, positive and negative deviations do not cancel out. However, the key figure is hard to interpret since
it depends on the amounts and units of the values.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 42
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Figure 39: Mean Absolute Percentage Error


Fig. 39 shows the mean absolute percentage error which gives the mean absolute deviation as a percentage
of the actual data. This is a very common key-figure.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 43
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Mean Squared Error (MSE)

Mean Squared Error (MSE): Measures variance of forecast error


„ Measures squared forecast error - error variance
„ Recognizes that large errors are disproportionately more “expensive” than small errors
„ But is not as easily interpreted as MAD, MAPE - not as intuitive

14
actuals/forecast

12

10

n
8
1
6
MSE =
n
∑t =1
( D t − Ft ) 2
4

2
time
0
1 2 3 4 5 6
Forecast 3 5 7 9 11 13
Actuals 2 6 5 10 13 11
Squared deviation 1 1 4 1 4 4 MSE = 2.5

© SAP 2008 / Page 31

Figure 40: Mean Squared Error


Fig. 40 shows the mean squared error that is in analogy to the statistical variance explained earlier.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 44
Basic Principles of Statistics and Forecasts in your Daily and Business Life

References
[1] Walter Krämer: Statistik verstehen, Piper Verlag GmbH, München, 6th Edition, 2007
[2] Walter Krämer, So lügt man mit Statistik, Piper Verlag GmbH, München, 9th Edition, 2007
[3] Wikipedia, the free encyclopedia, http://en.wikipedia.org/wiki/Main_Page, search for the keywords ‘normal
distribution’, ‘Poisson distribution’, ‘Forecasting’, ‘Time Series’, ‘Linear regression’
[4] Peter Mertens, Susanne Rässler (eds.): Prognoserechnung, Physica-Verlag Heidelberg, 6th edition, 2005
[5] BNET Business Dictionary (http://dictionary.bnet.com), search for the keyword ‘Forecasting’
[6] Talk given by Stephan R. Lawrence, Demand Forecasting: Time Series Models, College of Business and
Administration, University of Colorado, Boulder
[7] Wikipedia, die freie Enzyklopädie, http://de.wikipedia.org/wiki/Regressionsanalyse
For more information, visit the Retail homepage.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 45
Basic Principles of Statistics and Forecasts in your Daily and Business Life

Copyright
© 2008 SAP AG. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG.
The information contained herein may be changed without prior notice.
Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.
Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation.
IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iSeries, pSeries, xSeries,
zSeries, System i, System i5, System p, System p5, System x, System z, System z9, z/OS, AFP, Intelligent Miner, WebSphere,
Netfinity, Tivoli, Informix, i5/OS, POWER, POWER5, POWER5+, OpenPower and PowerPC are trademarks or registered trademarks of
IBM Corporation.
Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or registered trademarks of Adobe Systems
Incorporated in the United States and/or other countries.
Oracle is a registered trademark of Oracle Corporation.
UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.
Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of
Citrix Systems, Inc.
HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C®, World Wide Web Consortium, Massachusetts
Institute of Technology.
Java is a registered trademark of Sun Microsystems, Inc.
JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by
Netscape.
MaxDB is a trademark of MySQL AB, Sweden.
SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and other SAP products and services mentioned herein as well as their
respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All
other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves
informational purposes only. National product specifications may vary.
These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP
Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or
omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the
express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an
additional warranty.
These materials are provided “as is” without a warranty of any kind, either express or implied, including but not limited to, the implied
warranties of merchantability, fitness for a particular purpose, or non-infringement.
SAP shall not be liable for damages of any kind including without limitation direct, special, indirect, or consequential damages that may
result from the use of these materials.
SAP does not warrant the accuracy or completeness of the information, text, graphics, links or other items contained within these
materials. SAP has no control over the information that you may access through the use of hot links contained in these materials and
does not endorse your use of third party web pages nor provide any warranty whatsoever relating to third party web pages.
Any software coding and/or code lines/strings (“Code”) included in this documentation are only examples and are not intended to be
used in a productive system environment. The Code is only intended better explain and visualize the syntax and phrasing rules of
certain coding. SAP does not warrant the correctness and completeness of the Code given herein, and SAP shall not be liable for errors
or damages caused by the usage of the Code, except if such damages were caused by SAP intentionally or grossly negligent.

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com


© 2009 SAP AG 46

You might also like