Professional Documents
Culture Documents
Cliord Lam
Department of Statistics London School of Economics and Political Science
Princetonshield
1 / 58
Lecture 8 rundown
Recap from last lecture What is time series? Why are they so important? How do we model them? Were back to Excel
Princetonshield
2 / 58
Time series
Time series (TS) data is any sequence of measurements taken on a response that varies over time Examples:
Weather (pressure, temperature, rainfall; daily, weekly, annual) Health (HIV; white cell count, Cancer; tumor growth) Finance (shares, interest rates, exchange rates)
Princetonshield
3 / 58
TS - why do we care
In the business world TS are the main object of stats analysis Shares, interest rates, real estate prices, price of gold, petrol prices, ination etc. In your future jobs you might need to know something about TS Also TS are important for weather forecasting and in particular lately for identifying global warming
Princetonshield
4 / 58
TS - why do we care
The aim of TS modelling is to understand seasonal (cyclical) and directional trends In order to be able to FORECAST (i.e. predict) the values of the variable of interest on a future date This allows people in the nancial world to make prots By buying or selling shares, options etc.
Princetonshield
5 / 58
TS - forecasting
Forecasting is extremely uncertain Remember when we talked about out-of-sample predictions in MLR? I.e. predicting the outcome for ranges of the explanatory variables that we have not seen We are always less sure about out-of-sample predictions than we are of in-sample predictions
Princetonshield
6 / 58
TS - forecasting
In TS forecasting we only care about out-of-sample prediction This becomes dicult because TS are very variable and often unpredictable:
Markets have crashes and recessions Weather is highly variable
Princetonshield
7 / 58
TS and regression
Consider the data on petrol sales for cars per quarter for 4 years From the scatterplot of quarter (time) versus sales we can see that a linear downward trend would be suitable
Princetonshield
8 / 58
TS and regression
However we shouldnt really t a linear regression as we have dependent error terms Because there is a seasonal trend which shows clearly in the residual plot
Princetonshield
9 / 58
TS and regression
It makes sense that there should be a seasonal trend to petrol sales People travel more in the summer and therefore people buy more petrol then Of the NICEL assumptions I is being violated The errors are not Independent This is quite typical for time series models But there is a linear trend too Need to consider both elements
10 / 58
Princetonshield
TS Components
TS have 4 elements 1 Trend: long term direction of the data - this can be linear or exponential etc.
can be described by a regression
2 Seasonal eects: cycles related to seasons, months, weeks, days of the week 3 Cycles: long term cycles that are not necessarily related to the season - no need to worry in this course 4 Irregular uctuations: random error + blips and market crashes Princetonshield
11 / 58
TS Components
We assume a multiplicative model for how the TS components mix: Time series data= Yt =Trend Seasonality Cyclicality Irregularities = TSCI -typically we focus on T and S as these are easier to predict than long term cycles and irregular uctuations
12 / 58
Princetonshield
TS Components
We use a multiplicative model because is makes more sense than an additive model
Think about the residual plot from before It has a seasonal pattern but was also heteroscedastic (i.e. funnel shaped)
So if we just add seasonality to linearity we dont take into account that for larger time there is also often extra variability A multiplicative model does!
Princetonshield
13 / 58
Trend example
Below are monthly data on petrol sales, as you can see from this longer time series, there is an upward probably linear trend As the data are monthly there is also a monthly (seasonal) trend but it is harder to see
Princetonshield
14 / 58
Princetonshield
15 / 58
Season example
Below are seasonal data on ice-cream sales, as you can see from this time series, there is a denite seasonal trend to ice-creams sales
Princetonshield
16 / 58
Other examples
Employment will have cycles: recessions have a cyclical nature Irregularities: market crashes Seasonality: More work in the summer Trend: as more people are born more are employed
Princetonshield
17 / 58
TS components
In this course we focus on retrieving the main components of a time series Trend and Seasonality This can get pretty intense before you understand it so please pay attention The way this works is to nd the underlying trend of the time series And then divide the time series by this trend in order to get the seasonal component
Princetonshield
18 / 58
Stationarity
Time series without cycles or seasonality are called stationary I.e. if a time series has only trend and can be explained by a linear regression then it is stationary
Princetonshield
19 / 58
TS components
Let Y represent a time series Y = T SCI, the most general form I is unpredictable C is hard to do unless we know the data cycle T and S can be found so we assume Y = T S
Princetonshield
20 / 58
TS how to
1 S: First we preliminarily nd the trend by Smoothing the data
Think carefully about the type of moving average you might choose
2 M: We preliminarily nd the seasonal component by dividing the time series Y by the trend in the moving average S = Y T 3 S: We then have to get the true seasonal eects by estimating rst seasonal averages and then seasonal indices
Princetonshield
21 / 58
TS how to
4 T:Now that we have a good estimate of seasonality divide data by season to get the real trend T = Y S 5 R: Use a linear regression on the Trend to get the estimate of the trend parameters 6 F: Multiply the Trend forecast to the seasonal estimates and then do the usual residual analyses
Princetonshield
22 / 58
Smoothing
Smoothing is the idea of getting rid of seasonal, cyclical and irregular components of the time series, thus extracting the trend The idea is to summarise what a time series is doing by averaging the data points over a number of time points Some people dont like this and prefer autocorrelation models Well see these later
Princetonshield
23 / 58
Smoothing
There are two main ways of smoothing 1 Moving averages 2 Weighted moving averages 3 Exponential smoothing - later Well do Moving Averages rst
Princetonshield
24 / 58
Princetonshield
25 / 58
Princetonshield
26 / 58
Princetonshield
27 / 58
Princetonshield
28 / 58
Princetonshield
29 / 58
Instead of just using simple averages we can try using weighted moving averages These give more weight to values close to the time we are estimating it for E.g in our minks example we use a 5 point MAs where the furthermost points worth less
x5 = t xt2 + 2xt1 + 4xt + 2xt+1 + xt+2 10
Princetonshield
30 / 58
Princetonshield
31 / 58
Princetonshield
32 / 58
Princetonshield
33 / 58
Princetonshield
35 / 58
Princetonshield
36 / 58
Princetonshield
37 / 58
Princetonshield
38 / 58
Seasonal components
Once weve gotten the hang of the trend and smoothed out the season if there is one We can try and isolate the seasonal element by dividing the time series by the trend Remember that Y = T S if there are no cycles or irregularities so Y = S T What we estimate is the Ratio-to-moving-average (R2MA) In our petrol example there was a seasonal component so lets nd the R2MA
39 / 58
Princetonshield
TS example: R2MA
R2M At =
Yt 4ptCMAt
Princetonshield
40 / 58
Seasonal components
We now have an idea of the seasonal trends However, if you think about it, we want one estimate for the seasonal component for each quarter Currently we have 3 for each quarter (were looking over 4 years) Remember that because we have to leave out the rst couple and the last couple of values to get the 4 point centered MA we only have the R2MA from the 3rd Quarter till the 14th The best thing is to list them Princetonshield
41 / 58
TS example: R2MA
Princetonshield
42 / 58
Seasonal components
Once weve listed them we get average values for the seasonal component for each season For each season we have 3 values so we average them for each season. If some seasons have dierent number of values we take the average of them anyway. E.g. if for Autumn we have 2 values only then we take the average of these 2 values for Autumn.
Princetonshield
43 / 58
Seasonal Average
Princetonshield
44 / 58
Next we look at the sum of the seasonal averages In this case it is 4.0013, very close to 4. The idea of seasonality is it to imagine it as going up and down around the trendline We want it to be on average 1 each year so we want the average of the four seasonal averages to be exactly 4 Remember that we are using a multiplicative model - so 1 is similar to 0 in additive models Princetonshield
45 / 58
Seasonal components
We have to do estimate the seasonal indices The way we do this is by normalising the seasonal averages so they sum to 4 E.g. for summer we do Summer.Index = Summer.Average 4 Sum.of.Avgs
Princetonshield
46 / 58
Seasonal Indices
Princetonshield
47 / 58
De-seasonalising
Now we have a grip on the seasonal component of the data We want to get a better grip on the trend So we use the same trick we used before: Y = T S so Y = T S To get a de-seasonalised trend for the petrol data we divide Petrol by the Seasonal Index From the plot we can see that the deseasonalised data now just looks regularly linear
48 / 58
Princetonshield
De-seasonalising
Princetonshield
49 / 58
De-seasonalising
Princetonshield
50 / 58
Trend
What we do now with the trend is to t a linear regression to it You do this in the usual way and I wont go over it I just keep the intercept and the coecient of Quarter
Princetonshield
51 / 58
Trend
Princetonshield
52 / 58
Forecasting
We have the Seasonal indices (seasonal component) We have the trend line (linear regression) The aim is now to forecast the next few points 1 Use the linear regression to forecast(predict) the trend values for quarters 17-20 2 Multiply the regression predictions by the appropriate seasonal index These are the forecasts
Princetonshield
53 / 58
Forecast
Princetonshield
54 / 58
Forecast
We now look at the plot of the predicted time series As you see the downward trend is continuing and we see the seasonal patter repeating itself
Princetonshield
55 / 58
Real data
We actually have the data for those next 4 quarters As you see there was a crash in petrol prices in 1985 (where the quarter 16 ends) We could not have predicted that without further information
Princetonshield
56 / 58
Real data
At the end of 1985 there was a pretty big crash in the price of crude oil that lasted all the way through to 1986 This was due to a maneuver by OPEC countries to secure their future in the market in the face of competition from other countries, e.g the USA Crude oil production has gone down pretty much since then
Princetonshield
57 / 58
Princetonshield
58 / 58