3600 Note 01

Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr.
Bill Hung
Overview of Economic Forecasting Methods:

a. Causal (Multivariate) Forecasting Methods: Regression methods
Make projections of the future by modeling the causal relationship between a
series and other series. Can use time series or cross-sectional data to forecast.
Yt = f (Xt)
or
Yt = f (X1t, .., Xkt, Y2t, ..,Y1k)
Methods: Simple or Multiple models
System equation models
Seeming unrelated models
b. Univariate forecasting methods or Time series methods

Using the past, internal patterns in data to forecast the future.
Yt+1 = f (Yt, Yt-1, Yt-2, Yt-3)

Methods: Moving average
Smoothing and Exponential smoothing
Decomposition: Seasonal and trend decomposition
ARIMA (box-Jenkins)
c. Multivariate Time series methods
Y1t = f (Y1,t-1, .., Y2,t, ,Yj,t, .., X1,t,, Xj,t-k)
Methods: Multivariate ARIMA models

Autoregressive and distributed-lag models
Vector Autoregression (VAR)
Vector Error Correction (VEC) models
ARCH and GARCH models
d. Qualitative Forecasting Methods:

Dummy variables are used to characterize the information that choices are
involved. Such data is referred to as qualitative data. Models including yes-no-
type as dependent variables are called dichotomous, or dummy dependent
variable regression models. There are three approaches to estimating and
forecasting such models; (1) the linear probability model (LPM), (2) the Logit
model, (3) the Probit model and Tobit model.
Other qualitative forecasting methods are based on the judgement and opinions
of others, such as concerning future trends, tastes and technological changes.
Qualitative methods include Delphi, market research, panel consensus, scenario
analysis, and historical analogy methods of predicting the future. Qualitative
methods are useful when there is little data to support quantitative methods.
1
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
Understanding the classifications of economic data:

1. Primary versus secondary data:
The first hand data are called primary data: In natural sciences, the bulk of data are
generated by running experiments in a laboratory under a controlled environment in
which factors are determined by the researchers according to their theoretical design.
Therefore, the quality of data is very often control by the researchers.
In social sciences, particular in economics, researchers cannot create their own data
by running experiments in a laboratory. They must rely on the data collected by
official or agencies. Data collected by another party are called secondary data. If the
data do not meet the desired standard of quality, social scientists cannot afford to
throw them away. Instead, they must patiently and carefully treat or adjust these data
by means of different econometrics and statistical methods.
2. Time-Series versus cross-sectional data

Time series is a collection of observations generated sequentially through time. Time
series data can be observed at many "frequencies". The common frequencies of
economic data are annual, quarterly, monthly, weekly, and daily. The minute-by-
minute and even the second-by-second data are also very commonly used in the
financial researches.
Cross-sectional data are observations generated from different economic units in a

given time period. These units might refer to firms, factors, groups, people, areas,
provinces, regions, or countries, etc.
Panel (or pooled) data combines the time-series and cross-sectional approaches.
3. Macroeconomic versus microeconomic data

Highly aggregated data that measure the activity in the economy as a whole, such as
GDP, implicit price deflator, interest rate, unemployment rate, money supply, etc., are
called macroeconomic data.
Disaggregated data that measures the economic activity of an individual household,

factor, firm, or industry are called microeconomic data.
4. High-frequency versus low-frequency data

Most economic data are collected over a discrete interval of time. The length of the
discrete interval classifies measurements into annual, semiannual, quarterly, monthly,
weekly, or daily data. Data collected over longer intervals are released less frequently,
this type of data is called low-frequency data. Hourly, daily or weekly data is called
more high-frequency data. Real-time data is the data reflect events at the moments, all
others are called historical data.
5. Quantitative versus qualitative data

The economic data have a number corresponding to each economic unit, described as
previous data sets, is referred to as quantitative data. When we want to characterize a
series of information such "yes" or "no", "good" or "bad", "male" or female", or
others. These data information is referred to qualitative data. Such data arise often in
economics when choices are involved. When we convert these qualitative
characteristics into numeric data, we might assign 1 = "yes" and 0 = "no". When a
2
variable take on only the numeric values 1 and zero, they are referred to as dummy (or
binary) variables.
Measurement and conversion of economic data
1. Economic variable measured as levels
Flow variable: data flows overtime as a stream.

Stock variable: a measurement at a particular point of time and is the
accumulation of all past flows. For example, investment in a given year is a flow
variable, and the capital stock is a stock variable.
2. Economic variables measured as indexes

Index is the series data based on a particular based period (notation with
subscription 0) or reference period, the levels of the variable in other time periods
(notation with subscript with t) are compared with the based-period level and
measured. An economic indicator measured as an index must have a known base
period to show relative magnitude. Without a base period, the index itself is not
meaningful. There are two example types of price indexes
Laspeyres index of t period = 100*PtQ0 / P0Q0 (Base-weighted index)
Paasche index of t period = 100*PtQt / P0Qt (Current-weighted index)
Effective exchange rate index = wiEi (Where wi is the weight)
Real exchange rate index = E(HK$/US$) x (US Price index/ HK Price index)
Others such as production index, and wage index, etc are measured in index.
3. Economic variables measures as growth rates and ratios

Economic variables measured in term of percentage change or ratios are derived
from level or index variable. For examples:
Growth rate = 100*(GDPt GDPt-1)/GDPt-1
Inflation rate = 100* (CPIt CPIt-1)CPIt-1
Unemployment rate = 100* (Labor force Employment)/Labor force
Exchange rate = Domestic currency (units value)/ foreign currency (unit value)
In many instances, management finds forecasted growth rate of an economic variables

to be of more interest than forecast of absolute levels. Growth rate conversions tend to
focus attention on marginal changes rather than absolute level. This type of
transformation is specifically useful when we are developing forecasts with multiple
economic variables measured in different units.
For example: if we want to generate a forecast for interest rates (measured in
percentage term), one of the factors that we might consider would be the money
supply (record in billions of dollars). Therefore, it is more useful to convert the money
3
supply data to a percentage growth format similar with the interest rates.
4. Linear Transformation
Linear transformations are specifically useful when applying regression analysis
to forecasting problems. Specifically, regression analysis begins by assuming that the
relationship between variables is linear. The advantage of this type of conversion is
that we have been able to take a relationship that does not satisfy the requirements of
linearity and transform the data. The two common methods are:
- logarithmic transformation
- square root transformation
4
Understand the common patterns of Time series:

When a time series is plotted, common patterns are frequently found. These patterns
might be explained by many possible cause-and-effect relationships.
Random patterns:
Random time series are the result of many influences that act independently to yield
nonsystematic and non-repeating patterns about some average value. Purely random
series have a constant mean and no systematic patterns. Simply averaging model are
often the most best and accurate way to forecast them.
time
Trend patterns:
A trend is a general increase or decrease in a time series that lasts for approximately
seven or more periods. A trend can be a period-to-period increase or decrease that
follows as a straight-line, this kind of pattern is called a linear trend. Trends are not
necessarily linear because there are a large number of nonlinear causal influences that
yield nonlinear series. Trends may cause by long-term factors or population changes,
growth during production and technology introductions, changes in economic
conditions, and so on.
Y Y
Time Time
Seasonal Patterns: Seasonal series result from events that are periodic and
recurrent (e.g., monthly, quarterly, changes recurring each year). Common seasonal
influences are climate, human habits, holidays, repeating promotions, and so on.
Y
Time
Q1 Q3 Q1 Q3 Q1 Q3
Cyclical Patterns:
Economic and business expansions and contractions are most frequent cause of
cyclical influences on time series. These influences most often last for two to five
years and recur, but with no known period. In the search for explanation of cyclical
movements, many theories have been proposed, including sunspots, positions of
planets, stars, long wave movements in weather conditions, population life cycles, the
growth and decay cycle of new products and technologies, product life cycles, and the
economic cycles. Cyclical influences are difficult to forecast because they are
recurrent but not periodic.
6
Time
Autocorrelated Patterns: Correlation measures the degree of dependence or

association between two variables. The Term autocorrelation means that the value of a
series in one time period is related to the value of itself in previous periods. With
autocorrelation, there is an automatic correlation between observations in a series.
Y Y
Time Time
Outliers: The analysis of past data can be made very complex when the included
Y
values are not typical of the past or future. These non-typical values are called
outliers, which are very large or small observations that are not indicative of repeating
past or future patterns. Outliers occur because of unusual events.
Time
The importance of Pattern:

One of the fundamental assumptions of most univariate forecasting methods is that an
actual value consists of one (or more) pattern plus error. That is,
Actual Value = Pattern(s) + Error
When the pattern of the actual value is a repeating pattern, this behavior can be used
to predict the future value. We describe past patterns statistically. Therefore, the first
step in modeling a series is to plot it versus time and reveals it statistics.
Basic Tools in forecasting:

In forecasting context, statistically analysis is used to extract information from the
past. Based on more accurate forecasts, more effective decisions are possible, thus
Raw data information (i.e., pattern) forecasts decisions
1. Graphs and bar charts

Time plot graphs
Histograms
X-Y scatter plots
Others
2. Probability estimations:
a. Probability estimates using past percentage frequencies
i.e. probability distribution
b. Probability estimates using theoretical percentage frequencies
8
i.e., theoretical properties

c. Probability estimates using subjective judgment
i.e., decision makers subjective estimation
3. Univariate summary statistics:

Comparisons of measures and the properties of central values:
Mean: The statistical term for average
Standard deviation (SD): Interpreted in a comparative sense.
Median and Mode: Measures of the center location of a distribution.
4. Measuring Errors:
Error = Actual Forecast
Absolute Error Relative Error

ME
e t
PEt
(Yt Yt )
100
n Yt
MAD
e t
MPE
PE t
n n
SSE et2 MAPE
PE t
MSE
e 2
t
RSE
e 2
t
n n 1
Statistically Significance Test for Bias

Ho: et 0
H1: et 0
The simple t-test can be used to determine how many standard errors of the mean
error that is away from the hypothetical mean of zero.
et 0
t * calculated
Se / n
Where Se is the standard deviation of errors about its mean et , and
Se
(e t et ) 2
n 1
Decision rule:
If | t* | tc, no statistically significant bias has been found.

If | t* | > tc, there is statistically significant bias.
Understanding correlation
Predictive ability is achieved when it can be shown that the values of one variable
9
move together with values of another variable. While the positive and negative are
attributes of a relationship, however, they do not denote the strength or degree of
association between each variable. Therefore, other measures such as covariance,
correlation, auto-covariance and auto-correlation coefficients can be used for
identifying and diagnosing forecasting relationship.
Correlation Measures: (Pearson correlation coefficient)

Let X and Y be two variables and suppose the data have N observation such as i = 1 to
N. The correlation between X and Y is denoted by a small letter, rxy, and is calculated
by the follow formula:
rxy
(Y Y )( X X )
i i
(Y Y ) ( X X )
i
2
i
2
Covariance: Cov ( X , Y )
( X i X )(Yi Y )
n 1
The covariance is the mean of the product of the deviations of two numbers from their
respective means. The correlation coefficient measures the proportion of the
covariation of X and Y to the product of their standard deviations.
Cov ( X , Y )
Therefore, rxy
S x SY
The properties of correlation:

1. The r always lies between -1 and 1, which may be written as -1 r 1.
2. If r = -1, there is a perfect negative relationship; if r = 0, there is no relationship;
If r = 1, there is perfect positive relationship.
3. The correlation between X and Y is the same as the correlation between Y and X.
4. The correlation between any variable and itself is always equal 1.
Correlation does not necessarily imply causality:

It is widely accepted that cigarette smoking causes lung cancer. Suppose we have
collected data from a group of people on (a) the number of cigarettes each person
smokes per week (X) and (b) on whether they have ever had or now have lung cancer
(Y). Since we have undoubted that smoking causes cancer and find rxy > 0. The
positive correlation means that people who smoked tend to have higher rates of lung
cancer than non-smokers. Here the positive correlation between X and Y indicates
direct causality.
Now suppose that we also collect data on the same group of people to measure the
amount of alcohol they drink in a typical week. Suppose we assign this new variable
as Z. In practice, it is the case that heavy drinkers also tend to smoke and hence, rxz >
0. This correlation does not mean that cigarette smoking also cause people to drink.
Rather it probably reflects some underlying social attitudes. It may reflect the fact, in
other words, that people who smoke do not worry about their nutrition, or that their
social lives revolve around the pub, where drinking and smoking often go hand in
hand. In either case, the positive correlation between smoking and drinking probably
10
reflects some underlying causes (e.g. social attitude) which in turn causes both. Thus,
a correlation between two variables does not necessarily mean that one cause the
other. It may be the case that an underlying third variable is responsible.
Now consider the correlation between lung cancer and heavy drinking. Since people
who smoke tend to get lung cancer more, and people who smoke also tend to drink
more, it is not unreasonable to expect that lung cancer rates will be higher among
heavy drinkers (i.e. rxz >0). Note that this positive correlation does not imply that
alcohol consumption causes lung cancer. Rather, it is cigarette smoking that causes
cancer, but smoking and drinking are related to some underlying social attitude. This
example serves to indicate the kind of complicated pattern of causality which occurs
in practice, and how care must be taken when trying to relate the concepts of
correlation and causality. The general message that correlation can be very suggestive,
but it cannot on its own establish causality.
11
Time Series properties: Stationarity

White noise residuals are the residuals from a final model, confirmed in one of the
last steps in the model-building process. In contrast, stationarity should be achieved in
one of the first steps in the modeling process. White noise series are stationary;
however, stationary series are not typically white noise.
Stochastic process = stationary time series
Given Yt (t = 1, 2, 3, ,T) is a time series and Yt is said to be stationary if
(a) E(Yt) = constant for all t

(b) Var(Yt) = constant for all t They do not depend on time
(c) Cov(Yt, Yt-k) = constant for all t and all k0
Level stationary is achieved when the mean of the series is the same over time; thus
no differencing is necessary. While level stationary might be achieved in the ARIMA
model-building process, another form of nonstationarity, that is variance non-
stationarity, must also be modeled. Some time series may have either level
nonstationarity or variance nonstationarity, or both.
Non-stationary: if a time series fails to satisfy any part of the above conditions.
For examples:
(1) Trending is consistently upward (or downward) and it is certainly not satisfy
condition (a), because the mean value appears to change upward (or downward).
Yt Yt
t t
(2) Irregular series as below is also unlikely to stationary, because its mean may be
constant, however its variance appears to be increasing (or decreasing) over time.
Yt
12 t
13
Stochastic process:
Where the Yts are all identically and independently distributed with zero mean and
constant variance:
Yt = et where et ~iid(0, 2), i.e., E(et) = 0 and E(et2) = 2
This kind of process is often referred to as white noise.

Properties of stochastic time series:
Another example of a stationary time series is an AR(1) process,
Yt = Yt-1 + et , and 1<<1., and et is white noise.
The stationarity of the series can be demonstrated as the series is also valid for time
period t-1 as
Yt-1 = Yt-2 + et-1
And substituting for Yt-1, one has
Yt = Yt-1 + et = (Yt-2 + et-1)+ et
Yt = 2Yt-1 + et-1 + et
Continuing the process of lagging and substituting, we get
Yt = tX0 + te1 + t-1e2 ++ et-1 + et
The expected value of Yt is then given by
E(Xt) = tX0 + tE(e1) + t-1E(e2) ++ E(et-1) + E(et)
When t is large enough, t approaches 0, and each expectation value of the errors is
also 0, thus the expected value of Yt is 0 and it is independent of t.
The population variance of Yt is given by
E(Yt2) = tY0 + 2tE(e12) + 2t-2E(e22) ++ 2E(et-12) + E(et2)

= 0 + 2 2t + 22t-2 ++ 22 + 2
= 2 (1-2t)/(1-2) = 2(1/1-2)
Thus, the variance is asymptotically independent of t.
But if
Yt = Yt-1 + et
14
This is another type of simple stochastic process and is called random walk, and the
random walk is a non-stationary process.
Random walk process:
Yt = Yt-1 + et with E(et) = 0 t

E(etes) = 0 t, s, and ts
E(et2) = e2
Substituting and continuing the process of lagging, it gives
Yt = Y0 + et + et-1 ++ e2 + e1
The population variance becomes E(Yt2) = te2

Thus, the variance is not constant and it increases with time.
To make a one step forecast for a random-walk process:
Yt 1 = E (Y t+1 | Yt, Yt-1, , Y1, Y0)

= Yt + E (et+1)
= Yt
To forecast two periods ahead is
Yt 2 = E (Yt+2 | Yt, Yt-1, , Y1, Y0)

= E (Yt+1 + e t+2)
= E (Yt + e t+1 + e t+2)
= Yt
The forecast of k periods ahead is also as Yt. Although the forecasted Yt+1 = Yt+k, the
variance of the forecast error will grow as k becomes larger.
Simply look at the error term and its variance:
Error1 = Yt+1 Y t+1

= (Yt + et+1 ) Yt
= et+1
Its variance is: E( et21 ) = e2
For the two period ahead forecast:
Error2 = Yt+2 Y t+2

= (Yt + et+1 + et+2) Yt
= et+1 + et+2
Its variance is : E[(et+1 + et+2) 2 ] = E[(et+1) 2 ] + E[(et+2) 2 ] + 2E(et+1+ et+2)
15
= e2 + e2 = 2e2
For the k period ahead forecast:
Errork = Yt+k Y t+k

= (Yt + et+1 + et+2 + + et+k ) Yt
= et+1 + et+2 + + et+k
So it variance is: E[(et+1 + et+2 + + et+k) 2 ] = ke2
Yt
Confidence
interval
time
t t+1 t+k
The S.D. in forecast error increases as the square root of k period increases. Therefore
the random walk series is a non-stationary series.
16
A random walk with constant

The process accounts for a time trend (upward or downward)
Yt = Yt-1 + + et
Continuing the process of lagging and substituting, the series starts at time 0 is
Yt = Y0 + T + et + et-1++ e1
Now the expectation of Yt is a function of t. On the average the process tends to move
upward ( > 0) or downward (<0) depends on the sign of drift ().
The one period ahead forecast:
Yt+1 = E(Yt+1| Yt, .., Y1) = Yt +
And the k-period ahead forecast is
Yt+k = E(Yt+k| Yt, .., Y1) = Yt + k
The standard error of forecast is same as the random walk process as
Error1 = Yt+1 Y t+1

= (Yt + + et+1 ) (Yt + )
= et+1
Than, its variance is: E(et+12) = e2
And the k-period forecast variance is also same as the random process
Errork = Yt+k Y t+k

= (Yt + + et+1 + et+2 + + et+k ) (Yt + )
= et+1 + et+2 + + et+k
So its variance is: E[(et+1 + et+2 + + et+k) 2 ] = ke2
The S.D. in forecast error also increases as the k-period increases.
Yt Confidence
interval
17
time
t t+1 t+k
The consequences of non-stationarity

If the time series are nonstationary, then the least squares estimators will be
inconsistent and have spurious statistics, such as the t and F-statistics will not have
their standard limiting distributions. As a consequence, with non-stationary time
series, the regression coefficient of an explanatory variable may apparently be
significantly different from 0 when in fact it is not a determinant of the dependent
variable.
Whether the trend is deterministic or stochastic?

If the time trend is stable and as a straight line, then time trend is deterministic.
If the time trend is varied over time, then time variable is stochastic. It means the time
shifts overtime. In this case if we simply add a trend variable in the regression, it is
misleading.
Yt Yt
deterministic
stochastic
time time
18
In the left-hand side graph, the time effect is linear, so the time trend is so called
deterministic. While in the right-hand side graph the time series has a unit root
problem, it means the time series exhibits a stochastic trend (time trends change, and
it is non-stationary). In order to remove the time effect from the series, the trend-
stationary process (TSP) method or the difference-stationary process (DSP) method
can be used.
Trend-stationary process (TSP):

If a nonstationary time series can be transformed into a stationary process by
extracting a time trend, it is described as being trend-stationary process. The simple
method is to fitting a time trend in the equation:
Yt = + T + et and et ~ iid (0, 2)

E(et) = 0
E(et2) = e2
E(etes) = 0
Difference-stationary process (DSP):

If a non-stationary time series can be transformed into a stationary process by
differencing once, it is described as integrated of order 1, i.e., I(1)
Yt = Yt - Yt-1 = + et and et ~ iid (0, 2)

E(et) = 0
E(et2) = e2
E(etes) = 0
19
Detection of non-stationary
Autocorrelations and ACF(k)
(Y t Y )(Yt k Y )
ACF (k ) t 1
n
(Y
t 1
t Y ) 2
An autocorrelation measures the association between two sets of observations of a

series separated by some lags. In practice, accurate estimates of ACF(k) require a
minimum of about n=50 observation where k should not be larger than approximately
n/4.
For series with no autocorrelations (i.e., the population autocorrelation is zero), the
ACF can be expected to vary about zero with a standard error approximately equal to
a constant value such as 1/ n . The t-test of each ACF can be used as a guide in
assessing statistically significant of ACFs. The t-test formula is
t = ACF(k) / SeACF
If the t-value is much greater than 2, we can infer that there is autocorrelation between
Yt and Yt-1.
Pattern recognition with ACFs:
Random: No discernible patterns
Random and white noise: No systematic patterns
Random-walk: The ACFs are very high and decline slowly
Trending: The ACFs are very high and decline slowly
Seasonal: The ACFs perform a sinusoidal shape
ACFs are so important in forecasting because they assist in identification, estimation

and diagnosis of forecasting models.
20
The difference between ACFs and Pearson correlationion coefficient:

When the number of observation is low (e.g., < 60) and the number of lags is big
(e.g., >10), there may be considerable differences in the coefficients. In such situation,
it is important to calculate the Pearson coefficient at relevant long lags; for example,
with monthly data and few observations (e.g., n=48), the Pearson coefficient will be
more accurate a t lag of 12 than the ACF. For example, when the ACFs for daily data
at lags of 364 are being analyzed with only two years of daily data, the ACFs are very
misleading. (Note: 364 days is often the lag of greatest interest because it separates
the Monday of one year with the same Monday of an adjacent year; weeks are
repeating patterns of 7 days, and there are 52 weeks per year, 7*52 = 364)
White noise Residuals:

In statistical terms, white noise (WN) is normally and independently distributed (NID)
with zero mean and a constant variance, zero autocovariances and zero
autocorrelations.
et = et/n = E(et) = 0
var(et) = Var(et-k) = E(et2)
cov(et et-k) = E (et et-k) = 0 also denotes cov(k)
ACF(k) = cov(k)/Set Set-k = Cov(et et-k)/ Var(et2) = 0
Testing the white noise: ACF(k)s

The Q-statistic measures whether as a group the autocorrelations are statistically
significant different than those expected from white noise. This is also commonly
called Box-Pierce statistic or Ljung-Box statistic.
Q statistic n(n 2)
ACF (k ) 2
nk
Ho: ACF(k)s are not significantly different than white noise ACF(k)s (i.e.,
ACF(k)s=0).
H1: ACF(k)s are statistically different than white noise ACF(k)s (i.e., ACF(k)s 0).
Decision rule:
If Q chi-square table, the ACF patterns are white noise

If Q > chi-square table, the ACF patterns are not white noise
21

3600 Note 01

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3600 Note 01

Uploaded by

Copyright:

Available Formats

Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr.

Overview of Economic Forecasting Methods:

b. Univariate forecasting methods or Time series methods

Yt+1 = f (Yt, Yt-1, Yt-2, Yt-3)

c. Multivariate Time series methods

Y1t = f (Y1,t-1, .., Y2,t, ,Yj,t, .., X1,t,, Xj,t-k)

Methods: Multivariate ARIMA models

d. Qualitative Forecasting Methods:

Understanding the classifications of economic data:

2. Time-Series versus cross-sectional data

Cross-sectional data are observations generated from different economic units in a

3. Macroeconomic versus microeconomic data

Disaggregated data that measures the economic activity of an individual household,

4. High-frequency versus low-frequency data

5. Quantitative versus qualitative data

Measurement and conversion of economic data

1. Economic variable measured as levels

Flow variable: data flows overtime as a stream.

2. Economic variables measured as indexes

Laspeyres index of t period = 100*PtQ0 / P0Q0 (Base-weighted index)

Paasche index of t period = 100*PtQt / P0Qt (Current-weighted index)

Effective exchange rate index = wiEi (Where wi is the weight)

3. Economic variables measures as growth rates and ratios

Growth rate = 100*(GDPt GDPt-1)/GDPt-1

Inflation rate = 100* (CPIt CPIt-1)CPIt-1

Unemployment rate = 100* (Labor force Employment)/Labor force

In many instances, management finds forecasted growth rate of an economic variables

- square root transformation

Understand the common patterns of Time series:

Autocorrelated Patterns: Correlation measures the degree of dependence or

The importance of Pattern:

Actual Value = Pattern(s) + Error

Basic Tools in forecasting:

Raw data information (i.e., pattern) forecasts decisions

1. Graphs and bar charts

i.e., theoretical properties

3. Univariate summary statistics:

Absolute Error Relative Error

Statistically Significance Test for Bias

Where Se is the standard deviation of errors about its mean et , and

If | t* | tc, no statistically significant bias has been found.

Correlation Measures: (Pearson correlation coefficient)

The properties of correlation:

Correlation does not necessarily imply causality:

Time Series properties: Stationarity

Stochastic process = stationary time series

Given Yt (t = 1, 2, 3, ,T) is a time series and Yt is said to be stationary if

(a) E(Yt) = constant for all t

Yt = et where et ~iid(0, 2), i.e., E(et) = 0 and E(et2) = 2

This kind of process is often referred to as white noise.

Yt = Yt-1 + et , and 1<<1., and et is white noise.

Yt = tX0 + te1 + t-1e2 ++ et-1 + et

The expected value of Yt is then given by

E(Xt) = tX0 + tE(e1) + t-1E(e2) ++ E(et-1) + E(et)

The population variance of Yt is given by

E(Yt2) = tY0 + 2tE(e12) + 2t-2E(e22) ++ 2E(et-12) + E(et2)

Random walk process:

Yt = Yt-1 + et with E(et) = 0 t

Substituting and continuing the process of lagging, it gives

The population variance becomes E(Yt2) = te2

To make a one step forecast for a random-walk process:

Yt 1 = E (Y t+1 | Yt, Yt-1, , Y1, Y0)

To forecast two periods ahead is

Yt 2 = E (Yt+2 | Yt, Yt-1, , Y1, Y0)