Professional Documents
Culture Documents
Bill Hung
Yt = f (Xt)
or
Yt = f (X1t, .., Xkt, Y2t, ..,Y1k)
Methods: Simple or Multiple models
System equation models
Seeming unrelated models
Other qualitative forecasting methods are based on the judgement and opinions
of others, such as concerning future trends, tastes and technological changes.
Qualitative methods include Delphi, market research, panel consensus, scenario
analysis, and historical analogy methods of predicting the future. Qualitative
methods are useful when there is little data to support quantitative methods.
1
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
In social sciences, particular in economics, researchers cannot create their own data
by running experiments in a laboratory. They must rely on the data collected by
official or agencies. Data collected by another party are called secondary data. If the
data do not meet the desired standard of quality, social scientists cannot afford to
throw them away. Instead, they must patiently and carefully treat or adjust these data
by means of different econometrics and statistical methods.
Panel (or pooled) data combines the time-series and cross-sectional approaches.
2
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
variable take on only the numeric values 1 and zero, they are referred to as dummy (or
binary) variables.
Real exchange rate index = E(HK$/US$) x (US Price index/ HK Price index)
Others such as production index, and wage index, etc are measured in index.
Exchange rate = Domestic currency (units value)/ foreign currency (unit value)
3
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
supply data to a percentage growth format similar with the interest rates.
4. Linear Transformation
Linear transformations are specifically useful when applying regression analysis
to forecasting problems. Specifically, regression analysis begins by assuming that the
relationship between variables is linear. The advantage of this type of conversion is
that we have been able to take a relationship that does not satisfy the requirements of
linearity and transform the data. The two common methods are:
- logarithmic transformation
4
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
Random patterns:
Random time series are the result of many influences that act independently to yield
nonsystematic and non-repeating patterns about some average value. Purely random
series have a constant mean and no systematic patterns. Simply averaging model are
often the most best and accurate way to forecast them.
time
Trend patterns:
A trend is a general increase or decrease in a time series that lasts for approximately
seven or more periods. A trend can be a period-to-period increase or decrease that
follows as a straight-line, this kind of pattern is called a linear trend. Trends are not
necessarily linear because there are a large number of nonlinear causal influences that
yield nonlinear series. Trends may cause by long-term factors or population changes,
growth during production and technology introductions, changes in economic
conditions, and so on.
Y Y
Time Time
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
Seasonal Patterns: Seasonal series result from events that are periodic and
recurrent (e.g., monthly, quarterly, changes recurring each year). Common seasonal
influences are climate, human habits, holidays, repeating promotions, and so on.
Y
Time
Q1 Q3 Q1 Q3 Q1 Q3
Cyclical Patterns:
Economic and business expansions and contractions are most frequent cause of
cyclical influences on time series. These influences most often last for two to five
years and recur, but with no known period. In the search for explanation of cyclical
movements, many theories have been proposed, including sunspots, positions of
planets, stars, long wave movements in weather conditions, population life cycles, the
growth and decay cycle of new products and technologies, product life cycles, and the
economic cycles. Cyclical influences are difficult to forecast because they are
recurrent but not periodic.
6
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
Time
Y Y
Time Time
Outliers: The analysis of past data can be made very complex when the included
Y
values are not typical of the past or future. These non-typical values are called
outliers, which are very large or small observations that are not indicative of repeating
past or future patterns. Outliers occur because of unusual events.
Time
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
When the pattern of the actual value is a repeating pattern, this behavior can be used
to predict the future value. We describe past patterns statistically. Therefore, the first
step in modeling a series is to plot it versus time and reveals it statistics.
2. Probability estimations:
a. Probability estimates using past percentage frequencies
i.e. probability distribution
b. Probability estimates using theoretical percentage frequencies
8
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
4. Measuring Errors:
Error = Actual Forecast
MAD
e t
MPE
PE t
n n
SSE et2 MAPE
PE t
MSE
e 2
t
RSE
e 2
t
n n 1
The simple t-test can be used to determine how many standard errors of the mean
error that is away from the hypothetical mean of zero.
et 0
t * calculated
Se / n
Se
(e t et ) 2
n 1
Decision rule:
Understanding correlation
Predictive ability is achieved when it can be shown that the values of one variable
9
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
move together with values of another variable. While the positive and negative are
attributes of a relationship, however, they do not denote the strength or degree of
association between each variable. Therefore, other measures such as covariance,
correlation, auto-covariance and auto-correlation coefficients can be used for
identifying and diagnosing forecasting relationship.
rxy
(Y Y )( X X )
i i
(Y Y ) ( X X )
i
2
i
2
Covariance: Cov ( X , Y )
( X i X )(Yi Y )
n 1
The covariance is the mean of the product of the deviations of two numbers from their
respective means. The correlation coefficient measures the proportion of the
covariation of X and Y to the product of their standard deviations.
Cov ( X , Y )
Therefore, rxy
S x SY
Now suppose that we also collect data on the same group of people to measure the
amount of alcohol they drink in a typical week. Suppose we assign this new variable
as Z. In practice, it is the case that heavy drinkers also tend to smoke and hence, rxz >
0. This correlation does not mean that cigarette smoking also cause people to drink.
Rather it probably reflects some underlying social attitudes. It may reflect the fact, in
other words, that people who smoke do not worry about their nutrition, or that their
social lives revolve around the pub, where drinking and smoking often go hand in
hand. In either case, the positive correlation between smoking and drinking probably
10
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
reflects some underlying causes (e.g. social attitude) which in turn causes both. Thus,
a correlation between two variables does not necessarily mean that one cause the
other. It may be the case that an underlying third variable is responsible.
Now consider the correlation between lung cancer and heavy drinking. Since people
who smoke tend to get lung cancer more, and people who smoke also tend to drink
more, it is not unreasonable to expect that lung cancer rates will be higher among
heavy drinkers (i.e. rxz >0). Note that this positive correlation does not imply that
alcohol consumption causes lung cancer. Rather, it is cigarette smoking that causes
cancer, but smoking and drinking are related to some underlying social attitude. This
example serves to indicate the kind of complicated pattern of causality which occurs
in practice, and how care must be taken when trying to relate the concepts of
correlation and causality. The general message that correlation can be very suggestive,
but it cannot on its own establish causality.
11
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
Level stationary is achieved when the mean of the series is the same over time; thus
no differencing is necessary. While level stationary might be achieved in the ARIMA
model-building process, another form of nonstationarity, that is variance non-
stationarity, must also be modeled. Some time series may have either level
nonstationarity or variance nonstationarity, or both.
Non-stationary: if a time series fails to satisfy any part of the above conditions.
For examples:
(1) Trending is consistently upward (or downward) and it is certainly not satisfy
condition (a), because the mean value appears to change upward (or downward).
Yt Yt
t t
(2) Irregular series as below is also unlikely to stationary, because its mean may be
constant, however its variance appears to be increasing (or decreasing) over time.
Yt
12 t
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
13
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
Stochastic process:
Where the Yts are all identically and independently distributed with zero mean and
constant variance:
The stationarity of the series can be demonstrated as the series is also valid for time
period t-1 as
Yt-1 = Yt-2 + et-1
And substituting for Yt-1, one has
Yt = Yt-1 + et = (Yt-2 + et-1)+ et
Yt = 2Yt-1 + et-1 + et
Continuing the process of lagging and substituting, we get
When t is large enough, t approaches 0, and each expectation value of the errors is
also 0, thus the expected value of Yt is 0 and it is independent of t.
But if
Yt = Yt-1 + et
14
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
This is another type of simple stochastic process and is called random walk, and the
random walk is a non-stationary process.
Yt = Y0 + et + et-1 ++ e2 + e1
The forecast of k periods ahead is also as Yt. Although the forecasted Yt+1 = Yt+k, the
variance of the forecast error will grow as k becomes larger.
15
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
= e2 + e2 = 2e2
Yt
Confidence
interval
time
t t+1 t+k
The S.D. in forecast error increases as the square root of k period increases. Therefore
the random walk series is a non-stationary series.
16
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
Yt = Yt-1 + + et
Continuing the process of lagging and substituting, the series starts at time 0 is
Yt = Y0 + T + et + et-1++ e1
Now the expectation of Yt is a function of t. On the average the process tends to move
upward ( > 0) or downward (<0) depends on the sign of drift ().
And the k-period forecast variance is also same as the random process
Yt Confidence
interval
17
time
t t+1 t+k
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
Yt Yt
deterministic
stochastic
time time
18
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
In the left-hand side graph, the time effect is linear, so the time trend is so called
deterministic. While in the right-hand side graph the time series has a unit root
problem, it means the time series exhibits a stochastic trend (time trends change, and
it is non-stationary). In order to remove the time effect from the series, the trend-
stationary process (TSP) method or the difference-stationary process (DSP) method
can be used.
19
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
Detection of non-stationary
Autocorrelations and ACF(k)
(Y t Y )(Yt k Y )
ACF (k ) t 1
n
(Y
t 1
t Y ) 2
For series with no autocorrelations (i.e., the population autocorrelation is zero), the
ACF can be expected to vary about zero with a standard error approximately equal to
a constant value such as 1/ n . The t-test of each ACF can be used as a guide in
assessing statistically significant of ACFs. The t-test formula is
t = ACF(k) / SeACF
If the t-value is much greater than 2, we can infer that there is autocorrelation between
Yt and Yt-1.
20
Econ3600 Econometric Modeling and Analysis Lecture Note # 1 Dr. Bill Hung
Q statistic n(n 2)
ACF (k ) 2
nk
Ho: ACF(k)s are not significantly different than white noise ACF(k)s (i.e.,
ACF(k)s=0).
H1: ACF(k)s are statistically different than white noise ACF(k)s (i.e., ACF(k)s 0).
Decision rule:
21