Professional Documents
Culture Documents
() .
Expected Value:
= ()
= ()
Properties of the mean:
(()) = ()
( + ) = () +
(()2 ) = 2 ( 2 )
( + ) = () + ()
( 2 ) 2 ()
( ) = () ()
( )2
() =
2 () =
()
( )2
1 =
2 () =
det()
1
(( ) ( )) ()
(, ) =
= (())
2)
( + ) = ()
() = (
Independent variables have: cov(x,y)=0. hence:
( + ) = () + () &
(, ) = ( ) (like variance)
(( 2 ) ( 2 ))
( )2
:
( 2 ) = (
)=
2
2
( ) =
1
1
Plim means
(): (()) = ()
2 ) = ( ) + 2
probability
(
() = ( 2 ) 2
limit. P(x)
2
2
2
( ) = () +
= ( ) + =
as n
2
()
()
+ 2 =
+ 2 = 2 + 2 = 2 / + 2
2
2
(, )
( 2 ) = () + 2 = 2 + 2
= (, )
(( 2 ) ( 2 ))
> 0
2
( ) =
=
1
(,
)
= (, )
(
) ( 2 + 2 ( 2 / + 2 )) = 2
1
< 0
: () = ( 2 ) 2
() = (( )2 ) =
= ( 2 2 + 2 ) = ( 2 ) 2() + 2
= ( 2 ) 22 + 2 = ( 2 ) 2
(, ) = (, )/ (() ())
: (, ) = ()
(, ) = (( )( ))
Physical meaning of
(=0.05) (c.f is 0.95) =
= ( + )
= () () () + probability of type 1
error in HT is 5%
= ()
|| 1
((( ) ( )) ) 0
2
(( )2 2( )( ) + 2 ( ) ) 0
() 2 (, ) + 2 () 0
(, )
2 2 (, ) 2 (, )
=
()
+
0
()
()
()
2 (,
2 (,
)
()
1
()
() ()
() () , , = 1
= + ,
(( )( + ))
=
(( 2 ) 2 )((( + )2 ) ( + )2 )
(( ) ( ))
(( )2 )
=
=
=1
(( 2 ) 2 )(2 ( 2 ) 2 2 ) 2 (( )2 )
Law of large numbers: (
) 0
(1) 2
2
=(1),=0.025
2
2
((1) /1 )/((2)
/2 ).
(1) 2
2
=(1),=0.975
] = 0.95
(1 , 2 ) (2 , 1 )
(
(
1)
+
1)
=
, = + 2
2 =
1/2 + 1/2
++2
2 = ( )2 /( 1) ( ) = 2 ( 1) The tests discussed
above are for
2
2 = ( ) /( 1) ( ) = 2 ( 1)
independent groups
x-y
2
2nm The tests discussed
unpooled test: t=
, df=
=
next (paired test
1 1 n+m
+
pooled test) are for
sx2 /n2 +sy2 /m2
related groups
(example of related groups: same group of people before and after drug trial)
1) you must have the values of all 2) calculate the mean of
=
measurements. calculate the difference
the group () and
C: # column
Proof: = 2 / + 2 /
2 ) = ( 2 ),
(
. . = ( ) =
.
, =
(. ) <
Type 1 error: rejected H0
If (n) is fixed: Decreasing a type of
Type 2 error: did NOT reject H0 error increases other. Hence: n
Level of Significance (): lower bound of probability that you
2
ANOVA for HT on >2 groups
cannot reject the possibility of getting the result you get given that 2 =(Fobserved -Fexpected ) /Fexpected
the null hypothesis is true.
This is ANOVA (>2 groups)
This is the ordinary case (2 groups)
=
/
= 1
Compare means of 2 samples, t-test: with df = n1+n2-2
=
=
(1 2 ) (1 2 ) 2 (1 1)12 + (2 1)22
=
; =
(1 )2 + 2 (2 )2 + + ( )2
1
1 + 2 2
)
(
1
2 /1 + 2 /2
=
=
( 1)12 + (2 1)22 + + ( 1)2
2
(1) = ( )2 / 2 = 2 = 1
N is population size ( = = 1 + 2 + ) K is how many groups
2
To compare two samples: (2)
= 12 + 22 , = 2
MSbetween groups is due to effect of drug.
df1 for F = N-K
df2 for F = K-1
Remember to convert (S) given in the question to ()=(S) (n-1)/n
MSwithin group is due to chance.
= ( )4 / 4 = 3
2
(1)
= ( 1) 2 / 2 difference from definition above is df
= ( )3 / 3 = 0 = [()2 /6 + ( 3)2 /24]
You can prove it using: (( )2 ) = ( 1) 2
Large n>800 JB has 2 distribution. Otherwise use Bowman-Shelton table
In hypothesis testing on variance, is hypothesized variance
Estimator: A math. function to estimate a population parameter from sample
For Vector A[1xn],,AAT is symmetrical. Diagonal(AAT) is A.^2
2
2
lim ( ) = , and lim ( ) = 0 MSE=variance+bias2 =E(
Unbiasedness: ( ) = Consistency:
is a diagonal matrix with 2 , is BLUE
estimate -) E(any other -)
GM Assumptions: () = 0, () = 2 , () = 0, = ( )1 [ + ] = [ + ( )1 ( )] =
Y: regressand, explained, dependent variable K: # of regressors
= + ( )1 ( )
= (( )1 )1 ( )
x: regressor, independent, carrier variable
(including 1X0) Substitute : = + ( )1 ( (( )1 )1 ( ))
Remember to include a vector of 1's in your (X) matrix for (x0) * Unbiased E(* )=, if (r-R' B* =0)
~(, 2 ( )1 )
() = 0, , ( 2 ) = 2 See the rule above on vector A RMSE: Root mean square Error. MLE: Max Likelihood Estimation
( ) is a diagonal matrix. all elements of diagonal are 2
Unbiased :
2
2
() = ( ) = (( ) ) /( ) RMSE==se
E((2 ) = 2 (), ( ) = 0 ()
Derive the OLS Estimator: our objective function is: min( 2 )
For biased estimator, or inefficient unbiased estimator:
2
min( ) = ( ) ( ) = ( )( )
= ( ) = () + ()2 = 2 / + 0
= + = 2 + = ( )2 = 2 /
() + 2 = (/)
To minimize, find the value of where the derivative = 0
2
2
= (())/ = ()/ = ()/ = 2 /
/ = 2 + 2 = 0 = ( )1
Example in slides: X for predicted & real are matching: divide by n
2 / 2 = 2 0 Function is concave. Optim. is a min.
X for predicted & real are different, use a model, divide by n-K
Proof: Unbiased ( ) = (( )1 ) = ( )1 [ + ] About 68% of the points on a scatter diagram will be within 1RMSE of
-1
=+(X ' X) E(X ' ) (X'X) 0. so If E(X')=0, we get () = the regression line; about 95% of them will be within 2RMSE
2
2
= = /( ) = ; = MLE: Maximizing the likelihood that extracted sample represents the
VCV (Variance-Covariance) () = 2 [ ]1 Proof: VCV( ) population: distribution of y conditional to parameters .
Pdf: f(y|)
Likelihood function: L(|y) (opposite) = ( |)
() = [( )( ) ]
~(, 2 ( )1 )
L(|y)= the binomial distribution: (!/(! ( )!) (1 )
= +
MLE() is the value that maximizes the joint product of densities pdf
([( )1 ][( )1 ] )
([( )1 ( + ) ][( )1 ( + ) ] ) =argmax(L()), or say: find () for max L=L(); L=(|)
( )1
[(( )1 + ( )1 ) ]
( )2
1
)
(
) = (
:
()
=
(2)
exp
(
)
[(( ) + ( ) ) ]
( )
2 2
2
(
=
)
= = n:# observations
( )2
1
( | 2) = ( |{ , , 2 }) = (2) exp (
)
5) () = 2 is constant. 6) N 7) c( , ) 1
([1] [1]
) ([1] [1]
)
(
(
Restricted OLS: 1 coefficient t-test, joint test F-test
) )
( ) ( )
=
+
= 0 2 =
4
unrestricted model that we calculated q: # of restrictions ( 2 ) 2 2
2
df = {q , N-K}
Restricted model from hypothesis.
2 = / = ( ) 2 / ( 2 ) = ( /) = ( ) 2 /
Unrestricted model has always better fit
SSRUR < SSRR
Logistic regression: When y is discrete, binary or categorical
( )/ Joint restrictions: If we use OLS with discrete Y, we will suffer: 1) are not normally
,(1=,2=) =
/( ) 1 = 1 distributed, 2) probabilities can be >1 or <0, 3) coefficient
In Joint hypothesis, restrictions can be written as linear system interpretation is meaningless. Hence, we need logistic regression
Odds= p/(1-p)
0<logit<1
When you enforce () to certain values, you can't use the other = ln(/(1 )) = +
ln() =
() from the original model. You have to re-derive the whole
= 1/(1 + exp()) = /(1 + )
= exp()
model. Derive new under restrictions
MLE
for
logit:
solve
{Y-p(Y=1)}X
=0,
Wald
statistic
for = [/(S.E)]2
. : = 0
min( ) = ( ) ( )
2
increase odds of y=1, R=(wald-2)/(-2LL()); (=1)
: = ( ) ( ) 2( )
'
'
'
KRestrct= KUR when xn only increases by 1
Cutoff is: = 1
= (Y ' Y)-2* X ' Y+* X ' X-2(R' * -r ' )
L/* =-2X'Y +2X ' X* -2R'=0
L/=-2(R' '* -r ' ) (LR)=2 [LL(,)-LL()]= 2 [LL(end)-LL(beginning)] 2(=)
2
2
White: regress: 2 = (0 + 1 + 2 + 1
+ 2
+ 1 2 );
2
Breusch-Pagen: Regress: = (0 + 1 1 + ); : = 0
2
Test statistic (both): Test stat: 2 ~(=)
K TEST regressors (#x)=#-1
E(F'F')=2 FF'
GLS: FF ' =I; =F -1 1 ; F ' F=-1 Fy=FX+F=X
-1 'Y
-1 ' -1
'
'
-1
X
) X
=(X X) X Y VCV(GLS )=2 (X
'X
)-1 =2 (X ' -1 X)
GLS =(X
E(
2GLS )=2
N=obsrv.
. = (
= + =1 GLS & ARMA
) +
=1
Test: = (=1 ) + + Test-stat: () = 2 ~2
Auto-correlation covar(t ,t-s )=s 2u (1-2 )-1
=
1 =
= (1 2 )1
2
0
0
0
1
-
0
0
1
2
T-1
(1- )
2 -
-
1+
T-2
-
1 0 0
0 1 1+2 -
2
1
T-3
0
- 1 0 0
0 0 - 1
T-1 T-2 T-3
1
0
0 - 1
Violations of GM: 1)( ) = 2 Heteroskedasticity 2)( ) 0 i j are correlated=Autocorrelation 3)E(X)0 Stochastic Regressor, 4)K=rank(x) and KN Multicollinearity
= 0 + 1 1 + 2 2
We have 2 IVs, to be combined to 1
GLS: Heteroskedasticity: guess FGLS
2 stage LS: model is: = +
Autocorrelation: calculate FGLS
= 0 + 1 1 + 2 2
= + + Use regress. , not true x Add vector of 1s to
Multicollinearity: when explanatory variables are correlated
No multicollinearity information available to get , evaluate effect of x on y.
estimates of stay the same if the model was y(x1) or y(x1,x2). marginal
(individual) contribution of one regressor to reducing SSR is independent of
the other regressor (1 ) = (1 |2 )
Multicollinearity existssome info is repeated, redundant, useless
(X'X) is NOT full rank. estimates of i change when you add a new regressor to
the model (higher correlation higher change in i)
SSR(X1)SSR(X1|X2) HT: i=0 may yield different results((t) changes)
SE(x1|model with only x1) < SE(x1|model with multiple x)
symptoms of MC: 1)small changes in observation or regressor causes big
changes in coefficients 2) high SE, small t-statistic of a coefficient while R2 is
high relatively. (high SE of intercept normal)
Diagnosis: for model with 2 regressors only: correlation coefficient ||
Q1: if the explanatory variables are correlated with error terms?
between variables. For model with >2 regressors: 1) variance inflation factor
no, then there is no problem in the stochastic regressor.
(VIF) 2) R2 auxiliary regression > R2 original model
yes, then Q2: if the explanatory variables are correlated with error terms in the same
observation? yesGSRM // no PISRM
MC always exists, the question is: how much MC available
Slides example: y_t=yt-1+t; t=t-1+vt
depends on 1 , yt-1 is related Correlation between X and Y is NOT MC
Auxiliary regression:
to t-1, also t is related to t-1 yt-1 and t must be related (blt3ddi)model is Regress xi against ALL OTHER Xs Xi = 0 + x1 1 + x2 2 +x3 3
GSRM. Other example: = + 1 + , no relation between and Get UNADJUSTED (Ri2)
-1
Bigge VIF, bigger MC
VIF(i )=(1-R2i )
1 . Yt-1 is independent from PISRM model
Critical VIF: 5 or 10. more regressors (K), bigger VIFcritical
Effects of SR: if stochastic but uncorrelated to error: unbiased efficient
VIF Problems: no formal decision rule. Necessary but not sufficient test for
consistent. if PISRM: small sample: biased. large sample: consistent. if
MC. (MC exist with small VIF)
GSRM: always inconistent and biased
Another test of MC is: TOR = 1/VIF = (1-Ri2) TOR=0 perfect corr.
Instrumental variable, for stochastic regressor:
Other diagnosis: new variable causes sign change, decline in R2
Check last example, meat is IV for calorie, asset is IV for age.
z=[ones,meat,asset]. and continued normally using: z'y, z'x
Remedy: if cause is not in the data, change the functional form or sampling
Assume transformation matrix (Z) exists, highly correlated with (x), not
strategy. otherwise, nothing, or drop redundant variable, transform correlated
variables to a new 3rd variable that is a ratio or product, or increase sample
correlated with () ( ) = , ( ) = 0 model becomes
1
2 =
/
Z'Y=Z'X+Z =
ZY = ( )1
-1
-1
Proof IV
plim ()=plim ((Z' X) Z' Y) =plim ((Z' X) Z' (X+))
consistent
-1
1 (0)
+plim(Z' X/N) plim(Z' /N) = +
=
1
1
2 ( 1 (
1
2
() = = )
)( )1 Procedure:
add a vector of ones to Z, like 1st column in X Evaluate (Z'X),(Z'Y) and
Proof GLS validity: 1st term is: (1,1) 1 =(1,1) 1 '+(1,1) 1
E(1 =F(1,1) 1 )=0 For Remaining terms, yi -yt-1 =(xt -xt-1 )'
+ut Errors are homosked & uncorrelated GLS is BLUE
Heterosked. Example:
Build OLS, check diagonal(') constant
Assume , find F Create: FX, FY. REPLACE (FX(:,1)) BY ONES
Find GLS Find =
Autocorrelation Example:
Build OLS, find Split to sub-vectors: i-1 =[1 ,2 ],i =[2 ,3 ]
)2
Evaluate in: t =t-1 +ui
= =2(1
) / =2(1
Derive F matrix from ugly matrix
Use F to transform x & y
Did not replace FX(:,1) by ones
Stochastic Regressor: ( ) = + ( )1 ( )
If x is no longer fixed in
If X and are not independent of each other
repeated sampling:
(covariance exists)
(( )1 )
(( )1 ) (( )1 )()
GSRM: General Stochastic Regressor
( )1 ()
PISRM: Partially independent stochastic Regressor Model
Violation
Heteroskedasticity
( )
Definition
heterogeneous variance of error
terms
Effect
OLS remains unbiased and consistent
Estimated standard error biased
Diagnosis
Park Test (known ); BreuschPagen test and White test
(unknown )
Durbin-Watson test
Remedy
GLS; FGLS; robust standard error
Auto Correlation
( )
Stochastic regressor
() 0
IV; 2SLS
Multicollinearity
Rank(X)=KN