You are on page 1of 53

Introduction to Econometrics

The Statistical Analysis of


Economic (and related) Data
What do economists study?

2
How do we answer these
questions?

3
Review of Probability and Statistics
(SW Chapters 2, 3)

4
The California Test Score Data Set

5
Initial look at the data:
(You should already know how to interpret this table)

This table doesnt tell us anything about the


relationship between test scores and the STR.

6
Do districts with smaller classes have
higher test scores?
Scatterplot of test score v. student-teacher ratio

What does this figure show?

7
How do we answer this question
with data?

8
Compare districts with small (STR <
20) and large (STR 20) class sizes

Class Size Average score Standard n


(Y ) deviation (sY)
Small 657.4 19.4 238
Large 650.0 17.9 182

1. Estimation of = difference between group


means
2. Test the hypothesis that = 0
3. Construct a confidence interval for
9
1. Estimation

10
2. Hypothesis testing

11
Compute the difference-of-means
t-statistic:

12
3. Confidence interval

13
Review of Statistical Theory

14
(a) Population, random variable, and
distribution

15
Population distribution of Y

16
(b) Characteristics (a.k.a. moments) of a
population distribution

17
Flip coin to see how many heads
result from 2 flips
E(Y) = 0*(0.25) + 1*(0.50) + 2*(0.25)
= 0 + 0.50 + 0.50 = 1

var(Y) = (0.25)*(0 - 1) + (0.50)*(1 1) +


(0.25)*(2 1)
= 0.25 + 0 + 0.25 = .50

stdev(Y) = .50 = 0.7071


18
19
20
2 random variables: joint
distributions and covariance

21
Joint Probability
Example: The relationship between commute time
and rain

Pr(X=x, Y=y) is the joint probability, where


X = 0 if raining
= 1 otherwise
Y = 1 if commute time is short (<20 minutes)
= 0 if commute time is long (>= 20 minutes)
Positive or negative relationship?
22
Conditional Probability
Conditional probability is used to determine the
probability of one event given the occurrence of
another related event.
Conditional probabilities are written as P(X | Y).
They are read as the probability of X given Y and
are calculated as:

Pr( X x, Y y )
Pr( X x | Y y )
Pr(Y y )
23
Joint Independence
Two random variables, X and Y, are independently
distributed if for all X and Y

Pr(X = x,Y = y) = Pr(X = x)*Pr(Y = y)


or
Pr(Y = y | X = x) = Pr(Y = y)

1. Do these hold in the rain


and commute example?
2. Pr (X = 1, Y=1) = ?
3. E (X | Y=1) = ?
4. Pr (X=0 | Y=0) = ?
24
The correlation coefficient is
defined in terms of the covariance:

25
The correlation
coefficient
measures
linear
association

26
(c) Conditional distributions and
conditional means

27
Conditional mean, ctd.

28
(d) Distribution of a sample of data drawn
randomly from a population: Y1,, Yn

29
Distribution of Y1,, Yn under
simple random sampling

30
31
(a) The sampling distribution of Y

32
The sampling distribution of Y, ctd.

33
The sampling distribution of Y when Y is Bernoulli
(p = .78):

34
Things we want to know about the
sampling distribution:

35
The mean and variance of the
sampling distribution of Y

36
37
Mean and variance of sampling
distribution of Y, ctd.

38
The sampling distribution of Y when
n is large

39
The Law of Large Numbers:

40
The Central Limit Theorem (CLT):

41
Sampling distribution of Y when Y
is Bernoulli, p = 0.78:

42
Y E (Y )
Same example: sampling distribution of :
var(Y )

43
Summary: The Sampling
Distribution of Y

44
(b) Why Use Y To Estimate Y?

45
46
Language of Hypothesis Testing
Test statistic = t-statistic:

Significance level: Specified probability of Type I error


Significance level =
Critical Value: Value of test statistic for which the test just
rejects the null at a given significance level

47
Language of Hypothesis Testing,
ctd.
p-value
Probability of drawing a statistic (e.g. Y) at least as
adverse to the null hypothesis as the value computed
with your data, assuming the null hypothesis is true
The smallest significance level at which you can reject
the null hypothesis

|Test statistic| > |critical value| reject null hypothesis


|Test statistic| < |critical value| fail to reject null
hypothesis

48
Calculating the p-value with Y known:

49
Estimator of the variance of Y:

50
What is the link between the p-value
and the significance level?

51
Common Critical Values
One-Tail Test Two-Tail Test
1- Critical 1- /2 Critical
Value Value
0.90 0.10 1.282 0.90 0.05 1.645
0.95 0.05 1.645 0.95 0.025 1.960
0.99 0.01 2.326 0.99 0.005 2.576

52
53

You might also like