Professional Documents
Culture Documents
2
How do we answer these
questions?
3
Review of Probability and Statistics
(SW Chapters 2, 3)
4
The California Test Score Data Set
5
Initial look at the data:
(You should already know how to interpret this table)
6
Do districts with smaller classes have
higher test scores?
Scatterplot of test score v. student-teacher ratio
7
How do we answer this question
with data?
8
Compare districts with small (STR <
20) and large (STR 20) class sizes
10
2. Hypothesis testing
11
Compute the difference-of-means
t-statistic:
12
3. Confidence interval
13
Review of Statistical Theory
14
(a) Population, random variable, and
distribution
15
Population distribution of Y
16
(b) Characteristics (a.k.a. moments) of a
population distribution
17
Flip coin to see how many heads
result from 2 flips
E(Y) = 0*(0.25) + 1*(0.50) + 2*(0.25)
= 0 + 0.50 + 0.50 = 1
21
Joint Probability
Example: The relationship between commute time
and rain
Pr( X x, Y y )
Pr( X x | Y y )
Pr(Y y )
23
Joint Independence
Two random variables, X and Y, are independently
distributed if for all X and Y
25
The correlation
coefficient
measures
linear
association
26
(c) Conditional distributions and
conditional means
27
Conditional mean, ctd.
28
(d) Distribution of a sample of data drawn
randomly from a population: Y1,, Yn
29
Distribution of Y1,, Yn under
simple random sampling
30
31
(a) The sampling distribution of Y
32
The sampling distribution of Y, ctd.
33
The sampling distribution of Y when Y is Bernoulli
(p = .78):
34
Things we want to know about the
sampling distribution:
35
The mean and variance of the
sampling distribution of Y
36
37
Mean and variance of sampling
distribution of Y, ctd.
38
The sampling distribution of Y when
n is large
39
The Law of Large Numbers:
40
The Central Limit Theorem (CLT):
41
Sampling distribution of Y when Y
is Bernoulli, p = 0.78:
42
Y E (Y )
Same example: sampling distribution of :
var(Y )
43
Summary: The Sampling
Distribution of Y
44
(b) Why Use Y To Estimate Y?
45
46
Language of Hypothesis Testing
Test statistic = t-statistic:
47
Language of Hypothesis Testing,
ctd.
p-value
Probability of drawing a statistic (e.g. Y) at least as
adverse to the null hypothesis as the value computed
with your data, assuming the null hypothesis is true
The smallest significance level at which you can reject
the null hypothesis
48
Calculating the p-value with Y known:
49
Estimator of the variance of Y:
50
What is the link between the p-value
and the significance level?
51
Common Critical Values
One-Tail Test Two-Tail Test
1- Critical 1- /2 Critical
Value Value
0.90 0.10 1.282 0.90 0.05 1.645
0.95 0.05 1.645 0.95 0.025 1.960
0.99 0.01 2.326 0.99 0.005 2.576
52
53