Professional Documents
Culture Documents
006699414
Data Driven Decision Making
6
5
4
3
2
1
0
100
120
140
160
180
200
220
240
260
280
300
320
340
360
380
400
420
80
Weights
By this practice, I understand that we can uses some Excel functions to randomly select some sample groups,
find the means of each group, and create a sampling distribution of their means. Even though the population
distribution of 5000 sumo wrestlers and jockeys is binormal, the distribution of the sample means is
approximately normal. CLT states that as n gets larger, the sampling distribution of the means becomes
normal, that means the standard error of the mean will becomes smaller.
To proof CLT, I put the numbers into the equation and compare n=50 and n=1000:
If n = 50, = 1.62/√50 = 0.23
If n = 1000, = 1.62/√1000 = 0.05
𝜎𝑥̅ 0.05 < 𝜎𝑥̅ 0.23
Therefore, when n gets larger, the standard error of the mean becomes smaller and the sampling distribution
of the means becomes closer to normal. CLT is proofed.
A Confidence Interval is a range of values we are fairly sure our true value lies in. In the value that professor
gives me, I have 𝑥̅ = 0, 𝛼 = 0.05, which means z=1.96, 𝜎 = 1, 𝑛1 = 1, and 𝑛2 = 100.
𝜎 1
To find the confidence interval, I put the number into the equation 𝑥̅ ± Z 𝑛 , which test #1 is 0 ± 1.96 = 1.96,
√ √1
1
and test #2 is 0 ± 1.96 = 0.196. I find test #1’s true mean has 95% chance to lie between -1.96 to 1.96
√100
while test #2 has the same chance to lie between -0.196 to 0.196.
In my simulator, when 95% of chance is fixed, the sample mean of test #1 is 100% in CI, and test #2 is 96%. In
conclusion, we should put more samples in test to find a more accurate true value.
Distribution of Sample Data Distribution of Test Statistics
Hypothesis test evaluates two mutually exclusive statements about a population to determine which one best
support the sample data. In this demo, null hypothesis is 0 and alternate hypothesis is not equal to 0.
𝑥̅ −𝜇 −0.2−0
When 𝛼 = 0.05, z-critical = 1.96, and z-calc = 𝑠⁄ = 2.064 = -0.48. In this case, if |z-calc| > |z-critical|, reject
√𝑛
√25
the hypothesis; if |z-calc| ≤ |z-critical|, fail to reject the hypothesis. |-0.48| < |1.96|. Therefore, fail to reject
that null hypothesis is 0. More samples give more accurate result.