Professional Documents
Culture Documents
05/02/16
Math 311
The following sets of data were generated using the random data generator on Minitab. Part 1
of this lab will demonstrate the behavior of these randomly generated ‘nested’ raw data sets and their
tendency toward normaility as the size of samples increase. We being by examining samples of size 10,
50, 100, and 1000. Then we proceed to compare and contrast these data sets based on the type of
distribution generated. The first distribution we analyze is a standard normal distribution, with mean
µ=0 and standard deviation σ=1. The second we discuss is another normal distribution; however, this
distribution will have a mean of µ=2 and standard deviation of σ=10. Finally, we examine the behavior of
a binomial distribution of trial size 10 and probability of success p=.5 . In part 2, we determine the
probilities of hypothetical events from three types of probability distributions; these will be the normal,
binomial and uniform distributions.
Part 1)
A) Using a standard normal distribution (n_1), with mean µ=0 and standard deviation σ=1, we
generate random sample sets of size 10, 50, 100, and 1000. Each sample is denoted by
n_1= k, where k is the sample set size. Each larger sample contains the same variables of the
previous sample size; that is, each sample set is nested in the other larger sets. Thus, by the
random data generator, we display histograms and simple description statistics of our
results:
2.0 6
Frequency
Frequency
5
1.5
4
1.0 3
2
0.5
1
0.0 0
-3 -2 -1 0 1 2 -3 -2 -1 0 1 2
n_1 =10 n_1 =50
60
Frequency
10
40
5
20
0 0
-3 -2 -1 0 1 2 -3 -2 -1 0 1 2 3
n_1 =100 n_1 =1000
LAB 3 Joey Martinez
05/02/16
Math 311
Descriptive Statistics: n_1 =10, n_1 =50, n_1 =100, n_1 =1000
The most obvious result from our random variable generation is the change in our sample mean
values as the size of each sample increases. Each sample has the following mean: x̄10 =-.622, x̄50=-.004,
x̄100 = -.003, and x̄1000=-.0063 , where x̄n represents the sample mean of sample size n. Now consider the
following absolute differences between the sample means and the population mean:| µ- x̄10|= .622, | µ-
x̄50|= .004, | µ- x̄100|=.003, and | µ- x̄1000|=.0063. As we can tell, when our sample size increases, the
absolute difference between our sample means and population mean value decreases. The initial
sample size offers the most disparate value among the four samples; however, as we continue to
genrate larger sample sizes, the distributions (as seen by the histograms above) become more normal
and our sample means tend to approach the true population value of µ=0. We also note that the
standard deviation begins to shrink and the median of the samples tends toward the value x=0, as
sample size increases.
Next, we will perform the same random data generation and ‘nesting’ procedure; except, we
will now change the population we are sampling from to a normal distribution with mean µ=2 and
standard deviation of σ=10. Our new sample sets will be denoted by n_2= k’, where k’ is the size of our
sample set. The results are as follows:
12
3
10
Frequency
Frequency
2 8
1 4
0 0
-10 -5 0 5 10 15 20 -20 -10 0 10 20
n_2 =10 n_2 =50
14 70
12 60
Frequency
Frequency
10 50
8 40
6 30
4 20
2 10
0 0
-15.0 -7.5 0.0 7.5 15.0 22.5 -30 -20 -10 0 10 20 30
n_2 =100 n_2= 1000
LAB 3 Joey Martinez
05/02/16
Math 311
Descriptive Statistics: n_2 =10, n_2 =50, n_2 =100, n_2= 1000
As with the previous set of data, we make note of the sample means. The generated samples of
increasing size have the following sample means: x̄10=3.86, x̄50=.60, x̄100=1.181, and x̄1000=1.610, where
x̄n denotes the mean of a sample of size n. We also consider the absolute differences between the
population mean and sample means: | µ- x̄10|=1.86, | µ- x̄50|=1.40,| µ- x̄100|=.819, and| µ- x̄1000|=.390.
As the samples increase in size, the absolute differences between the sample means and the true mean
value approches 0. Hence, as before, the increasing sample sizes correspond with our sample
distributions tending towards the normal distribution with the assigned parameters µ=2 and σ=10.
The law of large numbers states that, as n∞, the sample mean x̄ approaches the population
mean. The data above is a demonstration of complience to this law. Given an initial population with
parameters µ and σ, as we increased our sample sizes, our sample values and distributions of samples
tended toward the population values and distributions. Consider the following comparisons between
the population distributions and our distributions of sample sizes 10 and 1000, where the sample size N
increases from left to right:
N --------------------------------------------------------------------------------------------------------------> ∞
Frequency
Density
1.5 0.2
40
1.0
0.1
20
0.5
0.0 0 0.0
-3 -2 -1 0 1 2 -3 -2 -1 0 1 2 3 0
n_1 =10 n_1 =1000 X
Mean 3.864
90
4 Mean 1.610 0.04
StDev 6.699 StDev 9.960
N 10 80 N 1000
70
3 0.03
60 1
Frequency
Frequency
Density
50
2 0.02
40
30
1 0.01
20
10
0 0 0.00
-10 -5 0 5 10 15 20 -30 -20 -10 0 10 20 30 2
n_2 =10 n_2= 1000 X
LAB 3 Joey Martinez
05/02/16
Math 311
B) Now we will randomly generate samples of size 10, 50, 100, and 1000 from a binomial
distribution with trials of size k=10 and probability of success p=.5. The sample variables will
be nested in the samples of larger size, as with the other distributions. We denote the
samples by n_bi=m, where m is the size of the sample. Thus we have the following
histograms and descriptive statistics:
0.4 0.30
0.25
0.3
0.20
Density
Density
0.2 0.15
0.10
0.1
0.05
0.0 0.00
2 3 4 5 6 7 2 3 4 5 6 7 8
n_bi=10 n_bi=50
200
0.20
Density
Frequency
150
0.15
100
0.10
0.05 50
0.00 0
1 2 3 4 5 6 7 8 2 4 6 8
n_bi=100 n_bi=1000
By the above, the differences between the distributions of each sample is most apparent
between n_bi=10 and n_bi=1000. The shape of the distribution of sample size 10 has no obvious
characteristics, aside from being slightly symmetric; however, as the sample becomes larger, we notice
that the distributions of the sample sizes 50 and 100 approach a normal distribution. In fact, when we
reach the final distribution of sample size 1000 , the shape of the distribution becomes approximately
normal, with mean x̄1000=4.9250 µ=kp=5 and standard deviation s1000=1.5735 σ=sqrt(kp(1-p))=1.118,
where k=10 is the trial size and p=.5 is the probability of success; as defined above.
LAB 3 Joey Martinez
05/02/16
Math 311
Part 2)
We now consider scenarios involving the normal, binomial and uniform distributions. We
subdivide this section of the lab into parts a, b and c for the normal, binomial, and uniform distributions,
respectively.
a) Suppose we are in the land of Springfield. Our friend Homer Simpson is at the local tavern
Moe’s enjoying a beer. Homer measures the amount of beer in his 16oz mug and finds it to be
filled with 14oz. Let the amount of beer in a mug at Moe’s be normally distributed, with mean
µ=15oz and standard deviation σ=.75oz. We determine whether Homer is being undercut by
Moe or experiencing a coincidence by finding the probability of being served a beer of 14 oz or
less. Consider the following graph demonstrating the cumulative probability of receiving 14oz or
less:
Distribution Plot of Beer Ammounts at Moe's
Normal, Mean=15, StDev=0.75
0.6
0.5
0.4
Density
0.3
0.2
0.1
0.09121
0.0
14 15
X=Ounces of Beer
We find that P(x ≤ 14oz)=.09121 and hence are forced to tell Homer that approximately only 9% of beers that
Moe serves is filled with 14oz or less. This means that Moe is more than likely singling Homer out and purposefully under
filling his mug; perhaps Moe has finally found out that Homer’s son Bart has been the one crank calling his tavern for all
these years…doh!
Unfortunately this is not the news that Homer had hoped for and demands that no more than 5% should be under
filled. Since the mean is 15oz, we consider an under filled mug to be any amount under this value. Thus we have the
following cut-off of 14.91oz of beer satisfying this condition, as demonstrated by the graph below.
0.5
0.4
Density
0.3
0.2
0.45
0.1
0.0
14.91 15
X= Ounces of Beer
LAB 3 Joey Martinez
05/02/16
Math 311
b) Suppose we are examining the weather in Ellensburg, WA. We find that during the month of
April the chances of experiencing high winds in Ellensburg are 36%, on average. To predict the
following we choose 10 random days in the month of April. Since each day’s wind is
independent of the next, the chance of wind is fixed, the measurement of the wind is identical,
and the town can experience high wind or not, we can apply binomial probability to our
situation. We let the binomial distribution have trial size k=10 and chance of success p=.36.
1. We begin by finding the probability that exactly 3 days have high wind. This is
determined by the following graph:
0.25
0.20
0.2462
Probability
0.15
0.10
0.05
0.00
0 3 8
X
2. Now we are interested in the chance that at least 3 days will have high winds:
Distribution Plot
Binomial, n=10, p=0.36
0.25
0.20
Probability
0.15
0.10
0.7595
0.05
0.00
0 3
X
3. Finally, we wish to find the chance of experiencing high winds for no more than 3
days:
Distribution Plot
Binomial, n=10, p=0.36
0.25
0.20
0.4868
Probability 0.15
0.10
0.05
0.00
3 8
X
We find that the chances of experiencing high wind for no more than 3 days to be
given by P(x≤3)=.4868 or 48.68%, as desired.
c) Consider a package delivery service that divides its packages into weight classes. Let the
packages in the 14lbs to 20lbs weight class be uniformly distributed. Suppose we wish to find
the following probabilities.
1. If customers are charged an extra fee for packages between 18-20lbs, what is the probability
a customer will have to pay this fee? In other words, we must find the chances of a package
falling between these values. We consider the following distribution:
Distribution Plot
Uniform, Lower=14, Upper=20
0.18
0.3333
0.16
0.14
0.12
Density
0.10
0.08
0.06
0.04
0.02
0.00
14 18 20
X
Thus, by the above cumulative probability density, we find that the chances are
P(18≤x≤20)=.3333, or about 33.33%.
LAB 3 Joey Martinez
05/02/16
Math 311
2. Now we wish to find the probability of a randomly selected package weighing 15lbs or less:
Distribution Plot
Uniform, Lower=14, Upper=20
0.18
0.1667
0.16
0.14
0.12
Density 0.10
0.08
0.06
0.04
0.02
0.00
14 15 20
X
We find that the probability of a package being less than or equal to 15lbs as P(x≤15)=.1667;
that is, the chances of a package being 15lbs or less is about 16.67%.