Bau Math381 Slides Part14

Introduction to Statistics
Statistics versus Probability

Probability and statistics are related areas of mathematics which concern themselves
with analyzing the relative frequency of events.
Probability deals with predicting the likelihood of future events, while statistics
involves the analysis of the frequency of past events.
Probability is primarily a theoretical branch of mathematics, which studies the
consequences of mathematical definitions. Statistics is primarily an applied
branch of mathematics, which tries to make sense of observations in the real
world.
What is Statistics?
Statistics is the study of collection of methods for planning experiments, obtaining
data, and then organizing, summarizing, presenting, analyzing, interpreting and
drawing conclusions.
Keep in mind:
Statistical inferences are no more accurate than the data they are based on
(weakest link).
Statistical results should be interpreted by one who understands the methods
used as well as the subject matter.
Individuals and Variables

Individuals are the people or objects included in the study.
Variable is the characteristic of the individual to be measured or observed
Example
if we want to do a study about the people who have climbed Mt. Everest, then the
individuals in the study are the actual people who made it to the top. The variables to
measure or observe might be the height, weight, race, gender, income, etc of the
individuals that made it to the top of Mt. Everest
Variables: Quantitative vs. Qualitative

A quantitative variable has a value or numerical measurement for which operations
such as addition or averaging make sense.
A qualitative variable is described by placing the individual into a category or group
such as male or female.
Simple Random
Sampling
Random Samples Simple Random Sampling

The outcome of a statistical experiment may be recorded either
as numerical value of a descriptive representation.
When a pair of dice is tossed and the total is the outcome
of interest, so we record a numerical value
When the students of a certain school are given blood

tests and the type of blood is of interest, then a
descriptive representation might be the most useful
(persons blood can be classified in 8 ways: AB, A, B, or O
with plus and minus)
Definitions
1. Population consists of the totality of the observations with
which we are concerned
2. Sample is a subset of a population
3. If X1, X2, , Xn represent a random sample of size n, then the
sample mean is defined by the statistic is:
xi
x1 x2 ... xn
x
n
i 1 n
n
4. If X1, X2, , Xn represent a random sample of size n, then the

sample variance is defined by the statistic is:
S2
1
2
X
i
n 1 i 1
n
n
n

1
2
n X i X i

n n 1 i 1
i 1
Definitions
5. The sample standard deviation denoted by S, is the
positive square root of the sample variance
6. The mode of a data set is the value that occurs most
frequently
7. The median is the central value of an ordered distribution
Order the data from smallest to largest

For an odd number of data,
Median = middle data value
For an even number of data,
Median = (sum of two middle values)/2
Example 01:
The lengths of time, in minutes, that 10 patients waited in a
doctors office before receiving treatment were recorded as
follows: 5, 11, 9, 5, 10, 15, 6, 10, 5 and 10.
Treating the data as a random sample, find:
a) The mean
b) The median
c) The mode
d) The variance
1
86
5 11 9 5 10 15 6 10 5 10 8.6
Sample mean x
10
10
Ordered sampling : 5
Median
5 6
9 10 10 10
10 11 15
9 10
9 .5
2
Mode are 5 and 10
2=
1
1
10
=1
= 10.93
8
Example 02:
The following measurements were recorded for the drying time,
in hours, of a certain brand of Latex paint.
3.4 2.5 4.8 2.9 3.6
2.8 3.3 5.6 3.7 2.8
4.4 4.0 5.2 3.0 4.8
a) Calculate the sample mean, sample median, mode and the
sample variance?
mean = x =
1
(3.4 + 2.5 + ... + 4.8) = 3.787
15
Median= 3.6
Mode = 2.8 and 4.8
2=
1
1
15
=1
= 0.9429
Sampling Distribution
The probability distribution of a statistic is called a
sampling distribution
The sampling distribution of a statistic depends on the size
of the population, the size of the samples, and the method of
choosing the samples.
1. Sampling Distribution of Means
The sampling distribution ofX with sample size n is the
distribution that results when a experiment is conducted
over and over and the many values ofX result
Then this sampling distribution describes the variability of
sample averages around the population mean
10
Sampling Distribution of Means

Suppose that a random sample of n observations is taken
from a normal population with mean and variance 2.
Each observation Xi of the random sample will then have the
same normal distribution as the population being sampled,
we conclude that:
the mean : X
1
X1 X 2 ... X n
n
Has a normal distribution with mean:

X
1
...
n
And variance
X2
2
2
1 2
n
2 2 ... 2 2
n
n
n
11
Central Limit Theorem

IfX is the mean of a random sample of size n taken from a
population with mean and finite variance 2, then the
limiting form of the distribution of:
Z
is approximately a standard
normal distribution
If n 30, the normal approximation forX will be good

If n < 30, the approximation is good only if the population is
not too different from a normal distribution
12
Example 03:
An electrical firm manufactures light bulbs that have a length of
life that is approximately normally distributed, with mean equal
to 800 hours and a standard deviation of 40 hours. Find the
probability that a random sample of 16 bulbs will have an
average life of less than 775 hours.
The sampling distribution of X will be approximately normal
with:
40
X 800 and X
x X
16
10
775 800
2.5
10
therefore : PX 775 PZ 2.5 0.0062
13
Sampling Distribution of the difference between two averages

If independent samples of size n1 and n2 are drawn at random
from two populations, discrete or continuous, with means 1
and 2, and variances 12 and 22, respectively, then the
sampling distribution of the differences of means, X1 - X2, is
approximately normally distributed with mean and variance
given by:
X
Hence : Z
X2
12
22
1 2 and 2X1 X 2
n1
n2
X 2 1 2
n1
2
1

n2
is approximately a standard normal variable
2
2
14
Example 04:
Two independent experiments are being run in which two
different types of paints are compared. Eighteen specimens are
painted using type A and the drying time, in hours, is recorded
on each. The same is done with type B. the population standard
deviations are both known to be 1.0.
Assuming that the mean drying time is equal for the two types
of paint, find P(XA - XB > 1.0), where XA and XB are average
drying times for samples of size nA = nB = 18
XB
2
X A X B
A B 0
A2
B2
1
1
1
nA
nB 18 18 9
1 A B
z
3.0; PZ 3.0 1 PZ 3.0 1 0.9987 0.0013
1
15
9
2. Sampling Distribution of S2
If S2 is the variance of a random sample of size n taken from
a normal population having the variance 2, then the
statistic:
2
n 1 S
2
i 1
has a Chi Squared distributi on with n 1
: degrees of freedom
16
Exactly 95% of a chi-squared distribution lies between 20.975 and

20.025. A 2 value falling to the right of 20.025 is not likely to occur
unless our assumed value of 2 is too small. Similarly, a 2 value falling
to the left of 20.975 is unlikely unless our assumed value of 2 is too
17
large.
Example 05:
A manufacturer of car batteries guarantees that his batteries
will last, on the average, 3 years with a standard deviation of 1
year. If five of these batteries have lifetimes of 1.9, 2.4, 3.0, 3.5
and 4.2 years, is the manufacturer still convinced that his
batteries have a standard deviation of 1 year? Assume that the
battery lifetime follows a normal distribution
S2
1
2
X
i
n 1 i 1
n
n
n
1
n X i2 X i

n n 1 i 1
i
548.26 15 0.815
54
1
Then 2
n 1 S 2
4 0.815 3.26
1
is a value from a Chi Squared distributi ons
with 4 degrees of freedom
18
Since 95% of the 2 values with 4 degrees of freedom fall

between 0.484 and 11.143, the computed value with 2 = 1 is
reasonable and therefore the manufacturer has no reason to
suspect that the standard deviation is other than 1 year.
19

Bau Math381 Slides Part14

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bau Math381 Slides Part14

Uploaded by

Copyright:

Available Formats

Introduction to Statistics

Statistics versus Probability

Individuals and Variables

Variables: Quantitative vs. Qualitative

Random Samples Simple Random Sampling

When the students of a certain school are given blood

4. If X1, X2, , Xn represent a random sample of size n, then the

Order the data from smallest to largest

Mode are 5 and 10

Sampling Distribution of Means

Has a normal distribution with mean:

Central Limit Theorem

If n 30, the normal approximation forX will be good

therefore : PX 775 PZ 2.5 0.0062

Sampling Distribution of the difference between two averages

is approximately a standard normal variable

has a Chi Squared distributi on with n 1

Exactly 95% of a chi-squared distribution lies between 20.975 and

Since 95% of the 2 values with 4 degrees of freedom fall

You might also like