Professional Documents
Culture Documents
Contents
Chapter 1 What is Statistics?...............................................................3
1
2
3
Chapter 6 Probability.........................................................................10
22
23
24
25
26
27
28
Random experiment.....................................................................................10
Sample space...............................................................................................10
Requirements of probabilities......................................................................10
Approaches to assigning probabilities.........................................................10
Events..........................................................................................................10
Joint, marginal and conditional probability..................................................10
Probability rules...........................................................................................10
Random variables........................................................................................12
37
Student
38
Chi-squared distribution...............................................................................15
39
distribution..............................................................................14
distribution..........................................................................................15
47
48
49
Sample size..................................................................................................18
( ) from median........................................18
Population All items of interest to a statistics practitioner (e.g. the shoe size
of Australians)
Parameter A descriptive measure of a population (e.g. the mean shoe size
of Australians)
Sample A subset of a population (e.g. the shoe size of UNSW students)
Statistic A descriptive measure of a sample (e.g. the mean shoe size of
UNSW students)
Statistical inference
Types of data
Hierarchy of data
Higher-level data can be treated as lower-level data, but not vice versa.
Frequency
1. Frequency distribution1 - A table that shows the frequency of each
outcome
2. Bar chart A chart that shows the frequency of each outcome
Relative frequency
3. Relative frequency distribution A table that shows the relative frequency
of each outcome
4. Pie chart A chart that shows the relative frequency of each outcome
Class width=
o
o
o
o
o
11 Graphical excellence
1.
2.
3.
4.
5.
Concise data
Clear ideas
Multivariate
Substance over form
No distortion
12 Graphical deception
1.
2.
3.
4.
1.
xi
2.
3.
xi
Median=Middle observation=x n +1
2
4.
5.
Geometric mean=
( 1+ r)
r =1
14 Variability
1.
2. Variance
N
a.
( x i ) 2
b.
c.
( x ix )2
n1
1
Shortcut sample variance=s =
n1
2
2
i
x
i =1
( )
xi
i=1
3. Standard deviation
a.
b.
4.
|x ix|
5. Empirical rule
a. Within
one
standard
deviation
of
the
mean:
deviations
of
the
mean:
two
standard
three
standard
deviations
of
the
mean:
6.
7.
8.
1
[ for k >1 ]
k2
s
x
2.
Interquartile range=Q3Q1
P
100
3. Box plots A graph with a box and whiskers that shows the maximum,
minimum, range, median, interquartile range and outliers.
4. Outliers Unusually large or small observations
a.
( x i x )( y i y)
Population covariance= xy = i =1
b.
c.
( xi x )( y i y )
n1
1
Shortcut sample covariance=s xy =
n1
i=1
x i y i i=1
2. Coefficient of correlation
3.
4.
a.
b.
xi yi
i=1
xy
xy
s xy
sx sy
a.
b.
y intercept =b 1=
c.
Slope=b0 =y b1 x
s xy
s 2x
Y s variation is explained by
X s variation
a.
b.
10
18 Sampling
19 Sampling plans
1. Simple random sample Samples with the same number of
observations are equally likely to be chosen
2. Stratified random sample Dividing the population into mutually
exclusive strata and then drawing simple random samples from each
stratum
3. Cluster sample Dividing the population into mutually exclusive clusters
and then only drawing simple random samples from selected clusters
20 Sampling error
21 Nonsampling error
Nonsampling error Differences between the sample and the population
because of mistakes in data acquisition or improper selection of sample
observations; it cannot be reduced by increasing the sample size
11
12
Chapter 6 Probability
22 Random experiment
23 Sample space
24 Requirements of probabilities
1. The probability of any outcome must lie between 0 and 1:
0 P ( Oi ) 1 [ for each i ]
k
P(Oi )=1
i=1
P=20 ).
3. Subjective approach Probabilities are the degree of belief in the
occurrence of an event (e.g. the probability that the price of a share will
increase)
26 Events
13
occur:
and
P( A B)
P ( A|B )=
probability
The
probability
of
B :
given
P ( A B)
P (B )
4. Independent events
P ( A|B )=P ( A )
A
or
or both occur:
P ( A B)
28 Probability rules
1. Complement
rule:
The
probability
that
does
not
of
occur:
P ( A C )=1P( A)
2. Multiplication
rule:
The
joint
probability
and
P ( A B ) =P ( A ) P(B)
4. Addition
of
rule:
The
union
and
B=P ( A B ) =P ( A )+ P ( B )P( A B)
5. Addition rule for mutually exclusive events:
14
P ( A B )=P ( A ) + P( B)
{ 0,1, 2, } )
0 P ( x ) 1
P ( x )=1
all x
Population variance=V ( X )= 2= ( x )2 P ( x )
all x
E ( c )=c
b.
E ( X +c )=E ( X )+ c
c.
E ( cX )=cE ( X )
7. Laws of variance
a.
V ( c )=0
b.
V ( X +c )=V ( X )
c.
V ( cX )=c2 V ( X )
15
31 Bivariate distributions
1. Requirements for discrete bivariate distributions
a.
P ( x , y )=1
b.
2.
3.
4.
all x all y
Covariance=COV ( X , Y )= xy = ( x X ) ( yY ) P ( x , y )
all x all y
Coefficient of correlation= =
xy
x y
E ( X +Y )=E ( X ) + E(Y )
a.
V ( X +Y )=V ( X ) +V ( Y ) +2 COV ( X ,Y )
X
b. If
and
are
independent,
COV ( X ,Y )=0
and
V ( X +Y )=V ( X ) +V (Y )
7.
8.
32 Binomial distributions
Requirements of binomial experiments:
1. Fixed number of trials
2. Two outcomes:
(n)
P ( success )= p
and
P ( failure ) =1 p
3. Independent trials the outcome of one trial does not affect the
outcomes of other trials
Binomial probability distribution:
X Bin (n , p)=
n!
x
nx
n x
nx
p ( 1 p ) =C r p ( 1p )
x ! ( nx ) !
Cumulative probability =P ( X x )
16
Probability that
is at least
Probability that
equals
x=P ( X x )=1P ( X [ x1 ] )
x=P ( x ) =P ( X x )P ( X [ x1 ] )
Mean==np
2.
Variance= 2=np (1 p)
3.
17
f ( x ) dx=1
34 Uniform distributions
f ( x )=
1
[where a x b ]
ba
P ( x1 < X < x2 ) =Base Height=( x 2x 1)
1
ba
35 Normal distributions
1 x
(
1
f ( x )=
e2
2
normal
( )
distributions
are
symmetric
about
36 Exponential distribution
f ( x )= ex [where x 0 ]
18
( )
0:
P ( X > x )=ex
P ( X < x )=1e
37 Student
distribution
( +1) 2
[ ]
[ ( +1 ) 2]
t2
f ( t )=
1+
v ( 2)
It is symmetrical about 0:
Mean=E ( t )=0
Variance=V ( t )=
( )
[ for >2 ]
2
38 Chi-squared distribution
f ( 2) =
( 2)1 2
1
1
( 2 )
e
[ where 2> 0 ]
2
( 2) 2
2
Probabilities
39
+
2
F
(
)
2
2
f ( F )=
( )
F
( ) ( )
1+
(
2
2
)
1
P ( 2 > 2A ) =P ( 2 < 21 A ) = A
distribution
1
1 2
1+ 2
2
[ where F> 0 ]
( )
Mean=E ( F )=
2
[ >2 ]
2 2 2
19
Variance=V ( F )=
2 2 ( 1 + 22)
2
1 ( 22 ) ( 24)
Area the A
P ( F< F 1 A , , )= A
F1 A , , =
[ 2 > 4]
1
F A,
, 2
20
( )
Mean= x =
2.
Variance= 2x =
3.
2
n
np 5 ; and
b.
n ( 1 p ) 5
2.
Mean==np
3.
Variance= 2=np ( 1 p )
4.
Standard deviation= = np ( 1 p )
(0.5)
Normal distribution
P ( X=x )
21
P(X x)
P (Y x+ 0.5 )
P(X x)
P(Y x0.5)
^
P
np 5
b.
n ( 1 p ) 5
2.
Expected value =E ( ^
P )= p
3.
p ( 1 p )
Variance=V ( ^
P ) = 2^p=
n
4.
Standard deviation= ^p =
p ( 1 p )
n
Mean= X X =12
1
2
X 1 X 2
21 22
= +
n1 n 2
2.
Variance=
3.
Standard deviation= X X =
1
21 22
+
n1 n2
22
46 Properties of estimators
1. Unbiased The expected value of the estimator equals the parameter:
E ( ^ ) =
2. Consistent As the sample size grows, the difference between the
lim E ( ^ )=
and
^ 1
^ 2
if
Var ( ^ 1 )<Var ( ^ 2)
( )
from standard
( )
1.
2.
3.
1.2533
n
49 Sample size
23
( )
from median
1.
2.
z
Sample estimate a mean=n= 2
B
24
25
26
27
28
29
30
31