Summary

ECON1203 Statistics
Contents
Chapter 1 What is Statistics?...............................................................3
1
2
3
Descriptive statistics vs. inferential statistics................................................3

Population vs. sample.....................................................................................3
Statistical inference........................................................................................3
Chapter 2 Graphical Descriptive Techniques I......................................4

4
5
6
7
Variables, values, data...................................................................................4

Types of data..................................................................................................4
Describing univariate nominal data...............................................................4
Comparing multivariate nominal data...........................................................4
Chapter 3 Graphical Descriptive Techniques II....................................5

8
9
10
11
12
Describing univariate interval data................................................................5

Describing time-series data...........................................................................5
Describing bivariate interval data..................................................................5
Graphical excellence......................................................................................5
Graphical deception.......................................................................................5
Chapter 4 Numerical Descriptive Techniques......................................7

13
14
15
16
Measures of central location..........................................................................7

Variability.......................................................................................................7
Measures of relative standing........................................................................7
Measures of linear relationship......................................................................8
Chapter 5 Data Collection and Sampling.............................................9

17
18
19
20
21
Methods of collecting data.............................................................................9

Sampling........................................................................................................9
Sampling plans...............................................................................................9
Sampling error...............................................................................................9
Nonsampling error.........................................................................................9
Chapter 6 Probability.........................................................................10
22
23
24
25
26
27
28
Random experiment.....................................................................................10
Sample space...............................................................................................10
Requirements of probabilities......................................................................10
Approaches to assigning probabilities.........................................................10
Events..........................................................................................................10
Joint, marginal and conditional probability..................................................10
Probability rules...........................................................................................10
Chapter 7 Discrete Probability Distributions......................................12

29
Random variables........................................................................................12
ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

30
31
32
Discrete probability distributions.................................................................12

Bivariate distributions..................................................................................12
Binomial distributions..................................................................................13
Chapter 8 Continuous Probability Distributions.................................14

33
34
35
36
Requirements of probability density functions............................................14

Uniform distributions....................................................................................14
Normal distributions.....................................................................................14
Exponential distribution...............................................................................14
37
Student
38
Chi-squared distribution...............................................................................15
39
distribution..............................................................................14
distribution..........................................................................................15
Chapter 9 Sampling Distributions......................................................16

40
41
42
43
44
Central Limit Theorem (CLT)........................................................................16

Sampling distribution of the sample mean..................................................16
Normal approximation of binomial distributions..........................................16
Approximating sampling distribution of a sample proportion.............................16
Sampling distribution of the difference between two means..........................17
Chapter 10 Introduction to Estimation...............................................18

45
46
Point vs. interval estimators........................................................................18

Properties of estimators...............................................................................18
47
Estimating population mean
48
Estimating population mean
49
Sample size..................................................................................................18
( ) from standard deviation ( ) .................18
( ) from median........................................18
Chapter 11: Introduction to Hypothesis Testing...................................19
Chapter 1 What is Statistics?

1
Descriptive statistics vs. inferential statistics
Descriptive statistics Organising, summarising & presenting data

Inferential statistics Drawing conclusions about populations based on
sample data
Population vs. sample
Population All items of interest to a statistics practitioner (e.g. the shoe size
of Australians)
Parameter A descriptive measure of a population (e.g. the mean shoe size
of Australians)
Sample A subset of a population (e.g. the shoe size of UNSW students)
Statistic A descriptive measure of a sample (e.g. the mean shoe size of
UNSW students)
Statistical inference
Statistical inference Drawing conclusions about populations based on

sample data
Confidence level The proportion of times an estimation procedure will
be correct
Significance level The proportion of times a conclusion will be wrong
Chapter 2 Graphical Descriptive Techniques I

4
Variables, values, data
Variable (denoted as uppercase letters) A characteristic of a population or

sample (e.g. shoe size)
Values The possible observations of a variable (e.g. shoe sizes between
1-16)
Data (denoted as lowercase letters) The observed values of a variable
Types of data
Hierarchy of data
Moving down the hierarchy of data reduces the number of permissible

calculations.
Higher-level data can be treated as lower-level data, but not vice versa.
1. Interval/quantitative/numerical data Real numbers (all calculations

are valid)
2. Ordinal data Data in a ranked order (calculations based on order are
valid)
3. Nominal/qualitative/categorical data Arbitrary numbers (calculations
based on frequencies and percentages are valid)
Describing univariate nominal data
Frequency
1. Frequency distribution1 - A table that shows the frequency of each
outcome
2. Bar chart A chart that shows the frequency of each outcome
Relative frequency
3. Relative frequency distribution A table that shows the relative frequency
of each outcome
4. Pie chart A chart that shows the relative frequency of each outcome
Comparing multivariate nominal data
1 Excel: To count the frequency of a particular value, use =COUNTIF ([Input

range], [Criteria]).
1. Cross-classification table/cross-tabulation table A table that shows

the frequency of combinations of two variables
2. Relative cross-classification table/cross-tabulation table A table
that shows the relative frequency of combinations of two variables
3. Separate bar charts
Chapter 3 Graphical Descriptive Techniques II

8
Describing univariate interval data

1. Histogram A chart with rectangles whose bases are the intervals and
whose heights are the frequencies
o
Number of class intervals=1+3.3 log ( n )
Class width=
Largest observationSmallest observation

Number of classes
o
o
o
o
o
Symmetric Mirrored on either sides of the middle

Positively skewed With a tail to the right
Negatively skewed With a tail to the left
Unimodal With one peak
Bimodal With two peaks
Bell-shaped Symmetric & unimodal
2. Stem-and-leaf display A table that separates place values

3. Relative frequency distribution A table that shows the relative
frequency of values
4. Cumulative relative frequency distribution A table that cumulatively adds
relative frequencies
5. Ogive A chart that shows cumulative relative frequency
Describing time-series data
Line chart A chart that plots a variable over time
10 Describing bivariate interval data

Scatter diagram A chart that plots the observed combinations of two
variables
Linearity linear/nonlinear/no relationship

Direction positive/negative
Strength strong/medium-strength/weak
11 Graphical excellence
1.
2.
3.
4.
5.
Concise data
Clear ideas
Multivariate
Substance over form
No distortion
12 Graphical deception
1.
2.
3.
4.
Graphs without scale

Graphs with different captions
Stretching and shrinking graphs
Bar charts with changing widths
Chapter 4 Numerical Descriptive Techniques

13 Measures of central location
N
1.
xi
Population mean== i=1

N
n
2.
3.
xi
Sample mean=x = i=1

n
Median=Middle observation=x n +1
2
4.
Mode=Most frequent observation
5.
Geometric mean=
( 1+ r)
r =1
14 Variability
1.
Range=Largest observationSmallest observation
2. Variance
N
a.
( x i ) 2
Population variance= 2= i=1
b.
c.
( x ix )2
Sample variance=s 2= i=1
n1
1
Shortcut sample variance=s =
n1
2
2
i
x
i =1
( )
xi
i=1
3. Standard deviation
a.
b.
Population standard deviation= = 2
Sample standard deviation=s= s2

n
4.
|x ix|
Mean absolute deviation ( MAD )= i=1
5. Empirical rule
a. Within
one
standard
deviation
of
the
mean:
deviations
of
the
mean:
P ( < x < + )=68

b. Within
two
standard
P ( 2 < x < +2 )=95

c. Within
three
standard
deviations
of
the
mean:
P ( 3 < x < +3 ) =99.7

'
6.
7.
8.
Chebysheff s Theorem : P ( k < x < +k ) 1
Population coefficient of variation=CV =
Sample coefficient of variation=cv=
1
[ for k >1 ]
k2
s
x
15 Measures of relative standing

1.
Location of a percentile=LP =( n+1 )
2.
Interquartile range=Q3Q1
P
100
3. Box plots A graph with a box and whiskers that shows the maximum,
minimum, range, median, interquartile range and outliers.
4. Outliers Unusually large or small observations
16 Measures of linear relationship

1. Covariance
N
a.
( x i x )( y i y)
Population covariance= xy = i =1

n
b.
c.
( xi x )( y i y )
Sample covariance=s xy= i=1
n1
1
Shortcut sample covariance=s xy =
n1
i=1
x i y i i=1
2. Coefficient of correlation
3.
4.
a.
Population coefficient of correlation==
b.
Sample coefficient of correlation=r=
Least squares method
xi yi
i=1
xy
xy
s xy
sx sy
a.
Equation of the line : ^y =b 0+ b1 x
b.
y intercept =b 1=
c.
Slope=b0 =y b1 x
s xy
s 2x
Coefficient of determination how much of
Y s variation is explained by
X s variation
a.
Population coefficient of determination=2
b.
Sample coefficient of determination=r 2
5. Correlation is not causation!
10
Chapter 5 Data Collection and Sampling

17 Methods of collecting data
1. Primary data Collected by the statistics practitioners for the current
problem
2. Secondary data Collected by someone else for another problem
3. Observation Measuring actual behaviour
4. Experiments Imposing treatments and measuring resultant behaviour
5. Surveys Asking questions
18 Sampling
Target population The population about which we want to draw

inferences
Sampled population The actual population from which the sample has
been take
Self-selected samples When participants choose to participate and
thus are more keenly interested in the issue than other members of the
population
19 Sampling plans
1. Simple random sample Samples with the same number of
observations are equally likely to be chosen
2. Stratified random sample Dividing the population into mutually
exclusive strata and then drawing simple random samples from each
stratum
3. Cluster sample Dividing the population into mutually exclusive clusters
and then only drawing simple random samples from selected clusters
20 Sampling error
Sampling error Differences between the sample and the population

because of observations that happened to be selected for the sample; it
can be reduced by increasing the sample size
21 Nonsampling error
Nonsampling error Differences between the sample and the population
because of mistakes in data acquisition or improper selection of sample
observations; it cannot be reduced by increasing the sample size
11
1. Errors in data acquisition (e.g. faulty equipment, inaccurate responses to

sensitive questions)
2. Nonresponse error When responses are not obtained from some
members of the sample
3. Selection bias When members of the target population cannot possibly
be selected for inclusion in the sample
12
Chapter 6 Probability
22 Random experiment
Random experiment An action or process that leads to one of several

possible outcomes (e.g. Experiment: Flipping a coin. Outcomes: Heads or
tails.)
23 Sample space
Sample space All possible outcomes of an experiment. They must be

mutually exclusive.
24 Requirements of probabilities
1. The probability of any outcome must lie between 0 and 1:
0 P ( Oi ) 1 [ for each i ]
k
2. The sum of the probabilities of all outcomes is 1;
P(Oi )=1
i=1
25 Approaches to assigning probabilities

1. Classical approach Probabilities in games of chance (e.g. flipping a
coin, rolling dice)
2. Relative frequency approach Probabilities are long-run relative
frequencies
(e.g. if the relative frequency of getting a distinction is 200/1000 students,
P=20 ).
3. Subjective approach Probabilities are the degree of belief in the
occurrence of an event (e.g. the probability that the price of a share will
increase)
26 Events
Simple event An individual outcome of a sample space (e.g. getting a

mark of 80)
Event A collection or set of one or more simple events in a sample space
(e.g. the event of getting a distinction requires a mark of at least 80,
Distinction={80,81, 82, , 99,100 } )
13
Probability of an event The sum of the probabilities of the simple events

that make up an event
27 Joint, marginal and conditional probability

A
1. Joint probability (intersection) The probability that both
occur:
and
P( A B)
2. Marginal probability Probabilities computed by adding across rows or

down columns
3. Conditional
P ( A|B )=
probability
The
probability
of
B :
given
P ( A B)
P (B )
4. Independent events
P ( A|B )=P ( A )
A
5. Union The probability that either
or
or both occur:
P ( A B)
28 Probability rules
1. Complement
rule:
The
probability
that
does
not
of
occur:
P ( A C )=1P( A)
2. Multiplication
rule:
The
joint
probability
and
B=P ( A B )=P ( A ) P ( B|A )

3. Multiplication rule for independent events:
P ( A B ) =P ( A ) P(B)
4. Addition
of
rule:
The
union
and
B=P ( A B ) =P ( A )+ P ( B )P( A B)
5. Addition rule for mutually exclusive events:
14
P ( A B )=P ( A ) + P( B)
Chapter 7 Discrete Probability Distributions

29 Random variables
Random variable A function or rule that assigns a number to each

outcome of an experiment (e.g. when flipping a coin, the number of heads
{ 0,1, 2, } )
Discrete random variable Can only assume certain values (whether

finite or infinite)
Continuous random variable Can assume any values within a specified
range (e.g. time)
Probability distribution A table, formula or graph that shows the
probabilities of values of a random variable
30 Discrete probability distributions

1. Requirements of discrete probability distributions
a.
b.
2.
3.
4.
5.
0 P ( x ) 1
P ( x )=1
all x
Population mean=E ( X )== xP ( x )

allx
Population variance=V ( X )= 2= ( x )2 P ( x )
all x
Shortcut population variance=V ( X )= 2= x 2 P ( x )2

all x
Population standard deviation= = 2
6. Laws of expected value

a.
E ( c )=c
b.
E ( X +c )=E ( X )+ c
c.
E ( cX )=cE ( X )
7. Laws of variance
a.
V ( c )=0
b.
V ( X +c )=V ( X )
c.
V ( cX )=c2 V ( X )
15
31 Bivariate distributions
1. Requirements for discrete bivariate distributions
0 P ( x , y ) 1 [ for all pairs of values ( x , y ) ]
a.
P ( x , y )=1
b.
2.
3.
4.
all x all y
Covariance=COV ( X , Y )= xy = ( x X ) ( yY ) P ( x , y )
all x all y
Shortcut covariance=COV ( X , Y )= xy = xyP ( x , y ) X Y

allx all y
Coefficient of correlation= =
xy
x y
5. Laws of expected value of the sum of two variables
E ( X +Y )=E ( X ) + E(Y )
a.
6. Laws of variance of the sum of two variables

a.
V ( X +Y )=V ( X ) +V ( Y ) +2 COV ( X ,Y )
X
b. If
and
are
independent,
COV ( X ,Y )=0
and
V ( X +Y )=V ( X ) +V (Y )
7.
Mean of a portfolio of two stocks=E ( R p ) =w1 E ( R1 ) +w 2 E ( R2 )
8.
Variance of a portfolio of two stocks=V ( R p )=w 21 V ( R1 ) + w22 V ( R2 ) +2 w 1 w2 1 2
32 Binomial distributions
Requirements of binomial experiments:
1. Fixed number of trials
2. Two outcomes:
(n)
P ( success )= p
and
P ( failure ) =1 p
3. Independent trials the outcome of one trial does not affect the
outcomes of other trials
Binomial probability distribution:
X Bin (n , p)=
n!
x
nx
n x
nx
p ( 1 p ) =C r p ( 1p )
x ! ( nx ) !
Cumulative probability =P ( X x )
16
Probability that
is at least
Probability that
equals
x=P ( X x )=1P ( X [ x1 ] )
x=P ( x ) =P ( X x )P ( X [ x1 ] )
Mean, variance and standard deviation:

1.
Mean==np
2.
Variance= 2=np (1 p)
3.
Standard deviation= = np(1p)
17
Chapter 8 Continuous Probability Distributions

33 Requirements of probability density functions
1. The function is above 0:
f ( x ) 0 [for a< x <b ]

b
f ( x ) dx=1
2. The area under the function is 1:
34 Uniform distributions
f ( x )=
1
[where a x b ]
ba
P ( x1 < X < x2 ) =Base Height=( x 2x 1)
1
ba
35 Normal distributions
1 x
(
1
f ( x )=
e2
2
[ for < x < ]

() .
It is symmetric about the mean
Increasing the standard deviation
Standard normal random variable=Z=

Standardised
normal
( )
widens the curve.
distributions
are
symmetric
about
P ( Z > Z A )=P ( Z<Z A ) =A
36 Exponential distribution
f ( x )= ex [where x 0 ]
Increasing the parameter of distribution
Mean ( )=Standard deviation ( )=
18
( )
steepens the curve.
0:
P ( X > x )=ex
P ( X < x )=1e
P ( x1 < X < x2 ) =P ( X < x 2 )P ( X < x 1 )=e x e x
37 Student
distribution
( +1) 2
[ ]
[ ( +1 ) 2]
t2
f ( t )=
1+
v ( 2)
P ( t>t A , v )=P ( t <t A ,v ) =A
It is symmetrical about 0:
It is flatter than the standard normal distribution.
Increasing the degrees of freedom
Mean=E ( t )=0
Variance=V ( t )=
( )
narrows the curve.
[ for >2 ]
2
38 Chi-squared distribution
f ( 2) =
( 2)1 2
1
1
( 2 )
e
[ where 2> 0 ]
2
( 2) 2
2
Increasing the degrees of freedom
Probabilities
39
+
2
F
(
)
2
2
f ( F )=
( )
F
( ) ( )
1+
(
2
2
)
1
flattens the curve.
P ( 2 > 2A ) =P ( 2 < 21 A ) = A
distribution
1
1 2
1+ 2
2
[ where F> 0 ]
( )
Mean=E ( F )=
2
[ >2 ]
2 2 2
19
Variance=V ( F )=
2 2 ( 1 + 22)
2
1 ( 22 ) ( 24)
Area the is A=P ( F > F A , , )= A
Area the A
P ( F< F 1 A , , )= A
F1 A , , =
[ 2 > 4]
1
F A,
, 2
20
Chapter 9 Sampling Distributions

40 Central Limit Theorem (CLT)
The sampling distribution of the mean of a random sample drawn from any
population is approximately normal for a sufficiently large sample size.
41 Sampling distribution of the sample mean

2
( )
Normally distributed sampling distribution= X N ,

n
1.
Mean= x =
2.
Variance= 2x =
3.
Standard devi ation= x =
2
n
42 Normal approximation of binomial

distributions
Normally distributed binomial distribution=Y N ( , 2 )
1. Binomial distributions are approximately normally distributed if:
a.
np 5 ; and
b.
n ( 1 p ) 5
2.
Mean==np
3.
Variance= 2=np ( 1 p )
4.
Standard deviation= = np ( 1 p )
5. The continuity correction factor
(0.5)
because binomial distributions
are discrete random variables whereas normal distributions are continuous

random variables:
Binomial distribution
Normal distribution
P ( X=x )
P ( x0.5<Y < x +0.5 )
21
P(X x)
P (Y x+ 0.5 )
P(X x)
P(Y x0.5)
43 Approximating sampling distribution of a

sample proportion
1.
^
P
is approximately normally distributed I:

a.
np 5
b.
n ( 1 p ) 5
2.
Expected value =E ( ^
P )= p
3.
p ( 1 p )
Variance=V ( ^
P ) = 2^p=
n
4.
Standard deviation= ^p =
p ( 1 p )
n
44 Sampling distribution of the difference

between two means
1.
Mean= X X =12
1
2
X 1 X 2
21 22
= +
n1 n 2
2.
Variance=
3.
Standard deviation= X X =
1
21 22
+
n1 n2
22
Chapter 10 Introduction to Estimation

45 Point vs. interval estimators
1. Point estimators Estimate a parameter using a single value or point
2. Interval estimators Estimate a parameter using an interval
46 Properties of estimators
1. Unbiased The expected value of the estimator equals the parameter:
E ( ^ ) =
2. Consistent As the sample size grows, the difference between the
lim E ( ^ )=
estimator and the parameter falls:
and
lim Var ( ^ )=0
3. Relatively efficient An estimator is relatively more efficient if its

variance
is
lower:
^ 1
is relatively more efficient than
^ 2
if
47 Estimating population mean

deviation
Var ( ^ 1 )<Var ( ^ 2)
( )
from standard
( )
1.
Confidence interval estimator of =x z 2
2.
Lower confidence limit ( LCL )=x z 2
3.
Upper confidence limit (UCL )=x + z 2
48 Estimating population mean

Confidence interval estimator of =m z 2
1.2533
n
49 Sample size
23
( )
from median
1.
Bound on theerror of estimation=B=z 2
2.
z
Sample estimate a mean=n= 2
B
24
Chapter 11 Introduction to hypothesis testing
25
Chapter 12 Inference about a population
26
Chapter 13 Inference about comparing two

populations
27
Chapter 14 Analysis of variance
28
Chapter 15 Chi-squared tests
29
Chapter 16 Similar linear regression and

correlation
30
Chapter 17 Multiple regression
31

Summary

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Summary

Uploaded by

Copyright:

Available Formats

ECON1203 Statistics

Descriptive statistics vs. inferential statistics................................................3

Chapter 2 Graphical Descriptive Techniques I......................................4

Variables, values, data...................................................................................4

Chapter 3 Graphical Descriptive Techniques II....................................5

Describing univariate interval data................................................................5

Chapter 4 Numerical Descriptive Techniques......................................7

Measures of central location..........................................................................7

Chapter 5 Data Collection and Sampling.............................................9

Methods of collecting data.............................................................................9

Chapter 7 Discrete Probability Distributions......................................12

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Discrete probability distributions.................................................................12

Chapter 8 Continuous Probability Distributions.................................14

Requirements of probability density functions............................................14

Chapter 9 Sampling Distributions......................................................16

Central Limit Theorem (CLT)........................................................................16

Chapter 10 Introduction to Estimation...............................................18

Point vs. interval estimators........................................................................18

Estimating population mean

Estimating population mean

( ) from standard deviation ( ) .................18

Chapter 11: Introduction to Hypothesis Testing...................................19

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 1 What is Statistics?

Descriptive statistics vs. inferential statistics

Descriptive statistics Organising, summarising & presenting data

Population vs. sample

Statistical inference Drawing conclusions about populations based on

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 2 Graphical Descriptive Techniques I

Variables, values, data

Variable (denoted as uppercase letters) A characteristic of a population or

Moving down the hierarchy of data reduces the number of permissible

1. Interval/quantitative/numerical data Real numbers (all calculations

Describing univariate nominal data

Comparing multivariate nominal data

1 Excel: To count the frequency of a particular value, use =COUNTIF ([Input

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

1. Cross-classification table/cross-tabulation table A table that shows

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 3 Graphical Descriptive Techniques II

Describing univariate interval data

Number of class intervals=1+3.3 log ( n )

Largest observationSmallest observation

Symmetric Mirrored on either sides of the middle

Bell-shaped Symmetric & unimodal

2. Stem-and-leaf display A table that separates place values

Describing time-series data

Line chart A chart that plots a variable over time

10 Describing bivariate interval data

Linearity linear/nonlinear/no relationship

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Graphs without scale

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques II

Chapter 4 Numerical Descriptive Techniques

Population mean== i=1

Sample mean=x = i=1

Mode=Most frequent observation

Range=Largest observationSmallest observation

Population variance= 2= i=1

Sample variance=s 2= i=1

Population standard deviation= = 2

Sample standard deviation=s= s2