Professional Documents
Culture Documents
Techniques
Measure Phase
Introduction to Basic Statistics
A.RameshPhD
DepartmentofManagementStudies
IndianInstituteofTechnologyRoorkee
ramesh.anbanandam@gmail.com
DATA
Information
Knowledge
ROOF
WIRES
TREE
TREE
FLOWERS
SKY
GLASS
LAMP
POLE
GRASS
TILES
WINDOWS
PROBABLY NOTHING
MUCH !
CanStatisticsBeTrusted?
Therearethreekindsoflies:
Lies,damnedlies,andstatistics.
MarkTwain
Itiseasytoliewithstatistics.Butitiseasierto
liewithoutthem.
FrederickMosteller
Figureswontliebutliarswillfigure.
CharlesGrosvenor
ramesh.anbanandam@gmail.com
Statistics..
Plays an important role in many facets of
human endeavour
Occurs remarkably frequently in our everyday
lives
It is often incorrectly thought of as just a
collection of data, graphs and diagrams
ramesh.anbanandam@gmail.com
What is Statistics?
Science of gathering, analyzing, interpreting, and
presenting data
Branch of mathematics
Facts and figures
Statistics is the scientific method that enables us to
make decisions as responsibly as possible.
ramesh.anbanandam@gmail.com
ramesh.anbanandam@gmail.com
What
customer
experiences
(Variation)
What we measure
(Average)
6.55
7.05
No ! We may not be !
It may be affected by factors which
Affects the time he takes to travel
He cannot control
Vary randomly
E.g..) The normal traffic he encounters
under normal course of travel
6.00
YES. We Can !
It may be because of some specific
circumstances which do not occur in
the normal scheme of actions.
E.g..)
His watch was running fast
He got a lift
He had a Client Call
He had some important work to be
finished before 7.30
Population
x
x x
x
x
x
x x
x
x x
x
x x
Sample
x
x x
x
x
x
n = 3 days
N = 567 days
What is the Average
Resolution time?
*Within a certain
confidence band or
margin of error
ramesh.anbanandam@gmail.com
16
ramesh.anbanandam@gmail.com
17
Descriptive Statistics
Collect data
ex. Survey
Present data
ex. Tables and graphs
Characterize data
Descriptive statistics..
Encompasses the following:
Graphical or pictorial display
Condensation of large masses of data into a form
such as tables
Preparation of summary measures to give a
concise description of complex information (e.g.
an average figure)
Exhibition of patterns that may be found in sets of
information
ramesh.anbanandam@gmail.com
19
Inferential Statistics
Estimation
ex. Estimate the population
mean weight using the
sample mean weight
Hypothesis testing
ex. Test the claim that the
population mean weight is
120 pounds
Drawingconclusionsand/ormakingdecisionsconcerninga
population basedonsample results.
Inferential Statistics..
Especially relates to:
Determining whether characteristics of a situation
are unusual or if they have happened by chance
Estimating values of numerical quantities and
determining the reliability of those estimates
Using past occurrences to attempt to predict the
future
ramesh.anbanandam@gmail.com
21
Population
Calculate x
to estimate
Sample
x
(statistic )
(parameter)
Select a
random sample
ramesh.anbanandam@gmail.com
22
Sample
ramesh.anbanandam@gmail.com
24
ramesh.anbanandam@gmail.com
25
ramesh.anbanandam@gmail.com
26
Types of Variables
Categorical (qualitative) variables have values
that can only be placed into categories, such as
yes and no.
Numerical (quantitative) variables have values
that represent quantities.
Types of Variables
Data
Categorical
Numerical
Examples:
MaritalStatus
PoliticalParty
EyeColor
(Definedcategories)
Discrete
Examples:
NumberofChildren
Defectsperhour
(Counteditems)
Continuous
Examples:
Weight
Voltage
(Measuredcharacteristics)
ramesh.anbanandam@gmail.com
29
Levels of Measurement
A nominal scale classifies data into distinct
categories in which no ranking is implied.
CategoricalVariablesCategories
PersonalComputerOwnership
TypeofStocksOwned
Yes / No
Growth Value Other
InternetProvider
MicrosoftNetwork/AOL
LevelsofMeasurement
An ordinal scale classifies data into distinct
categories in which ranking is implied
CategoricalVariable
OrderedCategories
Product satisfaction
Faculty rank
Student Grades
A, B, C, D, F
Levels of Measurement
An interval scale is an ordered scale in which the
difference between measurements is a meaningful
quantity but the measurements do not have a true
zero point.
A ratio scale is an ordered scale in which the
difference between the measurements is a
meaningful quantity and the measurements have a
true zero point.
ramesh.anbanandam@gmail.com
34
MeaningfulOperations
Statistical
Methods
Nominal
ClassifyingandCounting
Nonparametric
Ordinal
AlloftheaboveplusRanking
Nonparametric
Interval
AlloftheaboveplusAddition,
Subtraction
Parametric
Ratio
Alloftheaboveplus
multiplicationanddivision
ramesh.anbanandam@gmail.com
Parametric
35
36
East
West
North
1st Qtr
2nd Qtr 3rd Qtr
4th Qtr
20.4
27.4
90
20.4
30.6
38.6
34.6
31.6
45.9
46.9
45
43.9
ramesh.anbanandam@gmail.com
37
50
40
30
20
10
0
1st Qtr
2nd Qtr
3rd Qtr
ramesh.anbanandam@gmail.com
4th Qtr
38
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
ramesh.anbanandam@gmail.com
39
North
West
East
2nd Qtr
1st Qtr
0
20
40
ramesh.anbanandam@gmail.com
60
80
100
40
100
80
60
40
North
20
0
East
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
ramesh.anbanandam@gmail.com
West
41
Frequency distributions
Frequency tables
Class Interval
< 20
<40
<60
<80
<100
Observation Table
Frequency
Cumulative Frequency
13
13
18
31
25
56
15
71
9
80
ramesh.anbanandam@gmail.com
42
Frequency diagrams
Frequency
30
Frequency
25
20
Cumulative Frequency
15
90
80
70
60
50
40
30
20
10
0
10
5
0
< 20
<40
<60
<80
<100
Cumulative Frequency
< 20
<40
<60
<80
<100
Frequency
30
25
20
Frequency
15
10
5
0
< 20
<40
<60
<80
<100
ramesh.anbanandam@gmail.com
43
Grouped data
have been organized into a frequency
distribution
ramesh.anbanandam@gmail.com
44
26
32
34
57
30
58
37
50
30
53
40
30
47
49
50
40
32
31
40
52
28
23
35
25
30
36
32
26
50
55
30
58
64
52
49
33
43
46
32
61
31
30
40
60
74
37
29
43
54
Ages of a Sample of
Managers from
XYZ
ramesh.anbanandam@gmail.com
45
Frequency
6
18
11
11
3
1
ramesh.anbanandam@gmail.com
46
Data Range
42
26
32
34
57
30
58
37
50
30
53
40
30
47
49
50
40
32
31
40
52
28
23
35
25
30
36
32
26
50
55
30
58
64
52
49
33
43
46
32
61
31
30
40
60
74
37
29
43
54
Smallest
Largest
ramesh.anbanandam@gmail.com
47
51
Approximate Class Width = = 8.5
6
Class Width = 10
ramesh.anbanandam@gmail.com
48
Class Midpoint
beginning class endpoint + ending class endpoint
Class Midpoint =
2
30 + 40
=
2
= 35
1
Class Midpoint = class beginning point + class width
2
1
= 30 + 10
2
= 35
ramesh.anbanandam@gmail.com
49
Relative Frequency
Class Interval
20-under 30
30-under 40
40-under 50
50-under 60
60-under 70
70-under 80
Total
Frequency
6
18
6
11
50
11
18
3
50
1
50
ramesh.anbanandam@gmail.com
Relative
Frequency
.12
.36
.22
.22
.06
.02
1.00
50
Cumulative Frequency
Class Interval
20-under 30
30-under 40
40-under 50
50-under 60
60-under 70
70-under 80
Total
Frequency
6
18
11
11
3
1
50
Cumulative
Frequency
6
24
18 + 6
35
11 + 24
46
49
50
ramesh.anbanandam@gmail.com
51
Class Interval
20-under 30
30-under 40
40-under 50
50-under 60
60-under 70
70-under 80
Total
Frequency
6
18
11
11
3
1
50
Midpoint
25
35
45
55
65
75
ramesh.anbanandam@gmail.com
Relative
Cumulative
Frequency
.12
.36
.22
.22
.06
.02
1.00
Frequency
6
24
35
46
49
50
52
Class Interval
20-under 30
30-under 40
40-under 50
50-under 60
60-under 70
70-under 80
Total
Frequency
6
18
11
11
3
1
50
Relative Cumulative
Frequency Frequency
.12
6
.36
24
.22
35
.22
46
.06
49
.02
50
1.00
ramesh.anbanandam@gmail.com
Cumulative Relative
Frequency
.12
.48
.70
.92
.98
1.00
53
ramesh.anbanandam@gmail.com
54
10
0
Frequency
20
Histogram
10 20 30 40 50 60 70 80
Years
ramesh.anbanandam@gmail.com
55
10
0
Frequency
20
Histogram Construction
10 20 30 40 50 60 70 80
Years
ramesh.anbanandam@gmail.com
56
10
0
Frequency
20
Frequency Polygon
10 20 30 40 50 60 70 80
Years
ramesh.anbanandam@gmail.com
57
40
20
0
Frequency
Class Interval
20-under 30
30-under 40
40-under 50
50-under 60
60-under 70
70-under 80
Cumulative
Frequency
6
24
35
46
49
50
60
Ogive
10
20
30
40
50
60
70
80
Years
ramesh.anbanandam@gmail.com
58
Class Interval
20-under 30
30-under 40
40-under 50
50-under 60
60-under 70
70-under 80
Cumulative
Relative
Frequency
.12
.48
.70
.92
.98
1.00
1.00
0.90
0.80
0.70
0.60
0.50
0.40
0.30
0.20
0.10
0.00
0
10
20
30
40
50
60
70
80
Years
ramesh.anbanandam@gmail.com
59
ComplaintsbyPassengers
COMPLAINT
NUMBER
PROPORTION
DEGREES
Stations,etc.
28,000
.40
144.0
Train
Performance
Equipment
14,700
.21
75.6
10,500
.15
50.4
Personnel
9,800
.14
50.6
Schedules,
etc.
Total
7,000
.10
36.0
70,000
1.00
360.0
ramesh.anbanandam@gmail.com
60
ComplaintsbyPassengers
Personnel
14%
Schedules,
Etc.
10%
Equipment
15%
Train
Performance
21%
ramesh.anbanandam@gmail.com
Stations, Etc.
40%
61
Company
Second Quarter
Truck
Production
2dQuarter
Truck
Production
357,411
354,936
160,997
34,099
12,747
920,190
ramesh.anbanandam@gmail.com
Totals
62
Second Quarter
Truck Production
17%
4%
1%
39%
39%
ramesh.anbanandam@gmail.com
63
Company
A
B
C
357, 411
=
920,190
2dQuarter
Truck
Production
Degrees
357,411
.388
140
354,936
.386
139
160,997
.175
63
34,099
12,747
920,190
Totals
Proportion
.388 .037
360 =
ramesh.anbanandam@gmail.com
.014
1.000
13
5
360
64
Frequency
Pareto Chart
100
100%
90
90%
80
80%
70
70%
60
60%
50
50%
40
40%
30
30%
20
20%
10
10%
0%
Poor
Wiring
Short in
Coil
Defective
Plug
Other
ramesh.anbanandam@gmail.com
65
ScatterPlot
Gasoline Sales
(1000's of
Gallons)
60
15
120
90
15
140
60
200
Gasoline Sales
Registered
Vehicles
(1000's)
100
0
0
ramesh.anbanandam@gmail.com
10
15
Registered Vehicles
20
66
MinimumWage
1960:$1.00
GoodPresentation
MinimumWage
1970:$1.60
2
1980:$3.10
0
1990:$3.80
1960
1970
1980
1990
Graphical Errors:
Compressing the Vertical Axis
BadPresentation
200
QuarterlySales
50
100
25
0
Q1
Q2
Q3
Q4
GoodPresentation
QuarterlySales
Q1
Q2
Q3
Q4
GoodPresentations
45
45
42
42
39
39
36
36
MonthlySales
MonthlySales
M J
Thank You
71
http://www.stats.gla.ac.uk/steps/glossary/prese
nting_data.html
http://www.ilir.uiuc.edu/courses/lir593/