Professional Documents
Culture Documents
ORGANIZING DATA
RAW DATA
Definition
Data recorded in the sequence in
which they are collected and before
they are processed or ranked are
called raw data.
2
Table 2.1 Ages of 50 students
21 19 24 25 29 34 26 27 37 33
18 20 19 22 19 19 25 22 25 23
25 19 31 19 23 18 23 19 23 26
22 28 21 20 22 22 21 20 19 21
25 23 18 37 27 23 21 25 21 24
3
Table 2.2 Status of 50 Students
J F SO SE J J SE J J J
F F J F F F SE SO SE J
J F SE SO SO F J F SE SE
SO SE J SO SO J J SO F SO
SE SE F SE J SO F J SO SO
4
ORGANIZING AND
GRAPHING QUANTITATIVE
DATA
Frequency Distributions
Relative Frequency and Percentage
Distributions
Graphical Presentation of Qualitative
Data
Bar Graphs
Pie Charts
5
TABLE 2.3 Type of Employment Students
Intend to Engage In
6
Frequency Distributions
Definition
A frequency distribution for
qualitative data lists all categories and
the number of elements that belong to
each of the categories.
7
Variable
A characteristic that varies
with an individual or an
object, is called a variable
8
Frequency Distribution:
A frequency distribution divides observations in the data set into conveniently established,
numerically ordered classes (groups or categories).
For example.
An advertising company kept an account of response letters received each day
over a period of 50 days. The observations were.
0 2 1 1 1 2 0 0 1 0 1
0 0 1 0 1 1 0 2 0 0 2
0 1 0 1 0 1 0 3 1 0 1
0 1 0 2 5 1 2 0 0 0 0
3 0 1 1 2 0
No.o
Frequency Distribution:
Any table arrange in such a way
that data are with their frequency
letter 9
CLASS BOUNDARIES:
The true class limits of a class are known as its class boundaries.
In this example:
Class
It should be noted that the difference between the upper class boundary
and the lower class boundary of any class is equal to the class interval h = 3.
32.95 minus 29.95 is equal to 3, 35.95 minus 32.95 is equal to 3, and so on.
10
Relative Frequency Distribution
Cla
ss
L
im it
11
Mid Point
C
12
Example 2-1
A sample of 30 employees from large
companies was selected, and these
employees were asked how stressful their
jobs were. The responses of these
employees are recorded next where very
represents very stressful, somewhat
means somewhat stressful, and none
stands for not stressful at all.
13
Example 2-1
Some what None SomewhatVery Very None
14
Solution 2-1
Table 2.4 Frequency Distribution of Stress on Job
Stress on Job Tally Frequency (f)
Very |||| |||| 10
Somewhat |||| |||| |||| 14
None |||| | 6
Sum = 30
15
Relative Frequency and
Percentage Distributions
Calculating Relative Frequency of a
Category
16
Relative Frequency and
Percentage Distributions
cont.
Calculating Percentage
17
Example 2-2
Determine the relative frequency and
percentage for the data in Table 2.4.
18
Solution 2-2
19
Graphical Presentation of
Qualitative Data
Definition
A graph made of bars whose heights
represent the frequencies of respective
categories is called a bar graph.
20
Figure 2.1 Bar graph for the frequency
distribution of Table 2.4
16
14
12
Frequency
10
8
6
4
2
0
Very Somewhat None
Strees on Job
21
Graphical Presentation of
Qualitative Data cont.
Definition
A circle divided into portions that
represent the relative frequencies or
percentages of a population or a
sample belonging to different
categories is called a pie chart.
22
Table 2.6 Calculating Angle Sizes for the Pie
Chart
23
Figure 2.2 Pie chart for the percentage
distribution of Table 2.5.
None, 20%
Very,
33.30%
Somewhat,
46.70%
24
Graphical Presentation of
Data
TYPES OF DATA
Qualitative Quantitative
Frequency
Curve 25
Presentation of
Qualitative Data
Qualitative
Univariate Bivariate
Frequency Frequency
Table Table
Percentages
Component Multiple
Bar Chart Bar Chart
Pie Chart
Bar Chart
800
600
400
200
0
1 2
27
Dividing the cell frequencies by the total frequency and multiplying by 100 we obtain the following:
Medium of f %
Institution
Urdu 719 59.9 = 60%
English 481 40.1 = 40%
1200
28
PIE CHART
Medium of f Angle
Institution
Urdu 719 215.70
ENGLISH 481 144.30
1200
Urdu
215.70
English
144.30
29
SIMPLE BAR CHART:
Suppose we have available to us information regarding the turnover of a company
for 5 years as given in the table below:
Years 1965 1966 1967 1968 1969
50,000
40,000
30,000
20,000
10,000
0
1965 1966 1967 1968 1969
30
Bivariate Data:
ose that along with the enquiry about the Medium of Institut
re also recording the sex of the student.
Student No. Medium Gender
1 U F
2 U M
3 E M
4 U F
5 E M
6 E F
7 U M
8 E M
: : :
: : :
31
Sex Male Female Total
Med.
Urdu 202 517 719
32
MULTIPLE BAR CHART
33
ultiple Bar Chart Showing Imports & Exports of Pakistan 1970-71 to 197
2500
2000
1500
1000 Imports
Exports
500
0
1
5
-7
-7
-7
-7
-7
72
70
71
73
74
19
19
19
19
19
34
ORGANIZING AND
GRAPHING QUANTITATIVE
DATA
Frequency Distributions
Constructing Frequency Distribution
Tables
Relative and Percentage Distributions
Graphing Grouped Data
Histograms
Polygons
35
Frequency Distributions
Table 2.7 Weekly Earnings of 100 Employees of a
Company
Weekly Earnings Number of Employees Frequency
Variable
(dollars) f column
401 to 600 9
601 to 800 22
801 to 1000 39 Frequency of
Third class
the third class
1001 to 1200 15
1201 to 1400 9
1401 to 1600 6
36
Frequency Distributions
cont.
Definition
A frequency distribution for
quantitative data lists all the classes
and the number of values that belong
to each class. Data presented in the
form of a frequency distribution are
called grouped data.
37
Frequency Distributions
cont.
Definition
The class boundary is given by the
midpoint of the upper limit of one
class and the lower limit of the next
class.
38
Frequency Distributions
cont.
Finding Class Width
39
Frequency Distributions
cont.
Calculating Class Midpoint or Mark
40
Constructing Frequency
Distribution Tables
Calculation of Class Width
41
Table 2.8 Class Boundaries, Class Widths, and
Class Midpoints for Table 2.7
43
Table 2.9 Home Runs Hit by Major League
Baseball Teams During the 2002
Season
44
Solution 2-3
230 − 124
Approximate width of each class = = 21.2
5
Now we round this approximate width to a
convenient number – say, 22.
45
Solution 2-3
The lower limit of the first class can be
taken as 124 or any number less than 124.
Suppose we take 124 as the lower limit of
the first class. Then our classes will be
124 – 145, 146 – 167, 168 – 189, 190 –
211,
and 212 - 233
46
Table 2.10 Frequency Distribution for the
Data of Table 2.9
48
Example 2-4
Calculate the relative frequencies
and percentages for Table 2.10
49
Solution 2-4
Table 2.11 Relative Frequency and Percentage
Distributions for Table 2.10
Total Home Class Boundaries Relative Percentage
Runs Frequency
50
Graphing Grouped Data
Definition
A histogram is a graph in which classes are
marked on the horizontal axis and the
frequencies, relative frequencies, or
percentages are marked on the vertical axis.
The frequencies, relative frequencies, or
percentages are represented by the heights
of the bars. In a histogram, the bars are
drawn adjacent to each other.
51
Figure 2.3 Frequency histogram for Table
2.10.
15
12
Frequency
.50
.40
.30
.20
.10
Definition
A graph formed by joining the
midpoints of the tops of successive
bars in a histogram with straight lines
is called a polygon.
54
Figure 2.5 Frequency polygon for Table 2.10.
15
12
Frequency
56
Example 2-5
The following data give the average
travel time from home to work (in
minutes) for 50 states. The data are
based on a sample survey of 700,000
households conducted by the Census
Bureau (USA TODAY, August 6, 2001).
57
Example 2-5
22.4 18.2 23.7 19.8 26.7 23.4 23.5 22.5 24.3 26.7 24.2
19.7 27.0 21.7 17.6 17.7 22.5 23.7 21.2 29.2 26.1 22.7
21.6 21.9 23.2 16.0 16.1 22.3 24.4 28.7 19.9 31.2 22.6
15.4 22.1 19.6 21.4 23.8 21.9 21.9 15.6 22.7 23.6 20.8
21.1 25.4 24.9 25.5 20.1 17.1
58
Solution 2-5
31.2 − 15.4
Approximate width of each class = = 2.63
6
59
Solution 2-5
Table 2.12 Frequency, Relative Frequency, and
Percentage Distributions of Average Travel
Time to Work
Class Boundaries f Relative Percentage
Frequency
15 to less than 18 7 .14 14
18 to less than 21 7 .14 14
21 to less than 24 23 .46 46
24 to less than 27 9 .18 18
27 to less than 30 3 .06 6
30 to less than 33 1 .02 2
Σf = 50 Sum = 1.00 Sum = 100%
60
Example 2-6
The administration in a large city wanted to know
the distribution of vehicles owned by households in
that city. A sample of 40 randomly selected
households from this city produced the following
data on the number of vehicles owned:
5 1 1 2 0 1 1 2 1 1
1 3 3 0 2 5 1 2 3 4
2 1 2 2 1 2 2 1 1 1
4 2 1 1 2 1 1 4 1 3
Construct a frequency distribution table for these
data, and draw a bar graph.
61
Solution 2-6
Table 2.13 Frequency Distribution of Vehicles
Owned
Vehicles Number of
Owned Households (f)
0 2
1 18
2 11
3 4
4 3
5 2
Σf = 40
62
Figure 2.7 Bar graph for Table 2.13.
20
18
16
14
12
Frequency
10
0
No Car 1 Car 2 Cars 3 Cars 4 Cars 5 Cars
Vehicles ow ned
63
SHAPES OF HISTOGRAMS
1. Symmetric
2. Skewed
3. Uniform or rectangular
64
Figure 2.8 Symmetric histograms.
65
Figure 2.9 (a) A histogram skewed to the
right. (b) A histogram skewed to the left.
(a) (b)
66
Figure 2.10 A histogram with uniform
distribution.
67
Figure 2.11 (a) and (b) Symmetric frequency
curves. (c) Frequency curve skewed
to the right. (d) Frequency curve skewed
to the left.
68
CUMULATIVE FREQUENCY
DISTRIBUTIONS
Definition
A cumulative frequency distribution
gives the total number of values that
fall below the upper boundary of each
class.
69
Example 2-7
Using the frequency distribution of
Table 2.10, reproduced in the next
slide, prepare a cumulative frequency
distribution for the home runs hit by
Major League Baseball teams during
the 2002 season.
70
Example 2-7
Total Home f
Runs
124 – 145 6
146 – 167 13
168 – 189 4
190 – 211 4
212 - 233 3
71
Solution 2-7
Table 2.14 Cumulative Frequency Distribution of Home
Runs by Baseball Teams
72
CUMULATIVE FREQUENCY
DISTRIBUTIONS cont.
Calculating Cumulative Relative
Frequency and Cumulative Percentage
Cumulative frequency of a class
Cumulative relative frequency =
Total observations in the data set
73
Table 2.15 Cumulative Relative Frequency
and Cumulative Percentage
Distributions for Home Runs Hit by
baseball Teams
74
CUMULATIVE FREQUENCY
DISTRIBUTIONS cont.
Definition
An ogive is a curve drawn for the
cumulative frequency distribution by
joining with straight lines the dots
marked above the upper boundaries of
classes at heights equal to the
cumulative frequencies of respective
classes.
75
Figure 2.12 Ogive for the cumulative
frequency distribution in Table
2.14
30
25
20
Cumulative
frequency
15
10
77
Example 2-8
The following are the scores of 30
college students on a statistics test:
75 52 80 96 65 79 71 87 93 95
69 72 81 61 76 86 79 68 50 92
83 84 77 64 71 87 72 92 57 98
78
Solution 2-8
To construct a stem-and-leaf display
for these scores, we split each score
into two parts. The first part contains
the first digit, which is called the stem.
The second part contains the second
digit, which is called the leaf.
79
Solution 2-8
We observe from the data that the
stems for all scores are 5, 6, 7, 8, and
9 because all the scores lie in the
range 50 to 98
80
Figure 2.13 Stem-and-leaf display.
Stems
Leaf for 52
5 2
Leaf for 75
6
7 5
8
9
81
Solution 2-8
After we have listed the stems, we
read the leaves for all scores and
record them next to the corresponding
stems on the right side of the vertical
line.
82
Figure 2.14 Stem-and-leaf display of test
scores.
5 2 0 7
6 5 9 1 8 4
7 5 9 1 2 6 9 7 1
8 2
9 0 7 1 6 3 4 7
6 3 5 2 2 8
83
Figure 2.15 Ranked stem-and-leaf display of
test scores.
5 0 2 7
6 1 4 5 8 9
7 1 1 2 2 5 6 7 9
8 9
9 0 1 3 4 6 7 7
2 2 3 5 6 8
84
Example 2-9
The following data are monthly rents
paid by a sample of 30 households
selected from a small city.
880 1081 721 1075 1023 775 1235 750 965 960
1210 985 1231 932 850 825 1000 915 1191 1035
1151 630 1175 952 1100 1140 750 1140 1370 1280
85
Solution 2-9
Figure 6 30
2.16 Stem- 7 75 50 21 50
and-leaf
display of 8 80 25 50
rents. 9 32 52 15 60 85 65
10 23 81 35 75 00
11 91 51 40 75 40 00
12 10 31 35 80
13 70
86
Example 2-10
The following stem-and-leaf display is
prepared for the number of hours that
25 students spent working on
computers during the last month.
87
Example 2-10
0 6
1 1 7 9
2 2 6
3 2 4 7 8
4 1 5 6 9 9
5 3 6 8
6 2 4 4 5 7
7
8 5 6
Prepare a new stem-and-leaf display
by grouping the stems.
88
Solution 2-10
0–2 6 * 1 7 9 * 2 6
2 4 7 8 * 1 5 6 9 9 * 3 6 8
3– 2 4 4 5 7 * * 5 6
5
6–8
89