Professional Documents
Culture Documents
Presentation of Data
Q.1. What are different methods of presentation of Data?
Ans. (i) Classification
(ii) Tabulation
(iii) Diagrams
(iv) Graphs.
Q.2. What is Classification?
Ans. Classification is the process of arranging the data into relatively homogeneous groups
or classes according to their resemblances and affinities.
Q.3. What is Tabulation?
Ans. The systematic arrangement of data in the form of rows and columns for the purpose
of comparison and analysis is known as tabulation.
Q.4. What is an array?
Ans. Arrangement of data in ascending or descending order is called as an array.
Q.5. What are the main parts of the table.
Ans. (i) Title
(ii) Box-head
(iii) Stub
(iv) Body
(v) Prefatory Note
(vi) Foot note
(vii) Source Note.
Q.6. What is frequency distribution?
Ans. A frequency distribution is a tabular arrangement of the data that shows the
distribution of observations among different classes.
Q.7. What are class limits?
Ans. The class limits are defined as the values of the variables, which explain the classes.
Q.8. What do you mean by open-end class?
Ans. If a frequency distribution has no lower class limit or no upper class limit of its any
class is called an open-end class.
Q.9. What are class boundaries?
Ans. The class boundaries are the exact values, which break up one class from another
class.
Q.10. What do you mean by class marks (or mid points)?
Ans. A class mark is average value of the lower and upper class limits or class boundaries.
Q.11. What is class interval?
Ans. The difference between the upper and lower class boundaries is called class interval
or class width.
Q.12. What is class frequency?
Ans. The number of values falling in a specified class is called class frequency or
frequency.
Q.13. What is relative frequency?
Ans. The frequency of a class divided by the total frequency is called relative frequency.
Q.14. What is Histogram?
Ans. A histogram is a set of adjacent rectangles for a frequency distribution such that the
area of each rectangle is proportional to the corresponding class frequency.
Page # 2
Q.15. What is frequency polygon.
Ans. A frequency polygon is a many-sided closed figure that represents a frequency
distribution.
Q.16. How is a frequency polygon constructed?
Ans. It is constructed by plotting the mid points and corresponding frequencies and then
connecting them by straight line segments.
Q.17. What is ogive?
Ans. The cumulative frequency polygon is called ogive.
Q.18. What is chart?
Ans. A chart is a device used for representing a simple statistical data in a simple, clear
and effective manner.
Q.19. What is ungrouped data?
Ans. The fresh data that have been collected for the first time are called ungrouped data.
Q.20. What do you mean by grouped data?
Ans. When the ungrouped data are arranged according to classes or groups with their
respective frequencies are called grouped data.
Q.21. For which distribution the graph of the frequency distribution is bell-
shaped?
Ans. For symmetrical distribution the graph is bell-shaped.
Q.22. Name some graphs of frequency distribution.
Ans. Histogram, polygon, frequency curve, ogive.
Q.23. Name some charts / diagrams.
Ans. Simple bar diagram, sub-divided bar diagram, Multiple bar diagram, Pie chart etc.
Q.24. What is the mid point of class 20-24?
Ans. Mid point is 22.
Q.25. Write the formula of angle of sector used in pie chart.
Ans. Angle of sector =
************************
Page # 3
Example 1: The following data shows the number of children in different families
of a small locality:
1, 2, 4, 3, 0, 1, 2, 3, 1, 1, 0, 2, 1, 0, 2, 3, 0, 0, 1, 3.
Make a frequency distribution. Also find relative frequencies.
Solution:
Range = Maximum value – Minimum value
=4–0=4
The number of families f
The number of children Tally r.f = Σf
(f)
0 //// 5 5/20 = 0.25
1 //// / 6 6/20 = 0.30
2 //// 4 4/20 = 0.20
3 //// 4 4/20 = 0.20
4 / 1 1/20 = 0.05
Σ ---- 20 1.00
Example 2: The following data shows the ages of 50 cancer patients admitted in
Shaukat Khanum Memorial Cancer Hospital, Lahore:
48 29 39 32 54 33 44 36 38 31
46 30 20 44 47 39 42 35 33 47
31 35 34 42 41 42 43 35 32 35
43 36 37 45 46 41 25 27 26 40
38 41 44 47 45 45 52 43 44 43
Make a frequency distribution. Also find class boundaries and mid points.
Solution:
The following steps are involved in constructing a frequency distribution.
i) Range = Maximum value – Minimum value
= 54 – 20 = 34
ii) Approximate number of classes
No. of classes = 1 + 3.322 logn = 1 + 3.322log(50)
= 1 + 3.322 (1.6990)= 6.6066 ≅ 7 (approximately)
iii) Width of class interval
Range 34
h = No . of classes = 7 = 4.8571 ≅ 5 (appr.)
iv) Group the entire data with an interval of 5 each and write down the classes in the first
column under the heading “Ages”. Count the actual number falling in each interval putting a
tally (/) in the proper interval for each value. Count the number of tallies for each interval
and write down in the next column, these are frequencies denoted by f.
Ages
Tally f Class boundaries Mid point (X)
(Class limits)
20 – 24 / 1 19.5 – 24.5 22
25 – 29 //// 4 24.5 – 29.5 27
30 – 34 //// /// 8 29.5 – 34.5 32
35 – 39 //// //// / 11 34.5 – 39.5 37
40 – 44 //// //// //// 15 39.5 – 44.5 42
45 – 49 //// //// 9 44.5 – 49.5 47
50 – 54 // 2 49.5 – 54.5 52
Σ ---- 50 ---- ----
Page # 4
Example 3: The following data shows the scores made by Pakistani cricketers
against New Zeeland in one-day match. Draw a simple bar chart of the following
data:
Cricketers Inzmam Waseem Shahid Saeed Imran Razzaq
Scores 54 47 26 30 25 23
Solution:
50
40
Scores
30
20
10
0
Inzmam Waseem Shahid Saeed Imran Razzaq
Cricketers
600
500
400
300
200
100
0
1987 Years 1988 1989
Page # 5
ii)
Production in Kg (thousands)
Year 1987 1988 1989
Locality I
500 600 800
Locality II 600 700 700
200 400 500
Locality III
Total 1300 1700 2000
1500
1000
500
0
1987 1988 1989
Ye ars
iii)
70%
60%
50%
40%
30%
20%
10%
0%
1987 1988 1989
Years
Example5: The data are available regarding total production of urea fertilizer and
its use on different crops. Total production of urea is 200 (thousand Kg) and its
Page # 6
consumption for different crops wheat, sugarcane, maize, and lentils is 75, 80, 30
and 15 (thousand Kg) respectively. Make an appropriate diagram to represent
these data.
Solution:
Angle of sector
Crops Fertilizer (thousand Kg) Component part
θ= × 360
Total
75
Wheat 75 × 360 = 135 o
200
80
Sugarcane 80 × 360 = 144 o
200
30
Maize 30 × 360 = 54 o
200
15
Lentils 15 × 360 = 27 o
200
Total 200 360
Pie Diagram
Lentils
27o
Maize
54o Wheat
135o
Sugarcane
144o
Histogram
12
10
8
frequency
0
85.5 90.5 95.5 100.5 105.5 110.5 115.5
Class Boundaries
Ans. Mode Mode is defined as the most frequent value of the data. It is denoted by ^
X
.
^ =l+ × h
X
fm = frequency of the modal class.
l = lower class boundary of the modal class.
f1 = frequency preceding the modal class.
f2 = frequency following the modal class.
h = class interval or width of the model class.
Q.8. Define harmonic mean?
Ans. Harmonic mean is defined as the reciprocal of the mean of the reciprocals of the
observations
Q.9. Define G.M.
Ans. The geometric mean is defined as the nth root of the product of n positive values.
Q.10(a) What do you mean by unimodal, bimodal, multimodal distributions?
(b) When it is not possible to find mode?
Ans. (a) Unimodal Distribution: A distribution having a single mode is called unimodal
distribution.
Bimodal Distribution:A distribution having two modes is called bimodal distribution.
Multimodal Distribution:A distribution having more than two modes is called multimodal
distribution.
(b) If each value occurs the same number of times, then it is not possible to find mode.
Arithme
Step =A+ =A+ × h
deviation u=
tic / Coding
Me
an
G.M= G.M =
Geometric Mean = antilog = antilog
H.M = H.M =
Harmonic Mean
~ ~
Y = The value of Y=l+
Median
QK=The Value of Kth item QK = l +
Quartiles k = 1,2,3
Note: The formulae of median, quartiles, deciles and percentiles for discrete frequency
distribution are same as that of ungrouped data.
Weighted Arithmetic Mean:
Page # 10
_
Yw =
For symmetrical distribution:
Mean = Median = Mode
For skewed Distribution:
Mode = 3 Median – 2 Mean
Example 1: Find arithmetic mean of the following data:
102, 104, 106, 108, 110.
(i) By direct method (ii) By short-cut method
Solution:
X D= X – A(X –100)
102 2
104 4
106 6
108 8
110 10
Σ X= Σ D= 30
530
Arithmetic mean:
(i) Direct Method (ii) Short-cut Method
ΣX ΣD
X = X = A+
n n
530 30
= = 1 06 = 1 0 0+
5 5
= 1 0 0+ 6 = 1 0 6
Example 2: Find average age from the following frequency distribution of ages of
50 patients
Ages
No. of
pat
ien
ts
20-24 1
25-29 4
30-34 8
35-39 11
40-44 15
45-49 9
50-54 2
Solution:
Ages f X fX
20-24 1 22 22
25-29 4 27 108
30-34 8 32 256
35-39 11 37 407
40-44 15 42 630
45-49 9 47 423
50-54 2 52 104
Σ 50 ----- 1950
Page # 11
Average Age:
∑fX
X =
∑f
1950
= = 39
50
Hence the average age of patients is 39 years.
Example 3:Find the arithmetic mean from the given information:
(i) D = X– 39, ΣD = 240 and n = 10
X − 57
(ii) u= , Σu = 23 and n = 20
5
(iii) X = 10 + 5u, Σfu = - 46 and n = 125
Solution:
(i) ΣD = 240, n = 10
D = X – 39, Comparing with D = X – A, A = 39
ΣD 240
Arithmetic mean = X = A + = 39 + = 39 + 24 = 63
n 10
(ii) Σu = 23 n = 20
X − 57 X −A
u= , Comparing with u=
5 h
A = 57, h = 5
Σu
Arithmetic mean = X = A + ×h
n
23
= 57 + × 5 = 57 + 5.75 = 62.75
20
(iii) X = 10 + 5u, Σfu = – 46, n = 125
X – 10 = 5u
X −10 X −A
u= , Comparing with u =
5 h
A = 10, h = 5, n = Σ f = 125
Σfu
Arithmetic mean = X =A+ ×h
∑f
− 46
= 10 + ×5 10 + (– 1.84)= 10 – 1.84 = 8.16
125
Example 4: Calculate the weighted man of the following data:
Items Expenditure Weight
Food 290 7.5
Rent 54 2.0
Clothing 98 1.5
Fuel 75 1.0
Miscellaneous 75 0.5
Solution:
Items X W WX
Food 290 7.5 2175
Rent 54 2.0 108
Clothing 98 1.5 147
Fuel 75 1.0 75
Miscellaneous 75 0.5 37.5
Σ ----- 12.5 2542.5
Weighted mean:
=
∑WX
X
∑W
w
Page # 12
2542 .5
= = 203.4
12 .5
f
C.B f Y
Y
0–4 2 2 1.0000
4–8 5 6 0.8333
8 – 12 7 10 0.7000
12 – 16 8 14 0.5714
16 – 20 7 18 0.3889
20 – 24 4 22 0.1816
24 – 28 1 26 0.0385
Σ 34 ---- 3.7137
∑f
H.M = f
∑Y
34
= = 9.1553
3.7137
Example 8: Find median from the following data:
(i) c, a, b
(ii) 88.03, 94.50, 95.05, 84.60
(iii) 87,91,89,88,89,91,87,92,90,98.
Solution:
(i) The data in an array:
a, b, c
n +1
Median = th value
2
3 +1
= th value
2
= 2rd value = b
(ii) The data in an array:
Sr. No. 1 2 3 4 5
Values 84.60 88.30 94.50 94.90 95.05
Here n = 5
Page # 14
n +1
Median = th value
2
5 +1
= th value
2
= 3rd value = 94.50
(iii) The data in an array:
Sr. No. 1 2 3 4 5 6 7 8 9 10
Values 87 87 88 89 89 90 91 91 92 98
Here n = 10,
n +1
Median = th value
2
10 +1
= th value.
2
5th + 6th 89 + 90
= (5.5)th value= = = 89 .5
2 2
Example 9: Find the median from the following data of heights of students:
Frequency
C.1
86 – 90 6
91 – 95 4
96 – 100 10
101 – 105 6
106 – 110 3
111 – 115 1
Solution:
C.I Class boundaries f c.f
86 – 90 85.5 – 90.5 6 6
91 – 95 90.5 – 95.5 4 10
96 – 100 95.5 – 100.5 10 20
101 – 105 100.5 – 105.5 6 26
106 – 110 105.5 – 110.5 3 29
111 – 115 110.5 – 115.5 1 30
Σ ---- 30 ----
h n
Median = l + −C
f 2
n 30
th value = th value =15 th value
2 2
∴Median class is 95 .5 −100 .5
5
Median = 95.5 + (15-10) = 95.5 + 2.5 = 98.0
10
Example 10: Find mode for the following data:
91, 89, 88, 87, 89, 91, 87, 92, 90, 98, 95, 97, 96, 100, 101, 96, 98, 99, 98, 100, 102,
99, 101, 105, 103, 107, 105, 106, 107, 112.
Solution: Since the most frequent value of the data is 98.Therefore, Mode = 98
Example 11: Find mode for the following frequency distribution of heights of
students:
Frequency
Heights
86 ≤ X ≤ 6
90 4
91 ≤ X ≤ 95 10
Page # 15
96 ≤ X ≤ 6
100 3
101 ≤ X ≤ 1
105
106 ≤ X ≤
110
111 ≤ X ≤
115
Solution:
Class boundaries f
Heights C.1
86 ≤ X ≤ 90 86 – 90 85.5 – 90.5 6
91 ≤ X ≤ 95 91 – 95 90.5 – 95.5 4
96 ≤ X ≤ 100 96 – 100 95.5 – 100.5 10
101 ≤ X ≤ 105 101 – 105 100.5 – 105.5 6
106 ≤ X ≤ 110 106 – 110 105.5 – 110.5 3
111 ≤ X ≤ 115 111 – 115 110.5 – 115.5 1
f m − f1
Mode = l+ ×h
( f m − f1 ) + ( f m − f 2 )
Sine maximum frequency is 10, therefore 95.5 – 100.5 is modal class.
l = 95 .5, h = 5, f m =10 , f1 = 4, f 2 = 6
10 − 4
Mode = 95 .5 + ×5
(10 − 4) + (10 − 6)
= 95.5+3.0 = 98.5
Measures of Dispersion
Relative
Topic Absolute Dispersion Dispersio
n
R = Ym − Yo Co-
Range efficient of
Range
=
\ Q.D = Co-
Quartile efficient of
Deviation or Q.D =
Semi Inter
Quartile Range
Ungrouped Grouped
_ _
Mean Deviation M.D = M.D =
Y Y
M.D~ = M.D ~ =
Y Y
Page # 17
Co-
efficient of
S.D =
Standard
Deviation
Co efficient
Variance 2
S = (S.D) 2 2
S = (S.D) 2 of
Variation =
× 100
Example3: Find lower quartile, upper quartile & quartile deviation from the given data:
Groups Frequen
cy
70-74 2
75-79 5
80-84 12
85-89 18
90-95 7
Solution:
Groups Frequen
C.B C.f
cy
70-74 2 69.5- 2
74.5
75-79 5 74.5- 7
79.5
80-84 12 79.5- 19
84.5
85-89 18 84.5- 37
89.5
90-95 7 89.5- 44
94.5
Σ 44 ----- -----
L o w eQr u a r tile
h 1n
Q1 = l+ −C
f 4
5 5
= 7 9.5 + (1 1− 7) = 7 9.5 + (4)
12 12
20
= 7 9.5 + = 7 9.5 + 1.6 7
12
= 8 1.1 7
Page # 19
U p p eqr u a rtile
h 3n
Q3 = l + −C
f 4
5 5
= 8 4.5 (3 3− 1 9) = 8 4.5 + (1 4)
18 18
5 5
= 8 4.5 + (3 3− 1 9) = 8 4.5 + (1 4)
18 18
70
= 8 4.5 + = 8 4.5 + 3.8 9 = 8 8.3 9
18
Q3 − Q1
Q.D =
2
88.39 − 81.17
=
2
7.22
= = 3.61
2
Example 4: The ungrouped date is given below:
2, 5, 6, 6, 8, 9, 12, 13, 16, 23
Calculate the average deviation from
i) Mean ii) Median.
Solution:
~
Y −Y Y −Y
Y
= Y −10 =Y −8.5
2 8 6.5
5 5 3.5
6 4 2.5
6 4 2.5
8 2 .5
9 1 .5
12 2 3.5
13 3 4.5
16 6 7.5
23 13 14.5
~
Σ Y −Y = Σ Y −Y =
Σ Y = 100
48 46
ΣY −Y
i) Average deviation (Mean) =
n
ΣY 100
_
Mean = Y = = = 10
n 10
ΣY −Y 48
Average deviation (Mean) = = = 4.8
n 10
~
ΣY −Y
ii) Average deviation (Median) =
n
~ n + 1
Y = th value
2
10 + 1
= th value
2
11
= th value= 5.5 th value= 5th + .5(6th − 5th)
2
= 8 + .5(9 − 8) = 8.5
~
ΣY −Y 46
Average deviation (Median) = = = 4. 6
n 10
Page # 20
Example 5: Calculate median & mean deviation from the following data:
Solution:
~
f C.B C.f Y −
~
Y f Y −Y
2 9.25 – 9.75 2 1.57 3.14
5 9.75 – 10.25 7 1.07 5.35
12 10.25 – 10.75 19 0.57 6.84
17 10.75 – 11.25 36 0.07 1.19
14 11.25 – 11.75 50 0.43 6.02
6 11.75 – 12.25 56 0.93 5.58
3 12.25 – 12.75 59 1.43 4.29
1 12.75 – 13.25 60 1.93 1.93
60 ----- ----- ----- 34.34
h n
M e d ia=n l + −C
f 2
n 60
th v a lu e= th v a lu e= 3 0thv a lu e
2 2
∴ M e d ia cnla s iss 1 0.7 5− 1 1.2 5
0.5 0.5
M e d ia n = 1 0.7 5+ (3 0− 1 9) = 1 0.7 5+ (1 1)
17 17
5.5
= 1 0.7 5+ = 1 0.7 5+ 0.3 2 = 1 1.0 7
17
M.D from Median:
~
ΣfY − Y
M .D(Y~ ) =
Σf
3 4.3 4
= = 0.5 7 2
60
Example 6: Calculate variance & standard deviation from the following data:
102, 104, 106, 108, 110
Solution:
Y −Y
Y
=(Y −106 )
(Y −Y ) 2
102 -4 16
104 -2 4
106 0 0
108 2 4
110 4 16
530 0 40
Y =
ΣY 530
= = 106 , Variance = S 2 =
Σ Y −Y( ) 2
=
40
=8
n 5 n 5
S .D =S =
(
Σ Y −Y ) 2
= 8 = 2.83
n
Example 7: Determine, mean S.D and C.V from the given data:
Ages Frequency
20-24 1
25-29 4
30-34 8
35-39 11
40-44 15
45-49 9
50-54 2
Solution:
Page # 21
Ages f fY fY2
Y
20-24 1 22 22 484
25-29 4 27 108 2916
30-34 8 32 256 8192
35-39 11 37 407 15059
40-44 15 42 630 26460
45-49 9 47 423 19881
50-54 2 52 104 5408
Σ 50 1950 78400
_
ΣfY 1950
Mean = Y = = = 39
Σf 50
ΣfY 2 78400
S .D = − (Y ) 2 = − (39 ) 2
Σf 50
= 1568 −1521 = 47 = 6.85
S.D 6.8 5
C.V = × 1 0 0= × 1 0 0
M ean 39
= 0.1 7 5× 61 0 0= 1 7.5 60 0
Y Y
X X
Direct Linear Relationship Inverse Linear Relationship
Page # 22
Y Y
X X
Curvilinear Relationship No Relationship
Regression: The dependence of one variable (dependent variable) on one or more other
variables (independent variables) is called regression. When we study the dependence of a
variable on a single independent variable, it is called simple regression or twovariable
regression. When the dependence of a variable on two or more than two variables is studied,
it is called multiple regression.
Regressand: In regression process the dependent variable is called regressand. It is also
called as the response variable or the predictand variable or the dependent variable or the
explained variable.
Regressor: In regression process the independent variable is called as the regressor. It is
also called as the predictor variable or the independent variable or the controlled variable or
the explanatory variable.
Least Squares Principle: The principle of least squares states that the sum of squares
of the residuals of observed values from their corresponding estimated values should be
least.
Properties of the Least Squares Line: Following are the important properties of
the least squares regression line:
(i) The sum of residuals between the observed the corresponding estimated values is
always zero i.e.,
e = (y – ŷ ) = 0
(ii) The sum of squares of the residuals e2 is minimum.
(iii) The least squares regression line always passes through the point ( x, y ) .
(iv) It is the best line because a and b are the unbiased estimates of the parameters and
.
Correlation: The degree or strength of relationship (interdependence) between the
variables is called correlation.
Examples of correlation; heights and weights of children, ages of husbands and ages of
wives at the time of their marriages, marks of students in mathematics and in statistics etc.
Product Moment Coe fficient of Correlation: A numerical measure of strength in
the linear relationship between any two variables is called the Pearson’s product moment
correlation coefficient or coefficient of simple correlation.
The sample linear correlation for n pairs of observations is defined by
Σ ( x − x)( y − y )
r=
Σ ( x − x) 2 Σ ( y − y) 2
(i) Positive Correlation: If both the variables are moving in same direction (increase or
decrease), then it is said to be positive or direct correlation. For example, ages and heights
of children.
(ii) Negative Correlation: If both the variables are moving in opposite direction it is
called negative or inverse correlation. For example, increase in the supply of a commodity
decreases its price.
(iii) No Correlation: If the change in one variable does not effect the other variable,
then there will be no correlation. For example, the head sizes and I.Q’s of persons.
Properties of Coe fficient of Correlation: The important properties of coefficient
Page # 23
of correlation are given as follows:
(i) The coefficient
of correlation is symmetrical with respect to x and y, i.e.,
rxy = ryx
(ii) The correlation coefficient
is a pure number i.e., it does not depend upon the unit of
measurement.
(iii) The correlation coefficient
always lies between –1 and +1.
(iv) The correlation coefficient
is the geometric mean between the two regression
coefficients
i.e.,
r = b yx ×bxy
r = +ve, if both byx and bxy are +ve.
r = ve, if both byx and bxy are ve.
(v) The correlation coefficient
is independent of origin and scale, i.e. ,
rxy = ruv
Σ ( x − x )( y − y ) Σ ( x − x )( y − y )
byx = bxy =
Σ ( x − x) 2 Σ ( y − y) 2
nΣ xy − ( Σx )( Σy ) nΣ xy − ( Σx )( Σy )
= = = =
nΣ x 2 − ( Σx ) 2 nΣ y 2 − ( Σy ) 2
Σxy − n x y Σxy −n x y
2 2
Σx − n x
2 Σy 2
−n y
a = y −b x or a = y −b yx x c = x −d y or c = x −bxy y
Coefficient of Correlation
Σ(x − x )(y − y)
r =
Σ(x − x ) 2 Σ(y − y) 2
nΣ xy − ( Σx )( Σy )
=
[nΣ x − ( Σx ) 2 ][ nΣ y 2 − ( Σy ) 2 ]
2
Σxy − nx y
=
[ Σx − nx 2 ][ Σy 2 − ny 2 ]
2
Example 1 The following table shows the ages x and systolic blood pressures y of 12
women.
Age 56 42 72 36 63 47 55 49 38 42 68 60
(years
) xi
Blood 14 12 16 11 14 12 15 14 11 14 15 15
pressu 7 5 0 8 9 8 0 5 5 0 2 5
re yi
Fit a regression line of blood pressure on age. Estimate the expected blood pressure of
a women whose age is 45 years. What is the change in blood pressure for a unit
change in age.
Solution:
x y xy x2
Page # 24
Estimate the linear regression equation taking (i) X as independent variable (ii) Y as
independent variable.
Solution:
n = 2, x = 2 , y = 8, Σx 2 = 180 , Σy 2 = 1424 , Σxy = 404
Time Series: A time series consists of numerical data collected, observed or recorded at
successive time periods.
Examples of time series are; the hourly temperature recorded by weather bureau, the
total monthly sales of pens in a book shop, the annual rainfall at Murree etc.
Analysis of Time Series: Analysis of time series is decomposition of a time series into
its different components for separate study. The basic purpose of analysis of time series is to
use it for forecasting.
Signal: The systematic component of variation in time series is called signal.
Noise: An irregular or random component of variation in time series is called noise.
Historigram: The graph of a time series is called historigram. It is constructed by taking
time along xaxis and the time series along yaxis. Using an appropriate scale, points are
plotted, then these points are joined by line segments to get required historigram.
Components of Time Series: Following are the main components of time series:
(i) Secular trend (T)
(ii) Seasonal variations (S)
Page # 26
(iii) Cyclical movements (C)
(iv) Irregular movements (I)
(i) Secular Trend: A secular trend is a long term movement that indicates the general
direction of the variation in a time series. It represents smooth, steady and gradual
movement in a time series in the same direction.
Examples of secular trend are; a decline in death rate due to advances in science, a
continually increasing demand for smaller automobiles etc.
(ii) Seasonal Variations: The Seasonal variations are short term movements that
indicate the identical changes in a time series during the corresponding seasons. The main
causes of these variations are seasons, religious affairs and social customs. Examples of
seasonal variations are; the increased sales of cotton cloths in summer, an after Eid sale in a
departmental store, an increase in employment during summer etc.
(iii) Cyclical Movements: Cyclical movements refer to the long term oscillations or
swings about the trend line or curve since the movements take the form of upward and
downward swings, they are also called “cycles”. The four phases of a business cycle are
prosperity, recession, depression and revival, provide important example of cyclical
movements.
(iv) Irregular Movements: Irregular movements are unsystematic in nature. They
occur in a completely unpredictable manner by chance, events such as war, floods,
earthquakes, strikes, fires etc. These variations are also called accidental, residual or
random variations. Examples of irregular movements are; a fire in a factory delaying in
production for 3 weeks, rise in prices due to floods etc.
Methods of measuring secular trend in a time series?
(i) Free hand curve method
(ii) Method of semi averages
(iii) Method of moving averages
(iv) Method of least squares
3 1 3 1.5
...
... 0 1 0.5
...
... 1 1 0.5
... 2 3 1.5
...
... 3 5 2.5
... ...
... ... 7 3.5
... ... ... ...
... ... ... ..
...
* The equation of semi averages is
ŷ = a + bx
y 2 − y1
where b = and a = y1 − bx1 or a = y 2 − bx 2
x2 − x1
* The equation of linear trend is ŷ = a + bx
Normal Equations are:
y = na + bx
xy = ax + bx2
Σy
If x = 0 ⇒ a =
n
Σxy
⇒ b =
Σx 2
Examples
Example 1. Make a historigram from the following data:
Year 196 196 196 196 196 1967
2 3 4 5 6
Production 20 28 50 15 18 27
(tons)
Solution:
Historigram
60
Production
50
40
30
20
10
0
1961 1962 1963 1964 1965 1966 1967 1968
Year
Example 2. The following table shows the property damaged by road accidents in Punjab
for the years 19737
9:
Year 197 197 197 197 197 197 1979
3 4 5 6 7 8
Property 201 238 392 507 484 649 742
damaged
Find trend values by free hand curve method.
Solution:
Page # 28
800
Property damaged
700
600
500
400
300
200
100
0
1972 1973 1974 1975 1976 1977 1978 1979 1980
Year
0 326 1 326
196 = 336 = y1 x1 = =
337 1680 2 336
1 2
340 3 346
196
196
4
196 365 5 366
5 372 6 376
Page # 29
196
196
9
The estimated equation of semi averages is
ŷ = a + bx
y 2 − y1 386 − 336 50
b = = = = 10
x2 − x1 7 −2 5
a = y1 −bx1
= 336 – 10(2) = 336 – 20 = 316
Hence ŷ = 316 + 10x
Example 4. Use the method of semi average to find trend values for the following data
showing net profit (in lacs of rupees) of SNGPL for the years 196472.
Year 196 196 196 196 196 196 197 197 197
4 5 6 7 8 9 0 1 2
Profit 33 86 116 95 101 128 146 110 32
Find the estimated profit in 1964.
Solution:
Year y Semi x xi ŷ i =
tota average 76.05 +
l 4.3x
1964 33 0 76.05
1967 95 3 88.95
1968 101 4 93.25
1969 128 5 97.55
1972 32 8 110.45
Solution:
Year Productio 4year moving
n total averag average (centred)
e
1948 50
1949 36.5
174.0 43.50
= 42.12
1950 43.0
162.9 40.73
= 40.93
1951 44.5
164.5 41.13 39.83
1952 38.9
154.1 38.53 37.81
1953 38.1
148.3 37.08 37.43
1954 32.6
151.1 37.78 38.16
1955 38.7
154.1 38.53
1956 41.7
1957 41.1
Example 7. The following data shows the production of steel in a mill for the years
19561964.
Year 195 195 195 195 196 196 196 196 196
6 7 8 9 0 1 2 3 4
(i) Fit the linear trend by the method of least squares by taking the origin at the middle.
Also calculate the trend values.
(ii) Predict the production of steel for the year 1965.
Solution:
Year y x xy x2 (Trend
value)
ŷ = 89 +
7.1x
1956 60 –4 –240 16 60.6
1957 65 –3 –195 9 67.7
1958 80 –2 –160 4 74.8
1959 73 –1 –73 1 81.9
Page # 32
1960 97 0 0 0 89.0
1961 105 1 105 1 96.1
1962 93 2 186 4 103.2
1963 111 3 333 9 110.3
1964 117 4 468 16 117.4
801 0 424 60
The least squares trend line is
ŷ = a + bx
Σy 801 Σxy 424
a = = = 89 b = = = 7.1
n 9 Σx 2 60
Hence ŷ = 89 + 7.1x
(ii) Prediction of the production of steel for the year 1965 is
For x = 5 ; ŷ = 89 + 7.1(5) = 124.5
Example 8. Fit a linear trend to the following data (take origin at the middle and half year
unit).
Year 199 199 199 199 199 199
1 2 3 4 5 6
Value 5 8 12 15 20 24
Also show that sum of residuals is equal to zero.
Solution:
Year y x= xy x2 ŷ = 14 + e = y ŷ
1.91x
INDEX
NUMBERS
Q.1. What is index number?
Ans. An index number is a device which measures the changes in a variable or group of
related variables with respect to time or space.
Q.2. What is simple index number?
Ans. An index number is called simple if it measures a relative change in a single variable
with respect to base.
Q.3. Give some examples of simple index number.
Ans. Index number for wages of employees, index number of cotton prices in Sahiwal etc.
Q.4. What is composite index number?
Ans. An index number is called composite index number if it measures a relative change in
a group of related variables with respect to base.
Q.5. What are the types of index number as regard to base?
Ans. (i) Fixed base index
(ii) Chain base index.
Q.6. Define price relative.
Ans. Price relative is the percentage ratio of the price in current year and the price in a
base year.
Q.7. Define link relative.
Ans. Link relative is the percentage ratio of the price in current year and the price in the
preceding year.
Q.8. What is price index number?
Ans. A price index number measures the changes in the whole sale or retail prices of a
particular commodity or a number of commodities with respect to base.
Q.9. What is quantity index number?
Ans. A quantity index number measures the changes in the quantity or volume of goods
produced or consumed.
Q.10. Define C.P.I.
Ans. A consumer price index number measures the changes in prices of a specified basket
of goods and services consumed in the given period relative to the base period.
Q.12. What do you mean by “basket” of goods?
Ans. The basket of goods and services will contain items like
(i) Food (ii) House rent (iii) Education (iv) Clothing (v) Misc.
Q.13. Write down the formula of C.P.I.
Ans. (i) Pon × 100 (Aggregate Expenditure Method)
(ii) Pon [Weighted Average of Relatives]
Page # 34
Q.14. Write the formula of price relative.
Ans. I = × 100
Q.15. What are the other names of cost of living index numbers?
Ans. Consumer price index number or retail price index number.
Q.16. What is whole – sale price index?
Ans. An index number considering the price quotations of whole-sale markets is called as
whole-sale price index.
Q.17. What is un-weighted index number?
Ans. An index number that measures the change in the price (or quantity) of a group of
commodities when the relative importance of commodities is not taken into account
is called un-weighted index number.
Q.18. What is weighted index number?
Ans. An index number that measures the change in the prices (or quantities) of a group of
commodities when the relative importance of commodities has been taken into
account is called weighted index number.
Q.19. Name the ideal index number?
Ans. Fisher’s index number is called ideal index number.
Q.20. What is base year weighted index number?
Ans. Laspeyre’s index number is called base-year weighted index number.
Q.21. What is the other name of Paasche’s index?
Ans. Paasche’s index number is also called current year weighted index number.
Q.22. Give two uses and two limitations of index number.
Ans. Uses of index Numbers:
(i) Index numbers are of great helpful in forecasting business conditions.
(ii) Index numbers are useful in education for I.Q. comparison and effectiveness of
teaching systems.
Limitations of Index Numbers:
(i) All index numbers are not suitable for all purposes.
(ii) Different methods of construction yield different results.
Price pn Chain
Year L.R = ×100
(Rs) pn −1 Indices
1962 0.80
1963 1.00 125 125
1964 1.20 120 150
1965 1.25 104.17 156.26
1966 1.25 100 156.26
1967 1.42 113.60 177.51
1968 1.50 105.63 187.50
1969 1.62 108 202.5
1970 1.75 108.02 218.74
A B C
Yea
r Pric Quanti Pric Quanti Pric Quanti
e ty e ty e ty
19
87 5 10 8 26 6 13
19 4 12 7 27 5 14
92
Solution:
Ite 1987 1992 p1q poq p1q poq
m po qo p1 q1 o o 1 1
A 5 10 4 12 40 50 48 60
B 8 26 7 27 182 208 189 216
C 6 13 5 14 65 78 70 84
28 33 30 36
7 6 7 0
(i) Laspeyre’s Index:
Σp1qo
P01 = ×100
Σpo qo
287
= ×100 = 85.42
336