You are on page 1of 37

Presentation On

Analytical Characteristics of Bangladesh


By Division

Submitted to
M. Amir Hossain, Ph.D.
Professor, Applied Statistics
D.U.
East West University, Bangladesh.

H.M. Faisal Ahmed 2010-2-91-021
Submitted By
Group C
INTRODUCTION
Data Presentation
We have collected demographic data from BBS
(Bangladesh Bureau of Statistics) website
www.bbs.gov.bd/Home.aspx. We decided to collect two
types of data (Qualitative & Quantitative). For Qualitative
data we have considered the data about the Land Area,
and number of Male and Female in a division and for
quantitative data we have considered the data about age.
We have applied the data in different types of Data
Presentation techniques.
TECHNIQUES USED
Bar Chart
Histogram
Frequency Polygon
Cumulative Frequency Curve
FACT TABLE
The Bar chart and Histogram are based on the
following fact table:
Based on Enumerated population in 2011
DIVISION AREA MALE FEMALE
BARISAL 13,645 4,006,000 4,140,000
CHITTAGONG 33,771 13,763,000 14,361,000
DHAKA 30,989 23,814,000 22,915,000
RAJSHAHI 34,495 9,183,000 9,146,000
KHULNA 22,285 7,782,000 7,781,000
SYLHET 12,596 4,882,000 4,925,000
BAR CHART
A bar chart or bar graph is a way of showing information by
the lengths of a set of bars. The bars are drawn horizontally
or vertically. If the bars are drawn vertically, then the graph
can be called a column graph or a block graph. A chart
which displays a set of frequencies using bars of equal
width whose heights are proportional to the frequencies.
In our presentation the height of the bars represents the
number of different individuals, the X axis represents
different division and Y axis the number of individuals.
BAR CHART (CONTINUED)
Chart 01: Bar Chart of Male and Female per Division
BAR CHART (CONTINUED)
14
34
31
34
22
13
0
5
10
15
20
25
30
35
40
Thousands
B
a
r
i
s
a
l
C
h
i
t
a
g
o
n
g
D
h
a
k
a
R
a
j
s
h
a
h
i
K
h
u
l
n
a
S
y
l
h
e
t
Land Area (Square
Killometer)
Chart 2: Bar Chart of Land Area per Division
HISTOGRAM
A graphical representation, similar to a bar chart in structure,
that organizes a group of data points into user-specified
ranges. The histogram condenses a data series into an
easily interpreted visual by taking many data points and
grouping them into logical ranges or bins. In statistics, a
histogram is a graphical display of tabulated frequencies,
shown as bars. It shows what proportion of cases fall into
each of several categories: it is a form of data binning. The
categories are usually specified as non-overlapping intervals
of some variable. The categories (bars) must be adjacent.
The intervals are generally of the same size.
Histograms are used to plot density of data, and often for
density estimation: estimating the probability density
function of the underlying variable.
HISTOGRAM (CONTINUED)
Chart 04: Histogram of Male & Female per Division
HISTOGRAM (CONTINUED)
0
5000
10000
15000
20000
25000
30000
35000
40000
Chittagong
Division
Dhaka Division Khulna
Division
Rajshai
Division
Sylhet Division
Chart 05: Histogram of Land Area per Division
FREQUENCY POLYGON
A frequency polygon is a graphical display of a frequency
table. The intervals are shown on the X-axis and the number
of scores in each interval is represented by the height of a
point located above the middle of the interval (Class Mark).
The points are connected so that together with the X-axis
they form a polygon.
In our presentation Class Marks (Class Mid Points) are
plotted through X axis and Number of individuals in that
class are plotted through Y axis.
FREQUENCY POLYGON (CONTINUED)
Frequency Distribution Table (With class Mark)
Class Class Mark Frequency
40-44 42 7133824
45-49 47 5152206
50-54 52 4322404
55-59 57 2774265
60-64 62 2662799
64-69 67 1758685
70-74 72 1461443
Class Class Mark Frequency
00-04 2 14465810
05-09 7 16534124
10-14 12 15704322
15-19 17 12186950
20-24 22 10688351
25-29 27 9858549
30-34 32 9363144
35-39 37 8198944
FREQUENCY POLYGON (CONTINUED)
-
2
4
6
8
10
12
14
16
18
- 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80
Age
P
o
p
u
l
a
t
i
o
n

N
u
m
b
e
r
Millions
Chart 06: Frequency Polygon of peoples age Information of Bangladesh
CUMULATIVE FREQUENCY CURVE
Also known as an ogive, this is a curve drawn
by plotting the value of the first class on a
graph. The next plot is the sum of the first and
second values, the third plot is the sum of the
first, second, and third values, and so on. The
total of a frequency and all frequencies below it
in a frequency distribution.
In our presentation cumulative frequency of age
groups is plotted through Y axis and Class
Frequency through Class Mark is plotted
through X axis.
CUMULATIVE FREQUENCY CURVE (CONT.)
Class Class Mark Frequency
Cumulative
Frequency
00-04 2 14465810 14465810
05-09 7 16534124 30999935
10-14 12 15704322 46704257
15-19 17 12186950 58891207
20-24 22 10688351 69579559
25-29 27 9858549 79438108
30-34 32 9363144 88801253
35-39 37 8198944 97000197
CUMULATIVE FREQUENCY CURVE (CONT.)
Class Class Mark Frequency
Cumulative
Frequency
40-44 42 7133824 104134021
45-49 47 5152206 109286228
50-54 52 4322404 113608632
55-59 57 2774265 116382897
60-64 62 2662799 119045696
64-69 67 1758685 120804382
70-74 72 1461443 122265825
CUMULATIVE FREQUENCY CURVE (CONT.)
0
20
40
60
80
100
120
140
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80
Age
P
o
p
u
l
a
t
i
o
n

N
u
m
b
e
r
s
Millions
Chart 07: Cumulative Frequency Curve of Age Information of Bangladesh
COMMENT
The Assignment was done within short time
thats why there might be some errors in our
analysis but still the data will be able to
visualize the actual picture.
MEASURES OF DISPERSION
The descriptive statistics that measure the
quality of scatter are called measures of
dispersion. Measures of dispersion give a
more complete picture of the data set. It
deals with spread of data. A small value of
the measure of dispersion indicates that
data are clustered closely. A large value of
dispersion indicates the estimate of central
tendency is not reliable.
TYPES OF MEASURES OF
DISPERSION
There are many type of measurement of dispersion, here we discuss as
below-

Absolute Measures of Dispersion:

These measures give us an idea about the amount of dispersion in a set of
observations. They give the answers in the same units as the units of the
original observations. When the observations are in kilograms, the absolute
measure is also in kilograms. If we have two sets of observations, we cannot
always use the absolute measures to compare their dispersion. We shall
explain later as to when the absolute measures can be used for comparison
of dispersion in two or more than two sets of data. The absolute measures
which are commonly used are:

1. Range
2. Mean Deviation
3. Variance
4. Standard Deviation

Relative Measure of Dispersion:

These measures are calculated for the comparison of dispersion in
two or more than two sets of observations. These measures are free
of the units in which the original data is measured. If the original
data is in dollar or kilometers, we do not use these units with relative
measure of dispersion. These measures are a sort of ratio and are
called coefficients. Each absolute measure of dispersion can be
converted into its relative measure. Hear we only discuses:

1. Coefficient of Variance

TYPES OF MEASURES OF
DISPERSION
RANGE
For ungroup data: The simplest measure of dispersion is the
range. The range is calculated by simply taking the
difference between the maximum and minimum values in
the data set.
Range=Highest Value-Lowest Value

For group data: If there are group data than the range is
calculated by taking the difference between the upper limit
of the highest class and the lower limit of the lowest class.
Range= upper limit of the highest class- lower
limit of the lowest class.

MEAN DEVIATION
The mean deviation is the first measure
of dispersion that we will use that actually
uses each data value in its computation. It
is the mean of the distances between each
value and the mean. It gives us an idea of
how spread out from the center the set of
values is.
For ungroup data:
For group data:
MD
X X
n
=
E
f
| X X | f
MD
E
E
=
I I
VARIANCE
Variance is a mathematical expression of
the average squared deviations from the
mean. We can said also, the arithmetic
mean of the squares of the deviations of
all values in a set of numbers from their
arithmetic mean.
Population Variance:
_
Sample Variance:
o

2
2
=
E( ) X
N
1
) (
2
2

E
=
n
X X
S
VARIANCE
Working formula for population variance is:



Working formula for sample variance is:



2
2
2
) (
N
X
N
X E

E
= o
1
) (
S
2
2
2

E
E
=
n
n
X
X
RELATIVE DISPERSION
The usual measure of dispersion cannot be
used to compare the dispersion if the units
are different, even the unit are same but the
means are different.
It reports variation relative to the mean.
It is useful for comparing distributions with
different units.
Hear we only discuses:
1. Coefficient of Variation


COEFFICIENT OF VARIANCE
The CV is the ratio of the standard
deviation to the arithmetic mean,
expressed as a percentage. We can also
said, to compare the variations (dispersion)
of two different series, relative measures of
standard deviation must be calculated.
This is known as co-efficient of variation.
The formula of CV is given bellow:

100 =
X
s
CV
Class Interval Frequency X/Midpoint xf -- -- --
f
00-04
05-09
10-14
15-19
20-24
25-29
30-34
35-39
40-44
45-49
50-54
55-59
60-64
65-69
70-74
14.46
16.53
15.70
12.18
10.68
9.85
9.36
8.19
7.13
5.15
4.32
2.77
2.66
1.75
1.46

2
7
12
17
22
27
32
37
42
47
52
57
62
67
72

28.92
115.78
188.4
207.06
234.96
265.95
299.52
303.03
299.46
242.05
224.64
157.89
164.92
117.25
105.12

-22.18
-17.18
-12.18
-7.18
-2.18
2.82
7.82
12.82
17.82
22.82
27.82
32.82
37.82
42.82
47.82

22.18
17.18
12.18
7.18
2.18
2.82
7.82
12.82
17.82
22.82
27.82
32.82
37.82
42.82
47.82

320.72
283.98
191.22
87.45
23.68
27.77
73.19
104.99
127.05
117.52
120.18
90.91
100.60
74.93
69.81

491.95
295.15
148.35
51.55
4.75
7.95
61.15
164.35
317.55
520.75
773.95
1077.15
1430.35
1833.55
2286.75

7113.59
4878.82
2329.09
627.87
50.73
78.30
572.364
1346.02
2264.13
2681.86
3343.46
2983.70
3804.73
3208.71
3338.65

122.19 2954.95 1814 38700.32
X X | | X X
| | X X
( )
2
X X
( )
2
X X f
Range= 74-0 = 74
_
X= 2954.95/122.19= 24.18
_
Mean Deviation= = 1814/122.19=14.8457
f
| | f
E
E X X
EXAMPLE
Determination of the year 2011:
Figure in Mil
Variance, =38700.32/122.19= 316.72

Standard Deviation=

= 17.7966
Coefficient of Variance (CV)=
= (17.7966/24)X100
= 74.15%

( )
f
X X
S
E
E
=
2
|
|
.
|

\
|

E
= =
1
) (
2
2
n
X X
S S
100 =
X
s
CV
|
.
|

\
|
=
122.19
38700.32
CORRELATION
Helps to take decision and identifying the nature of business
and economic decisions
Helpful in identifying the nature of relationship among many
business and economic variables
One variable depends on another and can be determined by it

The Coefficient of Correlation (r) is a measure of the strength of
the relationship between two variables.
It requires interval or ratio-scaled data (variables).
It can range from -1.00 to 1.00.
Values of -1.00 or 1.00 indicate perfect and strong correlation.
Values close to 0.0 indicate no linear correlation.
Negative values indicate an inverse relationship and positive
values indicate a direct relationship

DATA TABLE

X Y
X- X^ Y-Y^ (X-X^)(Y-Y^) (X-X^)
2
(Y-Y^)
2


3 41
-2 -14 28 4 196
7 76
2 21 42 4 441
6 56
1 1 1 1 1
5 78
0 23 0 0 529
2 43
-3 -12 36 9 144
1 34
-4 -21 84 16 441
X^ = 5 Y^ = 55
(X-X^)(Y-Y^)
=191
X-X^)
2
= 34 (Y-Y^)
2

= 1752
VALUE OF R
r = 0.78

Comment: As, the value or r is positive , so
the variables have stronger relation between
them.
REGRESSION
A regression is a statistical analysis assessing the association
between two variables. It is used to find the relationship
between two variables.
General form of linear regression model
Y = a + bX + e
Where,
Y : dependent variable
a : intercept term
b : slope of the line
X : independent variable
e : error term
Want to estimate a and b such that e
2
is minimum

REGRESSION ANALYSIS



X Y
X- Y- (X- )(Y- ) (X- )
2

3 41
-2 -14 28 4
7 76 2 21 42 4
6 56 1 1 1 1
5 78
0 23 0 0
2 43
-3 -12 36 9
1 34
-4 -21 84 16
= 5 = 55
(X- )(Y- )

=191
(X- )2
= 34
REGRESSION ANALYSIS
So,
Here after putting the value,

= 191/34 = 5.6

a = 55 - 5.6(5) =27

Form the linear regression model, Y = 27 + 5.6X
Here regression coefficient is 5.6 that means if we change 1 unit of
independent variable, dependent variable will change 5.6.
THANK YOU

You might also like