Chapter 2 Data Collection Statistc

CHAPTER-II
DATA COLLECTION AND

PRESENTATION
Definition of Data
Data are considered as the raw materials of
statistics.
Data are numerical measurement that are
collected in scientific/ systematic way and are
related to the predetermined objectives.
Statistical observations are called data.
Source of Data
Data can be obtained from three important sources-
DATA
Primary Data
Internal Data
Questioning
Observations
Personal Interview
Mail
Secondary Data
Focused Group Discussion (FGD)
Telephone
Primary Data :
Primary Data measurements observed
& record as Part of an original study,
when the data required for a Particular
study. It may become necessary to
collect the original data to conduct
first hand investigation.
Internal Data
Internal Data refer to the measurements
that are the by product of routing
business
record
keeping
like
accounting,
Finance,
Production,
Personnel, Quality control, Sales
Research and Development etc.
Secondary Data
When and investigator used the data,
which has already been collected by
others, such data are called secondary
data.
Secondary data can be obtained form
Journal, Reports, Govt. Publications,
etc.
METHOD OF OBTAINING PRIMARY DATA:
There are three basic methods of collecting

primary data
Questioning.
Observations.
Focused group Discussion (FGD).
Questioning:
Questioning, as the name suggests, is distinguished by
the fact that data are collected by asking questions from
people who are thought to have the desired information.
Questions may be asked in person, or in writing. A
formal list of such questions is called a questionnaire.
A distinction is often made between a questionnaire & a
schedule. Questionnaire refers to a device for securing
answers to questions by using a form which the
respondent fills in himself.
Schedule is the name usually applied to a set of
questions which are asked & filled in a face-to-face
situation with another person.
Observations:
When the data are collected by observation, the
investigator asks no questions. Instead he observes the
object or actions in which he is interested. Sometimes
individuals make the observations, on other occasions,
mechanical devices observe & record the desired
information.
Observation method does not automatically produce
accurate data. Physical difficulties in the observation
situation on the part of the observer may result in errors.
Even more important, however, is the influence on
observations of the observers training, philosophy,
opinions & expectations.
Examples, projects as the reading of X-ray films, ECG,
state of repair for roads.
FOCUSED GROUP DISCUSSION

When
using focused group discussions

as a research technique, the researcher
is no longer the center of activity, but
rather let informants discuss with each
other, providing guidance.
Tools Discussion Guide, Tape
Recorder.
QUESTIONNAIRE METHODS
Of the three methods named above, the
questionnaire method is most widely used for
collecting business data, when questionnaire
method is used, three different techniques of
communication with questionnaire are available Personal interview.
Mail.
Telephone.
1.Personal Interview:
Personal interviews are those in which an
interviewer obtains information from respondents
in face-to-face meetings.
The information obtained by this method is likely
to be more accurate because the interviewer can clear
up doubts, can cross-examine the informants
&
thereby obtain correct information.
2. Mail:
In most mail surveys, questionnaires are mailed to the
respondents to fill them & return by mail. Sometimes
mail questionnaires are placed in respondents hands by
other means such as attaching them to consumer
products, putting them in magazines or newspapers or
having field workers leave them with respondents. In
each case respondents complete the questionnaires
themselves & send back the completed forms by mail.
This method has a special advantage in surveys where
filed of investigation is very vast & the informants are
spread over a wide geographical area.
3. Telephone Interviews:
Are similar to personnel interviews except that
communications between interviewer &
respondent is on telephone instead of direct
personal contact.
However, this method has several limitations
such as it can not be used to interview those
people who dont have telephone, telephone
conversation can not be very long & also replies
on the phone can be erratic & unreliable.
DATA COLLECTION TECHNIQUES AND TOOLS
Data collection Techniques
Data Collection Tools
Using available information
Checklist, data, compilation form
Observing
Eyes and ears, pen and paper, watch,

tape
Or video recorder etc.
Interviewing
Interview
schedule,
checklist,
questionnaire, Tape recorder
Administering written questionnaires
Questionnaire
Organizing Focus Group Discussions
Discussion guide, tape recorder
Data:
Statistical observations are called data.
Attribute:
The phenomena, which are expressed in
some qualitative form, is termed as
attribute. Example occupation, religion,
education etc.
Variable:
A variable is a quantitative characteristic of a
person, object or phenomena that can take
more than one value. Example income, price,
height, weight, family size. There are mainly
four types of variables
Discrete Variable
Continuous Variable.
Dependent Variable.
Independent Variable.
1. Discrete Variable:
Discrete variable are those, which can
only vary by finite jumps & can not
manifest every conceivable fractional
value. It is obtained by counting. For
instance the number of persons in the
family, the number of rooms in the house,
number of employees.
2.
Continuous Variable:
A continuous variable is capable of

manifesting
every
conceivable
fractional value within the range of
possibilities. It is obtained by
measurement. Such as height, weight
of a product, age.
3.Dependent Variable:
The variable that is used to describe or measure
the problem understudy is called dependent
variable.
4.Independent Variable:
The variables that are used to describe or
measure the factors that are assumed to cause or
at least influence the problem are called
independent variable.
Presentation of Data:
Presentation can take place mainly in two forms
Statistical Table.
Statistical Charts.
A Statistical Table is the presentation of numbers
in a logical arrangements, with some brief
explanation to show what they are.
A Statistical Charts is a pictorial device for
presenting data.
Classification of Data
Classification is a grouping of related facts into

different classes.
Facts in one class differ from those of another

class with respect to some characteristics called a
basis of classification.
Sorting facts on one basis of classification & then on

another basis called cross-classification.
Classification of data is a function very similar to
that of sorting letters in post office (common
characteristics destination)
This process helps gathering important information
while dropping unnecessary details enabling
statistical treatment.
Types of Classification based on basis:
Geographical area wise, Cities, districts etc.
Chronological on basis of time.
Qualitative according to some attributes.
Quantitative in terms of some measurable

quantity (magnitude).
1. Geographical Classification:
Basis geographical or location differences.

Production of wheat, rice in different states.
Usually listed in alphabetical order for every

reference.
May also be listed by size to emphasize the

important areas as ranking the states by
population.
2. Chronological Classification:
Basis is period of time.
Time series are usually listed in
chronological order normally starting
with the earliest period.
When the major emphasis falls on the
most recent events, a reverse time
order
may be used.
3.
QUALITATIVE CASSIFICATION:
Basis of some attribute or quality as color of hair, literacy,

religion, sex etc.
The point to note, in this type of classification, is that the

attribute under study can not be measured: one can whether it is
present or absent in the units of population under study.
Example Blindness, we may find out how many persons are

blind in a given population. It is not
possible to measure
the
degree of blindness in each case. Thus often one attribute
is studied two classes are formed, one possessing the attribute &
the other not possessing the attribute. This classification is
known as simple classification.
Population
Blinds
Non-blinds
This type of classification where only two classes

are formed is also called two fold or dichotomous
classification.
If instead of forming only two classes we further

divide the data on the basis of some attributes so as
to form several classes, the classification is known
as manifold classification.
Population
Population
Population
Population
Population
Population
Population
4. Quantitative Classifications:
Basis some measurable characteristics such as height,
weight, income, sales.
Workers of a factory may be classified according to wages.
In this type of classification there are two elements
o o The Variable wage in the example below.
o o The frequency, the number of workers in each class.
CLASSIFICATION ACCORDING TO CLASS INTERVALS:
1. Class Limit
The class limit are the lowest & the highest
values that can be included in the class.
20(lower limit of the class) 30(upper limit)
The lower limit of the class is the value below
which there can be no value in that class & the
upper limit of a class is that value above which
no value can belong to that class.
2. Class Interval:
The span of the class, that is, the difference
between the upper limit & the lower limit, is
known as the class interval.
20 30 , class interval is 20.
The size of the class interval is determined by
the number of classes & total range in data.
3. Class Frequency:
The
number
of
observations
corresponding to the particular class is
known as the frequency of that class or
class frequency.
Income
800 900
900 1000
No of Emp
50
100
1000 1100
200
1100 1200
150
1200 1300
1300 1400
Total
40
10
550
There are 50 employees having income

between 800 900.
If we add together the frequencies of all

the individual classes, we obtain total
frequency. The total frequency of six classes
is 550, which means that in all
there are
550 employees whose income
has been
studied.
4. Class Midpoint:
Upper limit of the class + lower limit of the class
Mid point of a class = .
There are two methods of classifying the data
according to class Intervals
Exclusive methods.
Inclusive methods.
EXCLUSIVE METHODS:
When the class intervals are so fixed that the upper limit of one
class is the lower limit of the next class it is known as the exclusive
methods.
Income
800 900
900 1000
1000 1100
1100 1200
1200 1300
1300 1400
Total
No of Emp
50
100
200
150
40
10
550
The exclusive method ensures continuity of

data.
In the above example 50 persons whose income
between Tk. 800 & Tk. 899.99. A person who is
getting 900 would be included 900 1000 class.
INCLUSIVE METHODS:
Inclusive Methods:
Under this method of classification, the upper
limit of one class is included in that class itself.
Income
800 900
900 1000
1000 1100
1100 1200
1200 1300
1300 1400
Total
No of Emp
50
100
200
150
40
10
550
In this class, 800 899 we include persons

whose income is between 800 & 899. if the
income of a person is exactly 900 he is
included in the next class.
It should be noted that both the inclusive &
exclusive method give us the same class
intervals. For inclusive method the class
interval is obtained by taking the difference
between the two upper limits.
Class Interval = 999 899 = 100
STATISTICAL TABLE
One of the simplest & most revealing devices for

summarizing data in a meaningful fashion is the
statistical table. The purpose of the table is to simplify
the presentation & to facilitate comparison.
Parts of a Table:
1. Table Number:
The number may be given either in the center of the top

above the title.
Left hand side (L.H.S) of the table at the top or at the
bottom on the L.H.S.
2. Title of the table:

The title is the description of the content of the
table.
A complete title has to answer the questions
o What precisely are the data in the table?
o Where the data occurred (the precise geographical,
political or physical area covered).
o When the occurred?
The title should be clear, brief & self-explanatory.
3. Caption:
Caption refers to the column heading. It explains
what the column represents. It may consists of
one or more column headings.
Under a column heading there may be subheads.
The caption should be clearly defined & placed
at the middle of the column.
If the different columns are expressed in
different units, the units should be should be
specified along with the captions.
4. Stub:
Stubs are designation of the rows or row headings.
They are placed at the extreme left.
5. Body:
The body of the table contains the numerical
information.
This is the most vital part of the table.
Data presented in the body arranged according to
description & classifications of the captions & stubs.
6. Head note:
It is a brief explanatory statement applying to all or a
major part material in the table, & is placed below the
title entered & enclosed in brackets.
It is used to explain certain points relating to the
whole table that have not been included in the title
nor in the captions or stubs.
For example, unit of measurement is frequently
written as the head note, such as in thousand or in
million tones or in crores.
7. Foot note:
Anything in a table, which the reader may find difficult
to understand from the title, stubs or captions should
be explained in foot notes.
If footnotes are needed they are placed directly below
the body of the table.
Footnotes are used for four main purposes
o To point out exceptions to avoid any conclusion based on
that exception.
o Any special circumstances affecting the data strike.
o To clarify anything in the table.
o To give source in case of secondary data.
Graphs:
Broadly the various graphs can be divided under
the following two heads
Graphs of Time series or line graphs.
Graphs of Frequency distributions.
Graphs of Frequency Distributions:
Histograms or column diagrams.

Frequency Polygon
Smoothed frequency curve
Cumulative frequency curve.
FREQUENCY DISTRIBUTION:
Frequency distribution is a statistical table
which shows the set of all distinct values of
the variables arranged in order of magnitude
either individually or in groups with their
corresponding frequencies side by side. A
frequency distribution table is given below:
FREQUENCY DISTRIBUTION TABLE

Class Interval
Tally
Frequency
20-30
III
30-40
IIII
40-50
6
IIII I
50-60
IIII III
60-70
II
70-80
PROBLAM
The profit (in take) of 30 companies for year 2010-2011 are
given below20,
65,
49,
29,
1.
2.
22,
35,
42,
37,
42,
48,
53,
49,
39,
48,
67,
18,
16,
23,
37,
35,
63,
65,
55,
45,
58,
57,
69,
25,
58,
65.
Classify the above data taking a suitable class interval.
Represent the data frequency distribution table given there,
Highest value = 69
Lowest value = 16
Range
= 53
53 5 = 11
53 25 = 2.00
Profits (0in taka
Tally
No of frequency
15 25
IIII
25 35
II
35 45
IIII II
45 55
IIII I
55 65
IIII
65 - 75
IIII
5
N = 30
Name of graphs
X-axis
Y-axis
His to gram
Class lower limit
Frequency
F. Polygon
Class mid values
Frequency
Frequency Distribution :
Frequency distribution is a statistical table
which shows the set of all distinct values of the
variables arranged in order of magnitude either
individually or in groups with their
corresponding frequencies side by side. A
frequency distribution table is given below:
Frequency Histogram :
A histogram is a graphical method by presenting data, where observations are
located on a horizontal axis (usually grouped into intervals) and the frequency
of those observation is depicted along the vertical axis.
FREQUENCY DISTRIBUTION TABLE

Profits (0in taka
Tally
Frequency
20 30
III
30 40
IIII
40 50
IIII I
50 60
IIII III
60 70
II
70 80
1
N = 30
Histogram
5
4
Frequency 3
2
1
0
20-30
30-40
40-50
50-60
1
0 10 20 30 40 50 60 70
Class Interval
60-70
70-80
Frequency polygon :
A frequency polygon is a graphical display of a frequency table.
The intervals are shown on the X- axis and the number of scores
in each interval is represented by the height of a point located
above the middle of the interval. The points are connected so
that together with the X- axis they form a polygon.
FREQUENCY DISTRIBUTION TABLE:

Profits (0in taka
Tally
Frequency
20 30
III
30 40
IIII
40 50
IIII I
50 60
IIII III
60 70
II
70 80
Frequency Polygon
9
8
Frequency
7
6
5
Frequency
4
3
2
1
0
20-30
30-40
40-50
50-60
60-70
Class Mid-point
70-80
Bar Diagram :
A bar chart or bar graph is a chart with rectangular bars
with lengths proportional to the values that they represent.
They bars can be plotted vertically or horizontally. Bar
charts are used for plotting discrete data that is data, which
has discrete values and is not continuous.
Year
1995
1996
1997
1998
1999
Production (Ton)
50
60
71
72
74
Bar Diagram
50
60
71
72
74
Production (Ton)
Year
1995
1996
1997
Years
1998
1999
Pie Chart :
A pie chart is circular chart divided into sectors. In a pie
chart, the length of each sector is proportional to the
quantity it represents. When angles are measured with 1
turn as unit then a number of percent is identified with the
same number of cent turns. Together, the sectors create a
full disk. It is named for its resemblance to a pie, which
has been sliced. The pie chart is the most ubiquitous
statistical chart in the business world and the mass media.
To present the above data in a pie chart, we have to
calculate the (%) of the personnel by job status or by
degree of angles occupied, lets the calculation:-
Sl. No.
Person by job status
Workers
436
53
192.7
Computer
operator
218
27
96.3
Officers
109
13
48
Others
54
23
X of personnel
(%)of personnel
f
360
N
f = number of frequency
N = total number of frequency
Degree of angles
Pie- Chart
13%
7%
1
2
53%
27%
3
4
Pie- Chart
3
13%
4
7%
1
2
2
27%
1
53%
3
4
THANK YOU

Chapter 2 Data Collection Statistc

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 2 Data Collection Statistc

Uploaded by

Copyright:

Available Formats

CHAPTER-II

DATA COLLECTION AND

Data can be obtained from three important sources-

Focused Group Discussion (FGD)

METHOD OF OBTAINING PRIMARY DATA:

There are three basic methods of collecting

FOCUSED GROUP DISCUSSION

using focused group discussions

DATA COLLECTION TECHNIQUES AND TOOLS

Data collection Techniques

Data Collection Tools

Using available information

Checklist, data, compilation form

Eyes and ears, pen and paper, watch,

Administering written questionnaires

Organizing Focus Group Discussions

Discussion guide, tape recorder

A continuous variable is capable of

Classification is a grouping of related facts into

Facts in one class differ from those of another

Sorting facts on one basis of classification & then on

Types of Classification based on basis:

Geographical area wise, Cities, districts etc.

Chronological on basis of time.

Qualitative according to some attributes.

Quantitative in terms of some measurable

Basis geographical or location differences.

Usually listed in alphabetical order for every

May also be listed by size to emphasize the

Basis of some attribute or quality as color of hair, literacy,

The point to note, in this type of classification, is that the

Example Blindness, we may find out how many persons are

This type of classification where only two classes

If instead of forming only two classes we further

CLASSIFICATION ACCORDING TO CLASS INTERVALS:

There are 50 employees having income

If we add together the frequencies of all

The exclusive method ensures continuity of

In this class, 800 899 we include persons

One of the simplest & most revealing devices for

The number may be given either in the center of the top

2. Title of the table:

Graphs of Frequency Distributions:

Histograms or column diagrams.

FREQUENCY DISTRIBUTION TABLE

Class lower limit

Class mid values

FREQUENCY DISTRIBUTION TABLE

FREQUENCY DISTRIBUTION TABLE:

Person by job status

You might also like