You are on page 1of 36

BUSINESS STATISTICS

UNIT I
Data is a collection of related
observations.
SOME DEFINITIONS
Descriptive statistics.
Graphical and numerical procedures
used to summarize and process data
and transform data into information.
Inferential statistics.
Basis for predictions, forecasts and
estimates that are used to transform
information into knowledge.
SOME DEFINITIONS
Population is the complete set of all
items that interest an investigator.
Can be large or infinite.
A Sample is an observed subset of
population of size n.
A parameter is a specific characteristic
of a population.
A statistic is a specific characteristic of
a sample.
TYPES OF DATA
Categorical.
Do you own a car? : Yes/No.
This course is useful : Agree/Disagree.
Numerical.
Discrete : No of students enrolled for a
course.
Continuous : Average marks of students
in a course.
Qualitative and Quantitative data.
TYPES OF DATA
Measurement Levels
Nominal :
Gender, Names of places.
Ordinal :
Rank ordering of items. Satisfaction rating,
Product quality.
Interval.
Distance from an arbitrary zero.
Ratio.
Distance from an absolute zero.
MEASUREMENT LEVELS : EXERCISES

Temperature, Time of the day, Salary.


How many minutes on an average due you
spend per week for physical exercise?
Social commitment of businesses : Strongly
committed, Committed, Indifferent.
I tolerate corrupt practices by my
colleagues, superiors and subordinates :
Always, Many times, Some times, Never,
Dont want to answer.
DESCRIPTIVE STATISTICS :
TYPES
Organizing Data
Tables
Graphs

Summarizing Data
Central Tendency
Variation
DATA ARRAY
Highest and lowest values are easily
noticeable.
Easy to divide into sections.
Easy to notice repetitions.
Distances between succeeding
values observable.
DATA ARRAY : EXERCISE
4.1 4.3 4.2 5.5 5.5
4.0 3.8 4.5 3.9 4.0
4.9 4.5 5.5 4.4 4.1
5.5 4.1 4.2 4.9 4.7
FREQUENCY DISTRIBUTION
A Frequency Distribution organizes data into
classes, or categories, with a count of the
number of observations that fall into each
class.
Classes are all inclusive and mutually exclusive.
Classes should have equal width.
Open ended classes : others, 75 and
above.
Same example.
Relative frequency distribution.
CONSTRUCTING FREQUENCY
DISTRIBUTION TABLE
No of classes? 6 15 is a reasonable
range.
Width of the class :
W = (Next unit value after the largest
data point)/
(Total number of class intervals)
Create a data array and then count
the number of points in each class.
Cumulative frequency distribution :
Less than, More than.
BAR GRAPHS AND
HISTOGRAMS
10
COLOR FREQ 8
Red 5
6
Blue 7
4
Green 3
Yellow 8 2
0
Red Blue Green Yellow
Value FREQ
1
Range
0
3.8 4.0 5 8
4.1 4.3 7 6
4.4 4.6 3
4
4.7 4.9 3
5.0 5.2 0 2

5.3 5.5 2 0
3.8- 4.1- 4.4- 4.7- 5.0- 5.3-
BELL/NORMAL/SYMMETRIC
RIGHT SKEWED
LEFT SKEWED
UNIFORM
BIMODAL
PROBLEMS WITH
HISTOGRAMS
Original data cannot be retrieved.
The sizes of classes can be
manipulated to trick viewers.
Can conceal time differences among
datasets.
FREQUENCY POLYGON

8 8
6 6
4 4
2 2
0 0
3.8- 4.1- 4.4- 4.7- 5.0- 5.3- 3.8- 4.1- 4.4- 4.7- 5.0- 5.3-
4.0 4.3 4.6 4.9 5.2 5.5 4.0 4.3 4.6 4.9 5.2 5.5
FREQUENCY POLYGON
Frequency polygons are similar to
histogram.
Simpler than its histogram
counterpart.
Sketches the outline of the data
pattern more clearly.
Polygon becomes increasingly
smooth as we increase the number
of classes.
TIME SERIES PLOT
Series of data plotted at various time
intervals.
Sales data, student enrolment etc.
Indicates trend.
TIME SERIES PLOT EXAMPLE
5700

5600

5500

5400

Daily Visitors 5300

5200

5100

5000
38749 38808 38869 38930 38991 39052
38718 38777 38838 38899 38961 39022
TIME SERIES PLOT EXAMPLE
7000

6500

6000

5500
Daily Visitors
5000

4500

4000
38749 38808 38869 38930 38991 39052
38718 38777 38838 38899 38961 39022
SCATTER PLOTS
Scatter plots provide insight as to the
relationship that may exist between two
variables : dependent and independent.
They provide a picture of :
Range of each variable.
Pattern of values over the range.
A suggestion as to a possible relationship
between the two variables.
An indication of outliers.
SCATTER PLOT EXAMPLE
3.9

3.7

3.5

3.3

3.1

2.9

2.7

2.5
400 450 500 550 600 650 700
SCATTER PLOT EXAMPLE
70

65

60

55

50

45

40
2 4 6 8 10 12 14 16 18 20 22
SCATTER PLOT EXERCISE

Fuel Used
Speed (km/h) (Ltrs/100 km)
10 21
20 13
30 10
40 8
50 7
60 5.9
70 6.3
80 6.95
90 7.57
100 8.27
110 9.03
120 9.87
130 10.79
140 11.77
150 12.83
STEM AND LEAF DIAGRAM
Stem and leaf diagram depicts
summaries while retaining the
original data points.
One of the most useful techniques
for exploratory data analysis.
EXAMPLE
Data Array
48 57 66 71 73 78 84 87 94
50 61 66 72 76 79 84 89 99
51 63 67 72 78 82 85 93 100

Frequency Distribution
Stem & Leaf Plot
40-49 1 4 8
50-59 3 5 0 1 7
60-69 5
6 1 3 6 6 7
70-79 8
7 1 2 2 3 6 8 8 9
80-89 6
8 2 4 4 5 7 9
90-99 3
9 3 4 9
>99 1
10 0
EXAMPLE
4 4 8
7 5 5 0 1 7
7 5 3 6 1 3 6 6 7
8 7 5 5 2 7 1 2 2 3 6 8 8 9
6 6 5 3 2 8 2 4 4 5 7 9
9 3 4 9
2 1 10 0
OGIVE
Uses the concept of cumulative
frequency distribution.
Tells us how many observations are below
or above certain value.
Less Than ogive shows how many items in
the distribution have a value less than the
upper boundary/limit of each class.
More Than ogive shows how many items in
the distribution have a value more than the
lower boundary/limit of each class.
OGIVE - EXAMPLE
requency Distribution
Less Than Ogive More Than Ogive
CI Frequenc Values CF(Mor Values CF(Less
y e Than)
Than) More Than 27
40-49 1
Less Than 49.5 1 39.5
50-59 3 More Than 16
Less Than 59.5 4
49.5
60-69 5 Less Than 69.5 9 More Than 23
59.5
70-79 8 Less Than 79.5 17
More Than 18
80-89 6 Less Than 89.5 23 69.5
Less Than 99.5 26 More Than 10
90-99 3 79.5
Less Than 27 More Than 4
100-109 1 109.5 89.5
LESS THAN OGIVE
30
27
26
25
23
20
17
Axis Title 15

10
9

5
4
1
00
39 49 59 69 79 89 99 109
MORE THAN OGIVE
30
27
26
25
23
20
18

Frequency 15

10 10

5
4
1
0 0
39 49 59 69 79 89 99 109
OGIVE

30
27 27
26 26
25
23 23
20
18
17
Frequency 15

10 10
9

5
4 4
1 1
00 0
39 49 59 69 79 89 99 109

You might also like