You are on page 1of 11

International University IU

STATISTICS FOR BUSINESS [IUBA]


CHAPTER 01

INTRODUCTION AND DESCRIPTIVE STATISTICS

1. SAMPLES AND POPULATIONS

Population consists of the set of all


measurements in which the
investigator is interested.

Sample is a subset of measurements


selected from the population.

Statistics for Business | Chapter 01: Introduction and Descriptive Statistics


Random sample is a sample selected
in the way that sampling from the
population is often done randomly,
such that every possible sample of n
elements will have an equal chance
of being selected.

2. PERCENTILES AND QUARTILES

Percentiles: The th percentile of a group of numbers is that value below which lie %
( percent) of the numbers in the group. The position of the th percentile is given by
( + ) / , where is the number of data points.

Quartiles: The percentage points that break down the data set into quartersrst
quarter, second quarter, third quarter, and fourth quarter.

+ The 1st quartile/lower quartile is the 25th percentile.

+ The median is the 50th percentile.

+ The 3rd quartile/lower quartile is the 75th percentile.

Powered by statisticsforbusinessiuba.blogspot.com 1
International University IU
+ Interquartile Range = 3rd Quartile 1st Quartile

= Upper Quartile Lower Quartile

= 75th Quartile 25th Quartile

+ Range = Largest Observation Smallest Observation

Example The following data are numbers of passengers on ights of Delta Air Lines between San
Francisco and Seattle over 33 days in April and early May.

128, 121, 134, 136, 136, 118, 123, 109, 120, 116, 125, 128, 121, 129, 130,
131, 127, 119, 114, 134, 110, 136, 134, 125, 128, 123, 128, 133, 132, 136,
134, 129, 132

Find the lower, middle, and upper quartiles of this data set. Also nd the 10th, 15th, and

Statistics for Business | Chapter 01: Introduction and Descriptive Statistics


65th percentiles. What is the interquartile range?

(Hint: Use ( + ) / )

Solution Firstly, lets order the data from smallest to largest


109, 110, 114, 116, 118, 119, 120, 121, 121, 123,
123, 125, 125, 127,128, 128, 128, 128, 129, 129,
130, 131, 132, 132, 133, 134, 134, 134, 134, 136,
136, 136, 136
n = 33
The lower quartile is the observation in position
(33 + 1)25/100 = 8.5, which is 121.
The middle quartile (median) is the observation in position
(33 + 1)50/100 = 17, which is 128.
The upper quartile is the observation in position
(33 + 1)75/100 = 25.5, which is 133.5.
The 10th percentile is the observation in position
(33 + 1)10/100 = 3.4, which is 114 + (116 114)(0.4) = 114.8.
The 15th percentile is the observation in position
(33 + 1)15/100 = 5.1, which is 118 + (119 118)(0.1) = 118.1.
The 65th percentile is the observation in position
(33 + 1)65/100 = 22.1, which is 131 + (132 131)(0.1) = 131.1.
The interquartile range is equal to
Third quartile First quartile = 133.5 121 = 12.5

Powered by statisticsforbusinessiuba.blogspot.com 2
International University IU

3. MODE/MEAN/VARIANCE/STANDARD DEVIATION

Sample Population
Mode
The mode of the data set
is the value that occurs
most frequently.
Mean
The mean of a set of
x= x /n = x /N
observations is their
average.
Variance
The variance of a set of
observations is the
s = (x x) /(n 1) = (x ) /N
average squared
deviation of the data
points from their mean.
Standard Deviation

Statistics for Business | Chapter 01: Introduction and Descriptive Statistics


The standard deviation
of a set of observations
is the (positive) square
s= s = (x x) /(n 1) = = (x ) /N
root of the variance of
the set.

CALCULATOR INSTRUCTIONS FOR STATISTICS


Note: This page is only relevant for CASIO scientific calculator fx-570ES and fx-570ES PLUS

Calculating Mean and Standard Deviation of Sample / Population.

(Chapter 01 | Introduction and Descriptive Statistics)

For CASIO scientific calculator fx-570ES:

Step 01: Press MODE + 3: STAT

Step 02: Press 1: 1 VAR

Step 03: Input the data

Step 04: Press SHIFT + 1 [STAT]

Step 05: Press 5: VAR

Powered by statisticsforbusinessiuba.blogspot.com 3
International University IU
Step 06:

Press 2: to calculate the sample mean or population mean

Press 3: to calculate the population standard deviation

Press 4: to calculate the sample standard deviation

For CASIO scientific calculator fx-570ES PLUS

Step 01: Press MODE + 3: STAT

Step 02: Press 1: 1 VAR

Step 03: Input the data

Statistics for Business | Chapter 01: Introduction and Descriptive Statistics


Step 04: Press AC

Step 05: Press SHIFT + 1 [STAT]

Step 06: Press 4: VAR

Step 07:

Press 2: to calculate the sample mean or population mean

Press 3: to calculate the population standard deviation

Press 4: to calculate the sample standard deviation

Powered by statisticsforbusinessiuba.blogspot.com 4
International University IU
Example Case of population Case of sample
The future Euroyen is the price of the The daily expenditure on food by a
Japanese yen as traded in the European traveler, in dollars in summer 2006, was as
futures market. The following are 30-day follows:
Euroyen prices on an index from 0 to 17.5, 17.6, 18.3, 17.9, 17.4, 16.9, 17.1,
100%: 17.1, 18.0, 17.2, 18.3, 17.8, 17.1, 18.3,
99.24, 99.37, 98.33, 98.91, 98.51, 99.38, 17.5, 17.4.
99.71, 99.21, 98.63, 99.10. Find the mean, standard deviation, and
Find the mean, standard deviation, and variance.
variance, viewed as a population.
(Hint: Use the calculator) (Hint: Use the calculator)

Solution It is not necessary to order the data from smallest to largest in both cases

For CASIO scientific calculator fx-570ES


Step 01 Press MODE + 3: STAT Press MODE + 3: STAT
Step 02 Press 1: 1 VAR Press 1: 1 VAR

Statistics for Business | Chapter 01: Introduction and Descriptive Statistics


Step 03 Input the data Input the data
Step 04 Press SHIFT + 1 [STAT] Press SHIFT + 1 [STAT]
Step 05 Press 5: VAR Press 5: VAR
Step 06 Press 2: to calculate the Press 2: to calculate the sample
population mean mean
The result we can get is 99.039 The result we can get is 17.588
Press 3: to calculate the Press 4: to calculate the
population standard deviation sample standard deviation
The result we can get is 0.414 The result we can get is 0.466
Finally, to calculate the Finally, to calculate the sample
population variance, we use variance, we use the following
the following formula: formula:
= (0.414) 0.172 = (0.466) 0.217

For CASIO scientific calculator fx-570ES PLUS


Step 01 Press MODE + 3: STAT Press MODE + 3: STAT
Step 02 Press 1: 1 VAR Press 1: 1 VAR
Step 03 Input the data Input the data
Step 04 Press AC Press AC
Step 05 Press SHIFT + 1 [STAT] Press SHIFT + 1 [STAT]
Step 06 Press 4: VAR Press 4: VAR
Step 07 Press 2: to calculate the Press 2: to calculate the sample
population mean mean
The result we can get is 99.039 The result we can get is 17.588
Press 3: to calculate the Press 4: to calculate the sample
population standard deviation standard deviation

Powered by statisticsforbusinessiuba.blogspot.com 5
International University IU
The result we can get is 0.414 The result we can get is 0.466
Finally, to calculate the Finally, to calculate the sample
population variance, we use variance, we use the following
the following formula: formula:
= (0.414) 0.172 = (0.466) 0.217

Conclusion Population mean Sample mean


= 99.039 = 17.588
Population standard deviation Population standard deviation
= 0.414 = 0.466
Population variance Population variance
= 0.172 = 0.217

Statistics for Business | Chapter 01: Introduction and Descriptive Statistics


4. CHEBYSHEVS THEOREM AND THE EMPIRICAL RULE

Chebychevs Theorem

No condition: The Chebychevs theorem can apply in any case.

1. At least three-quarters of the observations in a set will lie within 2 standard


deviations of the mean.
2. At least eight-ninths of the observations in a set will lie within 3 standard deviations
of the mean.

PROCEDURE OF CHEBYCHEVS THEOREM

STEP 01: Determine the sample mean ( ) and the sample standard deviation ( )
STEP 02: Choose the rule of Chebyshevs theorem and determine the value of
STEP 03: Calculate the interval
STEP 04: Determine the percentage of observations lying into the specified range
(Divide the number of observations lying into the specified range
by the total number of observations in the data set)
STEP 05: Draw a conclusion

Powered by statisticsforbusinessiuba.blogspot.com 6
International University IU
Empirical Rule

Condition: The empirical rule can apply if the distribution of the data is mound-shaped
that is, if the histogram of the data is more or less symmetric with a single mode or high
point.

1. Approximately 68% of the observations will be within 1 standard deviation of the mean.
2. Approximately 95% of the observations will be within 2 standard deviations of the mean.
3. A vast majority of the observations (all, or almost all) will be within 3 standard
deviations of the mean.

PROCEDURE OF THE EMPIRICAL RULE

STEP 01: Draw the histogram of the data and check the condition that the
distribution of the data is mound-shaped
If the distribution of the data is mound-shaped, follow the next five steps.

Statistics for Business | Chapter 01: Introduction and Descriptive Statistics


If not, do nothing more.
STEP 02: Determine the sample mean ( ) and the sample standard deviation ( )
STEP 03: Choose the rule of the Empirical Rule and determine the value of
STEP 04: Calculate the interval
STEP 05: Determine the percentage of observations lying into the specified range
(Divide the number of observations lying into the specified range
by the total number of observations in the data set)
STEP 06: Draw a conclusion.

Powered by statisticsforbusinessiuba.blogspot.com 7
International University IU
Example Check the applicability of Chebyshevs theorem and the empirical rule for the following
data set
12.5, 13, 14.8, 11, 16.7, 9, 8.3, 1.2, 3.9, 15.5, 16.2, 18, 11.6, 10, 9.5

Solution Chebyshevs Theorem:

We found that:
the sample mean = .
the sample standard deviation = .

According to rule 1 of Chebyshevs Theorem, the value of = and the interval


= . . =[ . , . ]

From the data set itself, we see that there are 14 of 15 observations in the set,
. = . % are within the specified range, so the rule that at least
three-quarters will be within range is satisfied.

The Empirical Rule:

Statistics for Business | Chapter 01: Introduction and Descriptive Statistics


Since the distribution of the data is not mound-shaped, the empirical rule cannot apply.

Powered by statisticsforbusinessiuba.blogspot.com 8
International University IU

5. BOX PLOT

Introduction

+ A box plot (also called a box-and-whisker plot) is another way of looking at a data set
in an effort to determine its central tendency, spread, skewness, and the existence of
outliers

+ A box plot is a set of ve summary measures of the distribution of the data:

1. The median of the data


2. The lower quartile
3. The upper quartile
4. The smallest observation
5. The largest observation

Statistics for Business | Chapter 01: Introduction and Descriptive Statistics


The elements of a box plot
- The median is marked as a vertical line across the box.
- The hinges of the box are the upper and lower quartiles (the rightmost and
leftmost sides of the box).
- The interquartile range (IQR) is the distance from the upper quartile to the lower
quartile (the length of the box from hinge to hinge): =
- The inner fence as a point at a distance of . ( ) above the upper quartile;
similarly, the lower inner fence is Q 1.5(IQR).
- The outer fences are dened similarly but are at a distance of ( ) above or
below the appropriate hinge.

Powered by statisticsforbusinessiuba.blogspot.com 9
International University IU
THE ELEMENTS OF A BOX PLOT

Box plots are very useful for the following purposes.


1. To identify the location of a data set based on the median.
2. To identify the spread of the data based on the length of the box, hinge to hinge (the

Statistics for Business | Chapter 01: Introduction and Descriptive Statistics


interquartile range), and the length of the whiskers (the range of the data without extreme
observations: outliers or suspected outliers).
3. To identify possible skewness of the distribution of the data set. If the portion of the box to the
right of the median is longer than the portion to the left of the median, and/or the right whisker
is longer than the left whisker, the data are right-skewed. Similarly, a longer left side of the box
and/or left whisker implies a left-skewed data set. If the box and whiskers are symmetric, the
data are symmetrically distributed with no skewness.
4. To identify suspected outliers (observations beyond the inner fences but within the outer fences)
and outliers (points beyond the outer fences).
5. To compare two or more data sets. By drawing a box plot for each data set and displaying the
box plots on the same scale, we can compare several data sets.

Powered by statisticsforbusinessiuba.blogspot.com 10
International University IU
Example Construct a box plot for the following data set
5, 8, 6, 9, 17, 24, 10, 5, 6, 13, 5, 3, 6, 12, 11, 10, 9, 10, 14, 15

Solution Lets order the data from smallest to largest


3, 5, 5, 5, 6, 6, 6, 8, 9, 9, 10, 10, 10, 11, 12, 13, 14, 15, 17, 24

= 20
The median is the observation in position (20 + 1)50/100 = 10.5, which is 9.5.
The lower quartile is the observation in position (20 + 1)25/100 = 5.25, which is 6.
The upper quartile is the observation in position (20 + 1)75/100 = 15.75, which is 12.75.
The smallest observation is 3.
The largest observation is 24.

Table 1

Smallest Lower Upper Largest


Median
Observation Quartile Quartile Observation
Position 5.25 10.5 15.75

Statistics for Business | Chapter 01: Introduction and Descriptive Statistics


Observation 3 6 9.5 12.75 24

IQR = Upper Quartile Lower Quartile = 12.75 6 = 6.75


Lower Inner Fence = Q 1.5(IQR) = 6 10.125 = 4.125
Upper Inner Fence = Q + 1.5(IQR) = 12.75 + 10.125 = 22.875
Lower Outer Fence = Q 3(IQR) = 6 20.25 = 14.25
Upper Outer Fence = Q + 3(IQR) = 12.75 + 20.25 = 33

Table 2

Lower Outer Lower Inner Upper Inner Upper Outer


Median
Fence Fence Fence Fence
Q 3(IQR) Q 1.5(IQR) Q + 1.5(IQR) Q + 3(IQR)
9.5
14.25 4.125 22.875 33

Box Plot

Conclusion:
Based on the box plot, we can see that the distribution of the data is relatively symmetric.
And there is one suspected outlier, 24.

Powered by statisticsforbusinessiuba.blogspot.com 11

You might also like