You are on page 1of 4


MATH 1050Y
A Non-Calculus Based Introduction to Probability & Statistical Methods
Section A FW 2012-13 Instructor: Jaclyn Semple

Tuesday 12-1pm SC 203 Tuesday 1-2pm SC 203 Tuesday 4-5pm GCS 115 Remember: Weekly quizzes (iClicker) Weekly assignments due

MATH 1050Y-A (FW 2012-13)

2 - 50

MATH 1050Y-A (FW 2012-13)

2 - 51

Chapter 2 Describing, Exploring, and Comparing Data

2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Graphs of Data 2-4 Measures of Central Tendency 2-5 Measures of Variation 2-6 Measures of Position 2-7 Exploratory Data Analysis
MATH 1050Y-A (FW 2012-13)

Measures of Central Tendency

A measure of central tendency is a value at the centre or the middle of a data set.

We will consider the following four measures of central tendency: mean median mode midrange.

2 - 52

MATH 1050Y-A (FW 2012-13)

2 - 53

The Mean
The (arithmetic) mean of a set of values is the number obtained by adding the values and dividing the total by the number of values. For a sample of n observations, the mean is referred to as the sample mean, x . If all N values of the population are available, this is referred to as the population mean, .

The Median
The median of a data set is the middle value when the values are arranged in order of increasing magnitude. For a sample of n observations, the median is % denoted by x (x-tilde). To find the median, first arrange the values in order. If the number of values is odd, the median is the number that is located in the exact middle of the list. If the number of values is even, the median is found by computing the mean of the two middle numbers.
MATH 1050Y-A (FW 2012-13)



sample mean x-bar

MATH 1050Y-A (FW 2012-13)

population mean mu
2 - 54

2 - 55

Examples of The Median

Odd Number of Values (Ordered)
0.42 0.48 0.66 0.73 1.10 1.10 5.40

The Mode
The mode of a data set is the value that occurs most frequently. When two values occur with the same greatest frequency, each one is a mode and the data set is said to be bimodal. When more than two values occur with the same greatest frequency, each one is a mode and the data set is said to be multimodal. When no value is repeated, we say that there is no mode. The mode is often denoted by M. It is the only measure of central tendency that can be use with nominal data.
2 - 56
MATH 1050Y-A (FW 2012-13)

% x = 0.73
Even Number of Values (Ordered)
0.42 0.48 0.73 1.10 1.10 5.40

% x
MATH 1050Y-A (FW 2012-13)

0.73 + 1.10 2

= 0.915

2 - 57

Examples of The Mode

1) 5.40 1.10 0.42 0.73 0.48 1.10 2) 27 27 27 55 55 55 88 88 99 3) 1 2 3 6 7 8 9 10 Mode is 1.10 Bimodal: 27 & 55 No Mode

The Midrange
The midrange is the value midway between the highest and lowest values in the data set. The midrange is found by adding the highest value to the lowest value and then dividing the sum by 2. That is,

midrange =

highest value +lowest value . 2

Not used very often because it is too sensitive to extremes.

MATH 1050Y-A (FW 2012-13)

2 - 58

MATH 1050Y-A (FW 2012-13)

2 - 59

Rounding Rule
A simple rule for rounding calculations of measures of central tendency is this: Carry one more decimal place than is present in the original data set. Note: Round only the final answer, never in the middle of a calculation.

Pause & Practice

A potato chip packaging plant selects 10 bags for a quality control check. The weights in grams are listed below. Find the mean, median, mode, and midrange for this data set. 454.1 454.4 455.0 455.1 454.2 454.6 454.9 454.4 454.7 455.2

MATH 1050Y-A (FW 2012-13)

2 - 60

MATH 1050Y-A (FW 2012-13)

2 - 61

Mean from a Frequency Table

When sample data are summarized in a frequency table, we can approximate the mean by replacing class limits with class midpoints and assuming that each class midpoint is repeated a number of times equal to the class frequency, f. We then use the following formula to approximate the sample mean.
class midpoint

Example: Mean from a Frequency Table

Approximating the mean from the axial load data.

MATH 1050Y-A (FW 2012-13)

( f x) f
2 - 62
MATH 1050Y-A (FW 2012-13)

2 - 63

Pause & Practice

From our previous example of statistics marks, approximate the mean by completing the table:
Marks (%) 30 39 40 49 50 59 60 69 70 79 80 89 90 - 99 Total
MATH 1050Y-A (FW 2012-13)

Weighted Mean
In some situations, the values vary in their degree of importance. In this situation, we may wish to compute a weighted mean

f 2 3 6 12 13 9 5 f=50


A. 34.5 B. 69 C. 3505 D. 70.1

takes frequency/importance into account Eg. Mean crop yield from 3 farms of different size A weighted mean is a mean computed with different values assigned different weights, w. We use the following formula to compute a weighted mean:

2 - 64
MATH 1050Y-A (FW 2012-13)

(w x) w

2 - 65

Example: Weighted Mean

Three assessment results (quiz, test, and final exam) in a course for a particular student are 65, 70 and 85. Find the students average mark in the course if the quiz is worth 25%, the test 45%, and final exam 30%.

The Best Measure of Central Tendency

Unfortunately, there is no single best measure of central tendency. This is because the best measure of central tendency largely depends on the data set being analyzed. One disadvantage of the mean is that it is sensitive to every data value, so even one unusually large or small value can affect the mean dramatically. The median largely overcomes this disadvantage. For a more complete comparison of the mean, median, mode, and midrange, refer to Table 2-6 in the textbook (p64)

(w x) = x= w

MATH 1050Y-A (FW 2012-13)

2 - 66

MATH 1050Y-A (FW 2012-13)

2 - 67

Skewness and Symmetry

A distribution of data is symmetric if the left side of its histogram is roughly a mirror image of its right half. A distribution of data is skewed if it is not symmetric and extends more to one side than the other. Data skewed to the left are said to be negatively skewed; the mean and median are to the left of the mode. Data skewed to the right are said to be positively skewed; the mean and median are to the right of the mode.
MATH 1050Y-A (FW 2012-13)

Skewness and Symmetry

NOTE: The mean and median cannot always be used to identify the shape of a distribution of data values.
MATH 1050Y-A (FW 2012-13)

2 - 68

2 - 69

Pause & Practice

Returning to the histogram of the ages of faculty members cars, what is the shape of the distribution?

Pause & Practice

Look at the following histogram for salaries of baseball players. What shape would you say the data take?

A. Symmetric B. Left-skewed
MATH 1050Y-A (FW 2012-13)

A. B. C. D. E.

Bi-modal Left-skewed Right-skewed Symmetric Uniform

C. Right-skewed D. Cant decide

2 - 70
MATH 1050Y-A (FW 2012-13)

2 - 71

Coming up
Reminders: Assignment #2 due Tuesday in seminar Quiz #2 Tuesday in seminar For next class: Do practice questions from 2-4 Read Section 2-5 bring iClicker

MATH 1050Y-A (FW 2012-13)

2 - 72