You are on page 1of 16

Chapter 1

Descriptive Statistics

1.1

INTRODUCTION

Statistics is the science of collecting, simplifying, and describing data, as well as making
inferences about the population characteristic based on the analysis of data. Statistics can be
divided into two branches which are descriptive statistics and inferential statistics.
DEFINITION
1. Descriptive statistics consists of organization, summarization, and description of data
sets in effective presentation and increase understanding by using charts, tables, graphs,
etc.
2. Inferential Statistics - Using sample data to make an inference about a population.

There are few terms that you need to know before finding the statistics which are as follows:
1. Population
- A population is the collection of all the elements (often data values) we are interested
in.
2. Parameter
- A parameter is a number representing a numerical property of a population. Ex: , ,
3. Sample
- A sample from population is a collection of some of the elements obtained from the
population.
Example 1.1..
Given the all the people inUTHM as a population. Find the examples of samples that you can get
create.
Solution:
(a) Students

(b) Non-academic staff

(c) Male students

A variable is a characteristic under study that assumes different values for different
elements. Variable normally refers to the observations or data of a statistical investigation. We
use the symbol x, y, etc to represent variable. There are 2 types of variable which are qualitative
variable and quantitative variable.
1. Qualitative Variable
- A qualitative variable is a variable that cannot assume a numerical value but can be
classified into two or more categories according to its characteristic, level etc.
2. Quantitative Variable
- A quantitative variable is a variable that can be measured numerically.

Example 1.2
Determine whether the following are qualitative or quantitative variable
(a) height, (b) mass, (c) gender, (d) colour.

Data is a collection of observations on one or more variables which is taken from the population
or sample. There are two types of data which are discrete data and continuous data.
1. Discrete Data
- Discrete data can be described by a discrete variable which is a variable that can only
assume particular numerical values over a certain interval. They are usually obtained by
counting.
2. Continuous Data
- Continuous data can be described by continuous variable which is a variable that can
assume any numerical values over a certain interval. They are obtained by measurement
and the accuracy depends on the measuring instruments.
Example 1.3
Determine whether the following are discrete or continuous data
(a) height of students,
(b) number of children in a family,
(c) length of the students thumbs,
(d) number books in the students bags.

Variable can be illustrated as below :

Note :
Data is a collection of observations on one or more variables which is taken from the population
or sample.

1.2

PRESENTATION OF DATA

There are two types of presentation of data which are table and graph. This is an
example of presentation of data in a table.
(a) Table

(b) Graph
1) Line graph

Single line

Combination Line

2) Bar Graph

Single data vertical

Single data horizontal

Combination data if the variable are same


i)
Positive & Negative data

ii)

Combine data

3) Pie chart

4) Histigram

5) Frequency polygon

6) Ogive

1.3 ORGANIZING AND DESCRIBING DATA


A frequency distribution gives us the distinct data values in a collection of data together
with the number of times each value occurs, denoted by fx (or just f ). These are definitions of
different types of frequency.
1. Relative Frequency
-

Relative frequency is the fraction or proportion of observed responses in the category.

2. Grouped Frequency Distribution


-

A grouped frequency distribution is obtained by giving classes or intervals together


with the number of data values in each class.

3. Grouped Relative Frequency Distribution


-

A grouped relative frequency distribution gives the frequency for each class divided by
the total number of data values for the data set. Instead of listing the frequency f of each
class, we list the relative frequency, f/n.

4. Cumulative Frequency
-

Cumulative frequency is the frequency of a class that includes all values in a data set
that fall below the upper boundary of that class.

Frequency distribution and grouped frequency distribution is good because we avoid write out all
the data values, including repetitions. Frequency distribution is not suitable if the number of
distinct data value is large while grouped frequency distribution tells us how many data values are
in each class but not what the data values are.
Example 1.4..

Example 1.5..

In a frequency distribution, we need to find the class midpoint or mark and class width. The
definitions are as below:

Example 1.6

EXERCISE 1A

1.4 MEASURE OF CENTRAL TENDENCY


Measure of central tendency is a number that is a typical or representative value for a
collection of data.
1.4.1 Mean
Mean is the average of data values.
1.4.1.1 Mean for Ungrouped Data
Below are the definitions of mean for ungrouped data (sample and population).

Example 1.7 ....

Example 1.8

Example 1.9.

1.4.1.2 Mean for Grouped Data

Example 1.10

Example 1.11
A study of sulfur oxide production within 80 days produced the distribution of the following
table. Find the mean.
Sulfur Oxide
5.0 8.9
9.0 12.9
13.0 16.9
17.0 20.9
21.0 24.9
25.0 28.9
29.0 32.9

Frequency
3
10
14
25
17
9
2

Examlpe 1.12

1.4.1.3 Assumed Mean Method


If the values of are too large or too small, it is better to use assumed mean or
coding methods.

Example 1.13

Example 1.14

Example 1.15

Example 1.16

Example 1.17

You might also like