You are on page 1of 5

STATISTICS

INTRODUCTION
Statistics involves collection of information on one or more
characteristics of interest called variables. These refer to
information that can be observed for every individual or
entity under study.

Basic Concepts
Population and Sample
In the language of statistics, one of the most basic concepts
is sampling. In most statistical problems, a specified number
of measurements or data sample is drawn from much
larger body of measurements, called population.
Definition: A population is the set of all measurements of
interest to the investigator.

Definition: Descriptive statistics consists of procedures


used to summarize and describe the important characteristic
of measurements.
If the set of measurements is the entire population, a
conclusion can be drawn based on descriptive statistics.
However, it might be too expensive or too time consuming to
enumerate the entire population. For these or other reasons,
a sample from the population may suffice. By looking at the
sample, answers to questions about the population can be
used. The branch of statistics that deals with this problem is
called inferential statistics.
The objective of inferential statistics is to make inferences
(that is, draw conclusions, make predictions, make decisions)
about the characteristics of a population from information
contained in a sample.

Definition: A sample is a subset of measurements selected


from the population of interest.

Definition: Inferential statistics consists of procedures used


to make inferences about population characteristics from
information contained in a sample drawn from this
population.

Descriptive and Inferential Statistics

Variables and Data

When presented with set of measurements whether sample


or a population the branch of statistic that finds a way to
organize and summarize the set of measurements is called
descriptive statistics. Techniques for describing these sets
of measurements may include but not limited to the
following: bar charts, pie charts, and line charts, stem and
leaf plot, histogram and scatter plot.

Definition: A variable is a characteristic that changes or


varies over time and/or for different individuals or objects
under consideration.

Some measures that are commonly used to describe a data


set are measures of central tendency and measures of
variability or dispersion. Measures of central tendency
include the mean, median and mode, while measures of
variability include the standard deviation (or variance), the
minimum and maximum values of the variables, kurtosis and
skewness.

Definition: An experimental unit is the individual or object


on which a variable is measured. A single measurement or
data value results when a variable is actually measured on
an experimental unit.
When a variable is actually measured on a set of
experimental units, it is called a set of measurements or
data result.
Data results can be Univariate data when a single variable is
measured on a single experimental unit, Bivariate data when
two variables are measured on a single experimental unit or

Multivariate
measured.

data

when

more

than

two

variables

are

Types of Variables
There are two general classifications of variables: qualitative
and quantitative. Qualitative variables measure a quality
or characteristic on each experimental unit that represents
attributes, traits, or qualities with no inherent meanings as
numbers but can be categorized like gender, zodiac sign, and
religion. On the other hand, Quantitative variables
measure a numerical quantity or amount on each
experimental unit. It can be further classified as either
discrete or continuous. Variables that can only assume a
finite or countable number of values are called discrete
variable while continuous variable can only assume
infinitely many values corresponding to the points on a line
interval.
Levels of Measurement
Psychologist Stanley Smith Stevens develop a taxonomy of
levels of measurement or scaled of measure. He proposed
four levels of measurement: nominal, ordinal, interval and
ratio. These are arranged in hierarchical order that the next
higher level carries out the properties of the lower level with
some additional properties
1. Nominal. This is the lowest level of measurement.
Nominal variables take values that give names or
labels to various categories with no particular
ordering. Information that can be obtained from
processing data on these variables is limited to
frequency counts and percentages. Some examples
are gender, nationality, language, and ethnicity.
2. Ordinal. Variables measured in ordinal scale are
basically nominal with categories having inherent
ordering. However, the difference between categories
cannot be measured and has no meaning. Information
that can be obtained from processing data on these

variables are limited to frequency counts with


additional insight of rank or order of the categories
specified. For example, Educational attainment
( elementary, high school, college ).
3. Interval. The next higher level of measurement is the
interval scale. Interval variables are quantitative
variables with differences between two consecutive
quantities being constant. Thus interval between
categories can be quantified and have meaning.
However, it is distinguished as having no true starting
point or zero point. Thus, ratios are not meaningful
and having a value of zero does not necessarily means
the absence of attributes being measured. Examples
are temperature measured in Celsius scale and time.
4. Ratio. The highest level of measurement. It has all
three characteristics of interval except that there is an
absolute or true zero point. Some examples of ratio
are height, weight and age.
METHODS
OF
COLLECTING,
PRESENTING DATA

ORGANIZING

AND

Statistics is a science that deals with the collection,


organization, presentation, analysis and interpretation of
data. This shows that statistics is a series of steps or a
process to handle information or data.
Methods of Collecting Data
There are three methods of collecting data: objective
method, subjective method, and the use of existing records.
1. Objective method. This method requires the use of a
measuring instrument or a counting device in
collecting data. Data obtained through this method
can also be collected by observation.
2. Subjective method. This method relies on the
information given by the respondents. Data are usually
collect by using a questionnaire.
3. Use of existing records. Data from these were
previously collected by another person or institution

for some other purposes. A researcher that will make


use of data from existing records must properly
acknowledge the source.
Data collected can be classified as primary or secondary
data. Primary data are those collected directly from the
source, either objective method or subjective method, while
secondary data are those which have been acquired using
existing records.

Exercises

5. Number of boating accidents along a 50-mile stretch of the


Colorado River
6. Time required to complete a questionnaire
7. Cost of a head of lettuce
8. Number of brothers and sisters you have
9. Yield in kilograms of rice from a 1-hectare plot in a rice
field
Six vehicles are selected from the vehicles that are
issued campus parking per- mits, and the following
data are recorded:

Identify the experimental units on which the following


variables are measured:
a.
b.
c.
d.
e.

Gender of a student
Number of errors on a midterm exam
Age of a cancer patient
Number of flowers on an azalea plant
Color of a car entering a parking lot

Identify each variable as quantitative or qualitative:


a. Amount of time it takes to assemble a simple puzzle
b. Number of students in a first-grade classroom
c. Rating of a newly elected politician (excellent, good, fair,
poor)
d. Order of birth of a person
Identify the following quantitative variables as
discrete or continuous:
1. Population in a particular area of the Philippines
2. Weight of newspapers recovered for recycling on a single
day
3. Time to complete a sociology exam
4. Number of consumers in a poll of 1000 who consider
nutritional labeling on food products to be important

a. What are the experimental units?


b. What are the variables being measured? What types of
variables are they?
c. Is this univariate, bivariate, or multivariate data?
Methods of Organizing and Presenting Data
Graphs for Categorical Data
After the data have been collected, it can be consolidated
and summarized to show the following information:

What values of the variable have been measured


How often each value occurred

In qualitative or categorical data, the most common question


would be how often each value occurred. It can be measured
in three different ways:

The frequency, or number of measurements in each


category.
The
relative
frequency,
or
proportion
of
measurements in each category.
The percentage of measurement in each category.

Once the data have been categorized and summarized in a


statistical table, pie chart or bar chart can be used to display
the distribution of data.
Note: To construct a pie chart, assign one sector of a circle
to each category.
The angle of each sector should be proportional to the
proportion of measurements relative frequency) in that
category. Since a circle contains 360, you can use this
equation to find the angle:
Angle = Relative frequency x 360

The visual impact of these two graphs is somewhat different.


The pie chart is used to display the relationship of the parts
to the whole; the bar chart is used to emphasize the actual
quantity or frequency for each category. Since the categories
in this example are ordered grades (A, B, C, D), we would
not want to rearrange the bars in the chart to change its
shape. In a pie chart, the order of presentation is irrelevant.

Example:
Raw data:

Bar Chart

Example:

You might also like