You are on page 1of 55

DATA GATHERING AND ORGANIZING DATA

DATA COLLECTION
Data collection is the process of gathering and
measuring information about variables on study
established systematic procedure, which then enable
to answer relevant questions at hand and evaluate
outcomes.
FOUR TYPES OF DATA

1. Nominal 3. Interval

2. Ordinal 4. Ratio
1. N O M I N A L

It is sometimes referred to as classificatory scale.


This scale is used for classifying and labeling
variables without quantitative value.
1. N O M I N A L
Examples:
a. Eye color
b. Gender
c. VSU Dormitories
d. Degree Programs
2. O R D I N A L

It possesses the characteristics of the nominal scale,


where it classifies data, however, the classification
has ranks. Data is shown in order of magnitude.
2. O R D I N A L
Examples:
a. Educational Attainment
b. Grades
c. Emotion
d. Organizational Structure
3. I N T E R V A L
This scale possesses the characteristics of the nominal and
ordinal scale where data are classified and ranked. Interval
scale is a classification that describes the nature of
information within the values assigned to variables. One
problem with interval scale, it doesn’t have a “true zero”.
3. I N T E R V A L
Examples:
a. IQ
b. Transmutation of grades
c. BMI
d. Temperature (Celsius & Fahrenheit)
4. R A T I O

This scale possesses the characteristics of nominal,


ordinal, and interval scale. However, if in interval scale
there is no zero value, in ratio scale, zero is absolute.
This is the point where the quality being measured
does not exist.
4. R A T I O
Examples:
a. Age
b. Monthly Income
c. Height
d. Allowance
THE TABLE BELOW SHOWS THE GENERALIZATION
OF THE FOUR SCALES OF MEASUREMENT

Information Nominal Ordinal Interval Ratio


The order of values is known
Can quantify the difference
between each value
Can add and subtract
Can multiply and divide values
Has “true zero”
MEASURES OF CENTRAL TENDENCY

A central tendency is a central or typical value


for a probability distribution. It may also be called
a center or location of the distribution.
MEASURES OF CENTRAL TENDENCY
The mean is the average of the numbers.
It is easy to calculate: add up all the numbers,
then divide by how many numbers there are.
Denoted by 𝑥ഥ :
Using the formula: σ𝑥
𝑥ҧ =
𝑛
Where; 𝒙 is the value of an observation
𝒏 is the total number of observation
Examples:
1. Find the mean of the set of values:
3,4,6,8,9, 𝑎𝑛𝑑 5.

2. What is the mean of the set of values:


1.2, 3.5, 4.0, 𝑎𝑛𝑑 1.3?
The median is the value separating the higher half of
a data sample, a population, or a probability
distribution, from the lower half.

For a data set, it may be thought of as the “middle”


value. It is denoted as 𝑴𝒅.
Examples:
1. In the data set 1,3,3,6,7,8,9 , the median is 6, the fourth
largest, and also the fourth smallest number in the
sample.
2. What is the median in the data set: 4,5,11,2,6,4,1,10 ?
The mode of a set of data values is the value that
appears most often. Mode may exist sometimes
does not. If it exists, sometimes it has one mode or
sometimes it has more than one mode. It is denoted
as 𝑴𝒐.
Examples:
1. Given the data set { 3,2,7,3,4,5} the mode is 3 since
3 is repeated the most.

2. What is the mode in the data set: {3,7,5,4,5,5,4,8}?


ACTIVITY (ONE WHOLE YELLOW PAD)
Given: Math grades of high school students

89, 90, 80, 98, 75, 87, 88, 93, 79, 80


79, 83, 91, 89, 84, 82, 90, 95, 84, 70

Find the mean, median, and mode, range,


variance, standard deviation and absolute
deviation.
(1-7)Determine if each of the following is an example of
nominal, ordinal, interval or ratio.

1. Weight 6. Time
2. Car Racing Winners 7. Number of Correct
3. Test Score Items in a Test
4. Multiple Intelligences 8-10.Give the 3
5. Likert Scale measures of central
tendency.
MEASURES OF DISPERSION
In statistics, the dispersion of the data set is measured to
find out how spread out or close to the data values are.
Usually, the measure of dispersion comes along with the
measure of central tendency.
MEASURES OF DISPERSION
Measure of dispersion which includes range, interquartile
range, absolute deviation, variance and standard deviation
is also known as the measures of spread or variability.
MEASURES OF DISPERSION
Range

Absolute Deviation
RANGE
This is the easiest measure of dispersion. It is the
difference between the highest value and the lowest
value.
It tells how far the lowest value from the highest
value is. It is denoted as 𝑅.
RANGE
Example:
Given the data set {5,8,7,12,12,13,18}
𝑅 = 𝐻𝑉 − 𝐿𝑉
𝑅 = 18 − 5
= 13
The value 13 tells that the HV and lowest value is
13 steps away from each other.
This is the expectation of the squared deviation of a
random variable from its mean. It measures how far a set
of numbers are spread out from their average value.
Using the formula:
2 σ(𝑥−𝑥)ҧ 2
𝑠 =
𝑛−1

Where; 𝑥 is the value of an observation


𝑥ҧ is the mean
n is the total number of observation
Example:

Consider the following scores of students in an


achievement exam;
15, 19, 11, 13, 17, 10, 20

Find the variance.


This is the square root of its variance. A low standard
deviation indicates that the data set tend to be closed to the
mean. A high standard deviation indicates that the spread
of data points is of wider range.
Example:
Consider the example in variance where:
2
𝑠 = 15
So, the standard deviation is:
s = 15
𝑠 = 3.87
Absolute Deviation

This is the average distance of all of the elements


in a data set from the mean of the same data set.
Absolute Deviation
Compute using the formula:
σ 𝑥−𝑥ҧ
AD =
𝑛
Where; 𝑥 is the value of an observation
𝑥ҧ is the mean
n is the total number of observation
Absolute Deviation
Example:
Erica enjoys posting pictures of her cat online. Here's
how many "likes" the past 6 pictures each received:
10, 15, 15, 17, 18, 21

Find the mean absolute deviation.


EXERCISE:

Given: Scores of 20 students in Third Long exam of Math 11n

30 25 38 36 36 40 44 43 42 26
27 44 41 39 42 27 37 40 43 40

Find the range, variance, standard deviation, and absolute deviation


MEASURES OF RELATIVE POSITION

It is sometimes referred to as measure of location. It


is considered as the extension of median. It talks
about the position/location of the value relative to the
other values in the data set.
MEASURES OF RELATIVE POSITION
This measures divides the observation in four equal parts.
The 𝑸𝟏 is the middle point between the smallest value and
the center value also called 𝑄2 . The 𝑸𝟐 is also called the
median, 𝑸𝟑 is the middle value between the median and
the highest value of the data set.
Q 1=

When the set of observation is arranged in an ascending order, then


the Lower quartile is given as:
(𝑛 + 1)𝑡ℎ
𝑄1 =
4

If the solution is a decimal number then, Lower quartile Q1 is given


by rounding it to the nearest whole integer.
Q 1=

The Second quartile, which is the median of the set of observation


is given as:

(𝑛 + 1)𝑡ℎ
𝑄2 =
2
Q 1=

The Upper quartile is given as,

3(𝑛 + 1)𝑡ℎ
𝑄3 =
4

If the solution is a decimal number then, Upper quartile Q3 is given


by rounding it to the nearest whole integer.
Q 1=

The lower and the upper quartile value helps us to find the measure
of dispersion in the set of observation, which is called as 'inter-
quartile range', it is denoted as IQR and it is the difference
between upper and lower quartile.

𝑰𝑸𝑹 = 𝑸𝟑 − 𝑸𝟏
Q 1=

Example:

Find the median, lower quartile, upper quartile and inter-quartile


range of the following data set of scores: 19, 22, 24, 20, 24, 27, 25,
24, 30 ?
Exercise: Round-off your final answer to the nearest whole
number.

Of the following data set of scores:


34,19, 22, 24, 20, 45, 24, 27, 25, 24, 30, 45,18,42
Find:
1. First Quartile 5. 75th percentile
2. Second Quartile 6. 90th percentile
3. Third Quartile 7. z-score of 25, with 𝑥ҧ = 29 and 𝑆 = 9
4. IQR Note: Interpret the results.
P
This divides the observation in 100 equal parts. It is used
to indicate how much of the observation may be found
below. For instance, if 30th percentile is the value, this
means that 30% of the observation may fall below it.
P
To compute the position, we have:
𝐾
𝐿= 𝑁
100

Where K is the kth percentile


N is the total number of observations
Example:
P
The following are the scores of 12 students in midterm exam
in Math 11n.
85, 34, 42, 51, 84, 86, 78, 85, 87, 69, 74, 65
Find the 80th percentile?
This indicates how many standard deviation an element is from
the mean. The positive and negative signs indicates the direction
of the point away from the mean.
Computed using the formula:

𝑥−𝑥ҧ
z=
𝑠
Where x is the value of the element
𝑥ҧ is the mean
s is the standard deviation
A z-score less than 0 represents an element less than the mean.
A z-score greater than 0 represents an element greater than the
mean.
A z-score equal to 0 represents an element equal to the mean.
A z-score equal to 1 represents an element that is 1 standard
deviation greater than the mean; a z-score equal to 2, 2 standard
deviations greater than the mean; etc.

A z-score equal to -1 represents an element that is 1 standard


deviation less than the mean; a z-score equal to -2, 2 standard
deviations less than the mean; etc.
Example:
Given a mean of 60 and a standard deviation of 5, find
the corresponding z-score of 65.
Solution:
𝑥−𝑥ҧ 65−60
z= = =1
𝑠 5

This mean that 65 is one standard deviation higher than the mean.

You might also like