You are on page 1of 39

Introducing Statistical Terms

By

Sachin Chauhan
Pillai’s Institue of Management
studies and Research
Introducing Statistical Terms
What is it?

Descriptive Statistics is a tool or


technique that is used to describe and Click to visit
organize the characteristics of a website.
collection of information or data. The
collection is called a data set or just
data.
Why use it?

Descriptive Statistics is used in


research to answer five basic
questions based on five key
concepts. Click for key concepts.
Concept:
Concept Finding middle scores

Question: What is the middle


set of scores for this data set?
Concept:
Concept Finding the spread of scores

Question: How spread out are


the scores of this data set?
Concept:
Concept Finding the rank of scores

Question: How does a


particular score compare to the
rest of the set of scores for this
data set? Click for key concepts.
Concept:
Concept Finding the shape of scores

Question: What is the shape of


the frequency distribution for
this data set?
Concept:
Concept Finding relationships
between variables

We all love this game! All of


our children are playing!
Question: How are different
variables related in this data
set?
Descriptive statistical strategies can be divided into five
areas that coincide with the five basic questions.

Central Tendency: Mean, Median, and Mode (Middle)


Variability: Standard Deviation, Variance (Spread)
Rank, Relative Position: Percentile, Standard Score (Rank)
Distributional Shape, Function: Skewness, Kurtosis (Shape)
Correlation, Association: Pearson r, Contingency Table (Relationships)
MEASURES OF CENTRAL TENDENCY

Please find
the
middle!!!
 The mean is the arithmetic average of the scores.
 It is the most frequently used measure of central tendency.
 It is calculated by adding up all the scores and dividing by
the total number of scores.
 The mean takes into account every score and is affected by
extreme scores.

6, 10, 20 = 36 36/3 =12


 The number that lies at the midpoint of the
distribution of earned scores.

 It divides the distribution into two equally large


parts.

17, 15, 25, 78, 99


 The mode is the score obtained by more subjects than any
other score.
 It is determined by looking at a set of scores or a graph
and seeing which score occurs most frequently.
 A set of scores may have more than 1 mode. If it has two,
it is bimodal. The scores can also be labeled multimodal.

7, 12, 45, 45, 8


VARIABILITY:FINDING DIFFERENCES
 Standard deviation is the most frequently used index of
variability. It is conceptually similar to the average deviation or
distance from the mean.
A small standard deviation, or SD, indicates that the scores are
close together and a large SD indicates the scores are more spread
out.
If the distribution is exactly normal, then the mean plus 1
standard deviation will include about 68% of scores, the mean
plus 2 standard deviation will include about 95% of scores, and
the mean plus 3 deviations will include about 99% of the scores.
RANK, RELATIVE POSITION
Measures of relative position indicate where a score is in
relation to all other scores in the distribution.
 These measures make it possible to compare the
performance of an individual on two or more different tests.
 The two most frequently used measures of relative
position are percentile ranks and standard scores.
 A percentile rank indicates the percentage of scores that
fall below a given score. If a score of 65 corresponds to a
percentile rank of 80, the 80th percentile, this means that 80%
of the scores in the distribution are lower than 65.
 For example, if Suzy Smart scored at the 95th percentile,
this means that she did better than 95% of the other students
who took the same test. Conversely, if Dudley Dull scored at
the 2nd percentile, this means that Dudley only did better
than, or received a higher score than, 2% of the students.
 The median corresponds to the 50th percentile.
 A standard score is a derived score that expresses how far a
given raw score is from some reference point, typically the mean,
in terms of standard deviation units.
Standard scores allow scores from different tests to be
compared on a common scale and, unlike, percentiles,
mathematical operations can be performed on them such as
averaging. For example, Standard scores are used to report
achievement scores.
 The most common standard scores are z scores, T scores, and
stanines.
Standard Scores (cont.)
z scores
A z score expresses how far a score is from the mean in
terms of standard deviation units.
• A score which is exactly on the mean corresponds to a z
score of 0.
• A score which is exactly 1 standard deviation above the
mean corresponds to a z score of + 1.00.
• A score which is exactly 2 standard deviations below the
mean corresponds to a z score of - 2.00.
• The z score allows scores from different tests to be
compared.
Standard Scores (cont.)

T Scores
A T score is just like the z score in that it also indicates how
many standard deviations a particular raw score lies above or
below the group mean.
•A T score is a z score expressed in a different form. To
convert a z to a T, multiply the z by 10 and add 50.
• Thus, a z score of 0 (the mean) becomes a T score of 50.
• A z score of +1.00 becomes a T score of 60.
• A z score of -1.00 becomes a T score of 40.
Standard Scores (cont.)

Stanines

Stanines are standard scores that divide a distribution into nine


parts.
• Stanine equivalencies are derived using the formula 2z + 5
and rounding resulting values to the nearest whole number.
• Stanines 2 - 8 each represent one-half a SD of the distribution
and stanines 1 - 9 include the remainder.
• Stanines are frequently reported in norms tables and are often
used by school systems for grouping purposes.
DISTRIBUTIONAL SHAPE, FUNCTION
The Normal Curve
In a true normal distribution, the values
of the mean, median, and mode will be
identical. The normal curve is
symmetrical about the mean. This
means that the divided curves are mirror
images of each other. The tails of the
curve are asymptotic, meaning that they
come close to the horizontal axis, but
never touch. A true normal distribution
rarely happens.
When distributions are not normal, they
are labeled as being skewed. These
distributions can be positively or
negatively skewed.
Positively Skewed Distribution
•Skewed distributions have more
extreme cases at one end or the
other.
•In a positively skewed
distribution, a few scores are Mode Median Mean

strung out toward the high end of


the score continuum, thus
forming a “tail” that points to the
right.
• In this distribution the mean is
always higher than the median
and the mode.
Negatively Skewed Distributions
•In a negatively skewed
distribution, the mean is always
lower or smaller than the
median and the mode.
• For example, a graph of the
results of a 100-item test would Mean Median Mode

show most of the scores leaning


toward the upper end of the
scale with a few very poor ones.

• The few low scores tend to


“pull” the mean in the direction
of the low scores.
CORRELATION, ASSOCIATION

How does SES affect


learning?
Correlations
Correlations look at the relationship between variables. They
show how the value of one variable changes when the value of
another variable changes. To illustrate the relationship, correlation
coefficients between variables are computed.
A correlation coefficient is a numerical index that reflects the
relationship between two variables. The value ranges from – 1 to +
1 with the former being a strong negative relationship and the
latter being a strong positive relationship. Click here to see a table
that summarizes these relationships.
Types of Correlations and the Corresponding Relationship
Between Variables
What What Type of Value Example
happens to happens to Correlation
Variable X Variable Y

X increases Y increases Direct/Positive Positive > study, Higher


ranging test scores
from
.00 to +1.00

X decreases Y decreases Direct/Positive Positive < money put in


ranging bank, < interest
from earned
.00 to +1.00

X increases Y decreases Indirect/Negative Negative, > you exercise, the


ranging < you weigh
from – 1.00
to .00

X decreases Y increases Indirect/Negative Negative, < time taken to


ranging complete a test,
from – 1.00 the more you’ll get
to .00 wrong
Correlations cont.
Pearson r
Pearson r stands for Pearson-product moment correlation. The Pearson r is the
most used correlation coefficient. It examines relationships that are continuous
in nature like height, age, test scores, or income.
•Example: rxy = the correlation between variable X and variable Y.
•In descriptive statistics, a correlation matrix (Click picture for example) is
used to illustrate the relationship between variables. To understand the matrix,
use the following table.
.8-1.0 Very strong relationship
.6-.8 Strong relationship
.4-.6 Moderate relationship
.2-.4 Weak relationship
.0-.2 Weak or no relationship
Correlations cont.
Correlation Matrix
Correlations

PRE_LIT MID_LIT PST_LIT


PRE_LIT Pearson Correlation 1.000 .521 .511
Sig. (2-tailed) . .000 .000
N 90 90 90
MID_LIT Pearson Correlation .521 1.000 .987
Sig. (2-tailed) .000 . .000
N 90 90 90
PST_LIT Pearson Correlation .511 .987 1.000
Sig. (2-tailed) .000 .000 .
N 90 90 90

The correlation between pre-lit and mid-lit is .521.


The correlation between pre-lit and post lit is .511
The correlation between mid-lit and post-lit is .987
Use the table to interpret this data.
Using Different Levels of Measurements for
Correlation Coefficients
Nominal

There are four different ways to assess variables:


• Nominal, ordinal, interval, and ratio
• When computing correlation coefficients, nominal, ordinal, and
interval measures are used.
• Examples and explanations of level of measurements and types
of correlations can be found by clicking on the picture.
Using Different Levels of Measurements for
Correlation Coefficients
Variable X Variable Y Type of Correlation being computed
Correlation

Nominal (voting Nominal Phi coefficient Correlation between voting


preference) (gender) preference and gender

Nominal (social class Ordinal (rank Rank biserial Correlation between social class
such as high, middle, in high school coefficient and rank in high school
low) graduating
class
Nominal (family Interval Point biserial Correlations between family
configuration such as (grade point configuration and grade point
one/two parents in average average
home
Ordinal (height Ordinal Spearman rank Correlation between height and
converted to rank) (weight coefficient weight
converted to
rank)
Interval (# of Interval (age Pearson r Correlation between number of
problems solved in years) problems solved in age and
years
Levels of Measurement in Descriptive
Statistics
Nominal
Nominal scale represents the lowest scale of measurement.
This scale classifies persons into two or three categories.
• Whatever the basis for classification, a person can be in only
one category, and members of a given category have a common
set of characteristics.
• Identify and classify.
• Examples are religious preference, political preference, or
team jersey numbers.
Levels of Measurement cont.
Ordinal
An ordinal scale not only classifies subjects, but also ranks
them in terms of the degree to which they possess the
characteristic of interest. In other words, an ordinal scale
puts the subjects in order from highest to lowest, from
greatest to least.
• Determines greater to least.
• Rank orders scores.
• Examples are quality of objects such as lumber, gems,
personal preferences and attitudes.
Levels of Measurement cont.
Interval

An interval scale has all the characteristics of a nominal and an


ordinal scale, but in addition it is based upon predetermined equal
intervals.
• Determines equality of intervals or differences.
• Finds distances or differences
• Examples: Temperature, calendar dates
Levels of Measurement cont.
Ratio

• Ratio represents the highest, most precise, level of


measurement. A ratio scale has all the advantages of the
other types of scales and in addition it has a meaningful true
zero point.
• Determination of equality of ratios
• Finds ratios, fractions or multiples
• Length, weight, loudness, brightness, duration
At last we’re done! What do you think, Mark?

You might also like