You are on page 1of 79

Educational Research:

Data analysis and interpretation 1 Descriptive statistics

EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.

Statistics...

A set of mathematical procedures for describing, synthesizing, analyzing, and interpreting quantitative data the selection of an appropriate statistical technique is determined by the research design, hypothesis, and the data collected

Preparing data for analysis...


Data must be accurately scored and systematically organized to facilitate data analysis: scoring: assigning a total to each participants instrument tabulating: organizing the data in a systematic manner coding: assigning numerals (e.g., ID) to data

descriptive statistics... permit the researcher to describe many pieces of data with a few indices

statistics... indices calculated by the researcher for a sample drawn from a population

parameters... indices calculated by the researcher for an entire population

Types of descriptive statistics


1. graphs 2. measures of central tendency 3. measures of variability

graphs... representations of data enabling the researcher to see what the distribution of scores look like

1. Graphs

frequency polygon pie chart boxplot stem-and-leaf chart

measures of central tendency... indices enabling the researcher to determine the typical or average score of a group of scores

2. Measures of central tendency


mode median mean

mode... the score attained by more participants than any other score

median... the point in a distribution above and below which are 50% of the scores

mean... the arithmetic average of the scores

measures of variability... indices enabling the researcher to indicate how spread out a group of scores are

3. Measures of variability

range quartile deviation variance standard deviation

range... the difference between the highest and lowest score in a distribution

quartile deviation... one half of the difference between the upper quartile (the 75%ile) and the lower quartile (the 25%ile) in a distribution

variance... a summary statistic indicating the degree of variability among participants for a given variable

standard deviation... the square root of variance providing an index of variability in the distribution of scores

Normal distributions of data (the normal curve)...


A bell-shaped distribution of scores having four identifiable properties 50% of the scores fall above the mean and 50% of the scores fall below the mean the mean, median, and mode are the same value

most scores are near the mean and, the farther from the mean a score is, the fewer the number of participants who attained that score the same number, or percentage, of scores is between the mean and plus one standard deviation as is between the mean and minus one standard deviation

Non-normal distributions of data (skewed distributions)...


A non-bell-shaped distribution of scores where mean < median < mode (a negatively skewed distribution) mean > median > mode (a positively skewed distribution)

measures of relative position... indices enabling the researcher to describe a participants performance compared to the performance of all other participants

4. Measures of relative position


percentile ranks standard scores

percentile rank... indicates the percentage of scores that fall at or below a given score

standard score... a measure of relative position

Types of standard scores... z score T score stanines

z score... a statistic expressing how far a score is from the mean in terms of standard deviation units

T score... a transformed z score that voids negative numbers and decimals by multiplying the z score by 10 and adding 50

stanines... a standard score that divides a distribution into nine parts

measures of relationship... indices enabling the researcher to indicate the degree to which two sets of scores are related

5. Measures of relationship

Spearman Rho Pearson r

correlations determines whether and to what degree a relationship exists between two or more quantifiable variables the degree of the relationship is expressed as a coefficient of correlation

the presence of a correlation does not indicate a cause-effect relationship primarily because of the possibility of multiple confounding factors

Correlation coefficient

-1.00 strong negative

0.00

+1.00

strong positive no relationship

Spearman Rho... a measure of correlation used for rank and ordinal data

Pearson r... a measure of correlation used for data of interval or ratio scales assumes that the relationship between the variables being correlated is linear

Mini-Quiz

True and false the analysis of the data is as important as any other component of the research process
True

True and false descriptive statistics are normally computed separately for each group in a research study
True

True and false every instrument administered must always be scored accurately and consistently, using the same procedures and criteria
True

True and false tentative scoring procedures must always be tried out beforehand by administering the instrument to the study participants
False

True and false a computer should not be used to perform an analysis that a researcher has never completed by hand or, at least, studied extensively
True

True and false the first step in data analysis is to describe, or summarize, the data using descriptive statistics
True

True and false the number resulting from the computation of a measure of central tendency represents the typical score attained by a group of participants
True

True and false the mean is the most precise, stable index of typical performance that is especially useful in situations in which there are extreme scores
False

True and false unless a correlation coefficient is used to compute the reliability of an instrument in a causalcomparative or experimental study, a correlation coefficient is only computed in a correlation study
True

True and false plus and/or minus two standard deviations includes more the 99% of the scores
False

True and false standard scores are rarely used in research studies
True

True and false to test a hypothesis adequately, more than descriptive statistics are normally needed
True

True and false if the extreme scores are at the upper, or higher, end of the distribution, it is said to be positively skewed
True

True and false the median of a set of scores corresponds to the 50% percentile
True

True and false a standard score is a measure of relative position that is appropriate when the data represent a nominal scale
False

True and false a z score expresses how far a score is from the mean in terms of standard deviation units
True

True and false the Spearman Rho is the appropriate measure of correlation when the variables are expressed as ranks instead of scores
True

True and false the assumption associated with the application of Pearson r is that the relationship between the variables being correlated is linear
True

Fill in the blank statistics which permit the researcher to describe many scores with a small number of indices
descriptive statistics

Fill in the blank the values calculated for a sample drawn form a population
statistics

Fill in the blank the values calculated for an entire population


parameters

Fill in the blank a convenient way to describe a set of data with a single number
measures of central tendency

Fill in the blank the index of central tendency appropriate for nominal data
mode

Fill in the blank the index of central tendency appropriate for ordinal data
median

Fill in the blank the index of central tendency appropriate for interval or ratio data
mean

Fill in the blank the score attained by more participants than any other score
mode

Fill in the blank the point in a distribution above and below which are 50% of the scores
median

Fill in the blank the arithmetic average of the scores


mean

Fill in the blank the difference between the highest and lowest score in a distribution
range

Fill in the blank the measure of variability identifying one half of the difference between the 75th percentile and the 25th percentile
quartile deviation

Fill in the blank the measure of variability used for interval and ratio data
standard deviation

Fill in the blank the only appropriate measure of variability for nominal data
range

Fill in the blank +/- 1.00 standard deviations constitutes ____ % of the sample
68%

Fill in the blank extreme scores at the lower end of the distribution indicates a ______ skewed distribution
positively

Fill in the blank indices describing where a score is in relation to all other scores
measures of relative position

Fill in the blank indicates the percentage of scores that fall at or below a given score
percentile ranks

Fill in the blank if a set of scores is transformed into a set of z scores, the new distribution has a mean of ____ and a standard deviation of ____
zero; one

Fill in the blank a set of standard scores that divide a distribution into nine parts
stanines

Fill in the blank the most appropriate measure of correlation when the sets of data to be correlated represent either interval or ratio scales
Pearson r

This module has focused on...


descriptive statistics
...the statistical procedures for describing, synthesizing, analyzing, and interpreting quantitative data

The next module will focus on...


inferential statistics
...the statistical procedures for generalizing to a population of individuals based on information obtained from a limited number of research participants