You are on page 1of 27

How to Design and Evaluate

Research in Education
By
Jack R. Fraenkel and Norman E. Wallen
Chapter 1
The Nature of Research
Ways of knowing
Sensory experience (incomplete/undependable)
Agreement with others (common knowledge wrong)
Experts’ opinion (they can be mistaken)
Logic/reasoning things out (can be based on false premises)
Why research is of value
Scientific research (using scientific method) is more trustworthy than
expert/colleague opinion, intuition, etc.
Chapter 1 - continued
The Nature of Research
Scientific Method (testing ideas in the public arena)
Put guesses (hypotheses) to tests and see how they hold up
All aspects of investigations are public and described in detail so anyone
who questions results can repeat study for themselves
Replication is a key component of scientific method

Chapter 1 - continued
The Nature of Research
Scientific Method (requires freedom of thought and public procedures that
can be replicated)
Identify the problem or question
Clarify the problem
Determine information needed and how to obtain it
Organize the information obtained
Interpret the results
All conclusions are tentative and subject to change as new evidence is
uncovered (don’t PROVE things)

Chapter 1 - continued
The Nature of Research
Types of Research
Experimental (most conclusive of methods)
Researcher tries different treatments (independent variable) to see their effects
(dependent variable)
In simple experiments compare 2 methods and try to control all extraneous
variables that might affect outcome
Need control over assignment to treatment and control groups (to make sure they
are equivalent)
Sometimes use single subject research (intensive study of single individual or
group over time)

Chapter 1 - continued
The Nature of Research
(Types of Research continued)
Correlational Research
Looks at existing relationships between 2 or more variables to make better
predictions
Causal Comparative Research
Intended to establish cause and effect but cannot assign subjects to trtmt/control
Limited interpretations (could be common cause for both cause and effect…stress
causes smoking and cancer)
Used for identifying possible causes; similar to correlation
Chapter 1 - continued
The Nature of Research
(Types of Research continued)
Survey Research
Determine/describe characteristics of a group
Descriptive survey in writing or by interview
Provides lots of information from large samples
Three main problems: clarity of questions, honesty of respondents, return rates
Ethnographic research (qualitative)
In depth research to answer WHY questions
Some is historical (biography, phenomenology, case study, grounded theory)
Chapter 1 - continued
The Nature of Research
(Types of Research continued)
Historical Research
Study past, often using existing documents, to reconstruct what happened
Establishing truth of documents is essential
Action Research (differs from above types)
Not concerned with generalizations to other settings
Focus on information to change conditions in a particular situation (may use all the
above methods)
Each of these methods is valuable for a different purpose
Chapter 1 - continued
The Nature of Research
General Research Types
Descriptive (describe state of affairs using surveys, ethnography, etc.)
Associational (goes beyond description to see how things are related so
can better understand phenomena using correl/causal-comparative
Intervention (try intervening to see effects using experiments)
Chapter 1 - continued
The Nature of Research
Quantitative v. Qualitative
Quantitative (numbers)
Facts/feelings separate
World is single reality
Researcher removed
Established research design
Experiment prototype
Generalization emphasized
Chapter 1 - continued
The Nature of Research
Meta-Analysis
Locate all the studies on a topic and synthesize results using statistical techniques
(average the results)
Critical Analysis of Research (some say all research is flawed)
Question of reality (are only individual perceptions of it)
Question of communication (words are subjective)
Question of values (no objectivity only social constructs)
Question of unstated assumptions (researchers don’t clarify assumptions that guide
them)
Question of societal consequences (research serves political purposes that are
conservative or oppressive; preserve status quo)
Chapter 1 - continued
The Nature of Research
Overview of the Research Process (Fig. 1.4)
Introduction chapter
Problem statement that includes some background info and justification
for study
Exploratory question or hypothesis (relationship among variables clearly
defined); goes last in Ch.
Definitions (in operational terms)
Review of related literature (other studies of the topic read and
summarized to shed light on what is already known)
Chapter 1 - continued
The Nature of Research
Overview of the Research Process (Fig. 1.4)
Methods chapter
Subjects (sample, population, method to select sample)
Instruments (tests/measures described in detail and with rationale for
their use)
Procedures (what, when, where, how, and with whom);
Give schedule/dates, describe materials used, design of study, and possible
biases/threats to validity
4. Data analysis (how data will be analyzed to answer research questions
or test hypothesis)

Chapter 2
The Research Problem
Statement of the Problem (identify a problem/area of concern to
investigate)
Must be feasible, clear, significant, ethical
Research Questions (serve as focus of investigation, see p. 28 list)
Some info must be collected that answers them (must be researchable)
Cannot research “should” questions
See diagram, p. 29
Chapter 2 - Continued
The Research Problem
RQ should be feasible (can be investigated with available resources)
RQ should be clear (specifically define terms used…operational needed, but
give both)
Constitutive definitions (dictionary meaning)
Operational definitions (specific actions/steps to measure term; IQ=time to solve
puzzle, where <20 sec. is high; 20-40 is med.; 40+ is low)
RQ should be significant (worth investigating; how does it contribute to field
and who can use info)
RQs often investigate relationships (two characteristics/qualities tied
together)

Chapter 3
Variables and Hypotheses
Important to study relationships
Sometimes just want to describe (use RQ)
Usually want to look for patterns/connections
Hypothesis predicts the existence of a relationship
Variables (anything that can vary in measure; opposite of
constant)
Variables must be clearly defined
Often investigate relationship between variables

Chapter 3 - Continued
Variables and Hypotheses
Variable Classifications (Fig. 3.4, p. 42)
Quantitative (variables measured as a matter of degree, using real numbers; i.e. age,
number kids)
Categorical (no variation…either in a category or not; i.e. gender, hair color)
Independent: the cause (aka the manipulated, treatment or experimental variable)
Dependent: the effect (aka outcome variable)
Extraneous: uncontrolled IVs (see Fig. 3.2, p. 46)
All extraneous variables must be accounted for in an experiment
Chapter 3 - Continued
Variables and Hypotheses
Hypotheses – predictions about possible outcome of a study; sometimes several
hypotheses from one RQ (Fig 3.3)
RQ: Will athletes have a higher GPA that nonathletes?
H: Athletes will have higher GPAs that nonathletes
Advantages to stating a hypothesis as well as RQ
Clarifies/focuses research to make prediction based on previous research/theory
Multiple supporting tests to confirm hypothesis strengthens it
Disadvantages
Can lead to bias in methods (conscious or un) to try to support hypothesis
Sometimes miss other important info due to focus on hypothesis (peer review/replication is a check on
this)
Chapter 3 - Continued
Variables and Hypotheses
Some hypothesis more important than others
Directional v. nondirectional
Directional says which group will score higher/do better
Nondirectional just indicates there will be a difference, but not who will
score higher/do better
Directional more risky, so be careful/tentative in using directional ones
Chapter 4
Ethics and Research
Examples of unethical practices
Requiring participation from powerless (students)
Using minors without parental permission
Deleting data that don’t agree w/ hypothesis
Invading privacy of subjects
Physically or psychologically harming subjects
APA statement of ethical principles in research
Each student must sign one and have it signed by workplace supervisor
Chapter 4 - Continued
Ethics and Research
Protecting participants from harm requires informed consent
Subjects must know the purpose of the study, possible benefits/harm; participation is voluntary and
they can w/draw without penalty any time (Fig. 4.3, p. 59)
Researchers should ask: Could subjects be harmed? Is there another way to get the
info? Is the info valuable enough to justify study?
Researchers must ensure confidentiality of data (limit access; no names if possible; tell
subjects confidential or anonymous)
Deceiving subjects is sometimes necessary (Milgram study), ask if results justify ethical
lapse
When deception used subjects they should be okay with it after (and they can refuse use of their data)
Chapter 4 - Continued
Ethics and Research
Research with children
Parental consent required (signed permission from parents
APA Ethics in Research Form addresses this also
Regulation of Research (National Research Act of 1974)
If federal funding received must have an IRB to check: risks to subjects, informed
consent guidelines met, debriefing plans for subjects
HHS made changes in 1981 so that educational research is exempt under certain
conditions

Video 1
Chapter 5
Review of the Literature
Value of the Literature Review
Glean ideas from others interested in topic
See results of related studies (must be able to evaluated those objectively)
Types of sources
General References – indexes (of primary sources and abstracts (ERIC, Psych
Abstracts)
Primary Sources – publications where researchers report their results (peer
reviewed/refereed journals)
Secondary Sources – publications where authors describe works of others
(encyclopedias, tradebooks, textbooks)

Chapter 5 - Continued
Review of the Literature
Steps in the Literature Review (manual or electronic) See
examples p. 74
Define problem precisely as possible
Review some secondary sources*
Review some general reference works*
Formulate search terms (keywords/descriptors)
Search general references for primary sources
Obtain and read primary sources (make notes/summarize)
*May be based on existing knowledge or previous reading
Chapter 5 - Continued
Review of the Literature
Making notes
Include problem/purpose; hypotheses/RQ; procedures w/ subjects/methods; findings/conclusions;
citation!
Searching strategies…use Boolean operators (AND, OR, NOT)
Searching www…be careful of reliability
Writing up the Literature Review
Introduction - describes problem and justification for study;
Body – discuss related studies together (#2, p.88)
Summary – ties literature together/give conclusions arising from literature
Reference list
Don’t replace a review of primary sources with meta-analysis (a combined review of all
available research on a topic w/ results averaged)

End Part 1
Chapter 6
Sampling
Sample – any group on which info is obtained
Population – group that researcher is trying to represent
Population must be defined first; more closely defined, easier to do, but
less generalizable
Study a subset of the population because it is cheaper, faster, easier, and
if done right, get same results as a census (study of whole pop)
Accessible population – the group you are able to realistically generalize
to…may differ from target population
Chapter 6 - Continued
Sampling
(Random v. Nonrandom Sampling)
Random – every population element has an equal and
independent chance to participate
Uses names in a hat or table or random numbers
Elimination of bias in selecting the sample is most important (meaning
the researcher does not influence who gets selected)
Ensuring sufficient sample size is second most important
Nonrandom/purposive - troubles with
representativeness/generalizing
Chapter 6 - Continued
Sampling
(Random Sampling Methods)
Simple random sampling
Names in a hat or table of random numbers--p.99
Larger samples more likely to represent pop.
Any difference between population and sample is random and small (called random
sampling error)
Stratified random sampling
Ensures small subgroups (strata) are represented
Normally proportional to their part of pop.
Break pop into strata, then randomly select w/in strata
Multistage sampling (see p. 94)
Chapter 6 - Continued
Sampling
(Random Sampling Methods, cont.)
Cluster random sampling
Select groups as sample units rather than individuals
REQUIRES a large number of groups/clusters
Multistage sampling (see p. 94)
Systematic (Nth) sampling
Considered random is list if randomly ordered or nonrandom if systematic
w/ random starting point
Divide pop size by sample size to get N (ps/ss=N)
Chapter 6 - Continued
Sampling
(Non-Random Sampling Methods)
Systematic can be nonrandom if list is ordered
Convenience sampling
Using group that is handy/available (or volunteers)
Avoid, if possible, since tend not to be representative due to homogeneity of groups
Report large number of demographic factors to see likeliness of representativeness
Purposive sampling
Using personal judgment to select sample that should be representative (i.e., this
faculty seems to represent all teachers) OR selecting those who are known to
have needed info (interested in talking only to those in power)
Snowball is a type (used with hard to identify groups such as addicts)
Chapter 6 - Continued
Sampling
Sample size affects accuracy of representation
Larger sample means less chance of error
Minimum is 30; upper limit is 1,000 (see table)
External validity – how well sample generalizes to the population
Representative sample is required (not the same thing as variety in a sample)
High participation rate is needed
Multiple replications enhance generalization when nonrandom sampling is used
Ecological generalization (gen to other settings/conditions, such as using a method
tested in math for English class)

Video 17
Chapter 7
Instrumentation
(Measurement)
Data – information researchers obtain about subjects
Demographic data are characteristics of subjects such as age, gender, education
level, etc.
Assessment data are scores on tests, observations, etc. (the device used to measure
these is called the measurement instrument)
Key questions in data measurement/ instrumentation
Where and when will data be collected
How often will data be collected
Who will collect the data
Chapter 7 - Continued
Instrumentation
Validity – measures what it is supposed to (accurate)
Reliability – a measure that consistently gives same readings
(repeatable)
Objectivity – absence of subjective judgments (need to eliminate
subjectivity in measuring)
Usability of instruments
Consider ease of administration; time to administer; clarity of directions;
ease of scoring; cost; reliability/validity data availability
Chapter 7 - Continued
Instrumentation
(Classifying Data Collection Instruments)
By the group providing the data
Researcher instruments (researchers observes student performance and records)
Subject instruments (subjects record data about themselves, such as taking test)
Others/Informants (3rd party reports about subjects such as teacher rates students)
By where instrument came from
Preference is for existing ones (www.ericae.net, MMY
Can develop your own (requires time, effort, skill, testing; see p. 125)
By response type
Written response – preferred – objective tests, rating checklist
Performance instruments – measure procedure, product

Chapter 7 - Continued
Instrumentation
(Examples of Data Collection Instruments)
Researcher Completed Instruments
Rating scales (mark a place on a continuum for example numeric rating 1=poor to 5=
excellent)
Interview schedules (complete scales as interview takes place; use precoding; beware
of dishonesty)
Tally sheets (for counting/recording frequency of behavior, remarks, activities, etc.)
Flow charts (to record interactions in a room)
Anecdotal records (need to be specific and factual)
Time/Motion logs (record what took place and when)
Chapter 7 - Continued
Instrumentation
(Examples of Data Collection Instruments)
Subject Completed Instruments
Questionnaires (question clarity to reader essential)
Self checklists
Attitude scales (Likert is one type, how much subject agrees/disagrees with
descriptive statements about a topic indicates a positive/negative attitude toward
topic)
Semantic differential (good/bad; poor/excellent ratings)
Personality profiles
Achievement/Aptitude tests
Performance tests
Projective devices (Rorschach Ink Blot Test)
Sociometric devises (peer ratings)
Chapter 7 - Continued
Instrumentation
Item Formats
Selection items or closed response (T/F; Yes/No; Right/Wrong; Multiple choice)
Supply items or open ended (short answer; essay)
Unobtrusive measures (no intrusion into event… usually direct observation and
recording)
Types of Scores
Raw scores (initial score or count obtained…w/out context)
Derived scores (raw scores translated to meaningful usage with standardized process)
Age/Grade equivalence; Percentile ranks; Standard scores (how far a score is from a given
reference point, i.e. z and T scores);
Which to use depends on the purpose; usually standard scores used
Chapter 7 - Continued
Instrumentation
Norm Referenced v. Criterion Referenced Tests
Norm referenced scores give a score relative to a reference group (the
norm group)
Criterion referenced scores determine if a criterion has been mastered
These are used to improve instruction since they indicate what students can
or cannot do or do or do not know
Chapter 7 - Continued
Instrumentation
(Measurement Scales)
Nominal (in name only)
Numbers are only name tags, they have no mathematical value (gender: 1=male and 2= female OR
race: 1= Blk, 2=Wht, 3=other)
Ordinal (in name, plus relative order)
Numbers show relative position, but not quantity (grade level, finishing place in a race)
Interval (in name w/ order AND equal distance)
Numbers show quantity in equal intervals, but an arbitrary zero (can have negative numbers; degrees
C or F)
Ratio (in name, w/ order, eq. distance AND absolute zero)
Numbers show quantity with base of zero where zero means the construct is absent
Higher levels more precise…collect data at highest level possible; some statistics only
work with higher level data
Chapter 7 - Continued
Instrumentation
(Preparing for Data Analysis)
Scoring data – use exact same format for each test and describe
scoring method in text
Tabulating and Coding – carefully transfer data from source
documents to computer
Give each test an ID number
Any words must be coded with numerical values
Report codes in text of research report

Video 18
Chapter 8
Validity and Reliability
(Quality of instruments is important)
Validity is most important aspect of measures
Means accuracy, correctness, usefulness of instrument
Validation is the process of collecting and analyzing evidence to support
inferences based on an instrument
Test publishers usually give a statement of intended use as well as
evidence to support validity
Reliability (consistency in scoring) is part of validity
Chapter 8 - Continued
Validity and Reliability
(Three ways to establish validity)
Content validity – is entire content of construct covered by test, are
important parts emphasized?
Established by expert judgment
Facial validity is part of this
Criterion validity – is there consistency between the instrument and some
predicted or concurrent criterion?
Established by empirical evidence using validity coefficient (-1 to +1 scores)
Correlate scores of the test with the criterion (SAT and GPA in college)
Chapter 8 - Continued
Validity and Reliability
(Three ways to establish validity)
Construct validity – Does the measure correctly identify those with
different levels of the construct
Established with empirical evidence
Correlate scores on test with known indicator of the construct (prisoners
score low on test of ethics)
Validity problems come from systematic error (also known as
bias…something the research did wrong)

Chapter 8 - Continued
Validity and Reliability

Reliability means that scores are consistent from one time


measuring to the next
Can have a reliable measure that may not be valid
Must be reliable to be valid
See p. 166, target shooting
Errors of measurement – there is always some variation from
measure to measure
Look at reliability coefficient to determine reliability
Chapter 8 - Continued
Validity and Reliability
(Three ways to establish reliability)
Test/Retest – give the same test (of enduring trait) to the same
people at two times and correlate the scores
Equivalent forms – give two parallel forms of a test to the same
people and correlate scores
Internal consistency – several methods
Split halves (score two halves of test and correlate scores)
KR-21 and Cronbach Alpha – Correlate each item to overall score

Chapter 8 - Continued
Validity and Reliability

Standard Error of Measurement – variations in measurement result


in some error which is reported
Scoring Agreement – for subjective tests or direct observations
(check of internal reliability)
Validity and Reliability should be addressed in all research
(including qualitative)

Chapter 9
Internal Validity
(The IV really caused a change in the DV)
Threats
Subject characteristics/selection bias – when subjects in study or in
trmt/cont groups differ from each other (on age, gender, ability, etc)
Loss of subj/Mortality – must address question of whether those dropping
out are different than those not
Location/Experiment variables – characteristics of the school, classroom,
etc. may be interfere with the cause/effect relationship (keep constant
for both groups)

Chapter 9 - Continued
Internal Validity
(The IV really caused a change in the DV)
Threats (continued)
Instrumentation – need constant application and scoring of instruments
Instrument decay – when scoring varies due to fatique
Data collector characteristics (age, gender, etc.) influence results) … use same
collector or randomly assn
Data collector bias – unconscious or conscious distortion of data (use single or
double blind technique)
5. Testing – pretest sensitization can occur or subjects can figure out
acceptable answers
Chapter 9 - Continued
Internal Validity
(The IV really caused a change in the DV)
Threats (continued)
History – an external occurrence that interferes with relationship between
IV and DV
Maturation – changes in relationship between IV and DV due to passage
of time/growth of subj
Attitudes of Subjects – Hawthorne or guinea pig effects, novelty effects
and demoralization may occur
Regression (toward the mean) – Low scorers do better in subsequent
tests; high scorers do worse
Implementation – experiment differs for groups

Chapter 9 - Continued
Internal Validity
(The IV really caused a change in the DV)
How to minimize threats:
Standardized conditions
Collect and report demogr characteristics of subj
Identify/report details of study
Select a design to minimize effects (true randomized experimental
designs are best)
See page 189, Fig. 9.10 for threats summary

End Part 2
Chapter 13
Experimental Research
Most powerful design
Used to establish cause and effect by manipulating (influencing)
an IV (independent variable, aka treatment or experimental
variable) to see its effect on a DV (dependent variable (aka
criterion or outcome variable)
Goes beyond description and prediction

Chapter 13 - Continued
Experimental Research
(Characteristics of Experimental Research)
Comparison of groups (at least two groups of subjects, called treatment and control
groups)
Manipulation of the IV (experimenter changes something for the treatment group that’s
different than the control group)
Randomization (true experiments require random assignment into treatment/control
conditions…after random selection of subjects to participate in study)
Assignment takes place at start of experiment
Do not use already formed groups
Groups should be equivalent (any differences due to chance)
Randomization eliminates threats from extraneous variables
Groups must be sufficiently large to be equivalent

Chapter 13 - Continued
Experimental Research
(Control of Extraneous Variables)
All extraneous variables must be controlled to eliminate threats to
validity/rival hypotheses
Ensure groups are equivalent to begin using randomization
Hold certain variables constant (i.e. age, IQ) or build them into to the design
Use matching when necessary
Use subjects as their own controls (treat same group first in control condition then in
treatment OR use pre-test/posttest on same group)
Use analysis of covariance to statistically equate unequivalent groups
Chapter 13 - Continued
Experimental Research
(Group Designs)
Weak Designs
One Shot Case Study (X O)
One group exposed to treatment then DV is measured
No controls
Example: Try new teaching method then see how students do on post test
One Group Pretest-Posttest Design (O X O)
Adds a pretest but no control group
Static-Group Comparison Design X1 O
Need control for diff subj characteristics X2 O
Static Group Pretest/Posttest Design (adds a pretest)
Chapter 13 - Continued
Experimental Research
(Group Designs)
True Experimental Designs
Randomized Posttest Only Design R X1 O
(random assign to trtmt/cntrl, then posttest) R O
Randomized Pretest/Posttest Control Group R O X1 O
(controls history, maturation, etc.) R O X2 O

Randomized Solomon 4-Group Design combines the above two (eliminates testing
threat; problem is number of subjects needed)
Random Assignment w/ Matching
Match pairs on factors that influence DV then randomly assign to treatment or control (subjects
limited by no match elimination)
Statistical matching can be done using predicted scores

Chapter 13 - Continued
Experimental Research
(Group Designs)
Quasi Experimental Designs
Matching only – different from random assignment w/ matching (uses existing groups)
Match subjects in trmt and cntrl groups on known extraneous variables
If possible, use multiple groups, and randomly assign them
Counterbalanced – Each group exposed to all the same treatments but in different
order
Time series – Repeated treatments and observations over a period of time (both
before and after treatment)
Factoral designs – Multiple IVs or DVs investigated simultaneously (i.e. look for
interactions between 2 IVs)

Chapter 13 - Continued
Experimental Research
(Controlling Threats to Internal Validity)
See Table 13.1, p. 284 for advantage/disadv. of each design
To evaluate the likelihood of a threat to internal validity in experiments ask:
What are the known extraneous factors?
Do the groups differ on them?
How were they controlled?
Researchers need tight control for experiments to be successful
See pp. 288-289 questions to evaluate published article
See evaluation of selected article on pp. 290-299

Chapter 15
Correlation Research
(Predicting Outcomes Through Association)
Correlational research involves study of existing relationships
between two variables
Descriptive in nature
Often a precursor to experimental research
Positive correlation is Hi/Hi and Lo/Lo (coeff. +r)
Negative correlation is Hi/Lo and Lo/Hi (-r)
Purpose is to explain relationships or to predict outcomes

Chapter 15 - continued
-Correlation Research
(Predicting Outcomes Through Association)
Explanatory studies examine relationship to identify possible cause/effect
Relationship might or MIGHT NOT mean causation
For causation: 1) A before B; 2) A and B related; 3) Rule out other causes of B (need
experiment)
Prediction studies identify predictors of criterions (i.e. HS GPA and College
GPA)
Scatterplots with regression line/equation predicts scores numerically
The stronger the correlation the better the prediction

Chapter 15 – continued
Correlation Research
(Predicting Outcomes Through Association)
Complex Correlation Techniques, such as multiple regression allow use of
several predictors for one criterion
Coefficient of multiple correlation (R) gives strength of correlation between predictors
and criterion
Coefficient of determination (r2) is amount x and y vary together
Descriminant function analysis is for non-quantitative criterion (predict which group
someone will be in)
Other techniques also used (factor analysis, path analysis, structural modeling)

Chapter 15 - continued
Correlation Research
(Steps in the process)
Problem selection – usually it’s are x and y related or how well does p
predict c
Sample – random selection of at least 30
Measurement – need quantitative data
Design/Procedures – need two measures on each subject
Data collection – usually both measures close in time
Data analysis – correlation coefficient, r, and plot (r is -1 to +1, and the
closer to plus or minus 1, the stronger the relationship)

Chapter 15 - continued
Correlation Research
(Interpreting Correlation Coefficients)
General guideslines:
+.75 to +1.0 Very strong relationship
+.50 to +.75 Moderate strong relationship
+.25 to +.50 Weak relationship
+.00 to +.25 Low to no relationship
Need .5 or better for prediction of any use, and .65 for accurate
predictions
Reliability coefficients should be .7 up
Validity coefficients should be .5 up
Chapter 15 - continued
Correlation Research
(Threats to Internal Validity in Correlation Research)
Remember correlation is not causation (lurking variables)
Subject characteristics – may get different correl w/ different ability levels, gender, etc.
(can control with partial correlation)
Location – testing conditions can impact results
Instrumentation problems – helps to standardize instrument and data collection for both
groups
Testing – pretest interference and sensitization possible
Mortality – be careful if have large loss from one group being tested
Chapter 15 - continued
Correlation Research
(Questions to ask to avoid threats to internal validity)
What factors could affect the variables being studied?
Does any factor affect BOTH variables? (this is where threats occur)
Figure a way to control any lurking variables

Chapter 16
Causal Comparative Research
(Ex Post Facto)
Determines cause (or effect) that has occurred and looks for effect (or
cause) from it
Start w/ differences in groups and examine them
Examples: Difference in math abilities of male/female stu
No random assignment to treatment (it already occurred)
Associational like correlation but primarily interested in cause/effect
IV either cannot (ethnicity) or should not (smoking) be manipulated
Chapter 16 - continued
Causal Comparative Research
(Ex Post Facto)
Often an alternative to experimental (faster and cheaper)
Serious limitation is lack of control over threats to internal validity
Need to remember the cause may be the effect; they may only be
related and there is some other variable that is the cause
(lurker)
Remember three canons of causation
Chapter 16 - continued
Causal Comparative (CC) Research
(CC versus Correlational Research)
Both are associational (looking for relationship)
Both are often prelude to experiments
Neither involves manipulation of variables
CC works with different groups; correl examines one group on
different variables
Correlation is measured w/ coefficient while CC compares
means/medians/percents of group members
Chapter 16 - continued
Causal Comparative (CC) Research
(CC versus Experimental Research)
Both compare group scores of some type
In experimental the IV is manipulated, but not in CC (already took
place)
CC does not provide as strong evidence as experimental for cause
and effect
Chapter 16 - continued
Causal Comparative (CC) Research
(Steps in CC Research)
Problem formation – identify phenomena and look for causes or consequences of it
Sometimes several alternate hypotheses investigated
Sample – define (operationally) characteristics of study carefully, then select individuals
who possess
Groups should be homogeneous in regard to several important variables (to control for them as
causes) then match control/exp groups on one or more variables (smoking study matched on 19
variables)
Instruments – use any type to compare the groups
Design – basic CC involves 2 or more grps that differ on variable of interest (basic design
is one group possesses trait (athlete) other doesn’t compare DV (GPA)
Chapter 16 - continued
Causal Comparative (CC) Research
(Threats to Internal Validity in CC Research)
Subject characteristics – since don’t select subjects and form groups, there
may be unidentified lurking variables
Can use matching to control for any identified differences, but limits samples size
Can find or create homogeneous groups (for example compare only high GPA
students to other high GPA students) on attitudes toward x
Statistical matching – adjusts posttest scores based on some initial difference
Other threats – location, instrument, history, maturation, loss of subjects
can be concerns
Need to control as many as possible to eliminate alternate hypotheses
Chapter 16 - continued
Causal Comparative (CC) Research
(Evaluating threats to Internal Validity in CC Research)
Questions to ask
What factors are known to affect the variable being studied?
What is the likelihood the comparison groups differ on these factors?
How well did the design identify and control for these?
For example consider subject characteristics such as socioeconomic status, gender, ethnicity, job
skills; mortality rates in groups; location (schools differ); instrument (differrent data collectors
and/ or biases)
Data Analysis in CC – often compare means of groups; with 2 categorical
use crosstabs (crossbreak tables) to compare percents by groups
Text gives example study

Chapter 17
Survey Research
(Used to describe what people think/do/believe)
Types
Cross sectional provide a snapshot in time
Longitudinal collect data at different points in time to study changes over
time
Trend study - random sample each year on same topic
Cohort study - sample from same cohort members year after year
Panel study - same individuals surveyed year after year (mortality a problem over
long time periods)
Often surveys are the data collection instrument in correlation (or
cc/exp’l) studies

Chapter 17 - Continued
Survey Research
(Steps to conduct survey research)
Define the problem
Needs to be important enough respondents will invest their time to
complete it
Must be based on clear objectives
Identify the target population
Defined by sample unit or unit of analysis
Unit can be a person, school, classroom, district, etc.)
Survey a sample or do a census of the population

Chapter 17 - Continued
Survey Research
(Steps to conduct survey research)
Methods of data collection
Direct administration to a group (such as at a meeting) - good response rate, limited
generaliz.
Mail survey (inexpensive way to get large amount of data from widespread pop) -
lower response rates, not in-depth info, illiterate missed
Telephone survey (cheap/fast) - response rates higher due to encouragement (“I’m
not selling…”); miss some pop members, interviewer bias possible
Personal interviews (face-to-face has good response rate but time and cost high) -
lack anonymity, interviewer bias

Chapter 17 - Continued
Survey Research
(Steps to conduct survey research)
Select the sample (randomly, but check to see respondents are
qualified to answer)
Pilot test can indicate likely response rate and problems with data
collection or sample
Prepare instrument (questionnaire and interview schedule)
Appearance important - look short and easy
Clarity in questions is essential
Chapter 17 - Continued
Survey Research
(Steps to conduct survey research)
Question types (same questions need to be asked of all
respondents)
Closed ended (multiple choice) - easier to complete, score, analyze
Categories must be all inclusive, mutually exclusive
Open ended - easy to write, hard to analyze and hard on respondents
See examples p. 403

Chapter 10
Descriptive Statistics
(Tools to summarize data)
Descriptive statistics describe many scores with just one or two indices
(such as mean or median)
Sample of a pop is described w/ indices called statistics
Entire pop is described w/ indices called parameters
Types of data (words or numbers)
Quantitative data – scales measure how much (test scores, amount of money spent,
etc.
Interval, Ratio, and sometimes Ordinal, variables
Categorical data – total number of objects in a category (ethnicity, gender, etc.)
Nominal and sometimes Ordinal, variables

Chapter 10 - Continued
Descriptive Statistics
(Summarizing Quantitative Data)
Frequency distributions or tables show the layout of the data (see
text example p. 201)
Frequency polygons – shows where most scores are and how spread out
data are
Pay attention to shape (positive, negative skews)
Normal curves – smoothed polygons – most scores in the center, fewer in the tails
– many variables follow a normal shape (height, weight, age, etc.)
Normal curves are the foundation for inferential statistics
Chapter 10 - Continued
Descriptive Statistics
(Summarizing Quantitative Data)
Averages – measures of of central tendency
Three indices tell what is a typical score
Mode – most frequent score
Median – middle score (50th percent)
Mean – takes into account all scores
Which to use depends on what you are trying to show
See example pp. 205/206
Spreads – measures of variation or dispersion
Three indices tell how closely scores cluster together
Range (highest – lowest); a crude indicator of spread
Standard deviation (average distance of each point from the mean)
Smaller SD means less spread out, larger one means more spread out
Quartiles, percents, IQR, boxplots
SD and normal curves…68/95/99.7 rule

Chapter 10 - Continued
Descriptive Statistics
(Summarizing Quantitative Data)
Standard scores and the normal curve
Standard scores use a common scale for all scores
z scores are simplest – tell how far from the mean in SD units
Score on mean then z=0; score 1 SD above then z=1.0; 1SD below then z=-1.0,
etc.
Use mean and SD to calculate z scores so you can compare apples/oranges (p.
210)
Z = any score – mean
standard deviation

Chapter 10 - Continued
Descriptive Statistics
(Summarizing Quantitative Data)
Probability based on z scores
All scores in normal distribution are equal to 100%
A z-table gives percent of scores from any score to the mean (Appendix, pp. A-4/5)
The probability for getting higher or lower than any given score can then be
calculated
T-scores are often used because negative z scores awkward (all T-scores are
positive)
Multiply z times 10, then add 50 (p. 212 Table 10.15)
Standard test scores often given with T-scores and percents above/below the given
score
Note…use z and T scores only with NORMAL distributions!

Chapter 10 - Continued
Descriptive Statistics
(Summarizing Quantitative Data)
Correlation examines relationships between two quantitative
variables (interval/ratio data)
Scatterplot shows the relationship visually
Use it to check for pattern in data (hi/hi or hi/lo?)
If linear pattern, can us Pearson’s r coefficient
Use it to look for strength (scatteredness)
Pay attention to outliers (p. 215/216 examples)
Correlation coefficient is a numerical indicator or strength of the
relationship
Pearson’s ppm (r) is for linear data (-1 to +1)
Eta is for curved data

Chapter 10 - Continued
Descriptive Statistics
(Summarizing Categorical Data)
Frequency tables
Give percents for ease in interpreting
Crossbreak or crosstabulations for relationships (IV goes on the
side, then give row percents)
Bar charts and pie charts used
Bars for ordered categories
Pies for unordered categories
Chapter 11
Inferential Statistics
Inferences about a population based on data from a sample
Answers questions about how likely a sample is to represent some
parameter about a population
Inferential test used depends on the level of data (quantitative or
categorical)
Chapter 11 - Continued
Inferential Statistics
(The logic of inferential statistics)
Sampling error
Samples differ from their parent populations (no two samples are the same)
Difference is called sampling error
Distribution of sampling means (the sampling distribution)
Large collections of random samples of at least 30 follow a normal curve pattern
Its mean (mean of means) is the mean of the population
Its SD (SD of means) is the standard error of the mean (SEM)
Chapter 11 - Continued
Inferential Statistics
(The logic of inferential statistics)
Standard error of the mean (SEM)
It’s the SD of the sampling distribution
Since distribution is normal, then +1SEM has 68% of cases; +2SEM has 95%; +3SEM
has 99.7%
Once we can estimate the mean and SD of the sampling distribution can determine how likely it is
that a particular sample mean came from that population
i.e. Mean of pop=100, SD=10 and draw a sample with a mean of 110, yes could be from that pop…
but if draw a sample with a mean of 140, most likely NOT from that pop…since is +4SEM from
the mean (almost zero probability)
Express means as z scores; a z score move that 2SEM is going to occur less than 5%
of the time (2.5% each side)
Chapter 11 - Continued
Inferential Statistics
(The logic of inferential statistics)
Estimating the SEM
It is estimated from the SD of the sample, adjusted for sample size: SEM=SD/√n-1
Confidence Intervals (CI)
Use the SEM to indicate boundaries
95% of the time a pop mean will be within +2 SEM from the sample mean (actually +
1.96 SEM)
If sample mean IQ=85 (& SEM=2) then 95% of the time the pop mean IQ will be
85+1.96(2) or 85 +3.92 which is 81.08 to 88.92; 99% CI=79.84 to 90.16
Can be 95% confident that true pop mean is 81.08-88.92

Chapter 11 - Continued
Inferential Statistics
(The logic of inferential statistics)
Probability is a predicted occurrence such as 5 in 100 times (5% or .05)
In previous example, the probability of the population mean being outside the 95% CI
(of 81.08 to 88.92) is 5%
Usually comparing more than one mean
Examine difference in 2 sample means to see if how likely the difference in the
sample is to represent a true difference in the population…is it due to a true
difference in the pop or only due to sampling error
The SEM of the difference between sample means, called the SED or standard error of
the difference is used and w/in +1SED is 68%; +2 SED is 95%; +3 SED is 99%

Chapter 11 - Continued
Inferential Statistics
(Hypothesis Testing)
A hypothesis is a predicted relationship
Usually comparing means, proportions, or looking for correlations
between groups
The heart of infer. stats…is the relationship found in the sample most
likely due to a relationship in the pop, or just due to random sampling
error?
The null hypothesis is stated and tested
THE NULL ALWAYS SAYS THERE IS NO RELATIONSHIP OR
DIFFERENCE!!!
Chapter 11 - Continued
Inferential Statistics
(Hypothesis Testing)
Research hypothesis is what you really think is going on; opposite of the null
Example of hypothesis test
H0 (null) is that mean1=mean2, meaning the mean scores are equal OR the difference
between the mean scores is 0
The distribution for a difference of zero between the means is a normal curve
centered on zero
As diff between means gets larger, meaning further from the center (in SEM units),
the more likely it is to represent a true diff in the pop means
If the prob is .05 or less, reject null…called a statistically significant difference (some
fields use .01 or .001)
Chapter 11 - Continued
Inferential Statistics
(Hypothesis Testing Process)
State the research hypothesis (Ha or Hr)
State the null (H0) (Remember NO)
Obtain the sample statistics (means, proportions, correlations)
Determine the probability of getting the sample results just by chance if the null is
true
Small probability (p<.05) means reject null; there is a significant difference (or
correlation) in pop.
Large probability (p>.05) means do not reject; there is no significant difference (or
correl) in pop.
Note: Just because finding is statistically significant does not mean it is a practical
difference (given a large enough sample most are significant)
Chapter 11 - Continued
Inferential Statistics
(Hypothesis Testing)
One tailed versus two tailed tests
When literature strongly indicates the need for directional hypothesis
then do a one-tail
In a one tail all 5% is on one side (2-tailed cutoff is 1.96SD while 1 tailed
cutoff is 1.65)
Type I (alpha) versus Type II error
See Figure 11.16, p. 240
Type I – reject true null; Type II – accept a false
Inversely related errors
Chapter 11 - Continued
Inferential Statistics
(Inference Techniques)
Parametric tests (for quantitative I/R data from normal distributions of sample size 30+)
t-tests compare means of two groups (can be independent or correlated/paired samples)
ANOVA tests compare means of two or more groups (use post hoc)
Correlations t-test (with computers just use significance of r)
Nonparametric tests (for categorical data and I/R from non-normal pops or small
samples)
Mann Whitney U compares ranks of two groups
Kruskal Wallis Oneway ANOVA compares ranks of two plus groups
Chi-square test (compares proportions)
Power of tests – use parametrics and increase sample size

Chapter 12
Statistics in Perspective
Approaches to research
Either 2 or more groups compared OR variables in 1 group studied AND data are
either categorical or quantitative
Comparing groups on quantitative data
Can compare freq distributions (histograms), m. of center, and m. of spread OR all
three
Interpretation – improves with experience…need to know when something statistically
significant is not practically significant
Calculate effect size - look at size of difference or delta Δ…if it is greater than .5,
practically significant
Use infer. stats judicially paying attention to size of diff. and sample size and method
it is based on
Chapter 12 - continued
Statistics in Perspective
Relating variables within group w/ quant data
Scatterplot and correl coeff – examine plot carefully
Beyond significance pay attn to size of r and especially to r-squared
Examine how sample data collected
Comparing groups w/ categorical data
Use freq and percent in crossbreak tables
Look at summary stats carefully and pay attn to sample size
Relating variables within a group with categorical data – use one sample
chi-square

Chapter 12 - continued
Statistics in Perspective
Recap
Use graphics and numbers
Pay attention to outliers
Pay attention to magnitude of differences
Use inference tests for generalizing purposes and examine sampling
Use multiple techniques and CIs

You might also like