You are on page 1of 21

Scatterplots and Graphs Visualizing Relationships Plots and graphs are good for displaying the DIRECTION of relationships.

s. o o POSITIVE RELATIONSHIP: (upward-sloping) an increase in the IV relates with an increase in the DV. NEGATIVE RELATIONSHIP: (downward-sloping) an increase in the IV relates with a decrease in the DV. Plots and graphs are also good for displaying the LINEARITY of relationship. o o Scatterplots Use when both variables are interval Shows the individual observations as points on the graph Linear relationships Curvilinear or nonlinear relationships.

Using Graphs You can also use bar and line graphs to make comparisons o o o o Bar graphs = nominal or ordinal DV Line graphs = interval IV X-(horizontal) axis = IV Y-(vertical) axis = DV

Be consistent with axes.

Basics of Controlled Comparisons Asks the question: How else? Terminology o o o X = independent variable Y = dependent variable Z = additional independent variable or control variable

X explains Y controlling for Z

Controlled Comparison Terms Zero-order effect what is the relationship between a causal and a dependent variable. Controlled comparison table or control table - a cross-tabulation table between a IV and DV for each value of a third (control) variable. Controlled effect what is the relationship between a causal and a dependent variable within one value of a third (control) variable. Partial effect what is the relationship between a dependent variable and a control variable within one value of a causal (X) variable.

What happens when we ask, How else? Spurious relationship: a relationship between X and Y is caused entirely by a third (control) variable. Additive relationship: a third (control) variable adds to the relationship between X and Y. Interaction relationship: the relationship between X and Y depends on the value of the third (control) variable.

How to set up a control table Place DV into rows Decide the base category of the control Make columns for control Make columns for IV within columns of control Add data Calculate down Interpret across

Interpretation Controlled effect: o 1) relationship between party ID and gun control separately for men and women Democratic women are 30.5% more likely to support more gun control than GOP women Democratic men are 34.7% more likely to support more gun control than GOP men Overall, Democrats are roughly 30% more likely to favor gun control than GOP, controlling for gender Partial effect: o The change in the DV due to the control variable on each value of IV. Democratic women are 10.2% more likely to support more gun control than Democratic men. GOP women are 14.4% more likely to support more gun control than GOP men. Overall, women are about 10% more likely than men to favor gun control regardless of party. WHICH IS IT? Spurious Additive Interaction

IDENTIFYING THE PATTERN 1. After holding the control variable constant, does a relationship exist between the IV and DV within at least one value of the control variable? (ie., Are either control effects consistent with the bivariate cross-tabulation finding?) o IF NO, THEN ITS A SPURIOUS RELATIONSHIP. IF YES, GO TO (2). 2. Is the tendency (direction) of the relationship between the IV and DV the same at all values of the control variable? (ie., Are both control effects consistent with the bivariate cross-tabulation finding?) o IF NO, THEN ITS AN INTERACTION. IF YES, PROCEED TO (3) 3. Is the strength of the relationship between the IV and the DV the same or very similar at all values of the control variable? (ie., Are both controlled equally strong?) o IF YES, THEN ITS ADDITIVE. IF NO, INTERACTION.

Another example DV: vote choice (D or R) IV: abortion opinion (permit or not) Hypothesis: Control: Saliency of the issue to the voter

Means Comparison Use when DV is interval with categorical IV and control DV: feeling thermometer towards homosexuals IV: egalitarianism scale (belief is social equality) Hypothesis? Control: age group

How to Construct a Mean Comparison Control Table Findings What does the chart tell us? o o o o o Egalitarianism on homosexual attitudes, controlling for age? Age on homosexual attitudes? Egalitarianism on homosexual attitudes? IV makes the rows. CV makes the columns. DV reported as means inside the table

A Final Example DV: % of women in legislatures IV: electoral system (PR, non-PR) CV: cultural acceptance (low, high) Hypothesis: Control: cultural acceptance of women

Findings What does the chart tell us? o o o o o Conditional Pattern? A conditional change of 5% (from Non-PR to PR) for countries with low cultural acceptance of women. A conditional change of 11.4% (from Non-PR to PR) for countries with high cultural acceptance of women. Cultural acceptance (low, high) on women legislatures? Electoral system (Non-PR, PR) on women legislatures?

Foundations of Statistical Inference CHAPTER 6 Key Terms Significance (informally): o A statistical probability indicating whether an observed relationship could have occurred by chance Inferential statistics: o A set of statistical procedures for assessing the relationship between an observed sample value and an unobserved population parameter Population: o Sample: Sample statistic: an estimate of the population parameter. Population parameter: an unknown population characteristic

Ole Miss Student Survey STEP ONE: DEFINE THE POPULATION. o What is the population? All Students, right? But, what about . . . o Graduate v. Undergraduate? Part time v. full time students? Faculty and Staff? Oxford campus v. De Soto/Tupelo v. Jackson Medical campus?

STEP TWO: DECIDE ON THE SAMPLING FRAME. We draw the sample and sample statistic from our population list (aka, a sampling frame). The sampling frame is the method for defining all population members. What is our sampling frame for our Ole Miss survey?

STEP THREE: DRAW A SAMPLE. o How will you select your cases from the population to reduce selection bias? Selection Bias - the difference between the population parameter and parameter estimate resulting from some population members being more likely than others to be included in the sample.

STEP FOUR: DESIGN A SURVEY INSTRUMENT. o How will you write questions to avoid measurement error? How will you avoid response bias the bias occurring when some subjects in the sample participate at a higher rate than other subjects STEP FIVE: IMPLEMENT THE SURVEY WITH A HIGH RESPONSE RATE.

How could we get a random sample from our sampling frame? o o Generate random process from which all students have an equal chance of being selected Setting up a booth in front of the union? o Email? Selection bias? Response bias? Selection bias? Response bias?

We want to know student bodys average binge drinking score (number of drinks over 2 hours). Plug in our equation: o o Population parameter = sample statistic + Random sampling error Ole Miss (population) BDS average = sample BDS average + standard error

Assume a random sample of n = 100 drawn from the population has a mean of 3 and a standard deviation of 1.5. o Whats the standard error of the sample mean? (Hint: s / n )

x bar = 3 n = 100 s = 1.5 Standard error = 1.5 / 10 = .15 What is our 95% confidence interval for the population mean? X-Bar +/- Z (SE) o o o 3 2(.15) = 2.7 3 + 2(.15) = 3.3 Interpretation: 95% of all possible random samples of n = 100 will produce sample means between 2.7 and 3.3.

Types of Samples Nonrandom Samples: samples that do NOT have an equal probability of selection method (EPSEM) o o o Convenience sample a nonrandom sample in which subjects are selected because they are easy (convenient) for the researcher. Snowball sample respondents are asked to identify other likely participants. Quota sample a nonrandom sample in which subjects are selected in proportion to their representation in the population. Random Sample a family of samples that use an equal probability of selection method (EPSEM). o o o Simple Random Samples random selection from a list (ex. Random digit dialing, random number tables) Stratified Samples - a probability sample by strata. Cluster Samples a probability sample by clusters.

Sampling Review: Terms Types of samples EPSEM Sampling frame Selection bias Response bias

Random Sampling Error Random Sampling Error (Standard Error) The extent to which a sample statistic differs, by chance, from a population parameter. o o o Eliminating bias does not eliminate error Minimized by increasing our n Increases with variation in the population parameter.

Calculating Random Sampling Error Sample Size As sample size increases, error decreases This is an inverse and nonlinear relationship Population parameter = sample statistic + Random sampling error o Random Sampling Error = Variation component / Sample size component o Variation = direct (positive) relationship o Sample size = inverse (negative) relationship

The Nonlinear Effect of Sample Size Sample size = 100 o o o o o o o o Standard Deviation Most common measure of dispersion Average distance from the mean First calculate the mean: So sample size component is 100 = 10 Becomes variation / 10 So sample size component is 400 = 20 Becomes variation / 20 So sample size component is 1600 = 40 Becomes variation / 40 So sample size component is 2500 = 50 Becomes variation / 50

Increase sample to 400

Increase sample to 1,600

Increase sample to 2,500

Sample Standard Deviation

sX

( X X ) 2 n 1

Example data: 5, 8, 5, 4, 6, 7, 8, 8, 3, 6 Sum of X mean: squared o o Find the mean: (5+8+5+4+6+7+8+8+3+6)/10 = 6 Add up the squared deviations for each observation: (5-6) + (8-6) + (5-6) + (4-6) + (6-6) + (7-6) + (8-6) + (8-6)2 + (3-6)2 + (6-6)2 = 28

Divide sum by n-1 o o N = sample size (or number of observations 28 / (10-1) = 3.11

Take the Square Root So standard deviation = 1.76

Random Sampling Error (Standard Error): Whats the difference between SD and SE? Standard deviation: A measure of dispersion around a single mean from a known population. o Ex. POL 251 Exam Scores Standard error: A measure of how closely sample means estimate an unknown population mean in repeated sampling. o Notation = standard deviation of the population s = standard deviation of the sample o See table 6-5 Random sampling error = standard error of the mean Standard error = s / n or may see / n Ex. Binge Drinking Score at Ole Miss

Central Limit Theorem Established statistical rule that tells us that, if we were to take an infinite number of samples of size n from a population of N members, the means of these samples would be normally distributed The distribution of sample means would have a mean equal to the true population mean and have random sampling error (standard error) equal to , the population standard deviation, divided by the square root of n.

Normal Distribution The mean, median, and mode are all the same value. If this wasnt the case, then the distribution would not be normal. More importantly, a fixed percentage of observations will lie between the mean and any number of standard deviations from the mean. From our standard deviation example: o Assuming the data is normally distributed, then 68.2% of the cases lie between +/- 1 standard deviation of the mean and 95% fall within +/- about 2 (1.96) standard deviations.

Inference from the Normal Distribution 95% confidence interval: o o o o Interval within which 95% of all possible sample estimates will fall by chance Calculate by: x bar +/- 1.96 (standard error) Rule of thumb: x bar +/- 2 (standard error)

Standard(ized) Normal Distribution: Used For Probability Bell-shaped Curve

Z-Scores Z- Score: converts a raw deviation from the mean into a standardized deviation from the mean.

Extending Statistical Inference: Sample Proportions Use when variable is nominal or interval o Allows us to compare a sample category outcome to the population Sample proportion: the number of cases falling into 1 category divided by the number of all cases in the sample. Sample Proportions To apply statistical inference, we need to calculate our standard error (so we can compute a confidence interval or margin of error). Problem? o o We cant use standard deviation with nominal or interval data Solution:

Standard error of a sample proportion

Sample Proportions: An Example Survey question: o Should all alcohol be banned in the Grove on football Saturdays? Yes or no Level of measurement? Outcome: Yes = 15, no = 85, n = 100

Sample proportion of abolitionists (p): .15 Sample proportion of non-abolitionists (q): .85 Whats the standard error of the sample proportion? o o o P=.15 Q=.85 N=100

Sample Proportion: A CI for a sample proportion o o o Standard error in our question: .357/10 = .0357 Confidence interval for our finding: .15 give or take .0357 Use the rule of thumb for a 95% CI: p 2(.0357) = .0786 p + 2(.0357) = .2214

You might also like