Professional Documents
Culture Documents
Explanatory variable X
(independent variable)
1500
1000
30 40 50 60
Mass(kg)
300
hrt_death rate
200
100
0 1 2 3 4 5 6 7 8 9
Alcohol
wine consumption
• Patterns:
• Form (clusters, scatter, linear..)
• Direction (positive, negative)
• Strength ( how closely points follow form)
• Deviations:
• Outliers
Interpret the last two scatter plots….
y
x
A linear correlation coefficient that implies
a strong positive or negative association
that is computed using observational data
does not imply causation among the
variables.
Correlation = r
• Quantitative variables
• Linear relationships
• r has no units
• r can be between –1 and 1
• Positive r =
positive association
• Negative r =
negative association
• 0 = no association
• r is influenced by outliers
Do heavier people burn more energy?
Lean body mass vs. metabolic rate
2000
Rate(cal)
1500
1000
30 40 50 60
Mass(kg)
Males +
Rate(cal)
1500
Females o
1000
30 40 50 60
Mass(kg)
300
hrt_death rate
200
100
0 1 2 3 4 5 6 7 8 9
Alcohol
wine c onsumption
250
hrt death rate
200
150
1 2 3 4
Alc wine consumption
250
hrt death rate
200
150
1 2 3 4
Alc wine consumption
300
hrt_death rate
200
100
0 1 2 3 4 5 6 7 8 9
Alcohol
wine c onsumption
We can summarise an overall linear form with a line…the best line is called
the Regression Line
300
200
glasses?
0 1 2 3 4 5 6 7 8 9
wine consumption
sy
b rsx b is the slope (rate of change in y when x increases)
300
death rate
200
100
0 1 2 3 4 5 6 7 8 9
wine consumption
variation in y due to x
r
2
total variation in observed y
= coefficient of determination