Professional Documents
Culture Documents
Frequency
10
68 68 14
70 71 9 5
67.2 74 8 0
62 65 68 71 74 77 More
62.5 77 2
Bi n
75 More 0
67 Frequency
65
62.5
62 Variable Height
74 Minimum 59 Box Plot - Height
71
68 Q1 64
71.5 Median 67 70
71 Q3 70 69
73 Maximum 76 68
62 67
64 Variable Height 66
Units
70 Median - Min 8 65
69 Q1 64 64
67 Median - Q1 3 63
72 Q3 - Median 3 62
66 Max - Median 9 61
69 Hei ght
68
69 Part A Using the boxplot, are there any outliers present? Explain.
64 1.5 IQR 9
66 Q3+1.5*IQ 79
63.6 Q1-1.5*IQ 55
69 No, there are no outliers.
62
66 Part B Looking at the histogram, comment on the distribution shape of height: (left skewe
67 The shape of the histogram is symmetric
59
67
67
73
64
65
76
69.5
73.5
73
65
66
71
eight
74 77 More
ot - Height
Hei ght
7 8.5 72 70
5 6 67.5 60
5.8 6 62 50
7 8.5 68
Height
40
6.5 7 70 30
6.5 7.5 67.2 20
7 5 62.5
10
8.25 9.5 75
0
7 7.75 67 3 4 5 6 7 8 9 10
6.5 8 65 Hand Width
7.5 8 62.5
6.5 7 62
8.2 9.3 74
6.5 7 68
7.5 9.5 71.5 Height vs Hand Length
7.63 8.75 71 80
8.25 8.5 73 70
6.5 7.5 62 60
7.4 8.5 64
50
7.5 8.5 70
Height
40
7.75 9 69
30
7.5 8 67
20
7.75 9.5 72
7 8 66 10
6.5 7.5 69 0
4.5 5 5.5 6 6.5 7 7.5 8
7.5 8 68
Hand Length
7 7.5 69
7.4 8.5 64
7 8.5 66
6.5 7.75 63.6
7.5 8.5 69
7 8.25 62
6.5 7 66
7.4 7 67
6.7 7.4 59
6.7 7.5 67
8 8.5 67
8 9 73
7 8 64
6.75 7.5 65
8 9.5 76
7 8.125 69.5
6.5 8 73.5
8 4 73
6.8 7.4 65
6.75 7.5 66
7.5 10 71
and Width Height vs Hand Width
Correlation 0.395 Medium Positive
7 8 9 10 11
d Width
nd Length
Part 4A
Y = 55.66017 + 1.47485*X
Part 4B
X 8.5
Y = 55.66017 + 1.47485*8.5
68.1964
Part 4C
Although we can predict an individual's height given that the hadn width is 20". Ho
not possible in a realistic scenario and hence, the value generated will be absurd.
Part 4D
X 9
Y = 55.66017 + 1.47485*9
68.93382
Actual Value 72
Residual 3.066179
Part 4E ANOVA
df SS MS F Significance F
Regression 1 131.3323 131.3323 8.849051 0.004579
Residual 48 712.3869 14.84139
Total 49 843.7192
Part 5
X 4
Y = 55.66017 + 1.47485*4
61.55957
Actual Value 73
Residual 11.44043
hand_width Residual Plot hand_width Line Fit Plot
15 80
10 60
Residuals
5 40 height
height
0 Predic
20
-5 3 4 5 6 7 8 9 10 11 0
-10 3 4 5 6 7 8 9 10 11
hand_width hand_width
Q-Q Plot
80
60
Lower 95.0%
Upper 95.0% 40
height
47.71955 63.60079 20
0.477993 2.471706 0
0 20 40 60 80 100 120
Sample Percentile
LITY OUTPUT
40
30
20
10
0
3 4 5 6 7 8 9 10 11
Hand Width
Bin Frequency
-3 12 Histogram - Residuals
1 19 20
5 15 15
Frequency
9 3 10 Frequency
13 1 5
0
-3 1 5 9 13 More
Bin
20
15
Frequency
10 Frequency
5
More 0 0
More 1 -3 1 5 9 13 More
Bin
5
0
-5 61 62 63 64 65 66 67 68 69 70 71
-10
Predicted Height
height
Predicted height
6 7 8 9 10 11
hand_width
hand_width height hand_length
7.5 62 6.5 SUMMARY OUTPUT
8 60 6.5
7.25 62 7 Regression Statistics
8.5 72 7 Multiple R 0.58652104
6 67.5 5 R Square 0.344006931
6 62 5.8 Adjusted R0.329429307
8.5 68 7 Standard E3.389458184
7 70 6.5 Observatio 47
7.5 67.2 6.5
9.5 75 8.25 ANOVA
7.75 67 7 df SS MS F Significance F
8 65 6.5 Regression 1 271.1072 271.1072 23.59829 1.47E-05
8 62.5 7.5 Residual 45 516.9792 11.48843
7 62 6.5 Total 46 788.0864
9.3 74 8.2
7 68 6.5 CoefficientsStandard Error t Stat P-value Lower 95%
9.5 71.5 7.5 Intercept 44.82555634 4.649317 9.641321 1.61E-12 35.46135
8.75 71 7.63 hand_widt2.788820534 0.57409 4.857807 1.47E-05 1.632543
8.5 73 8.25
7.5 62 6.5
8.5 64 7.4
8.5 70 7.5 RESIDUAL OUTPUT PROBABILITY OUTPUT
9 69 7.75
8 67 7.5 Observation
Predicted heightResiduals
Standard Residuals Percentile
9.5 72 7.75 1 65.74171035 -3.74171 -1.11612 1.06383
8 66 7 2 67.13612062 -7.13612 -2.12865 3.191489
7.5 69 6.5 3 65.04450522 -3.04451 -0.90815 5.319149
8 68 7.5 4 68.53053089 3.469469 1.034917 7.446809
7.5 69 7 5 61.55847955 5.94152 1.772311 9.574468
8.5 64 7.4 6 61.55847955 0.44152 0.131702 11.70213
8.5 66 7 7 68.53053089 -0.53053 -0.15825 13.82979
7.75 63.6 6.5 8 64.34730009 5.6527 1.686158 15.95745
8.5 69 7.5 9 65.74171035 1.45829 0.434997 18.08511
8.25 62 7 10 71.31935142 3.680649 1.09791 20.21277
7 66 6.5 11 66.43891549 0.561085 0.167367 22.34043
7.4 59 6.7 12 67.13612062 -2.13612 -0.63719 24.46809
7.5 67 6.7 13 67.13612062 -4.63612 -1.38292 26.59574
8.5 67 8 14 64.34730009 -2.3473 -0.70018 28.7234
9 73 8 15 70.76158732 3.238413 0.965994 30.85106
8 64 7 16 64.34730009 3.6527 1.089573 32.97872
7.5 65 6.75 17 71.31935142 0.180649 0.053886 35.10638
9.5 76 8 18 69.22773602 1.772264 0.528653 37.23404
8.125 69.5 7 19 68.53053089 4.469469 1.333209 39.3617
8 73.5 6.5 20 65.74171035 -3.74171 -1.11612 41.48936
7.4 65 6.8 21 68.53053089 -4.53053 -1.35142 43.61702
7.5 66 6.75 22 68.53053089 1.469469 0.438332 45.74468
10 71 7.5 23 69.92494116 -0.92494 -0.2759 47.87234
24 67.13612062 -0.13612 -0.0406 50
25 71.31935142 0.680649 0.203032 52.12766
26 67.13612062 -1.13612 -0.3389 54.25532
27 65.74171035 3.25829 0.971923 56.38298
28 67.13612062 0.863879 0.257689 58.51064
29 65.74171035 3.25829 0.971923 60.6383
30 68.53053089 -4.53053 -1.35142 62.76596
31 68.53053089 -2.53053 -0.75484 64.89362
32 66.43891549 -2.83892 -0.84683 67.02128
33 68.53053089 0.469469 0.140039 69.14894
34 67.83332575 -5.83333 -1.74004 71.2766
35 64.34730009 1.6527 0.492988 73.40426
36 65.4628283 -6.46283 -1.92781 75.53191
37 65.74171035 1.25829 0.375338 77.65957
38 68.53053089 -1.53053 -0.45655 79.78723
39 69.92494116 3.075059 0.917267 81.91489
40 67.13612062 -3.13612 -0.93548 84.04255
41 65.74171035 -0.74171 -0.22125 86.17021
42 71.31935142 4.680649 1.396203 88.29787
43 67.48472319 2.015277 0.601142 90.42553
44 67.13612062 6.363879 1.898298 92.55319
45 65.4628283 -0.46283 -0.13806 94.68085
46 65.74171035 0.25829 0.077046 96.80851
47 72.71376169 -1.71376 -0.5112 98.93617
Part A
Y = 44.8255 + 2.7888*X
Part B
X 8.5
Y = 44.8255 + 2.7888*8.5
68.53053
Part C
X 9
Y = 44.8255 + 2.7888*9
69.92494
Actual Value 72
Residual 2.075059
Part D ANOVA
df SS MS F
Regression 1 271.1072 271.1072 23.59829
Residual 45 516.9792 11.48843
Total 46 788.0864
Part F
R^2 Old 0.155659
Ratio 2.210007
Part G
Yes the plot of residual vs Predicted Y is considered as random pattern
Yes the new model is better than the previous model. However, it can be
since the Coefficient of determination is still 0.344 showing that only 34
variance in dependent variable can be explained by variation in indepen
hand_width Residual Plot hand_width Line Fit Plot
10 80
60
5
40
Residuals
height
height
0 20 Predicted he
5.5 6 6.5 7 7.5 8 8.5 9 9.5 10 10.5 0
-5
5 5 5 5 5 .5
-10 5. 6. 7. 8. 9. 10
hand_width hand_width
20
Upper 95%
Lower 95.0%
Upper 95.0%
0
54.18976 35.46135 54.18976 0 20 40 60 80 100 120
3.945098 1.632543 3.945098 Sample Percentile
40
62
30
62
20
62
62 10
62 0
5.5 6 6.5 7 7.5 8 8.5 9 9.5 10 10.5
62.5
Hand Width
63.6
64
64 Bin Frequency
64 -4 6 Histogram - Residuals
65 -1 11 20
65 2 17 15
Frequency
65 5 10 10 Frequency
66 8 3 5
66 More 0 0
66 -4 -1 2 5 8 More
66 Bin
5
Fre
0
-4 -1 2 5 8 More
Bin
67
67
67
67
Predicted vs Residual Plot
67.2 10
67.5 5
Residuals
68 0
68 60 62 64 66 68 70 72 74
-5
68
-10
69
69 Predicted Height
69
69
69.5
70
70
71
71
71.5
72
72
73
73
73.5
74
75
76
Significance F
1.47E-05
Outliers: http://emp.byui.edu/BrownD/Stats-intro/dscrptv/graphs/qq-plot_egs.htm
height
Predicted height
5 5 .5
8. 9. 10
width
9.5 10 10.5
siduals
Frequency
8 More
8 More
dual Plot
68 70 72 74
eight
plot_egs.htm