Professional Documents
Culture Documents
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.9606 0.1904 -15.546 <2e-16 *** SizeNormal -0.3035 0.3015 -1.007 0.3141 SizeVery Large 0.5440 0.2857 1.904 0.0569 . --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 7.3209e+00 on 2 degrees of freedom Residual deviance: -1.9984e-15 on 0 degrees of freedom AIC: 20.851 Number of Fisher Scoring iterations: 3
R Input:
> model1=glm(cbind(Carrier, Non.carrier)~Size, family=binomial, data=Tonsils) > summary(model1)
R Output:
Call: glm(formula = cbind(Carrier, Non.carrier) ~ Size, family = binomial, data = Tonsils) Deviance Residuals: [1] 0 0 0 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.9606 0.1904 -15.546 <2e-16 *** SizeNormal -0.3035 0.3015 -1.007 0.3141 SizeVery Large 0.5440 0.2857 1.904 0.0569 . --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 7.3209e+00 on 2 degrees of freedom Residual deviance: -1.9984e-15 on 0 degrees of freedom AIC: 20.851 Number of Fisher Scoring iterations: 3
Part 2:
R Input:
> model1=glm(cbind(Carrier, Non.carrier)~Size, family=binomial, data=Tonsils) > model1.5=glm(cbind(Carrier, Non.carrier)~1, family=binomial, data=Tonsils) > anova(model1.5, model1, test="LRT")
R Output:
Analysis of Deviance Table Model 1: cbind(Carrier, Non.carrier) ~ 1 Model 2: cbind(Carrier, Non.carrier) ~ Size Resid. Df Resid. Dev Df Deviance Pr(>Chi) 1 2 7.3209 2 0 0.0000 2 7.3209 0.02572 * --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Part 3: Answer: LOG-ODDS Normal: -3.264151 CI: (-3.722312, -2.805990) Large: -2.960641 CI: (-3.333902, -2.587380) Very Large: -2.416658 CI: (-2.834200, -1.999116) ODDS: Normal: 0.03822938 CI: (0.02417801, 0.06044688) Large: 0.05178571
Aaron Vincent 17 April 2014 Statistics 516 CI: (0.03565371, 0.07521686) Very Large: 0.08921933 CI: (0.05876555, 0.13545503) PROBABILITY: Normal: 0.03682171 CI: (0.02360723, 0.05700134) Large: 0.04923599 CI:
(0.03442629, 0.06995505)
R Output:
> contrastfix(cont, wald=TRUE) glm model parameter contrast Contrast S.E. Lower Upper t df Pr(>|t|) Large -2.960641 0.1904428 -3.333902 -2.587380 -15.55 NA 0 Normal -3.264151 0.2337598 -3.722312 -2.805990 -13.96 NA 0 Very Large -2.416658 0.2130355 -2.834200 -1.999116 -11.34 NA 0 > contrastfix(cont, wald=TRUE, exp=TRUE) glm model parameter contrast Contrast S.E. Lower Upper t df Pr(>|t|) Large 0.05178571 0.1904428 0.03565371 0.07521686 -15.55 NA 0 Normal 0.03822938 0.2337598 0.02417801 0.06044688 -13.96 NA 0 Very Large 0.08921933 0.2130355 0.05876555 0.13545503 -11.34 NA 0 > Normal1 [1] 0.03682171 0.02360723 0.05700134 > Large1 [1] 0.04923599 0.03442629 0.06995505 > Very.Large1
R Input:
cont2=contrast(model1, list(Size="Large"), list(Size="Normal")) contrastfix(cont2, wald=TRUE, exp=TRUE) cont3=contrast(model1, list(Size="Very Large"), list(Size="Large")) contrastfix(cont3, wald=TRUE, exp=TRUE) cont4=contrast(model1, list(Size="Very Large"), list(Size="Normal")) contrastfix(cont4, wald=TRUE, exp=TRUE)
R Output:
> contrastfix(cont2, wald=TRUE, exp=TRUE) glm model parameter contrast Contrast S.E. Lower Upper t df Pr(>|t|) 1 1.354605 0.3015164 0.7501732 2.446042 1.01 NA 0.1571 > contrastfix(cont3, wald=TRUE, exp=TRUE) glm model parameter contrast Contrast S.E. Lower Upper t df Pr(>|t|) 1 1.722856 0.2857492 0.9840538 3.016332 1.9 NA 0.0285 > contrastfix(cont4, wald=TRUE, exp=TRUE) glm model parameter contrast Contrast S.E. Lower Upper t df Pr(>|t|) 1 2.33379 0.3162717 1.255599 4.337832 2.68 NA 0.0037
R Input:
summary(model) modela=glm(cbind(Removed, Placed-Removed)~Distance+Morph+Distance:Morph, data=case2102, family=binomial) summary(modela) newdata1 <- expand.grid(Distance = seq(0, 52, length = 100), Morph = c("light","dark")) newdata1$yhat <- predict(modela, newdata1, type = "response") plot1 <- ggplot(case2102, aes(x = Distance, y =Removed/Placed, color = Morph)) plot1 <- plot1 + geom_point() + geom_line(aes(y = yhat), data = newdata1) plot1 <- plot1 + ylab("Observed/Predicted Proportion") plot(plot1)
R Output:
> summary(modela) Call: glm(formula = cbind(Removed, Placed - Removed) ~ Distance + Morph + Distance:Morph, family = binomial, data = case2102) Deviance Residuals: Min 1Q Median 3Q Max -2.21183 -0.39883 0.01155 0.68292 1.31242 Coefficients: (Intercept) Estimate Std. Error z value Pr(>|z|) -1.128987 0.197906 -5.705 1.17e-08 ***
Part 2: Answer:
glm model parameter contrast Contrast S.E. Lower Upper t df Pr(>|t|) light 0.9907562 0.005788365 0.9795796 1.002060 -1.60 NA 0.0543 dark 1.0186745 0.005645336 1.0074653 1.030008 3.28 NA 0.0005
This means that for every 1 unit increase in distance chance of light being removed goes down by 0.01x and dark goes up by 1.01x. R Input:
source("http://dl.dropboxusercontent.com/u/10884844/Rcode/contrastfix.R") install.packages("contrast") library(contrast) conta=contrast(modela, list(Morph=c("light", "dark"), Distance=2), list(Morph=c("light", "dark"), Distance=1), cnames=c("light", "dark")) contrastfix(conta, wald=TRUE, exp=TRUE)
R Output:
> conta=contrast(modela, list(Morph=c("light", "dark"), Distance=2), list(Mor ph=c("light", "dark"), Distance=1), cnames=c("light", "dark")) > contrastfix(conta, wald=TRUE, exp=TRUE) #so for every 1 unit increase in di stance chance of light being removed goes up by 1.009x and dark goes down by .98x glm model parameter contrast Contrast S.E. Lower Upper t df Pr(>|t|) light 0.9907562 0.005788365 0.9795796 1.002060 -1.60 NA 0.0543 dark 1.0186745 0.005645336 1.0074653 1.030008 3.28 NA 0.0005
Part 3: Answer:
This means that at a distance of 0 units the chance of being removed if a moth is a light morph is 1.5x higher than if it was dark.
Contrast S.E. Lower Upper t df Pr(>|t|) 2.659651 0.2195914 1.729451 4.090169 4.45 NA 0
This means that at a distance of 50 units the chance of being removed if a moth is a dark morph is ~2.66x higher than if it was light. R Input:
contb=contrast(modela, list(Morph="light", Distance=0), list(Morph= "dark", Distance=0)) contrastfix(contb, wald=TRUE, exp=TRUE) contc=contrast(modela, list(Morph="dark", Distance=50), list(Morph= "light", Distance=50)) contrastfix(contc, wald=TRUE, exp=TRUE)
R Output:
> contrastfix(contb, wald=TRUE, exp=TRUE) glm model parameter contrast Contrast S.E. Lower Upper t df Pr(>|t|) 1.508713 0.2744898 0.8809688 2.583764 1.5 NA 0.067 > contrastfix(contc, wald=TRUE, exp=TRUE) glm model parameter contrast Contrast S.E. Lower Upper t df Pr(>|t|) 2.659651 0.2195914 1.729451 4.090169 4.45 NA
Part 4: Answer: There is no overdispersion because there is only one point that is greater than 2 on the residual vs. predicted plot and, in the summary statistics of the model, the residual deviance is not >2x the null deviance.
R Input:
plot(predict(modela, type = "response"), rstudent(modela))
R Output:
Standard Errors:
(Intercept) 3.83921 observerobs2 5.64216 photo 0.06317 observerobs2:photo 0.09580
Using the quasipoisson function gives you the same parameter estimates but higher standard error estimates, thus making your evaluations of the model more accurate by giving you bigger confidence intervals and more robust test statistics and p-values. Additionally, the quasipoisson distribution has adequately dealt with the overdispersion because only a few points are greater than 2 on the residual vs predicted plot and, in the summary statistics of the model, the residual deviance is not >2x the null deviance.
R Input:
model3.1 <- glm(count ~ observer + photo + observer:photo, data = snowgeese.long, family = quasipoisson(link = identity)) summary(model3.1) summary(model) plot(predict(model3.1, type = "response"), rstudent(model3.1)) plot(predict(model, type = "response"), rstudent(model))
R Output:
> summary(model3.1) Call:
Standard Error:
(Intercept) 2.25435 observerobs2 3.36972 photo 0.05827 observerobs2:photo 0.09396
Using a negative binomial gives you different parameter estimates and higher standard error estimates, thus making your evaluations of the model more accurate by giving you bigger confidence intervals and more robust test statistics and p-values. Additionally, the negative binomial distribution has adequately dealt with the overdispersion because only a few points are greater than 2 on the residual vs predicted plot and, in the summary statistics of the model, the residual deviance is not >2x the null deviance.
10
R Input:
library(MASS) model3.2 <- glm.nb(count ~ observer + photo + observer:photo, data = snowgeese.long, link = identity) summary(model3.2) plot(predict(model3.2, type = "response"), rstudent(model3.2)) plot(predict(model, type = "response"), rstudent(model))
R Output:
> summary(model3.2) Call: glm.nb(formula = count ~ observer + photo + observer:photo, data = snowgeese.long, link = identity, init.theta = 11.0702114) Deviance Residuals: Min 1Q Median 3Q Max -2.0805 -0.6863 -0.2228 0.3886 2.9868 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.42894 2.25435 0.634 0.52617 observerobs2 -2.94485 3.36972 -0.874 0.38217 photo 0.75447 0.05827 12.947 < 2e-16 *** observerobs2:photo 0.29679 0.09396 3.159 0.00158 ** --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Dispersion parameter for Negative Binomial(11.0702) family taken to be 1) Null deviance: 793.411 on 89 degrees of freedom Residual deviance: 84.989 on 86 degrees of freedom AIC: 779.36 Number of Fisher Scoring iterations: 1