STAT 700 Homework 5

Steven Abel
Fred Kaiser
Loan Robinson
STAT 700 Homework 5
1. (a) Make boxplots of the data.
> data <- data.frame(chickwts)
> boxplot(weight~feed, data=data, ylab = "Weight", xlab = "Feed",
main="Weights of Chickens by Type of Feed")

The boxplots suggest that the type of feed has a significant effect on the weight of newly hatched chicks
after six weeks. we see that every other feed except sunfower is signicantly different from casein. The
type of feed also affects the variance of the weights of chickens fed with it.
(b) Determine if there are differences in the weights of chicken according to their feed.
The model we are using is a linear model:
Y(ij) = mu + alpha(i) + epsilon(ij).
And we assume that the psilon terms are independent random variables and distributed N(0, sigma^2),
and constraint the alpha terms to sum to 0. The alpha terms represent treatment effects, in this case
effects of the type of feed given to the chicken.

We will test for whether there are differences in weight caused by feed by doing a one-way ANOVA test.
Ho: alpha(i) = 0 for all i=1,,6.
Ha: the alpha terms are not all 0, or in other words, at least one alpha term is different from 0.
> anova(fit.lm <- lm(weight~feed, data=data))
Analysis of Variance Table

Response: weight
Df Sum Sq Mean Sq F value Pr(>F)
feed 5 231129 46226 15.365 5.936e-10 ***
Residuals 65 195556 3009
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

The P-value from the one-way ANOVA test if 5.936e-10, showing that feed has a strong effect on the
response, weight, explaining a great deal of the variation in weight. We reject the null hypothesis and
conclude that at least one alpha term is different from 0.
> par(mfrow=c(2,2))
> plot(fit.lm)

(c) Now test all possible two-group comparisons.
> attach(data)
> pairwise.t.test(weight,feed,p.adjust.method="bonf")

Pairwise comparisons using t tests with pooled SD

data: weight and feed

casein horsebean linseed meatmeal soybean
horsebean 3.1e-08 - - - -
linseed 0.00022 0.22833 - - -
meatmeal 0.68350 0.00011 0.20218 - -
soybean 0.00998 0.00487 1.00000 1.00000 -
sunflower 1.00000 1.2e-08 9.3e-05 0.39653 0.00447

P value adjustment method: bonferroni

We conclude that, at the alpha=.05 level, casein is significantly different from horsebean, linseed and
soybean in effect. Horsebean is significantly different from meatmeal, soybean, and sunflower. Linseed
is significantly different from sunflower. Soybean is significantly different from and sunflower. All other
pairwise comparisons are not different enough to conclude they are different.
All in all, there are many differences between the types of feed when compared pairwise, which is a
stronger conclusion than we made in part (b), when we said only that at least one feed had a significant
effect.
15 pairwise comparisons were made for this problem.
(d) The Bonferroni adjustment is a way to control the type I error when doing multiple comparisons, as
we were in this case with our 15 pairwise comparisons. You can do it by simply fixing alpha and taking
the P-value and comparing to alpha/n, where n is the number of comparisons. In this case, R does it by
multiplying each P-value by n, the number of comparisons, and comparing to the original fixed alpha.
This leads to equivalent conclusions.
(e) Other methods exist to adjust the P-value when doing multiple comparisons. Holms method, like
Bonferroni, also controls the family-wise error rate, and is less conservative but also valid under
arbitrary assumptions. Hochbergs and Hommels methods also control the error rate, but are only in
valid in certain conditions.
The alternative way is to control the false discovery rate, or expected proportion of false discoveries
amongst rejected hypotheses. This is done in the method of Benjamini, Hochberg, and Yekutieli, and
methods like this are more powerful and less conservative than the others.

2. Plot the data using strip charts and interaction plots.
> data <- read.table("http://www-
rohan.sdsu.edu/~babailey/stat700/poison.dat", header = TRUE)
> data$Poison <- factor(data$Poison)
> data$Treatment <- factor(data$Treatment)
> attach(data)
> par(mfrow=c(1,1))
> stripchart(Survival~Poison, ylab = "Poison", main = "Survival by Poison")

This chart shows that type of poison seems to have a significant effect on survival time. Poison 3s effect
is the most pronounced, and is the best poison, as no observation was over 4 hours, while the other two
poisons had much more variation and certainly higher means for survival.

> stripchart(Survival~Treatment, ylab = "Treatment", main = "Survival by
Treatment")

This plot shows that the type of treatment can also have a significant effect on survival. Treatment 1
stands out as the least effective treatment, as its observations have a low mean and comparatively low
variation. The other three treatments had rather high variation, but treatment 3 seemed to be less
effective than 2 and 4.

> stripchart(Survival~Treatment+Poison, ylab = "Treatment/Poison
Combination", main = "Survival by Treatment/Poison")

This plot shows that certain treatment/poison combinations could have a strong effect on survival
compared to the others. The most striking feature is that poison 3 is very potent, as observations given
poison 3 all had low survival, although treatment did seem to have an effect. Other effects appear, such
as the strength of treatments 2 and 4, but there is more variation for any observation not given poison
3.

> interaction.plot(Treatment, Poison, Survival, main="Interaction of Poison
and Treatment")

> interaction.plot(Poison, Treatment, Survival, main="Interaction of Poison
and Treatment")

These plots suggest that interaction is not a strong feature of the data. In both plots, survival follows
almost the same pattern across the poisons or treatments, regardless of the level of the second factor.
They are almost parallel to each other.
(b) Conduct a two-way ANOVA to test the effects of the two main factors and their interactions.
> anova(fit.lm <- lm(Survival ~ Poison + Treatment + Poison*Treatment))

Response: Survival
Poison 2 103.043 51.521 23.5699 2.863e-07 ***
Treatment 3 91.904 30.635 14.0146 3.277e-06 ***
Poison:Treatment 6 24.745 4.124 1.8867 0.11
Residuals 36 78.692 2.186
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Our model is another linear model:
Y(ijk) = mu + alpha(i) + beta(j) + gamma(ij) + epsilon(ijk).
We assume the alpha, beta, and gamma terms sum to 0, and that the epsilons are independent random
variables distributed N(0, sigma^2).
First we test for interaction:
Ho: All gamma terms equal 0. Interaction is not significant.
Ha: At least one gamma term does not equal 0. There is significant interaction between factors Poison
and Treatment.
Because the P-value is 0.11, we fail to reject the null at the 0.05 level. This confirms what the interaction
plots suggested: there is no significant interaction between Poison and Treatment.
Doing tests for the main effects in the same way as our test for one-way ANOVA, we see the P-values of
2.863e-07 for the test of the Poison effect and 3.277e-06 for the Treatment factor. Thus, we reject the
null in each case and conclude that both Poison and Treatment have significant effects on survival time.
In other words, at least one alpha term is not equal to 0, and at least one beta term is not equal to 0.

> par(mfrow=c(2,2))
> plot(fit.lm)

We do all pairwise comparisons for both the levels of Poison and the levels of Treatment.
> pairwise.t.test(Survival,Poison,p.adjust.method="bonf")


data: Survival and Poison

1 2
2 0.9542 -
3 9.3e-05 0.0022


From this we conclude that poison 3 is different from poisons 1 and 2, while 1 and 2 are not significantly
different from each other in effect.

> pairwise.t.test(Survival,Treatment,p.adjust.method="bonf")


data: Survival and Treatment

1 2 3
2 0.0011 - -
3 1.0000 0.0147 -
4 0.1051 0.6613 0.7235


From this we conclude that treatment 2 is significantly different from treatments 1 and 3, while all other
pairs are not significantly different from each other in effect on survival time.

(c) Conduct the two-way ANOVA using rate of death instead of survival time.
> Deathrate <- 1/Survival
> anova(fit.lm <- lm(Deathrate ~ Poison + Treatment + Poison*Treatment))

Response: Deathrate
Poison 2 0.34863 0.174316 72.8419 2.217e-13 ***
Treatment 3 0.20396 0.067987 28.4100 1.336e-09 ***
Poison:Treatment 6 0.01567 0.002611 1.0911 0.3864
Residuals 36 0.08615 0.002393
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

> par(mfrow=c(2,2))
> plot(fit.lm)

We make the same conclusions using rate of death instead of survival time. We conclude that there is
no significant interaction between Poison and Treatment, while both main effects are significant.
However, the diagnostic plots look better using rate of death, in particular the residuals vs. fitted values
plot which shows no pattern, while the same plot using survival time had a bit of a curve. Also, the
normal Q-Q plot looks much better using rate of death.

STAT 700 Homework 5

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

STAT 700 Homework 5

Uploaded by

Copyright:

Available Formats

Steven Abel

You might also like