You are on page 1of 61

A coin is tossed 100 times what is the expected number of heads?

Answer is: 0.5 or 1/2


Given H

Answer Z=1
Two cards

Answer : 64
When referring to

Answer : None of these


When a null

Answer : <42
One-tailed two-tailed

Answer : z test one tailed


Which of the following in the first step in calculating Median of data set

Answer : Array the data


The accuracy of the prediction of variable can be improved

Answer : adding more dependent variables to the regression model


The hypothesis

Answer : Simple Hypothesis


Given p

Answer : No, Because P(A and B)

(1st option)

When referring to a curve that tails off to the left end

Answer : none of these


The test statistics is equal to

Answer : None of these


The correct formula is (Sample mean Population Mean/Standard error)
A pair of dice are thrown

Answer : 1/6 Good answer


If p(a or b)

Answer : P(A) + P(B) is the joint probability of A and B (I am not sure about
this)
When n dice are rolled the possible out comes are:

Answer : 6n
The number of road accidents is a

Answer : discrete
The grouped data are called

Answer : secondary data


A two tailed test of a difference between two proportions led to

Answer : 0.05 1st option


The range of test statistic-z is

Answer:

3rd option

The frequency divided by the total number of observations is called

Answer : relative frequency


Two cards are drawn from a well-shuffled

Answer : 1/221 (calculation required so I cant ans)


For two tailed test of hypothesis at

Answer : between the two critical values


If a sample of size m is drawn from one population and size

Answer : n-1 , m-1


For an upper tailed test of the difference of two means based on dependent
samples

Answer : 1.645
Given Ho

Answer : z=1 accept Ho


Two cards are switched

Answer : 2 but not sure (I also not sure)

Two cards are drawn

Given 130

Answer : Z
The range of test

Answer : c (-infinity to +infinity)

1st Quiz
1. If four coins are tossed, how many elements will the sample space contains. A=2, B= 4. C=16.
2. A Bag contains 10 red balls and 7 blue balls, A ball is drawn at random. The probability that ball
drawn is red. A= 7. B= 7/17. 10/17. 3/17.
3. A fair coin is tossed three times. What is the probability that at least one head appears. A=7/8.
B= 6/8. C= 5/8. D= 4/8.
4. If a dice is thrown twice, the number of elements in the sample space. A= 2. B= 4. C=16. D=36.
5. Mean, Median and mode always coincide in the case of ..Distribution. A= Poisson. B=
Binomial. C= Normal. D= Hypergeometric.
6. Two events. A and B are mutually exclusive and each have a nonzero probability. If event A is
known to occur, the probability of the occurrence of event B is. A= one. B= any positive value.
C=0, d= any value between 0 to 1.
7. The probability of getting a head in tossing of a coin is. A= 0.5, B= 1, C= 1.5, d= -0.5.
8. The probability of an event cannot ne. A= 1, B=0.1, C= 0.5, D=-0.5
9. Find X compliment, x= 2,8,4,4,6,8,10. A= 49, B= 42, C= 9, D= 6.
10. P (A intersection B) =.., A= P(A) P(A/B), B= P (B) P(B/B). C= P(A) P(B/A), D= none of these.
11. For a random sample 9 women the average resting pulse rate x= 76 beats per minute and the
sample standard deviation is s= 5. The standards error of the sample mean is. A= 0.557, B=
0.745, C= 1.667, D= 2.778.
12. Null and alternative hypothesis are statements about. A= population parameters, B= Sample
Parameters, C= Sample statistics, D= it depends- Sometimes population parameters and
sometimes sample parameters.
2nd Quiz
Q.1. In order to carry out a chi square test on data in contingency table, the observed values in the table
should be. A= close to the expected values, b= all greater than or equal to 5. C = frequencies, d=
quantitative.
Q.2. If two attributes A and B have perfect association the value of coefficient of association is equal to.
A= +1, B= 0, C= -1, D= (r-1 x c-1)
Q.3. The degree of freedom for chi square are (r-1)(c-1) for a contingency table with r-rows and ccolumns so for 2*2 contingency table there are. A= one degree of freedom, b= Two degree of freedom,
c= three degree of freedom, d= four degree of freedom,
Q.4. For an r*c contingency table the number of degrees of freedom equal. A= rc, b= r+c, c= (r-1) + (c-1),
d= (r-1)(c-1)
Q.5. For a 3 * 3 contingency table the number of cells in the table are. A= 3, b= 6, c= 9, d= 4.
Q.6. The total area under the curve of chi-square distribution is, A= 1, B= 0.5, C= 0 to infinity, D= - infinity
to + infinity
Q.7. Ch-square curve ranges from. A= infinity to + infinity, B= 0 to infinity, C= - infinity to 0, D= 0 to 1.
Q.8.The value of chi- square statistics is always. A= negative, B= 0, C= non negative, D= one.
Q.9.The slope of the simple linear regression equation (x is the independent variable and y is the
dependent variable) represents the. A= mean value of y when x=0, b= change in mean value of y per
unit change in x, c= True value of y for a fixed value of x, d= Variance of the value of x.

Q. 10. the range of values of co relation co-efficient are. A= 0 to 1, b= -1 to 0, C= -1 to 1. D= none of the


above.
Mid Term
1. With a lower significance level the probability of rejecting a null hypothesis that is actually true:
Decrease
2. Specify which probability distribution to use in a hypothesis and weather it will be one tailed or
two tailed if the following information is given
t-test two tailed
3. A pair of dice are thrown. Find the probability of getting a total of either 5 or 11:
1/6
4. If we want to test whether the proportions of more than two populations are equal. We use a.
analysis of variance. b. estimation. C. the variance. e. interval estimates. f. none of the these.
5. Which of the following is based on the relationship or association between two or more
variables?
Regression and Correlation
6. Which of the following test could be based on the normal distribution? a. Difference between
independent means, b. difference between dependent means, c. difference between
proportions. d. (a, c but not b) e. All of the above
7. The measure of central tendency listed below is:
1.The mean
8. Quantitative variable are variables measured on a __ scale.
numeric
9. a/2 is called:
Two tailed significance level
10. the normal distribution is the appropriate distribution to use in testing hypotheses about all of
the above
11. scores that differ greatly from the measures of central tendency are called:
extreme scores
12. when a null hypothesis is accepted, it is possible that. a. b. Zcal < Ztab
13. A fair coin is tossed three times. What is the probability that at least one head appears? 7/8
14. When one card is selected at random from a pack of 52 cards playing cards, the possible
selections are: 52
15. Given H find Z and make the statistical decision
Z=2.4 reject H0
16. The number of road accidents is a\
17. __ random variable continuous discrete
18. When data are classified according to a single characteristics, it is called Qualitative
classification / Simple Classification
19. A relative frequency distribution presents frequencies in terms of
only A and C (Fractions
and percentages)
20. Which of the following is true about the number of variables in regression ?
There can be
only be one dependent variable but multiple independent variables
21. Decision makers make decisions on the appropriate significance level by examining the cost of
a). performing the test. b). A type I error. c). A type II error. d). a and b. e. b and c.
22. In the case that two events A and B are mutally exclusive, P(A union B)= ? P(A)+ P(B)
23. Find x
x=1,7,3,4,6,8,10
N is replaced by n-1 or 39
24. Find x compliment, x= 1,7,3,4,6,8,10. A= 49, B= 23, C= 39, D= 59
25. When referring to a curve that tails off to the left end, you would call it none of these
26. A binomial distribution may be approximated by a poisson distribution if: 1 = n is large p is large,
2= n is amall, p is large, 3= none of these, 4= a and b but not c.
27. Specify which probability distribution to use in a hypothesis test and whether it will be onetailed or two-tailed given the following information
28. The square of variance of a distribution is the : None of these
29. Which of the following is true regarding the acceptance and rejection region? All of the
above
30. for a normal curve with mean 55 and standard deviation 10, what will be the area under curve
to the right of value 55?

31.
32.
33.
34.

a)1.0
b)0.68
c)0.32
d)0.5
e)non of above
What dose regression means? The general process of predicting one variable from another
variable
what is the probability that a randomly selected value of a population is greater than median of
that population?
(0.5)
if P(A or B)=P(A), then
1. A and B are mutually exclusive 2. The venn diagram area of A
and B overlap 3. P (A) + P(B) is the joint probability of A and B. 4. None of these. 5. All of these.
If the null hypothesis is rejected, then we may be making
a. Correct decision
b. type I error
c. type II error
d. either A or B
e. either A or C

35. A bag contains one rupee, 50-paisa and 25-paisa coins in the ratio
2 : 3 : 5. Their total value is Rs.144. The value of 50-paisa coins is
Rs.24
Rs.36
Rs.48
Rs.72
Rs.80
36. Which of the following normal curves is most likely the curve for u=10, sigma =5?
Curve
for u=20, sigma=10
37. A number between 0 and 1, that is used to measure uncertainty is called Probability
38. Histogram is a graph of frequency distribution
39. Given a =80, n=625,u0=350 and X= 356 Find Z?
1.88
40. The grouped data are called : Difficult to tell
41. u and sigma are parameters
z distribution
42. The values that separate the acceptance region from the rejection region is called Critical values
43. The test statistics is equal to: None of these
44. I fair coin is tossed three times, what is the probability that at least one head appears? 7/8
45. Economists use regression analysis and base their predictions of the annual gross domestic
product (GDP) on the final consumption spending within the economy. What are the dependent
and independent variables for the analysis.
Dependent: GDP; Independent: final;
consumption spending
46. Square root of variance have only values:
non Negative
47. A frequency distribution that contains a class with limits of 10 and under 20 would have a
midpoints:
15
48. Z= _____
z=x . u/sigma \square root of (n)
49. Which of the following represents the probability of mutally exclusive events A and B?
P(A)+P(B)
50. In testing hypothesis: alpha + beta is always equal to difficult to tell
51. The standard deviation of a binomial distribution depends upon: 1=success, 2= failure, 2= trial,
3= b and c but not a, 4= a,b and c

52. Suppose we Want to test whether the population mean is significantly large or small than 10.
What should our alternative hypothesis be ?
u<=10
53. The argument in which the order of the objects selected from a specific pool of objects is
important called
permutation
54. For an upper tailed test of the difference of two means based on dependent samples of size 6
and alpha =.05, the critical value for the test statistic is :
2.015
55. What is the probability that a value chosen at random from a particular population is larger than
the median of the population?
0.5
56. The accuracy of prediction of variable can be improved by adding more independent variables
to regression mode
57. The power of test is equal to :
1-beta
58. Six white balls and four black balls, Which are indistinguishable apart from colour, are placed in
a bag. If six balls are taken from the bag, Find the probability of their being three white and
three black balls :
8/21
59. The probability of an event occurring given that another event had occurred is called:
Conditional probability
60. The Largest and the smallest values of any given class of a frequency distribution are called:
Class limits
61. If a coin is tossed thrice the sample space consist of 8 elements
62. If two dices are rolled, the possible outcomes are :
36
63. Which of the following is an example of a parameter? n or u
64. For two tailed test of hypothesis at sigma=0.10, the acceptance region is the entire region:
Between the two critical values
65. In Lower tail
alpha=0.05 then z tabulated is 1.65
66. A two tailed test of a difference between two propositions led to z=1.85, for its standardized
difference of sample proposition. For which of the following significance level would you reject
H0?
Alpha=0.05
67. Given u0=130, x =150, sigma=25 and n=4. What test statistic is appropriate? 1= t, 2 = z, 3= x2,
4=f
68. The average of lower and upper class limits is called
Class boundary
69. If total number of data points are 120, then we can make a total of __ number of classes. 8
70. Fisher test nm
71. Alpha + beta = 1
72. Upper tailed test
73. Which test will be used if the population is normal and the standard deviation is known: Z test
74. Which one of the following is discrete variable: Number of rooms
75. If the total number of data points are 120, then we can make a total of --------number of classes:
6.
76. If the dependent variable decreases as the independent variable increase: Negative linear
relationship.
77. Numerical quality that describe a population is called: Parameter
78. Degree of freedom of t distribution is. a). N+1, b). n-1, c). n. d). n-1/2
79. Given x =100, ax = 16, and u0= 90, find Z : 0.65
80. When n dice are rolled, the possible outcomes are: 6n
81. The simple probability of an occurrence of an event is called: none of these.
82. When a null hypothesis is H0: u=42, then the alternative hypothesis can be : H1: u less than 42
83. If a is any event in S and A its complement, then p(A) is equal to : 1-p(A)
84. In regression analysis the variable we would like to predict or explain is called : dependent
variable
85. Histogram is a graph of : frequency distribution.

86. Given H: u=12, H1:u greater than 12, n=64, x=15. Phi=10, Sigma=0.05. find Z are make the
statistical decision: Z=2.4, reject H0
87. Given x =120, u0=100,s=34.75 and n=25, find t : 2.88
88. An arrangement in which the order of the objects selected from a specific pool of objects is
important called : permutation
89. With referring to a curve that tails off to the left end, you would call it : none of these
90. What does the term regression means: the general process of predicting one variable from
another variable.
91. Which of the following is the first step in calculating the median of data set. Array the data.
92. which of the is not a measure of central tendency? Geometric mean
93. f(x) represents the -------Variable : dependent.
94. The frequency divided by the total number of observations is called. Relative frequency
95. How does the computation of a sample variance differ from the computation of a population
variance? a). u is replaced by x, b). n is replaced by n-1, c) n is replaced by n, d) a and c but
not b, e) a and b but not c
96. If a sample of size m is drawn from one population. What are the respective degrees of
freedom if one has to apply Fishers test. a= n-1,m-1, b= n,m. c= n-1,m-1. d= n-1,m. e= m1,n
Previous quiz.
1. Which of the following is a criteria for selecting a regression line which best represents the data.
A= the mean of the data must agree with the line. B= the sum of squared differences between
the dependent variable must be minimized. C= the sum of the squared horizontal differences in
the independent variable must be minimized. D= the line must agree with at least half of the
data points
2. Which is the probability that a value at random from a particular population is larger than the
median of the population. A= 0.25, B= 0.5, C= 1.0, D= 0.67, E= none of these
3. P(A)=? A= number of favorable commitment/total number of possible outcomes. B= total
number of possible outcomes/number of favorable outcomes, C= both a and b. D= none of
these
4. The weight in grams of 10 male and 10 female eing-neckled pheasants are obtained. The
variance for each are different. In order to test the hypothesis that the variance of the different
genders favors males over females, which of the following test may be used? T- test one tailed.
5. In an un paired sample t-test with sample size n1=11 & n2= 11, the value of tabulated should be
obtain from. A= 10 degree of freedom, B= 21 degree of freedom, C= 22 degree of freedom, D=
20 degree of freedom.
6. E(x-x compliment)(y-y compliment) =0, E(x-xcompliment)2 = 10 & n=5 find the cooficient of
coorelation. A= 1, B= 2, C= 0, D= 0.5
7. A time series has. A= two components, B= three components, C= four components= five
components.
8. If the regression lines of 4 on x and y are respectively given by 2x-3y=0 and 4y-5x=8 find out
values of two regression coefficients of y on x and x on y. A= 3/2 and 5/4, B= and 1/5, C= 2/3
and 4/5, D= 2/5 and .
9. For an r x c contingency table, the number of cells in the table are. A= r.c, B= (r-1)(r-c). C= r+c,
D= (r+1)(c-1)
10. Given x2= 20.178, D.of=4 and alpha=0.01, find the table value of x2 and make the statistical
decision. A= x20.01(4)=13.277, rejected H0. B= x20.01(4)=14.277 rejected H1. C= x2
0.01(4)=13.277, rejected H1. D= X2 0.01(4)=14.277, rejected H0.
11. Moving average is. A= given the trend in a straight line, B= measure the seasonal variation, C=
smooth-out the time series. D= none of the above.

12. Suppose that y= 1, when x= 0, then y=2 where x= 2. Find the least square estimate b. A= 2.0, B=
1.0, C= 1.5, D= 2.5.
13. Given x= 0.6-0.5y and y = 0.8, find x= ?. A= 0.1, B= 03, C= 0.2, D= 0.4.
14. Suppose that y= 1, when x= 0, then y=0 when x= 1 and that y=3 where is x=2. In this case find
the sample correlation, A= 1, B=2, C= 3, D= 4.
15. In semi averages method, if the number of values is odd then we drop: A= first value, B= third
value, C= last value, D= middle value, E=middle two value.
16. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the least
square estimate a. A= 2, B= 3, C= 4, D= 1.
17.
Which of the following normal curves looks most like the curve for u=10,o=5
A=Currve for u=10,o=10. B.Curve for u=20, o=10. C=Curve for u=20, o=5.D=Curve for u=13, o=3. E=None of these.

18.
19.

20.
21.
22.

23.

24.
25.
26.
27.
28.

29.
30.
31.
32.
33.

34.

Degree of freedom of t-distribution is. A=n+1. B=n-1. C=N. D=n-1/2


Which of the following tests could be based on the normal distribution. A=Difference
between Independent means. B=Difference between dependent means.C=Difference between
propotions D=All of the above. E=a and c but not b
If the null hypothesis is rejected then we may be making.A=a correct decision. B=a Type I error.
C=a type II error.D=either A or B. E=either A or C
In lower tail\(\alpha\)=0.05 then tabulated is.A=.96. B=1.45. C=2.03. D=1.65.
The normal distribution is the appropriate distribution to use in testing hypothesis about.A=A
proportion, when npho>5 and nqho>5. B=A mean,when o is known and the population is
normal. C=A mean ,when o is unknown but n is large. D=All of above. E=None of these
Which of the following is true regarding the acceptance and rejection region. A=The acceptance
region is the range of values of the sample statistics within which if values of the sample
statistics falls then the null hypothesis is accepted. B=The rejection region is the range of values
of the sample statistics within which if values of the sample statistic fails then the null
hypothesis is rejected. C=The rejection region is the range of values of the sample statistic
within which if values of the sample statistic falls then the alternative hypothesis is accepted.
D=All of the above. E=only A and B.
If the total number of data points are 120 then we can make a total of-------- number of classes.
A=8. B=7. C=6. D=5
r=+1. A= no correlation. B=Negative correlation. C=Perfect correlation. D=None of above.
\(\mu\)and \(\sigma\) are the parameters------------. A=f distribution. B=z distribution. C= t
distribution. D= None of above.
Which of the following represents the probability of mutually exclusive events A and B.
A=P(A)+p(B). B=P(A)+P(B)+P(A^B). C=P(A)+P(B)-P(A^B). D=P(A)-P(B).
Specify which probability distribution to use in a hypothesis test and whether it will be one
tailed or two tailed if the following information is given. Ho:u=15,H1:u not equal to 15, xbar=14.8, n=20. A= z-test ;one- tailed. B= z- test; two tailed. C=t-test;one tailed. D= t-test;two
tailed. E= a and b but not c.
The probability of an event occuring given that an other event had occured is called. A= joint
probability. B=Conditional probability.C=Binominak probability. D=Discrete probability.
Z=.
A= z=\(\bar(x)\)-\(mu\)/\(\sigma\)\\(\sqrt(n)\), B= z=\(\bar(x)\)\(mu\)/\(\sigma2\)\\(\sqrt(n)\), C= z=x-\(mu\)/\(\sigma\)\\(\sqrt(n)\),
The simple probability of an occurrence of an event is called the. A=Bayesian probability. B=Joint
probability. C=Marginal probability. D=Conditional probability.E=None of these.
When referning to a curve that is tails of to the left end,you would call it. A=Symmetrical.
B=Skewed right. C=Positively skewed. D=All of these. E=None of these.
How does the computation of a sample variance differ from the computation of a population
variance. A=u is replaced by x. B=N is replaced by n-1. C= N is replaced by n. D= a and c. But not
d. E= a and b but not c.
Find \(\bar{x}\). x=1,7,3,4,6,8,10. A= 49. B=23. C=39 . D=59.

35. A pair of dice thrown,find the probability of getting a total of 5 or 11. A=2/6. B=6/6. C=3/6.
D=1/6.
36. If the sample of size m is drawn from one population and size of n from another
population,what are the respective degrees of freedom if one has to apply fisher's test. A=n-1,
m-1 . B=n,m. C= n-1,m+1. D=n-1,m. E=m-1,n.
37. Which of the following is an example of a parameter. A=x. B=n. C=u. D= All of these. E= b and c
but not a.
38. When a null hypothesis is Ho,u=42 then the alternative hypothesis can be. A=H1,u>42.
B=H1;u<42. C=H1;u=40. D=H1;u=40.
39. Quantitative variable are variables of measured on a---------scale. A=theoretical. B= numeric. C=
ordinary. D= ratio
40. For an upper tailed test of the difference of two means based on dependent samples of siza 6
and a=0.05 the critical value for the test statistic is.A=2.015. B=1.645. C=1.812. D=1.782. E=None
of these.
41. F(x) represent the-------- variable. A=Independent. B=Dependent. C=a and b. D= none of these.
42. If we want to test whether the proportions of more than two populations are equal,we
use.A=Analysis of variance. B= Estimation. C=The variance.D=Internal estimates.E=none of
these.
43. In the case that two events a and b are mutually exclusive P(AUB). A=P(A)+P(B). B=P(A)+P(B)-P(A
intersection B). C=P(A)xP(B). D=P(A intersection B)/P(B).
44. For two tailed test of hypothesis at a=0.10 the acceptance region is the entire region.A=To the
right of the negative critical values.B=Between the two critical values.C=Outside of the two
critical values. D=To the left of the positive critical value. E=None of these.
45. Which of the following is true for any regression model. A=The y- intercept of the model must
agree with the y-intercept of the data. B=There will always be a linear relationship between a
regression model and data.C=The choice of regression model to the best represent the data is
based on observing the trend in data. D=The standard deviation of the regression model is
always exactly the same as the standard deviation of the data.
46. Specify which probability distribution to use in a hypothesis test and whether it will be one
tailed or two tailed given the following information. Ho,u< and equal to 27. H1:u>27. X- bar = 33,
standard deviation=4, n=50.A= z-test:one tailed. B=z- test: two tailed. C= t-test one tailed. D= ttest; two tailed. E=None of these.
47. The square of the variance of a distribution is the. A= Standard deviation. B=Mean. C=Range.
D=Absolute deviation. E= None of these.
48. With a lower significance level,the probability of rejecting a null hypothesis thai is actually true.
A=Decreases. B=Remains the same. C=Increases. D=Increases as the mean changes. E=None of
these.
49. The number of road accidents is a--------- random variable. A= Discrete. B= Contiuous. C=Both.
D=None of above.
50. Six white balls and four black balls,which are indistinguishable apart from color,are placed in a
bag.if six balls are taken the bag,find the probability of their being three white and three black.
A=8/10. B=8/21. C=10/21. D=21/8.
51. Square root of variance have only values. A=Less than 10. B= Greater than 10. C= Less than 0.
D=Greater than 0. E=Non negative.
52. Suppose we want to test whether a population mean is significantly larger or smaller than
10.what should our alternative hypothesis is be. A=u<10. B=u>10. C=u=10. D=u not equal to 10.
E=None of these.
53. A two tailed test of a difference between two proportions led to z= 1.85 for its standardized
difference of sample proportions. for which of the following significance level would you reject
H0?. A= a= 0.05, B= a= 0.10, C= a=0.02, D= a=(a) and (b), but not (c). E = none of these.

54. For a normal curve with u=55 and @=10, how much area will be found under the curve to the
right of the value 55? A= 1.0, B= 0.68, C= 0.5, D= 0.32, E= none of these
Final paper
1. If the null hypothesis is rejected, then we may be making
a. Correct decision
b. type I error
c. type II error
d. either A or B
e. either A or C
2. Given rxy=-0.75, Sy=5, E(x-x)(Y-Y)=-15n. find Sx. A= 5, b=3, c=2, d= 4
3. Moving average is. A= given the trend in a straight line, B= measure the seasonal variation, C=
smooth-out the time series. D= none of the above.
4. Given x= 0.6-0.5y and y = 0.8, find x= ?. A= 0.1, B= 03, C= 0.2, D= 0.4.
5. Given x =120, u0=100,s=34.75 and n=25, find t : a= 3, b= 4, c= 2.88, d= 2
6. Given x=1, y=8 and b=2 find the value of interpret a. a= 7, b= 6, c= 8, d=10
7. Specify which probability distribution to use in a hypothesis test and whether it will be one
tailed or two tailed if the following information is given. Ho:u=15,H1:u not equal to 15, xbar=14.8, n=20. A= z-test ;one- tailed. B= z- test; two tailed. C=t-test;one tailed. D= t-test;two
tailed. E= a and b but not c
8. Given a =80, n=625,u0=350 and X= 356 Find Z? a=3, b=1, c= 2, d=1.88
9. Given x2= 20.178, D.of=4 and alpha=0.01, find the table value of x2 and make the statistical
decision. A= x20.01(4)=13.277, rejected H0. B= x20.01(4)=14.277 rejected H1. C= x2
0.01(4)=13.277, rejected H1. D= X2 0.01(4)=14.277, rejected H0.
10. Any statement whose validity is tested on the basis of a sample is called. A= null hypothesis, b=
alternative hypothesis, c= statistical hypothesis, d= simple hypothesis
11. X2 curve ranges from. A= infinity to + infinity, B= 0 to infinity, C= - infinity to 0, D= 0 to 1.
12. Semi-average method is used for measurement of trend values when. A=trend is linear, b=
observed data contains yearly values, c= the given time series contains odd number of values,
d= none of the above
13. Given a=80, n=625,u0=350 and X= 356 Find Z? a=1.88, b=1.99, c= 1.77, d=1.66
14. Given the equation of the straight line Y=a+bx, and the values of a a=45 b= -10 and x=3 find the
value of y. a=15, b=16, c=17, d=18
15. r=+1. A= no correlation. B=Negative correlation. C=Perfect correlation. D=None of above.
16. For an r x c contingency table, the number of cells in the table are. A= r.c, B= (r-1)(r-c). C= r+c,
D= (r+1)(c-1)
17. The hypothesis u less than 10 is a. a= simple hypothesis, b= composite hypothesis, c=
alternative hypothesis, d= difficult to tell
18. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the least
square estimate a. A= 2, B= 1, C= 1.5, D= 2.5.
19. If the respective values of f0= 21,38,32,29,36,25,41,23 and
fe=31.31,27.69,32.37,28.63,32.37,28.63,33.96,30.64,then find x2, a=12, b=13, c=11, d=11.22
20. P(type II error) is equal to. A=alpha, b= beta, c= 1-alpha, d= 1-beta.
21. Given H: u=12, H1:u greater than 12, n=64, x=15. Phi=10, Sigma=0.05. find Z are make the
statistical decision: a= Z=2.4, reject H0, b= z= 2.4 accept H0, c= z= 3.4 reject H1, d= z=3.4 reject
H0
22. Given f0=30, 75, 45, 30, 75, 45 fe=52.5, 52.5, 37.5, 60.0, 60.0, df=2 and alpha = 0.05 find x 2. A=
29.786, b= 30.0, c=26.99, d=23.
23. Given u0=130, x =150, sigma=25 and n=4. What test statistic is appropriate? a= t, b = z, c= f,
d=X2
24. Given X= 100, ox= 16, and u0=90, find Z= A= 0.6, B= 0.63, C= 0.62, 0.5.

25. If X2=13.95, df=4, X20.05(4)=13.227, we make the following statistical decision. A= We accept H0 at
alpha = 0.01 and a=0.05, b= we reject H0 at alpha = 0.05 but not at a= 0.01, C= We reject H0 at
a=0.01 but not at a=0.05, d= we reject H0 at a=0.01 and a=0.05.
26. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the sample
correlation coefficient r? . A= 2, B= 1, C= 1.5, D= 2.5.
27. The alternative hypothesis is called, a= null hypothesis, b= statistical hypothesis, c= research
hypothesis, d= single hypothesis
28. A time series has. A= two components, B= three components, C= four components= five
components.

29. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the least
square estimate b. A= 2, B= 1, C= 1.5, D= 2.5.
30. P(type I error) is equal to. A=alpha, b= beta, c= 1-alpha, d= 1-beta.
31. In the semi averages method, if the number of values is odd then we drop: A= first value, B=
third value, C= last value, D= middle value, E=middle two value.
32. Analysis is the statistical tool we can use to describe the degree to which one variable is linearly
related to another, a= regression, b= correlation, c= variances, d= none of the above.
33. A statement which is tested for the purpose of rejection under the assumption that it is true is
called. A= null hypothesis, B= alternative hypothesis, c= simple hypothesis, d= composite
hypothesis.
34. Which of the following is a criteria for selecting a regression line which best represents the data.
A= the mean of the data must agree with the line. B= the sum of squared differences between
the dependent variable must be minimized.
35. C= the sum of the squared horizontal differences in the independent variable must be
minimized. D= the line must agree with at least half of the data points
36. The choice of one-tailed test and two tailed test depends upon. A= null hypothesis, b=
alternative hypothesis, c= none of these, d= composite hypothesis.
37. Given x2= 20.178, D.of=4 and alpha=0.01, find the table value of x2 and make the statistical
decision. A= x20.01(4)=13.277, rejected H0. B= x20.01(4)=14.277 rejected H1. C= x2
0.01(4)=13.277, rejected H1. D= X2 0.01(4)=14.277, rejected H0.
38. In regression analysis the variable we would like to predict or explain is called : A. independent
variable b. dependent variable, c= regression coefficient, d= residual error
39. The degree of confidence is equal to . a= alpha, b= beta, c= 1-alpha, d= 1-beta.
40. = 100, X= 120, n = 25 s = 35.5 Find t which is 2.82
41. suppose that the null hypothesis is true and it is rejected, is known as. A= type I error, and its
probability is Beta. B= type I error, and its probability is alpha. C= type II error, and its
probability is alpha. D= type II error, and its probability is Beta.
42. Degree of freedom of t distribution is. a). N+1, b). n-1, c). n. d). n-1/2

Basic Probability

Library, Teaching and Learning


2014

Basic rules for calculating simple probability


Basic definition

Formula, symbols

Probability of an event, A, occurring:

number of successful outcomes


total number of possible outcomes
P A

Complementary Events
Events in the whole sample space but not one of the
outcomes included in A are complementary.

Limits of P

~
P ( A) P not A 1 P A

0 P( A) 1
P ( A) 0
P ( A) 1

P(any event occurring) lies between 0 and 1


If event A is certain not to happen:
If event A is certain to happen:

Union or General Addition Rule


Probability that either one or other event occurs

Mutually Exclusive Events


Events that cannot both occur at the same time
have no intersection.
For events that are mutually exclusive:

Non- Mutually Exclusive Events


Events that can both occur at the same time have
some intersection.

P A or B P A B

P A or B P A B P A PB

P A and B 0
P A or B P A P B P A and B

Note that this rule applies regardless, as if there is no intersection, zero will be subtracted.

Statistically Independent Events


Events where the occurrence of one event does not
influence the likelihood of the other occurring

If A and B are independent, then


P A and B P A P B

Note the reverse of this is also true:


If P A and B P A P B , then A and B are independent.
This is a specifically mathematical definition. Do not rely on gut feeling or instinct
to tell you whether two events are statistically independent or not.

Conditional Probability
This arises when we are calculating the probabilities of a
particular event, A, given that we know the condition of
another event, B. It is the probability that an event
occurs given that another event has occurred.

P A B

P ( A and B) n A and B

P ( B)
nB

P(A|B) means : The probability that A will occur given that B has already occurred.
P A B

P ( A and B )
P ( B)
P A P B

P A
P B

Also, Note: If the events are independent, then

i.e., if events A and B are independent then the conditional probability that A occurs, given that
event B has occurred, is simply the probability that event A occurs.

Expected Value
The expected value of a random variable is the
mean of the random variable

E X x1 px1 x2 px2 x3 px3 ...xn pxn

That is, to work out the expected value of a random variable, multiply each possible value of X by
its probability and add these products.

Presentation of information
As well as just being written out, information can be presented in a table or as a diagram.
Examine the following information.
The set of digits, D, contains the numbers {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
The set of even numbers, E, is {2, 4, 6, 8}
The set of odd numbers, O, is {1, 3, 5, 7, 9}
The set of prime numbers, P, is {2, 3, 5, 7}
This Venn diagram shows the relationships between the sets. Note there are some numbers in
more than one grouping and zero is all on its own.

D
O
3

1
9

P
5

4
8

D = digits

E = even numbers

O = odd numbers

E
6

P = prime numbers

A summary of this information could have been written in table form, showing the number of
digits in each category:
Odd numbers
3
2
5

Prime
Not Prime
Total

Even numbers
1
3
4

Neither
0
1
1

Total
4
6
10

That is, there are 3 digits that are both odd and prime, 2 digits that are both odd and not
prime and 5 odd digits in total etc.
Study how the following probabilities are calculated, using the rules given above.

Peven

4
0.4
10

Podd

P prime and even

5
0.5
10

Pneither even or odd

1
0.1(from Venn Diagram)
10

P A or B P A P B P A and B

or using formula:

P prime or even

P prime and odd

1
0.1
10

3
10

Peven, odd or neither 1

4 4 1
7

10 10 10 10

P prime or odd

5
4
3
6

10 10 10 10

Podd and even 0

Probabilities are very easy to calculate if data is given in table form. If you are given
information not in table form, try to tabulate it before you start your calculations.
The table is sometimes referred to as a contingency table.

In some texts you will see:


used for intersection and defined by the word and.
used for union and defined by the word or.
PROBABILITY
Addition law (events not mutually exclusive):

P(A or B) = P (A) + P(B) P(A and B)


For mutually exclusive events: P(A or B) = P(A) + P(B)

P(A and B) = 0
Multiplication law: P(A and B) P(A)P(B|A)=P(B)P(A|B)
If statistically independent:

P(A|B) = P(A) and P(B|A) = P(B)


P(A and B) P(A)P(B)
DO NOT ABANDON YOUR OWN LOGIC think about the questions and the
likely answer.
3

PROBABILITY - PRACTICE QUESTIONS


1. If two events are mutually exclusive, the probability that they both occur is:
A 0.00

0.50

1.00.

Cannot be determined from the information given

2. When using the general multiplication rule, P(A and B) is equal to:
A P(A|B).P(B)

B P(A).P(B)

C P(B)/P(A)

D P(A)/P(B)

3. A recent survey of banks revealed the following distribution for the interest rate being
charged on a home loan (based on a 15-year mortgage with a 20% deposit):
7.0%
7.5%
8.0%
8.5%
> 8.5%
Interest rate
0.12
0.23
0.24
0.35
0.06
Probability
If a bank is selected at random from this distribution, what is the chance that the interest
rate charged on a home loan will exceed 8.0%?
A 0.06

B 0.41

C 0.59

D 1.00

Use the following information for the next two questions.


Mothers Against Drunk Driving is a very visible group whose main focus is to educate the
public about the harm caused by drunk drivers. A study was recently done that emphasised
the problem we all face with drinking and driving. Four hundred accidents that occurred on
a Saturday night were analysed. Two items noted were the number of vehicles involved and
whether alcohol played a role in the accident. The numbers are shown below:
Number of Vehicles Involved Totals
Totals
1
2
3
Did alcohol play a role?
50
100
20
170
Yes
25
175
30
230
No
75
275
50
400
Totals
4. What proportion of accidents involved alcohol and a single vehicle?
A 25/400

B 50/400

C 195/400

D 245/400

5. Given that alcohol was not involved, what proportion of the accidents were multiple vehicle?
A 50/170

B 120/170

C 205/230

D 25/230

6. The connotation expected value or expected gain from playing Roulette at a casino
means:
A the amount you expect to gain on a
single play
C the amount you need to break even
over many plays

B the amount you expect to gain in


the long run over many plays
D the amount you should expect to
gain if you are lucky

7.

If two events are collectively exhaustive, the probability that one or the other occurs is
A 0

8.

0.50

1.00

Cannot be determined from the information given

There are 100 female students and 230 male students in a class. The probability that a
randomly picked student is a female is:
A 0

9.

0.50

0.30

Cannot be determined from the information given

According to a survey of American households, the probability that the residents own two
cars IF annual household income is over $25,000 is 80%. Of the households surveyed,
60% had incomes over $25,000 and 70% had two cars. The probability that the residents
of a household own two cars AND have an income less than or equal to $25,000 a year is:
A 0.12

B 0.18

C 0.22

D 0.48

10. A company has two machines that produce widgets. An older machine produces 23%
defective widgets, while the new machine produces only 8% defective widgets. In addition,
the new machine produces three times as many widgets as the older machine does.
Given a randomly chosen widget was tested and found to be defective, what is the
probability it was produced by the new machine?
A 0.08

B 0.15

C 0.489

D 0.511

Use the following information for the next two questions.


A certain sales company has both male and female employees. These employees either
worked overtime (extra hours) or did not. The probability that an employee chosen at random
was male was 0.60. The probability that a randomly chosen employee worked overtime was
0.45.
11. What is the probability that an employee chosen at random will be female?
12. The probability that an employee chosen at random is both male AND works overtime is
0.25. What is the probability that a randomly chosen employee is male OR works
overtime? Hint: to answer this question it could help to construct a 2x2 contingency table.
Use the following information for the next three questions.
The marks (pass or fail) of 100 QMET103 students were summarised according to
student gender:
Male
Female

Passed
20
45

Failed
20
15

13. If a student is selected at random, what is the probability that the student passed QMET103?
14. If a student is selected at random, what is the probability that the student failed QMET103
AND is male?
15. Given that the selected student had passed, what is the probability that the student was
male?

16. A local retail store surveyed 1000 people and asked whether they intended to purchase a
large television over the next 12 months. Twelve months later, the same respondents
were contacted and asked whether they actually purchased the television.
Their responses are summarized in the following table:
Planned to purchase

Actually Purchased
No
50
650

Yes
200
100

Yes
No

a) What is the probability that a randomly selected person planned to purchase a large
television?
b) What is the probability that a randomly selected person planned to purchase a
television AND actually purchased a television?
c) What is the probability that a randomly selected person planned to purchase a
television OR actually purchased a television?
d) Given that a randomly selected person planned to purchase a television, what is the
probability that he/she actually purchased a television?
e) Are the two events, planning to purchase a television and actually purchasing a
television, statistically independent? (Show working).
17. 300 students were sampled to determine attitudes to internal assessment workloads.
Students from both Commerce and Science Divisions were sampled and the following table
produced:

Science
Commerce

Workload
too light
20
100

Workload
about right
30
20

Workload
too much
50
80

a) What is the probability that a randomly selected person in the sample considers the
workload too light?
b) What is the probability that a randomly selected person in the sample considers the
workload about right AND too light?
c) What is the probability that a randomly selected person in the sample is a commerce
student OR considers the workload too much?
d) Given that a randomly selected student is from the Commerce Division, what is the
probability that the student considers the workload about right?
e) What is the probability that a randomly selected student is not a science student AND
they think the workload is too light?

18. There are 50 students in the Lincoln University Rugby Club and 20 of them take vitamin C
daily. 30% Rugby Club students catch a cold each year. 20% of students who take
Vitamin C every day caught a cold last year.
a) Prepare a contingency table for the above information.
b) What is the probability that a randomly selected student who does not take Vitamin C
every day caught a cold last year?
c) Given that the randomly selected student caught a cold last year, what is the
probability that he takes Vitamin C?
d) Are taking Vitamin C and catching a cold independent events? Support your answer
with appropriate mathematical calculations.

19. A soft drink company is interested in introducing a new Cola brand to the market. Initially
they developed three different flavours and want to select the flavour which would be the
most popular one. Their research department randomly selected 100 males and 100
females and asked them to choose the best flavour between the three flavours (say A, B
and C). The results are summarised in the following table:
Flavour
A

Male
25

Female
30

B
C

35
40

50
20

a) What is the probability that a person likes flavour A?


b) What is the probability that a randomly selected person is a female and likes flavour
C?
c) Given that the randomly selected person is male, what is the probability that he likes
flavour C?
d) What is the probability that a randomly selected person is female or likes the flavour
A?
e) If two persons are randomly selected without replacement, what is the probability that
both persons selected will like flavour C?
SOLUTIONS
Questions 1 - 6
1 A
5 C

2
6

A
B

Questions 7-10
1 C

Questions 11-15
1
0.4

0.8

0.65

0.2

0.31

Question 16
A
0.25

0.2

0.35

0.8

NO

Question 17
a
0.4

0.83

0.1

0.33

Question 18
Took Vit C
NO Vit C
Total
Caught cold
4
11
15
b
NO Cold
16
19
35
c
Total
20
30
50
d
PCold PvitC 0.4 0.3 0.12; Pcold and vit C 0.08 0.12

0.3667
0.2667

Hence not Statistica lly Independen t

Question 19
a
0.275

0.1

0.4

0.625

0.0889

MCQ TESTING OF HYPOTHESIS


MCQ 13.1
A statement about a population developed for the purpose of testing is called:
(a) Hypothesis
(b) Hypothesis testing
(c) Level of significance (d) Test-statistic
MCQ 13.2
Any hypothesis which is tested for the purpose of rejection under the assumption that it is true is
called:
(a) Null hypothesis (b) Alternative hypothesis (c) Statistical hypothesis (d) Composite hypothesis
MCQ 13.3
A statement about the value of a population parameter is called:
(a) Null hypothesis
(b) Alternative hypothesis (c) Simple hypothesis (d) Composite hypothesis
MCQ 13.4
Any statement whose validity is tested on the basis of a sample is called:
(a) Null hypothesis (b) Alternative hypothesis (c) Statistical hypothesis

(b) Simple hypothesis

MCQ 13.5
A quantitative statement about a population is called:
(a) Research hypothesis (b) Composite hypothesis (c) Simple hypothesis (d) Statistical hypothesis
MCQ 13.6
A statement that is accepted if the sample data provide sufficient evidence that the null hypothesis is false is
called:
(a) Simple hypothesis
(b) Composite hypothesis (c) Statistical hypothesis
(d) Alternative hypothesis
MCQ 13.7
The alternative hypothesis is also called:
(a) Null hypothesis (b) Statistical hypothesis

(c) Research hypothesis

(d) Simple hypothesis

MCQ 13.8
A hypothesis that specifies all the values of parameter is called:
(a) Simple hypothesis
(b) Composite hypothesis
(c) Statistical hypothesis

(d) None of the above

MCQ 13.9
The hypothesis 10 is a:
(a) Simple hypothesis

(d) Difficult to tell.

(b) Composite hypothesis

(c) Alternative hypothesis

MCQ 13.10
If a hypothesis specifies the population distribution is called:
(a) Simple hypothesis
(b) Composite hypothesis
(c) Alternative hypothesis
MCQ 13.11
A hypothesis may be classified as:
(a) Simple
(b) Composite

(c) Null

(d) None of the above

(d) All of the above

MCQ 13.12
The probability of rejecting the null hypothesis when it is true is called:
(a) Level of confidence
(b) Level of significance
(c) Power of the test

(d) Difficult to tell

MCQ 13.13
The dividing point between the region where the null hypothesis is rejected and the region where it is not
rejected is said to be:
(a) Critical region
(b) Critical value
(c) Acceptance region
(d) Significant region
MCQ 13.14
If the critical region is located equally in both sides of the sampling distribution of test-statistic, the test is
called:
(a) One tailed
(b) Two tailed
(c) Right tailed
(d) Left tailed
MCQ 13.15
The choice of one-tailed test and two-tailed test depends upon:
(a) Null hypothesis
(b) Alternative hypothesis
(c) None of these
MCQ 13.16
Test of hypothesis Ho: = 50 against H1: > 50 leads to:
(a) Left-tailed test
(b) Right-tailed test
(c) Two-tailed test

(d) Composite hypotheses

(d) Difficult to tell

MCQ 13.17
Test of hypothesis Ho: = 20 against H1: < 20 leads to:
(a) Right one-sided test
(b) Left one-sided test
(c) Two-sided test

(d) All of the above

MCQ 13.18
Testing Ho: = 25 against H1: 20 leads to:
(a) Two-tailed test (b) Left-tailed test (c) Right-tailed test (d) Neither (a), (b) and (c)
MCQ 13.19
A rule or formula that provides a basis for testing a null hypothesis is called:
(a) Test-statistic
(b) Population statistic
(c) Both of these
MCQ 13.20
The range of test statistic-Z is:
(a) 0 to 1
(b) -1 to +1

(c) 0 to

(d) - to +

MCQ 13.21
The range of test statistic-t is:
(a) 0 to
(b) 0 to 1

(c) - to +

(d) -1 to +1

MCQ 13.22
If Ho is true and we reject it is called:
(a) Type-I error
(b) Type-II error

(c) Standard error

MCQ 13.23
The probability associated with committing type-I error is:
(a)
(b)
(c) 1

(d) None of the above

(d) Sampling error

(d) 1

MCQ 13.24
A failing student is passed by an examiner, it is an example of:
(a) Type-I error
(b) Type-II error (c) Unbiased decision

(d) Difficult to tell

MCQ 13.25
A passing student is failed by an examiner, it is an example of:
(a) Type-I error (b) Type-II error
(c) Best decision
(d) All of the above
MCQ 13.26
1 is also called:
(a) Confidence coefficient

(b) Power of the test

MCQ 13.27
1 is the probability associated with:
(a) Type-I error
(b) Type-II error

(c) Size of the test

(c) Level of confidence

(d) Level of significance

(d) Level of significance

MCQ 13.28
Area of the rejection region depends on:
(a) Size of
(b) Size of
(c) Test-statistic

(d) Number of values

MCQ 13.29
Size of critical region is known as:
(a)
(b) 1 -

(d) Size of the test

(c) Critical value

MCQ 13.30
A null hypothesis is rejected if the value of a test statistic lies in the:
(a) Rejection region
(b) Acceptance region (c) Both (a) and (b)

(d) Neither (a) nor (b)

MCQ 13.31
The test statistic is equal to:

MCQ 13.32
Level of significance is also called:
(a) Power of the test
(b) Size of the test (c) Level of confidence
MCQ 13.33
Level of significance lies between:
(a) -1 and +1
(b) 0 and 1

(d) - to +

(c) 0 and n

MCQ 13.34
Critical region is also called:
(a) Acceptance region
(b) Rejection region

(c) Confidence region

MCQ 13.35
The probability of rejecting Ho when it is false is called:
(a) Power of the test
(b) Size of the test
(c) Level of confidence
MCQ 13.36
Power of a test is related to:
(a) Type-I error
(b) Type-II error

(d) Confidence coefficient

(c) Both (a) and (b)

(d) Statistical region

(d) Confidence coefficient

(d) Neither (a) and (b)

MCQ 13.37
In testing hypothesis + is always equal to:
(a) One
(b) Zero
(c) Two
MCQ 13.38
The significance level is the risk of:
(a) Rejecting Ho when Ho is correct
(c) Rejecting H1 when H1 is correct

(d) Difficult to tell

(b) Rejecting Ho when H1 is correct


(d) Accepting Ho when Ho is correct.

MCQ 13.39
An example in a two-sided alternative hypothesis is:
(a) H1: < 0
(b) H1: > 0
(c) H1: 0

(d) H1: 0

MCQ 13.40
If the magnitude of calculated value of t is less than the tabulated value of t and H1 is two-sided, we
should:
(a) Reject Ho
(b) Accept H1
(c) Not reject Ho
(d) Difficult to tell
MCQ 13.41
Accepting a null hypothesis Ho:
(a) Proves that Ho is true
(c) Implies that Ho is likely to be true

(b) Proves that Ho is false


(d) Proves that 0

MCQ 13.42
The chance of rejecting a true hypothesis decreases when sample size is:
(a) Decreased
(b) Increased
(c) Constant

(d) Both (a) and (b)

MCQ 13.43
The equality condition always appears in:
(a) Null hypothesis
(b) Simple hypothesis (c) Alternative hypothesis

(d) Both (a) and (b)

MCQ 13.44
Which hypothesis is always in an inequality form?
(a) Null hypothesis
(b) Alternative hypothesis

(d) Composite hypothesis

(c) Simple hypothesis

MCQ 13.45
Which of the following is composite hypothesis?
(a) o
(b) o
(c) = o

(d) o

MCQ 13.46
P (Type I error) is equal to:
(a) 1
(b) 1

(c)

MCQ 13.47
P (Type II error) is equal to:
(a)
(b)

(c) 1

(d) 1

MCQ 13.48
The power of the test is equal to:
(a)
(b)

(c) 1

(d) 1

(d)

MCQ 13.49
The degree of confidence is equal to:
(a)
(b)

(c) 1

MCQ 13.50
/ 2 is called:
(a) One tailed significance level
(c) Left tailed significance level

(b) Two tailed significance level


(d) Right tailed significance level

(d) 1

MCQ 13.51
Students t-test is applicable only when:
(a) n30 and is known (b) n>30 and is unknown (c) n=30 and is known (d) All of the above
MCQ 13.52
Students t-statistic is applicable in case of:
(a) Equal number of samples (b) Unequal number of samples (c) Small samples (d) All of the above
MCQ 13.53
Paired t-test is applicable when the observations in the two samples are:
(a) Equal in number (b) Paired
(c) Correlation
(d) All of the above
MCQ 13.54
The degree of freedom for paired t-test based on n pairs of observations is:
(a) 2n - 1
(b) n - 2
(c) 2(n - 1) (d) n - 1
MCQ 13.55
The test-statistic
(a) n

(b) n - 1

has d.f = ________:


(c) n - 2

(d) n1 + n2 - 2

MCQ 13.56
In an unpaired samples t-test with sample sizes n1= 11 and n2= 11, the value of tabulated t should be
obtained for:
(a) 10 degrees of freedom
(b) 21 degrees of freedom
(c) 22 degrees of freedom
(d) 20 degrees of freedom
MCQ 13.57
In analyzing the results of an experiment involving seven paired samples, tabulated t should be
obtained for:
(a) 13 degrees of freedom
(b) 6 degrees of freedom
(c) 12 degrees of freedom
(d) 14 degrees of freedom
MCQ 13.58
The mean difference between 16 paired observations is 25 and the standard deviation of differences is
10. The value of statistic-t is:
(a) 4
(b) 10
(c) 16
(d) 25
MCQ 13.59
Statistic-t is defined as deviation of sample mean from population mean expressed in terms of:
(a) Standard deviation
(b) Standard error
(c) Coefficient of standard deviation
(d) Coefficient of variation

MCQ 13.60
Students t-distribution has (n-1) d.f. when all the n observations in the sample are:
(a) Dependent
(b) Independent (c) Maximum
(d) Minimum
MCQ 13.61
The number of independent values in a set of values is called:
(a) Test-statistic
(b) Degree of freedom
(c) Level of significance (d) Level of confidence
MCQ 13.62
The purpose of statistical inference is:
(a) To collect sample data and use them to formulate hypotheses about a population
(b) To draw conclusion about populations and then collect sample data to support the conclusions
(c) To draw conclusions about populations from sample data
(d) To draw conclusions about the known value of population parameter
MCQ 13.63
Suppose that the null hypothesis is true and it is rejected, is known as:
(a) A type-I error, and its probability is
(b) A type-I error, and its probability is
(c) A type-II error, and its probability is
(d) A type-Il error, and its probability is
MCQ 13.64
An advertising agency wants to test the hypothesis that the proportion of adults in Pakistan who read a Sunday
Magazine is 25 percent. The null hypothesis is that the proportion reading the Sunday Magazine is:
(a) Different from 25%
(b) Equal to 25%
(c) Less than 25 %
(d) More than 25 %
MCQ 13.65
If the mean of a particular population is o,

is distributed:

(a) As a standard normal variable, if the population is non-normal


(b) As a standard normal variable, if the sample is large
(c) As a standard normal variable, if the population is normal
(d) As the t-distribution with v = n - 1 degrees of freedom
MCQ 13.66
If 1 and 2 are means of two populations,

is distributed:

(a) As a standard normal variable, if both samples are independent and less than 30
(b) As a standard normal variable, if both populations are normal
(c) As both (a) and (b) state
(d) As the t-distribution with n1 + n2 - 2 degrees of freedom
MCQ 13.67
If the population proportion equals po, then
(a) As a standard normal variable, if n > 30
(b) As a Poisson variable
(c) As the t-distribution with v= n 1 degrees of freedom
(d) As a distribution with v degrees of freedom

is distributed:

MCQ 13.68
When is known, the hypothesis about population mean is tested by:
(a) t-test
(b) Z-test
(c) 2-test
(d) F-test
MCQ 13.69
Given o = 130,
(a) t

= 150, = 25 and n = 4; what test statistics is appropriate?


(b) Z
(c) 2
(d) F

MCQ 13.70
Given Ho: = o, H1: o, = 0.05 and we reject Ho; the absolute value of the Z-statistic must have equalled
or been beyond what value?
(a) 1.96
(b) 1.65
(c) 2.58
(d) 2.33
MCQ 13.71
If p1 and p2 are not identical, then standard error of the difference of proportions (p1 p2) is:

MCQ 13.72
Under the hypothesis Ho: p1 = p2, the formula for the standard error of the difference between
proportions (p1 p2) is:

CORRELATION & REGRESSION

MULTIPLE CHOICE QUESTIONS

In the following multiple-choice questions, select the best answer.


1.

The correlation coefficient is used to determine:


a. A specific value of the y-variable given a specific value of the x-variable
b. A specific value of the x-variable given a specific value of the y-variable
c. The strength of the relationship between the x and y variables
d. None of these

2.

If there is a very strong correlation between two variables then the correlation coefficient must be
a. any value larger than 1
b. much smaller than 0, if the correlation is negative
c. much larger than 0, regardless of whether the correlation is negative or positive
d. None of these alternatives is correct.

3.

In regression, the equation that describes how the response variable (y) is related to the
explanatory variable (x) is:
a. the correlation model
b. the regression model
c. used to compute the correlation coefficient
d. None of these alternatives is correct.

4.

The relationship between number of beers consumed (x) and blood alcohol content (y) was studied
in 16 male college students by using least squares regression. The following regression equation
was obtained from this study:
!= -0.0127 + 0.0180x
The above equation implies that:
a. each beer consumed increases blood alcohol by 1.27%
b. on average it takes 1.8 beers to increase blood alcohol content by 1%
c. each beer consumed increases blood alcohol by an average of amount of 1.8%
d. each beer consumed increases blood alcohol by exactly 0.018

5.

SSE can never be


a. larger than SST
b. smaller than SST
c. equal to 1
d. equal to zero

6.

Regression modeling is a statistical framework for developing a mathematical equation that


describes how
a. one explanatory and one or more response variables are related
b. several explanatory and several response variables response are related
c. one response and one or more explanatory variables are related
d. All of these are correct.

7.

In regression analysis, the variable that is being predicted is the


a. response, or dependent, variable
b. independent variable
c. intervening variable
d. is usually x

8.

Regression analysis was applied to return rates of sparrowhawk colonies. Regression analysis was
used to study the relationship between return rate (x: % of birds that return to the colony in a given
year) and immigration rate (y: % of new adults that join the colony per year). The following
regression equation was obtained.
! = 31.9 0.34x
Based on the above estimated regression equation, if the return rate were to decrease by 10% the
rate of immigration to the colony would:
a. increase by 34%
b. increase by 3.4%
c. decrease by 0.34%
d. decrease by 3.4%

9.

In least squares regression, which of the following is not a required assumption about the error
term ?
a. The expected value of the error term is one.
b. The variance of the error term is the same for all values of x.
c. The values of the error term are independent.
d. The error term is normally distributed.

10.

Larger values of r2 (R2) imply that the observations are more closely grouped about the
a. average value of the independent variables
b. average value of the dependent variable
c. least squares line
d. origin

11.

In a regression analysis if r2 = 1, then


a. SSE must also be equal to one
b. SSE must be equal to zero
c. SSE can be any positive value
d. SSE must be negative

12.

The coefficient of correlation


a. is the square of the coefficient of determination
b. is the square root of the coefficient of determination
c. is the same as r-square
d. can never be negative

13.

In regression analysis, the variable that is used to explain the change in the outcome of an
experiment, or some natural process, is called
a. the x-variable
b. the independent variable
c. the predictor variable
d. the explanatory variable
e. all of the above (a-d) are correct
f. none are correct

14.

In the case of an algebraic model for a straight line, if a value for the x variable is specified, then
a. the exact value of the response variable can be computed
b. the computed response to the independent value will always give a minimal residual
c. the computed value of y will always be the best estimate of the mean response
d. none of these alternatives is correct.

15.

A regression analysis between sales (in $1000) and price (in dollars) resulted in the following
equation:
! = 50,000 - 8X
The above equation implies that an
a. increase of $1 in price is associated with a decrease of $8 in sales
b. increase of $8 in price is associated with an increase of $8,000 in sales
c. increase of $1 in price is associated with a decrease of $42,000 in sales
d. increase of $1 in price is associated with a decrease of $8000 in sales

16.

In a regression and correlation analysis if r2 = 1, then


a. SSE = SST
b. SSE = 1
c. SSR = SSE
d. SSR = SST

17.

If the coefficient of determination is a positive value, then the regression equation


a. must have a positive slope
b. must have a negative slope
c. could have either a positive or a negative slope
d. must have a positive y intercept

18.

If two variables, x and y, have a very strong linear relationship, then


a. there is evidence that x causes a change in y
b. there is evidence that y causes a change in x
c. there might not be any causal relationship between x and y
d. None of these alternatives is correct.

19.

If the coefficient of determination is equal to 1, then the correlation coefficient


a. must also be equal to 1
b. can be either -1 or +1
c. can be any value between -1 to +1
d. must be -1

20.

In regression analysis, if the independent variable is measured in kilograms, the dependent


variable
a. must also be in kilograms
b. must be in some unit of weight
c. cannot be in kilograms
d. can be any units

21.

The data are the same as for question 4 above. The relationship between number of beers
consumed (x) and blood alcohol content (y) was studied in 16 male college students by using least
squares regression. The following regression equation was obtained from this study:
!= -0.0127 + 0.0180x
Suppose that the legal limit to drive is a blood alcohol content of 0.08. If Ricky consumed 5 beers
the model would predict that he would be:
a. 0.09 above the legal limit
b. 0.0027 below the legal limit
c. 0.0027 above the legal limit
d. 0.0733 above the legal limit

22.

In a regression analysis if SSE = 200 and SSR = 300, then the coefficient of determination is
a. 0.6667
b. 0.6000
c. 0.4000
d. 1.5000

23.

If the correlation coefficient is 0.8, the percentage of variation in the response variable explained
by the variation in the explanatory variable is
a. 0.80%
b. 80%
c. 0.64%
d. 64%

24.

If the correlation coefficient is a positive value, then the slope of the regression line
a. must also be positive
b. can be either negative or positive
c. can be zero
d. can not be zero

25.

If the coefficient of determination is 0.81, the correlation coefficient


a. is 0.6561
b. could be either + 0.9 or - 0.9
c. must be positive
d. must be negative

26.

A fitted least squares regression line


a. may be used to predict a value of y if the corresponding x value is given
b. is evidence for a cause-effect relationship between x and y
c. can only be computed if a strong linear relationship exists between x and y
d. None of these alternatives is correct.

27.

Regression analysis was applied between $ sales (y) and $ advertising (x) across all the branches
of a major international corporation. The following regression function was obtained.
! = 5000 + 7.25x
If the advertising budgets of two branches of the corporation differ by $30,000, then what will be
the predicted difference in their sales?
a. $217,500
b. $222,500
c. $5000
d. $7.25

28.

Suppose the correlation coefficient between height (as measured in feet) versus weight (as
measured in pounds) is 0.40. What is the correlation coefficient of height measured in inches
versus weight measured in ounces? [12 inches = one foot; 16 ounces = one pound]
a. 0.40
b. 0.30
c. 0.533
d. cannot be determined from information given
e. none of these

29.

Assume the same variables as in question 28 above; height is measured in feet and weight is
measured in pounds. Now, suppose that the units of both variables are converted to metric (meters
and kilograms). The impact on the slope is:
a.
the sign of the slope will change
b.
the magnitude of the slope will change
c.
both a and b are correct
d.
neither a nor b are correct

30.

Suppose that you have carried out a regression analysis where the total variance in the response is
133452 and the correlation coefficient was 0.85. The residual sums of squares is:
a. 37032.92
b. 20017.8
c. 113434.2
d. 96419.07
e. 15%
f.
0.15

31.

This question is related to questions 4 and 21 above. The relationship between number of beers
consumed (x) and blood alcohol content (y) was studied in 16 male college students by using least
squares regression. The following regression equation was obtained from this study:
!= -0.0127 + 0.0180x
Another guy, his name Dudley, has the regression equation written on a scrap of paper in his
pocket. Dudley goes out drinking and has 4 beers. He calculates that he is under the legal limit
(0.08) so he decides to drive to another bar. Unfortunately Dudley gets pulled over and
confidently submits to a road-side blood alcohol test. He scores a blood alcohol of 0.085 and gets
himself arrested. Obviously, Dudley skipped the lecture about residual variation. Dudleys
residual is:
a.
b.
c.
d.

+0.005
-0.005
+0.0257
-0.0257

32.

You have carried out a regression analysis; but, after thinking about the relationship between
variables, you have decided you must swap the explanatory and the response variables. After
refitting the regression model to the data you expect that:
a. the value of the correlation coefficient will change
b. the value of SSE will change
c. the value of the coefficient of determination will change
d. the sign of the slope will change
e. nothing changes

33.

Suppose you use regression to predict the height of a womans current boyfriend by using her own
height as the explanatory variable. Height was measured in feet from a sample of 100 women
undergraduates, and their boyfriends, at Dalhousie University. Now, suppose that the height of
both the women and the men are converted to centimeters. The impact of this conversion on the
slope is:
a.
the sign of the slope will change
b.
the magnitude of the slope will change
c.
both a and b are correct
d.
neither a nor b are correct

34.

A residual plot:
a. displays residuals of the explanatory variable versus residuals of the response variable.
b. displays residuals of the explanatory variable versus the response variable.
c. displays explanatory variable versus residuals of the response variable.
d. displays the explanatory variable versus the response variable.
e. displays the explanatory variable on the x axis versus the response variable on the y axis.

35.

When the error terms have a constant variance, a plot of the residuals versus the independent
variable x has a pattern that
a. fans out
b. funnels in
c. fans out, but then funnels in
d. forms a horizontal band pattern
e. forms a linear pattern that can be positive or negative

36.

You studied the impact of the dose of a new drug treatment for high blood pressure. You think
that the drug might be more effective in people with very high blood pressure. Because you
expect a bigger change in those patients who start the treatment with high blood pressure, you use
regression to analyze the relationship between the initial blood pressure of a patient (x) and the
change in blood pressure after treatment with the new drug (y). If you find a very strong positive
association between these variables, then:
a.
there is evidence that the higher the patients initial blood pressure, the bigger the impact
of the new drug.
b.
there is evidence that the higher the patients initial blood pressure, the smaller the impact
of the new drug.
c.
there is evidence for an association of some kind between the patients initial blood
pressure and the impact of the new drug on the patients blood pressure
d.
none of these are correct, this is a case of regression fallacy

Question 37:
A variety of summary statistics were collected for a small sample (10) of bivariate data, where the
dependent variable was y and an independent variable was x.
X = 90
Y = 170
n = 10

(Y Y )(X X) = 466
(X X ) = 234
(Y Y ) = 1434

2
2

SSE = 505.98
37.1

Use the formula to the right to compute the sample correlation coefficient:
a. 0.8045
b. -0.8045
c. 0
d. 1

37.2

The least squares estimate of b1 equals


a. 0.923
b. 1.991
c. -1.991
d. -0.923

37.3

The least squares estimate of b0 equals


a. 0.923
b. 1.991
c. -1.991
d. -0.923

37.4

The sum of squares due to regression (SSR) is


a. 1434
b. 505.98
c. 50.598
d. 928.02

37.5

The coefficient of determination equals


a. 0.6471
b. -0.6471
c. 0
d. 1

37.6

The point estimate of y when x = 0.55 is


a. 0.17205
b. 2.018
c. 1.0905
d. -2.018
e. -0.17205

MULTIPLE CHOICE ANSWERS


1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

c
b
b
c
a
c
a
b
a
c

11.
12.
13.
14.
15.
16.
17.
18.
19.
20.

b
b
e
a
d
d
c
c
b
d

21.
22.
23.
24.
25.
26.
27.
28.
29.
30.

b
b
d
a
b
a
a
a
b
a

31.
32.
33.
34.
35.
36.
37.1
37.2
37.3
37.4

c
b
d
c
d
d
a
b
d
d

37.5 a
37.6 a

Chapter 15
Multiple Choice Questions
(The answers are provided after the last question.)
1. What is the median of the following set of scores?
18, 6, 12, 10, 14 ?
a. 10
b. 14
c. 18
d. 12
2. Approximately what percentage of scores fall within one standard deviation of the mean in a
normal distribution?
a. 34%
b. 95%
c. 99%
d. 68%
3. The denominator (bottom) of the z-score formula is
a. The standard deviation
b. The difference between a score and the mean
c. The range
d. The mean
4. Let's suppose we are predicting score on a training posttest from number of years
of education and the score on an aptitude test given before training. Here is the regression
equation
Y = 25 + .5X1 +10X2,
where X1 = years of education and X2 = aptitude test score.
What is the
predicted score for someone with 10 years of education and a aptitude test score of 5?
a. 25
b. 50
c. 35
d. 80
5. The standard deviation is:
a. The square root of the variance
b. A measure of variability
c. An approximate indicator of how numbers vary from the mean
d. All of the above
6. Hypothesis testing and estimation are both types of descriptive statistics.
a. True
b. False

7. A set of data organized in a participants(rows)-by-variables(columns) format is known as a


data set.
a. True
b. False
8. A graph that uses vertical bars to represent data is called a ____.
a. Line graph
b. Bar graph
c. Scatterplot
d. Vertical graph
9. The goal of ___________ is to focus on summarizing and explaining a specific set of data.
a. Inferential statistics
b. Descriptive statistics
c. None of the above
d. All of the above
10. The most frequently occurring number in a set of values is called the ____.
a. Mean
b. Median
c. Mode
d. Range
11. As a general rule, the _______ is the best measure of central tendency because it is more
precise.
a. Mean
b. Median
c. Mode
d. Range
12. Focusing on describing or explaining data versus going beyond immediate data and making
inferences is the difference between _______.
a. Central tendency and common tendency
b. Mutually exclusive and mutually exhaustive properties
c. Descriptive and inferential
d. Positive skew and negative skew
13. Why are variance and standard deviation the most popular measures of variability?
a. They are the most stable and are foundations for more advanced statistical analysis
b. They are the most simple to calculate with large data sets
c. They provide nominally scaled data
d. None of the above
14. ____________ is the set of procedures used to explain or predict the values of a dependent
variable based on the values of one or more independent variables.
a. Regression analysis

b. Regression coefficient
c. Regression equation
d. Regression line
15. The ______ is the value you calculate when you want the arithmetic average.
a. Mean
b. Median
c. Mode
d. All of the above
16. ___________ are used when you want to visually examine the relationship between two
quantitative variables.
a. Bar graphs
b. Pie graphs
c. Line graphs
d. Scatterplots
17. The _______ is often the preferred measure of central tendency if the data are severely
skewed.
a. Mean
b. Median
c. Mode
d. Range
18. Which of the following is the formula for range?
a. H + L
b. L x H
c. L - H
d. H L
19. Which is a raw score that has been transformed into standard deviation units?
a. z score
b. SDU score
c. t score
d. e score
20. Which of the following is NOT a measure of variability?
a. Median
b. Variance
c. Standard deviation
d. Range
21. Which of the following is NOT a common measure of central tendency?
a. Mode
b. Range
c. Median

d. Mean
22. What is the median of this set of numbers: 4, 6, 7, 9, 2000000?
a. 7.5
b. 6
c. 7
d. 4
23. What is the mean of this set of numbers: 4, 6, 7, 9, 2000000?
a. 7.5
b. 400,005.2
c. 7
d. 4
24. Which of the following is interpreted as the percentage of scores in a reference group that
falls below a particular raw score?
a. Standard scores
b. Percentile rank
c. Reference group
d. None of the above
25. The median is ______.
a. The middle point
b. The highest number
c. The average
d. Affected by extreme scores
26. Which measure of central tendency takes into account the magnitude of scores?
a. Mean
b. Median
c. Mode
d. Range
27. If a test was generally very easy, except for a few students who had very
low scores, then the distribution of scores would be _____.
a. Positively skewed
b. Negatively skewed
c. Not skewed at all
d. Normal
28. How many dependent variables are used in multiple regression?
a. One
b. One or more
c. Two or more
d. Two

29. Which of the following represents the fiftieth percentile, or the middle point in a set of
numbers arranged in order of magnitude?
a. Mode
b. Median
c. Mean
d. Variance
30. If a distribution is skewed to the left, then it is __________.
a. Negatively skewed
b. Positively skewed
c. Symmetrically skewed
d. Symmetrical
31. In a grouped frequency distribution, the intervals should be what?
a. Mutually exclusive
b. Exhaustive
c. Both A and B
d. Neither A nor B
32. When a set of numbers is heterogeneous, you can place more trust in the measure of central
tendency as representing the typical person or unit.
a. True
b. False
33. Non-overlapping categories or intervals are known as ______.
a. Inclusive
b. Exhaustive
c. Mutually exclusive
d. Mutually exclusive and exhaustive
34. To interpret the relationship between two categorical variables, a contingency table should be
constructed with either column or row percentages, and ----.
a. If the percentages are calculated down the columns, then comparisons should be made across
the rows
b. If the percentages are calculated across the rows, comparisons should be made down the
columns
c. Both a and b are correct
d. Neither a nor b is correct
Answers:
1. d
2. d
3. a
4. d
5. d

6. b
7. a
8. b
9. b
10. c
11. a
12. c
13. a
14. a
15. a
16. d
17. b
18. d
19. a
20. a
21. b
22. c
23. b
24. b
25. a
26. a
27. b
28. a
29. b
30, a
31. c
32. b
33. c
34. c

STATISTICS 8
CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS
Correct answers are in bold italics..
This scenario applies to Questions 1 and 2: A study was done to compare the lung capacity of
coal miners to the lung capacity of farm workers. The researcher studied 200 workers of each
type. Other factors that might affect lung capacity are smoking habits and exercise habits. The
smoking habits of the two worker types are similar, but the coal miners generally exercise less
than the farm workers.
1. Which of the following is the explanatory variable in this study?
a. Exercise
b. Lung capacity
c. Smoking or not
d. Occupation
2. Which of the following is a confounding variable in this study?
a. Exercise
b. Lung capacity
c. Smoking or not
d. Occupation
This scenario applies to Questions 3 to 5: A randomized experiment was done by randomly
assigning each participant either to walk for half an hour three times a week or to sit quietly
reading a book for half an hour three times a week. At the end of a year the change in
participants' blood pressure over the year was measured, and the change was compared for the
two groups.
3. This is a randomized experiment rather than an observational study because:
a. Blood pressure was measured at the beginning and end of the study.
b. The two groups were compared at the end of the study.
c. The participants were randomly assigned to either walk or read, rather than choosing
their own activity.
d. A random sample of participants was used.
4. The two treatments in this study were:
a. Walking for half an hour three times a week and reading a book for half an hour three
times a week.
b. Having blood pressure measured at the beginning of the study and having blood pressure
measured at the end of the study.
c. Walking or reading a book for half an hour three times a week and having blood pressure
measured.
d. Walking or reading a book for half an hour three times a week and doing nothing.

Scenario for Questions 3 to 5, continued


5. If a statistically significant difference in blood pressure change at the end of a year for the
two activities was found, then:
a. It cannot be concluded that the difference in activity caused a difference in the change in
blood pressure because in the course of a year there are lots of possible confounding
variables.
b. Whether or not the difference was caused by the difference in activity depends on what
else the participants did during the year.
c. It cannot be concluded that the difference in activity caused a difference in the change in
blood pressure because it might be the opposite, that people with high blood pressure
were more likely to read a book than to walk.
d. It can be concluded that the difference in activity caused a difference in the change in
blood pressure because of the way the study was done.
6. What is one of the distinctions between a population parameter and a sample statistic?
a. A population parameter is only based on conceptual measurements, but a sample statistic
is based on a combination of real and conceptual measurements.
b. A sample statistic changes each time you try to measure it, but a population parameter
remains fixed.
c. A population parameter changes each time you try to measure it, but a sample statistic
remains fixed across samples.
d. The true value of a sample statistic can never be known but the true value of a population
parameter can be known.
7. A magazine printed a survey in its monthly issue and asked readers to fill it out and send it
in. Over 1000 readers did so. This type of sample is called
a. a cluster sample.
b. a self-selected sample.
c. a stratified sample.
d. a simple random sample.
8. Which of the following would be most likely to produce selection bias in a survey?
a. Using questions with biased wording.
b. Only receiving responses from half of the people in the sample.
c. Conducting interviews by telephone instead of in person.
d. Using a random sample of students at a university to estimate the proportion of people
who think the legal drinking age should be lowered.
9. Which one of the following variables is not categorical?
a. Age of a person.
b. Gender of a person: male or female.
c. Choice on a test item: true or false.
d. Marital status of a person (single, married, divorced, other)

10. A polling agency conducted a survey of 100 doctors on the question Are you willing to treat
women patients with the recently approved pill RU-486? The conservative margin of error
associated with the 95% confidence interval for the percent who say 'yes' is
a. 50%
b. 10%
c. 5%
d. 2%
11. Which one of these statistics is unaffected by outliers?
a. Mean
b. Interquartile range
c. Standard deviation
d. Range
12. A list of 5 pulse rates is: 70, 64, 80, 74, 92. What is the median for this list?
a. 74
b. 76 c. 77 d. 80
13. Which of the following would indicate that a dataset is not bell-shaped?
a. The range is equal to 5 standard deviations.
b. The range is larger than the interquartile range.
c. The mean is much smaller than the median.
d. There are no outliers.
14. A scatter plot of number of teachers and number of people with college degrees for cities in
California reveals a positive association. The most likely explanation for this positive
association is:
a. Teachers encourage people to get college degrees, so an increase in the number of
teachers is causing an increase in the number of people with college degrees.
b. Larger cities tend to have both more teachers and more people with college degrees, so
the association is explained by a third variable, the size of the city.
c. Teaching is a common profession for people with college degrees, so an increase in the
number of people with college degrees causes an increase in the number of teachers.
d. Cities with higher incomes tend to have more teachers and more people going to college,
so income is a confounding variable, making causation between number of teachers and
number of people with college degrees difficult to prove.
15. The value of a correlation is reported by a researcher to be r = 0.5. Which of the following
statements is correct?
a. The x-variable explains 25% of the variability in the y-variable.
b. The x-variable explains 25% of the variability in the y-variable.
c. The x-variable explains 50% of the variability in the y-variable.
d. The x-variable explains 50% of the variability in the y-variable.
16. What is the effect of an outlier on the value of a correlation coefficient?
a. An outlier will always decrease a correlation coefficient.
b. An outlier will always increase a correlation coefficient.
c. An outlier might either decrease or increase a correlation coefficient, depending on
where it is in relation to the other points.
d. An outlier will have no effect on a correlation coefficient.

17. One use of a regression line is


a. to determine if any x-values are outliers.
b. to determine if any y-values are outliers.
c. to determine if a change in x causes a change in y.
d. to estimate the change in y for a one-unit change in x.
18. Past data has shown that the regression line relating the final exam score and the midterm
exam score for students who take statistics from a certain professor is:
final exam = 50 + 0.5 midterm
One interpretation of the slope is
a. a student who scored 0 on the midterm would be predicted to score 50 on the final exam.
b. a student who scored 0 on the final exam would be predicted to score 50 on the midterm
exam.
c. a student who scored 10 points higher than another student on the midterm would be
predicted to score 5 points higher than the other student on the final exam.
d. students only receive half as much credit (.5) for a correct answer on the final exam
compared to a correct answer on the midterm exam.
Questions 19 to 21: A survey asked people how often they exceed speed limits. The data are
then categorized into the following contingency table of counts showing the relationship between
age group and response.
Exceed Limit if Possible?
Age
Always Not Always Total
100
100
200
Under 30
40
160
200
Over 30
140
260
400
Total
19. Among people with age over 30, what's the "risk" of always exceeding the speed limit?
a. 0.20
b. 0.40
c. 0.33
d. 0.50
20. Among people with age under 30 what are the odds that they always exceed the speed limit?
a. 1 to 2
b. 2 to 1
c. 1 to 1
d. 50%
21. What is the relative risk of always exceeding the speed limit for people under 30 compared to
people over 30?
a. 2.5
b. 0.4
c. 0.5
d. 30%

Questions 22 and 23: A newspaper article reported that "Children who routinely compete in
vigorous after-school sports on smoggy days are three times more likely to get asthma than their
non-athletic peers." (Sacramento Bee, Feb 1, 2002, p. A1)
22. Of the following, which is the most important additional information that would be useful
before making a decision about participation in school sports?
a. Where was the study conducted?
b. How many students in the study participated in after-school sports?
c. What is the baseline risk for getting asthma?
d. Who funded the study?
23. The newspaper also reported that "The number of children in the study who contracted
asthma was relatively small, 265 of 3,535." Which of the following is represented by
265/3535 = .075?
a. The overall risk of getting asthma for the children in this study.
b. The baseline risk of getting asthma for the non-athletic peers in the study.
c. The risk of getting asthma for children in the study who participated in sports.
d. The relative risk of getting asthma for children who routinely participate in vigorous
after-school sports on smoggy days and their non-athletic peers.
Questions 24 to 26: The following histogram shows the distribution of the difference between
the actual and ideal weights for 119 female students. Notice that percent is given on the
vertical axis. Ideal weights are responses to the question What is your ideal weight? The
difference = actual ideal. (Source: idealwtwomen dataset on CD.)

24. What is the approximate shape of the distribution?


a. Nearly symmetric.
b. Skewed to the left.
c. Skewed to the right.
d. Bimodal (has more than one peak).
25. The median of the distribution is approximately
a. 10 pounds.
b. 10 pounds.
c. 30 pounds.
d. 50 pounds.

Scenario for Questions 24 to 26, continued


26. Most of the women in this sample felt that their actual weight was
a. about the same as their ideal weight.
b. less than their ideal weight.
c. greater than their ideal weight.
d. no more than 2 pounds different from their ideal weight.
27. A chi-square test of the relationship between personal perception of emotional health and
marital status led to rejection of the null hypothesis, indicating that there is a relationship
between these two variables. One conclusion that can be drawn is:
a. Marriage leads to better emotional health.
b. Better emotional health leads to marriage.
c. The more emotionally healthy someone is, the more likely they are to be married.
d. There are likely to be confounding variables related to both emotional health and
marital status.
28. A chi-square test involves a set of counts called expected counts. What are the expected
counts?
a. Hypothetical counts that would occur of the alternative hypothesis were true.
b. Hypothetical counts that would occur if the null hypothesis were true.
c. The actual counts that did occur in the observed data.
d. The long-run counts that would be expected if the observed counts are representative.
29. Pick the choice that best completes the following sentence. If a relationship between two
variables is called statistically significant, it means the investigators think the variables are
a. related in the population represented by the sample.
b. not related in the population represented by the sample.
c. related in the sample due to chance alone.
d. very important.
30. Simpson's Paradox occurs when
a. No baseline risk is given, so it is not know whether or not a high relative risk has
practical importance.
b. A confounding variable rather than the explanatory variable is responsible for a change in
the response variable.
c. The direction of the relationship between two variables changes when the categories of
a confounding variable are taken into account.
d. The results of a test are statistically significant but are really due to chance.

Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from
Midterms 1 and 2 are also representative of questions that may appear on the final exam.
1. A randomly selected sample of 1,000 college students was asked whether they had ever used the drug
Ecstasy. Sixteen percent (16% or 0.16) of the 1,000 students surveyed said they had. Which one of
the following statements about the number 0.16 is correct?
A. It is a sample proportion.
B. It is a population proportion.
C. It is a margin of error.
D. It is a randomly chosen number.
2. In a random sample of 1000 students, p = 0.80 (or 80%) were in favor of longer hours at the school
library. The standard error of p (the sample proportion) is
A. .013
B. .160
C. .640
D. .800

3. For a random sample of 9 women, the average resting pulse rate is x = 76 beats per minute, and the
sample standard deviation is s = 5. The standard error of the sample mean is
A. 0.557
B. 0.745
C. 1.667
D. 2.778
4. Assume the cholesterol levels in a certain population have mean = 200 and standard deviation =
24. The cholesterol levels for a random sample of n = 9 individuals are measured and the sample
mean x is determined. What is the z-score for a sample mean x = 180?
A. 3.75
B. 2.50
C. 0.83
D. 2.50
5. In a past General Social Survey, a random sample of men and women answered the question Are you
a member of any sports clubs? Based on the sample data, 95% confidence intervals for the
population proportion who would answer yes are .13 to .19 for women and .247 to .33 for men.
Based on these results, you can reasonably conclude that
A. At least 25% of American men and American women belong to sports clubs.
B. At least 16% of American women belong to sports clubs.
C. There is a difference between the proportions of American men and American women who
belong to sports clubs.
D. There is no conclusive evidence of a gender difference in the proportion belonging to sports
clubs.
6. Suppose a 95% confidence interval for the proportion of Americans who exercise regularly is 0.29 to
0.37. Which one of the following statements is FALSE?
A. It is reasonable to say that more than 25% of Americans exercise regularly.
B. It is reasonable to say that more than 40% of Americans exercise regularly.
C. The hypothesis that 33% of Americans exercise regularly cannot be rejected.
D. It is reasonable to say that fewer than 40% of Americans exercise regularly.

7. In hypothesis testing, a Type 2 error occurs when


A. The null hypothesis is not rejected when the null hypothesis is true.
B. The null hypothesis is rejected when the null hypothesis is true.
C. The null hypothesis is not rejected when the alternative hypothesis is true.
D. The null hypothesis is rejected when the alternative hypothesis is true.
8. Null and alternative hypotheses are statements about:
A. population parameters.
B. sample parameters.
C. sample statistics.
D. it depends - sometimes population parameters and sometimes sample statistics.
9. A hypothesis test is done in which the alternative hypothesis is that more than 10% of a population is
left-handed. The p-value for the test is calculated to be 0.25. Which statement is correct?
A. We can conclude that more than 10% of the population is left-handed.
B. We can conclude that more than 25% of the population is left-handed.
C. We can conclude that exactly 25% of the population is left-handed.
D. We cannot conclude that more than 10% of the population is left-handed.
10. Which of the following is NOT true about the standard error of a statistic?
A. The standard error measures, roughly, the average difference between the statistic and the
population parameter.
B. The standard error is the estimated standard deviation of the sampling distribution for the statistic.
C. The standard error can never be a negative number.
D. The standard error increases as the sample size(s) increases.
11. A prospective observational study on the relationship between sleep deprivation and heart disease was
done by Ayas, et. al. (Arch Intern Med 2003). Women who slept at most 5 hours a night were
compared to women who slept for 8 hours a night (reference group). After adjusting for potential
confounding variables like smoking, a 95% confidence interval for the relative risk of heart disease
was (1.10, 1.92). Based on this confidence interval, a consistent conclusion would be
A. Sleep deprivation is associated with a modestly increased risk of heart disease.
B. Sleep deprivation is associated with a modestly decreased risk of heart disease.
C. There was no evidence of an association between sleep deprivation and heart disease.
D. Lack of sleep causes the risk of heart disease to increase by 10% to 92%.
12. Consider a random sample of 100 females and 100 males. Suppose 15 of the females are left-handed
and 12 of the males are left-handed. What is the estimated difference between population proportions
of females and males who are left-handed (females males)? Select the choice with the correct
notation and numerical value.
A. p1 p2 = 3
B. p1 p2 = 0.03
C. p 1 p 2 = 3
D. p 1 p 2 = 0.03
13. A result is called statistically significant whenever
A. The null hypothesis is true.
B. The alternative hypothesis is true.
C. The p-value is less or equal to the significance level.
D. The p-value is larger than the significance level.

14. The confidence level for a confidence interval for a mean is


A. the probability the procedure provides an interval that covers the sample mean.
B. the probability of making a Type 1 error if the interval is used to test a null hypothesis about the
population mean.
C. the probability that individuals in the population have values that fall into the interval.
D. the probability the procedure provides an interval that covers the population mean.
For the next two questions: It is known that for right-handed people, the dominant (right) hand tends to
be stronger. For left-handed people who live in a world designed for right-handed people, the same may
not be true. To test this, muscle strength was measured on the right and left hands of a random sample of
15 left-handed men and the difference (left - right) was found. The alternative hypothesis is one-sided
(left hand stronger). The resulting t-statistic was 1.80.

15. This is an example of:


A. A two-sample t-test.
B. A paired t-test.
C. A pooled t-test.
D. An unpooled t-test.
16. Assuming the conditions are met, based on the t-statistic of 1.80 the appropriate conclusion for this
test using = .05 is: (Table would be provided with exam.)
A. Df = 14, so p-value < .05 and the null hypothesis can be rejected.
B. Df = 14, so p-value > .05 and the null hypothesis cannot be rejected.
C. Df = 28, so p-value < .05 and the null hypothesis can be rejected.
D. Df = 28, so p-value > .05 and the null hypothesis cannot be rejected.
17. A test of H0: = 0 versus Ha: > 0 is conducted on the same population independently by two
different researchers. They both use the same sample size and the same value of = 0.05. Which of
the following will be the same for both researchers?
A. The p-value of the test.
B. The power of the test if the true = 6.
C. The value of the test statistic.
D. The decision about whether or not to reject the null hypothesis.
18. Which of the following is not a correct way to state a null hypothesis?
A. H0: p 1 p 2 = 0 (Sample statistics do not go into hypotheses)
B. H0: d = 10
C. H0: 1 2 = 0
D. H0: p = .5
19. A test to screen for a serious but curable disease is similar to hypothesis testing, with a null hypothesis
of no disease, and an alternative hypothesis of disease. If the null hypothesis is rejected treatment will
be given. Otherwise, it will not. Assuming the treatment does not have serious side effects, in this
scenario it is better to increase the probability of:
A. making a Type 1 error, providing treatment when it is not needed.
B. making a Type 1 error, not providing treatment when it is needed.
C. making a Type 2 error, providing treatment when it is not needed.
D. making a Type 2 error, not providing treatment when it is needed.

20. A random sample of 25 college males was obtained and each was asked to report their actual height
and what they wished as their ideal height. A 95% confidence interval for d = average difference
between their ideal and actual heights was 0.8" to 2.2". Based on this interval, which one of the null
hypotheses below (versus a two-sided alternative) can be rejected?
A. H0: d = 0.5
B. H0: d = 1.0
C. H0: d = 1.5
D. H0: d = 2.0
21. The average time in years to get an undergraduate degree in computer science was compared for men
and women. Random samples of 100 male computer science majors and 100 female computer science
majors were taken. Choose the appropriate parameter(s) for this situation.
A. One population proportion p.
B. Difference between two population proportions p1 p2.
C. One population mean 1
D. Difference between two population means 1 2
22. If the word significant is used to describe a result in a news article reporting on a study,
A. the p-value for the test must have been very large.
B. the effect size must have been very large.
C. the sample size must have been very small.
D. it may be significant in the statistical sense, but not in the everyday sense.
23. A random sample of 5000 students were asked whether they prefer a 10 week quarter system or a 15
week semester system. Of the 5000 students asked, 500 students responded. The results of this
survey ________
A. can be generalized to the entire student body because the sampling was random.
B. can be generalized to the entire student body because the margin of error was 4.5%.
C. should not be generalized to the entire student body because the non-response rate was 90%.
D. should not be generalized to the entire student body because the margin of error was 4.5%.
24. In a report by ABC News, the headlines read City Living Increases Mens Death Risk The headlines
were based on a study of 3,617 adults who lived in the United States and were more than 25 years old.
One researcher said, Elevated levels of tumor deaths suggest the influence of physical, chemical and
biological exposures in urban areas Living in cities also involves potentially stressful levels of
noise, sensory stimulation and overload, interpersonal relations and conflict, and vigilance against
hazards ranging from crime to accidents. Is a conclusion that living in an urban environment causes
an increased risk of death justified?
A. Yes, because the study was a randomized study.
B. Yes, because many of the men in the study were under stress.
C. No, because the study was a retrospective study.
D. No, because the study was an observational study.
25. A significance test based on a small sample may not produce a statistically significant result even if
the true value differs substantially from the null value. This type of result is known as
A. the significance level of the test.
B. the power of the study.
C. a Type 1 error.
D. a Type 2 error.

For the next two questions: An observational study found a statistically significant relationship between
regular consumption of tomato products (yes, no) and development of prostate cancer (yes, no), with
lower risk for those consuming tomato products.

26. Which of the following is not a possible explanation for this finding?
A. Something in tomato products causes lower risk of prostate cancer.
B. There is a confounding variable that causes lower risk of prostate cancer, such as eating vegetables
in general, that is also related to eating tomato products.
C. A large number of food products were measured to test for a relationship, and tomato products
happened to show a relationship just by chance.
D. A large sample size was used, so even if there were no relationship, one would almost certainly
be detected.
27. Which of the following is a valid conclusion from this finding?
A. Something in tomato products causes lower risk of prostate cancer.
B. Based on this study, the relative risk of prostate cancer, for those who do not consume tomato
products regularly compared with those who do, is greater than one.
C. If a new observational study were to be done using the same sample size and measuring the same
variables, it would find the same relationship.
D. Prostate cancer can be prevented by eating the right diet.
28. The best way to determine whether a statistically significant difference in two means is of practical
importance is to
A. find a 95% confidence interval and notice the magnitude of the difference.
B. repeat the study with the same sample size and see if the difference is statistically significant
again.
C. see if the p-value is extremely small.
D. see if the p-value is extremely large.
29. A large company examines the annual salaries for all of the men and women performing a certain job
and finds that the means and standard deviations are $32,120 and $3,240, respectively, for the men
and $34,093 and $3521, respectively, for the women. The best way to determine if there is a
difference in mean salaries for the population of men and women performing this job in this company
is
A. to compute a 95% confidence interval for the difference.
B. to subtract the two sample means.
C. to test the hypothesis that the population means are the same versus that they are different.
D. to test the hypothesis that the population means are the same versus that the mean for men is
higher.
30. One problem with hypothesis testing is that a real effect may not be detected. This problem is most
likely to occur when
A. the effect is small and the sample size is small.
B. the effect is large and the sample size is small.
C. the effect is small and the sample size is large.
D. the effect is large and the sample size is large.

You might also like