Professional Documents
Culture Documents
Answer Z=1
Two cards
Answer : 64
When referring to
Answer : <42
One-tailed two-tailed
(1st option)
Answer : P(A) + P(B) is the joint probability of A and B (I am not sure about
this)
When n dice are rolled the possible out comes are:
Answer : 6n
The number of road accidents is a
Answer : discrete
The grouped data are called
Answer:
3rd option
Answer : 1.645
Given Ho
Given 130
Answer : Z
The range of test
1st Quiz
1. If four coins are tossed, how many elements will the sample space contains. A=2, B= 4. C=16.
2. A Bag contains 10 red balls and 7 blue balls, A ball is drawn at random. The probability that ball
drawn is red. A= 7. B= 7/17. 10/17. 3/17.
3. A fair coin is tossed three times. What is the probability that at least one head appears. A=7/8.
B= 6/8. C= 5/8. D= 4/8.
4. If a dice is thrown twice, the number of elements in the sample space. A= 2. B= 4. C=16. D=36.
5. Mean, Median and mode always coincide in the case of ..Distribution. A= Poisson. B=
Binomial. C= Normal. D= Hypergeometric.
6. Two events. A and B are mutually exclusive and each have a nonzero probability. If event A is
known to occur, the probability of the occurrence of event B is. A= one. B= any positive value.
C=0, d= any value between 0 to 1.
7. The probability of getting a head in tossing of a coin is. A= 0.5, B= 1, C= 1.5, d= -0.5.
8. The probability of an event cannot ne. A= 1, B=0.1, C= 0.5, D=-0.5
9. Find X compliment, x= 2,8,4,4,6,8,10. A= 49, B= 42, C= 9, D= 6.
10. P (A intersection B) =.., A= P(A) P(A/B), B= P (B) P(B/B). C= P(A) P(B/A), D= none of these.
11. For a random sample 9 women the average resting pulse rate x= 76 beats per minute and the
sample standard deviation is s= 5. The standards error of the sample mean is. A= 0.557, B=
0.745, C= 1.667, D= 2.778.
12. Null and alternative hypothesis are statements about. A= population parameters, B= Sample
Parameters, C= Sample statistics, D= it depends- Sometimes population parameters and
sometimes sample parameters.
2nd Quiz
Q.1. In order to carry out a chi square test on data in contingency table, the observed values in the table
should be. A= close to the expected values, b= all greater than or equal to 5. C = frequencies, d=
quantitative.
Q.2. If two attributes A and B have perfect association the value of coefficient of association is equal to.
A= +1, B= 0, C= -1, D= (r-1 x c-1)
Q.3. The degree of freedom for chi square are (r-1)(c-1) for a contingency table with r-rows and ccolumns so for 2*2 contingency table there are. A= one degree of freedom, b= Two degree of freedom,
c= three degree of freedom, d= four degree of freedom,
Q.4. For an r*c contingency table the number of degrees of freedom equal. A= rc, b= r+c, c= (r-1) + (c-1),
d= (r-1)(c-1)
Q.5. For a 3 * 3 contingency table the number of cells in the table are. A= 3, b= 6, c= 9, d= 4.
Q.6. The total area under the curve of chi-square distribution is, A= 1, B= 0.5, C= 0 to infinity, D= - infinity
to + infinity
Q.7. Ch-square curve ranges from. A= infinity to + infinity, B= 0 to infinity, C= - infinity to 0, D= 0 to 1.
Q.8.The value of chi- square statistics is always. A= negative, B= 0, C= non negative, D= one.
Q.9.The slope of the simple linear regression equation (x is the independent variable and y is the
dependent variable) represents the. A= mean value of y when x=0, b= change in mean value of y per
unit change in x, c= True value of y for a fixed value of x, d= Variance of the value of x.
31.
32.
33.
34.
a)1.0
b)0.68
c)0.32
d)0.5
e)non of above
What dose regression means? The general process of predicting one variable from another
variable
what is the probability that a randomly selected value of a population is greater than median of
that population?
(0.5)
if P(A or B)=P(A), then
1. A and B are mutually exclusive 2. The venn diagram area of A
and B overlap 3. P (A) + P(B) is the joint probability of A and B. 4. None of these. 5. All of these.
If the null hypothesis is rejected, then we may be making
a. Correct decision
b. type I error
c. type II error
d. either A or B
e. either A or C
35. A bag contains one rupee, 50-paisa and 25-paisa coins in the ratio
2 : 3 : 5. Their total value is Rs.144. The value of 50-paisa coins is
Rs.24
Rs.36
Rs.48
Rs.72
Rs.80
36. Which of the following normal curves is most likely the curve for u=10, sigma =5?
Curve
for u=20, sigma=10
37. A number between 0 and 1, that is used to measure uncertainty is called Probability
38. Histogram is a graph of frequency distribution
39. Given a =80, n=625,u0=350 and X= 356 Find Z?
1.88
40. The grouped data are called : Difficult to tell
41. u and sigma are parameters
z distribution
42. The values that separate the acceptance region from the rejection region is called Critical values
43. The test statistics is equal to: None of these
44. I fair coin is tossed three times, what is the probability that at least one head appears? 7/8
45. Economists use regression analysis and base their predictions of the annual gross domestic
product (GDP) on the final consumption spending within the economy. What are the dependent
and independent variables for the analysis.
Dependent: GDP; Independent: final;
consumption spending
46. Square root of variance have only values:
non Negative
47. A frequency distribution that contains a class with limits of 10 and under 20 would have a
midpoints:
15
48. Z= _____
z=x . u/sigma \square root of (n)
49. Which of the following represents the probability of mutally exclusive events A and B?
P(A)+P(B)
50. In testing hypothesis: alpha + beta is always equal to difficult to tell
51. The standard deviation of a binomial distribution depends upon: 1=success, 2= failure, 2= trial,
3= b and c but not a, 4= a,b and c
52. Suppose we Want to test whether the population mean is significantly large or small than 10.
What should our alternative hypothesis be ?
u<=10
53. The argument in which the order of the objects selected from a specific pool of objects is
important called
permutation
54. For an upper tailed test of the difference of two means based on dependent samples of size 6
and alpha =.05, the critical value for the test statistic is :
2.015
55. What is the probability that a value chosen at random from a particular population is larger than
the median of the population?
0.5
56. The accuracy of prediction of variable can be improved by adding more independent variables
to regression mode
57. The power of test is equal to :
1-beta
58. Six white balls and four black balls, Which are indistinguishable apart from colour, are placed in
a bag. If six balls are taken from the bag, Find the probability of their being three white and
three black balls :
8/21
59. The probability of an event occurring given that another event had occurred is called:
Conditional probability
60. The Largest and the smallest values of any given class of a frequency distribution are called:
Class limits
61. If a coin is tossed thrice the sample space consist of 8 elements
62. If two dices are rolled, the possible outcomes are :
36
63. Which of the following is an example of a parameter? n or u
64. For two tailed test of hypothesis at sigma=0.10, the acceptance region is the entire region:
Between the two critical values
65. In Lower tail
alpha=0.05 then z tabulated is 1.65
66. A two tailed test of a difference between two propositions led to z=1.85, for its standardized
difference of sample proposition. For which of the following significance level would you reject
H0?
Alpha=0.05
67. Given u0=130, x =150, sigma=25 and n=4. What test statistic is appropriate? 1= t, 2 = z, 3= x2,
4=f
68. The average of lower and upper class limits is called
Class boundary
69. If total number of data points are 120, then we can make a total of __ number of classes. 8
70. Fisher test nm
71. Alpha + beta = 1
72. Upper tailed test
73. Which test will be used if the population is normal and the standard deviation is known: Z test
74. Which one of the following is discrete variable: Number of rooms
75. If the total number of data points are 120, then we can make a total of --------number of classes:
6.
76. If the dependent variable decreases as the independent variable increase: Negative linear
relationship.
77. Numerical quality that describe a population is called: Parameter
78. Degree of freedom of t distribution is. a). N+1, b). n-1, c). n. d). n-1/2
79. Given x =100, ax = 16, and u0= 90, find Z : 0.65
80. When n dice are rolled, the possible outcomes are: 6n
81. The simple probability of an occurrence of an event is called: none of these.
82. When a null hypothesis is H0: u=42, then the alternative hypothesis can be : H1: u less than 42
83. If a is any event in S and A its complement, then p(A) is equal to : 1-p(A)
84. In regression analysis the variable we would like to predict or explain is called : dependent
variable
85. Histogram is a graph of : frequency distribution.
86. Given H: u=12, H1:u greater than 12, n=64, x=15. Phi=10, Sigma=0.05. find Z are make the
statistical decision: Z=2.4, reject H0
87. Given x =120, u0=100,s=34.75 and n=25, find t : 2.88
88. An arrangement in which the order of the objects selected from a specific pool of objects is
important called : permutation
89. With referring to a curve that tails off to the left end, you would call it : none of these
90. What does the term regression means: the general process of predicting one variable from
another variable.
91. Which of the following is the first step in calculating the median of data set. Array the data.
92. which of the is not a measure of central tendency? Geometric mean
93. f(x) represents the -------Variable : dependent.
94. The frequency divided by the total number of observations is called. Relative frequency
95. How does the computation of a sample variance differ from the computation of a population
variance? a). u is replaced by x, b). n is replaced by n-1, c) n is replaced by n, d) a and c but
not b, e) a and b but not c
96. If a sample of size m is drawn from one population. What are the respective degrees of
freedom if one has to apply Fishers test. a= n-1,m-1, b= n,m. c= n-1,m-1. d= n-1,m. e= m1,n
Previous quiz.
1. Which of the following is a criteria for selecting a regression line which best represents the data.
A= the mean of the data must agree with the line. B= the sum of squared differences between
the dependent variable must be minimized. C= the sum of the squared horizontal differences in
the independent variable must be minimized. D= the line must agree with at least half of the
data points
2. Which is the probability that a value at random from a particular population is larger than the
median of the population. A= 0.25, B= 0.5, C= 1.0, D= 0.67, E= none of these
3. P(A)=? A= number of favorable commitment/total number of possible outcomes. B= total
number of possible outcomes/number of favorable outcomes, C= both a and b. D= none of
these
4. The weight in grams of 10 male and 10 female eing-neckled pheasants are obtained. The
variance for each are different. In order to test the hypothesis that the variance of the different
genders favors males over females, which of the following test may be used? T- test one tailed.
5. In an un paired sample t-test with sample size n1=11 & n2= 11, the value of tabulated should be
obtain from. A= 10 degree of freedom, B= 21 degree of freedom, C= 22 degree of freedom, D=
20 degree of freedom.
6. E(x-x compliment)(y-y compliment) =0, E(x-xcompliment)2 = 10 & n=5 find the cooficient of
coorelation. A= 1, B= 2, C= 0, D= 0.5
7. A time series has. A= two components, B= three components, C= four components= five
components.
8. If the regression lines of 4 on x and y are respectively given by 2x-3y=0 and 4y-5x=8 find out
values of two regression coefficients of y on x and x on y. A= 3/2 and 5/4, B= and 1/5, C= 2/3
and 4/5, D= 2/5 and .
9. For an r x c contingency table, the number of cells in the table are. A= r.c, B= (r-1)(r-c). C= r+c,
D= (r+1)(c-1)
10. Given x2= 20.178, D.of=4 and alpha=0.01, find the table value of x2 and make the statistical
decision. A= x20.01(4)=13.277, rejected H0. B= x20.01(4)=14.277 rejected H1. C= x2
0.01(4)=13.277, rejected H1. D= X2 0.01(4)=14.277, rejected H0.
11. Moving average is. A= given the trend in a straight line, B= measure the seasonal variation, C=
smooth-out the time series. D= none of the above.
12. Suppose that y= 1, when x= 0, then y=2 where x= 2. Find the least square estimate b. A= 2.0, B=
1.0, C= 1.5, D= 2.5.
13. Given x= 0.6-0.5y and y = 0.8, find x= ?. A= 0.1, B= 03, C= 0.2, D= 0.4.
14. Suppose that y= 1, when x= 0, then y=0 when x= 1 and that y=3 where is x=2. In this case find
the sample correlation, A= 1, B=2, C= 3, D= 4.
15. In semi averages method, if the number of values is odd then we drop: A= first value, B= third
value, C= last value, D= middle value, E=middle two value.
16. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the least
square estimate a. A= 2, B= 3, C= 4, D= 1.
17.
Which of the following normal curves looks most like the curve for u=10,o=5
A=Currve for u=10,o=10. B.Curve for u=20, o=10. C=Curve for u=20, o=5.D=Curve for u=13, o=3. E=None of these.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35. A pair of dice thrown,find the probability of getting a total of 5 or 11. A=2/6. B=6/6. C=3/6.
D=1/6.
36. If the sample of size m is drawn from one population and size of n from another
population,what are the respective degrees of freedom if one has to apply fisher's test. A=n-1,
m-1 . B=n,m. C= n-1,m+1. D=n-1,m. E=m-1,n.
37. Which of the following is an example of a parameter. A=x. B=n. C=u. D= All of these. E= b and c
but not a.
38. When a null hypothesis is Ho,u=42 then the alternative hypothesis can be. A=H1,u>42.
B=H1;u<42. C=H1;u=40. D=H1;u=40.
39. Quantitative variable are variables of measured on a---------scale. A=theoretical. B= numeric. C=
ordinary. D= ratio
40. For an upper tailed test of the difference of two means based on dependent samples of siza 6
and a=0.05 the critical value for the test statistic is.A=2.015. B=1.645. C=1.812. D=1.782. E=None
of these.
41. F(x) represent the-------- variable. A=Independent. B=Dependent. C=a and b. D= none of these.
42. If we want to test whether the proportions of more than two populations are equal,we
use.A=Analysis of variance. B= Estimation. C=The variance.D=Internal estimates.E=none of
these.
43. In the case that two events a and b are mutually exclusive P(AUB). A=P(A)+P(B). B=P(A)+P(B)-P(A
intersection B). C=P(A)xP(B). D=P(A intersection B)/P(B).
44. For two tailed test of hypothesis at a=0.10 the acceptance region is the entire region.A=To the
right of the negative critical values.B=Between the two critical values.C=Outside of the two
critical values. D=To the left of the positive critical value. E=None of these.
45. Which of the following is true for any regression model. A=The y- intercept of the model must
agree with the y-intercept of the data. B=There will always be a linear relationship between a
regression model and data.C=The choice of regression model to the best represent the data is
based on observing the trend in data. D=The standard deviation of the regression model is
always exactly the same as the standard deviation of the data.
46. Specify which probability distribution to use in a hypothesis test and whether it will be one
tailed or two tailed given the following information. Ho,u< and equal to 27. H1:u>27. X- bar = 33,
standard deviation=4, n=50.A= z-test:one tailed. B=z- test: two tailed. C= t-test one tailed. D= ttest; two tailed. E=None of these.
47. The square of the variance of a distribution is the. A= Standard deviation. B=Mean. C=Range.
D=Absolute deviation. E= None of these.
48. With a lower significance level,the probability of rejecting a null hypothesis thai is actually true.
A=Decreases. B=Remains the same. C=Increases. D=Increases as the mean changes. E=None of
these.
49. The number of road accidents is a--------- random variable. A= Discrete. B= Contiuous. C=Both.
D=None of above.
50. Six white balls and four black balls,which are indistinguishable apart from color,are placed in a
bag.if six balls are taken the bag,find the probability of their being three white and three black.
A=8/10. B=8/21. C=10/21. D=21/8.
51. Square root of variance have only values. A=Less than 10. B= Greater than 10. C= Less than 0.
D=Greater than 0. E=Non negative.
52. Suppose we want to test whether a population mean is significantly larger or smaller than
10.what should our alternative hypothesis is be. A=u<10. B=u>10. C=u=10. D=u not equal to 10.
E=None of these.
53. A two tailed test of a difference between two proportions led to z= 1.85 for its standardized
difference of sample proportions. for which of the following significance level would you reject
H0?. A= a= 0.05, B= a= 0.10, C= a=0.02, D= a=(a) and (b), but not (c). E = none of these.
54. For a normal curve with u=55 and @=10, how much area will be found under the curve to the
right of the value 55? A= 1.0, B= 0.68, C= 0.5, D= 0.32, E= none of these
Final paper
1. If the null hypothesis is rejected, then we may be making
a. Correct decision
b. type I error
c. type II error
d. either A or B
e. either A or C
2. Given rxy=-0.75, Sy=5, E(x-x)(Y-Y)=-15n. find Sx. A= 5, b=3, c=2, d= 4
3. Moving average is. A= given the trend in a straight line, B= measure the seasonal variation, C=
smooth-out the time series. D= none of the above.
4. Given x= 0.6-0.5y and y = 0.8, find x= ?. A= 0.1, B= 03, C= 0.2, D= 0.4.
5. Given x =120, u0=100,s=34.75 and n=25, find t : a= 3, b= 4, c= 2.88, d= 2
6. Given x=1, y=8 and b=2 find the value of interpret a. a= 7, b= 6, c= 8, d=10
7. Specify which probability distribution to use in a hypothesis test and whether it will be one
tailed or two tailed if the following information is given. Ho:u=15,H1:u not equal to 15, xbar=14.8, n=20. A= z-test ;one- tailed. B= z- test; two tailed. C=t-test;one tailed. D= t-test;two
tailed. E= a and b but not c
8. Given a =80, n=625,u0=350 and X= 356 Find Z? a=3, b=1, c= 2, d=1.88
9. Given x2= 20.178, D.of=4 and alpha=0.01, find the table value of x2 and make the statistical
decision. A= x20.01(4)=13.277, rejected H0. B= x20.01(4)=14.277 rejected H1. C= x2
0.01(4)=13.277, rejected H1. D= X2 0.01(4)=14.277, rejected H0.
10. Any statement whose validity is tested on the basis of a sample is called. A= null hypothesis, b=
alternative hypothesis, c= statistical hypothesis, d= simple hypothesis
11. X2 curve ranges from. A= infinity to + infinity, B= 0 to infinity, C= - infinity to 0, D= 0 to 1.
12. Semi-average method is used for measurement of trend values when. A=trend is linear, b=
observed data contains yearly values, c= the given time series contains odd number of values,
d= none of the above
13. Given a=80, n=625,u0=350 and X= 356 Find Z? a=1.88, b=1.99, c= 1.77, d=1.66
14. Given the equation of the straight line Y=a+bx, and the values of a a=45 b= -10 and x=3 find the
value of y. a=15, b=16, c=17, d=18
15. r=+1. A= no correlation. B=Negative correlation. C=Perfect correlation. D=None of above.
16. For an r x c contingency table, the number of cells in the table are. A= r.c, B= (r-1)(r-c). C= r+c,
D= (r+1)(c-1)
17. The hypothesis u less than 10 is a. a= simple hypothesis, b= composite hypothesis, c=
alternative hypothesis, d= difficult to tell
18. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the least
square estimate a. A= 2, B= 1, C= 1.5, D= 2.5.
19. If the respective values of f0= 21,38,32,29,36,25,41,23 and
fe=31.31,27.69,32.37,28.63,32.37,28.63,33.96,30.64,then find x2, a=12, b=13, c=11, d=11.22
20. P(type II error) is equal to. A=alpha, b= beta, c= 1-alpha, d= 1-beta.
21. Given H: u=12, H1:u greater than 12, n=64, x=15. Phi=10, Sigma=0.05. find Z are make the
statistical decision: a= Z=2.4, reject H0, b= z= 2.4 accept H0, c= z= 3.4 reject H1, d= z=3.4 reject
H0
22. Given f0=30, 75, 45, 30, 75, 45 fe=52.5, 52.5, 37.5, 60.0, 60.0, df=2 and alpha = 0.05 find x 2. A=
29.786, b= 30.0, c=26.99, d=23.
23. Given u0=130, x =150, sigma=25 and n=4. What test statistic is appropriate? a= t, b = z, c= f,
d=X2
24. Given X= 100, ox= 16, and u0=90, find Z= A= 0.6, B= 0.63, C= 0.62, 0.5.
25. If X2=13.95, df=4, X20.05(4)=13.227, we make the following statistical decision. A= We accept H0 at
alpha = 0.01 and a=0.05, b= we reject H0 at alpha = 0.05 but not at a= 0.01, C= We reject H0 at
a=0.01 but not at a=0.05, d= we reject H0 at a=0.01 and a=0.05.
26. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the sample
correlation coefficient r? . A= 2, B= 1, C= 1.5, D= 2.5.
27. The alternative hypothesis is called, a= null hypothesis, b= statistical hypothesis, c= research
hypothesis, d= single hypothesis
28. A time series has. A= two components, B= three components, C= four components= five
components.
29. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the least
square estimate b. A= 2, B= 1, C= 1.5, D= 2.5.
30. P(type I error) is equal to. A=alpha, b= beta, c= 1-alpha, d= 1-beta.
31. In the semi averages method, if the number of values is odd then we drop: A= first value, B=
third value, C= last value, D= middle value, E=middle two value.
32. Analysis is the statistical tool we can use to describe the degree to which one variable is linearly
related to another, a= regression, b= correlation, c= variances, d= none of the above.
33. A statement which is tested for the purpose of rejection under the assumption that it is true is
called. A= null hypothesis, B= alternative hypothesis, c= simple hypothesis, d= composite
hypothesis.
34. Which of the following is a criteria for selecting a regression line which best represents the data.
A= the mean of the data must agree with the line. B= the sum of squared differences between
the dependent variable must be minimized.
35. C= the sum of the squared horizontal differences in the independent variable must be
minimized. D= the line must agree with at least half of the data points
36. The choice of one-tailed test and two tailed test depends upon. A= null hypothesis, b=
alternative hypothesis, c= none of these, d= composite hypothesis.
37. Given x2= 20.178, D.of=4 and alpha=0.01, find the table value of x2 and make the statistical
decision. A= x20.01(4)=13.277, rejected H0. B= x20.01(4)=14.277 rejected H1. C= x2
0.01(4)=13.277, rejected H1. D= X2 0.01(4)=14.277, rejected H0.
38. In regression analysis the variable we would like to predict or explain is called : A. independent
variable b. dependent variable, c= regression coefficient, d= residual error
39. The degree of confidence is equal to . a= alpha, b= beta, c= 1-alpha, d= 1-beta.
40. = 100, X= 120, n = 25 s = 35.5 Find t which is 2.82
41. suppose that the null hypothesis is true and it is rejected, is known as. A= type I error, and its
probability is Beta. B= type I error, and its probability is alpha. C= type II error, and its
probability is alpha. D= type II error, and its probability is Beta.
42. Degree of freedom of t distribution is. a). N+1, b). n-1, c). n. d). n-1/2
Basic Probability
Formula, symbols
Complementary Events
Events in the whole sample space but not one of the
outcomes included in A are complementary.
Limits of P
~
P ( A) P not A 1 P A
0 P( A) 1
P ( A) 0
P ( A) 1
P A or B P A B
P A or B P A B P A PB
P A and B 0
P A or B P A P B P A and B
Note that this rule applies regardless, as if there is no intersection, zero will be subtracted.
Conditional Probability
This arises when we are calculating the probabilities of a
particular event, A, given that we know the condition of
another event, B. It is the probability that an event
occurs given that another event has occurred.
P A B
P ( A and B) n A and B
P ( B)
nB
P(A|B) means : The probability that A will occur given that B has already occurred.
P A B
P ( A and B )
P ( B)
P A P B
P A
P B
i.e., if events A and B are independent then the conditional probability that A occurs, given that
event B has occurred, is simply the probability that event A occurs.
Expected Value
The expected value of a random variable is the
mean of the random variable
That is, to work out the expected value of a random variable, multiply each possible value of X by
its probability and add these products.
Presentation of information
As well as just being written out, information can be presented in a table or as a diagram.
Examine the following information.
The set of digits, D, contains the numbers {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
The set of even numbers, E, is {2, 4, 6, 8}
The set of odd numbers, O, is {1, 3, 5, 7, 9}
The set of prime numbers, P, is {2, 3, 5, 7}
This Venn diagram shows the relationships between the sets. Note there are some numbers in
more than one grouping and zero is all on its own.
D
O
3
1
9
P
5
4
8
D = digits
E = even numbers
O = odd numbers
E
6
P = prime numbers
A summary of this information could have been written in table form, showing the number of
digits in each category:
Odd numbers
3
2
5
Prime
Not Prime
Total
Even numbers
1
3
4
Neither
0
1
1
Total
4
6
10
That is, there are 3 digits that are both odd and prime, 2 digits that are both odd and not
prime and 5 odd digits in total etc.
Study how the following probabilities are calculated, using the rules given above.
Peven
4
0.4
10
Podd
5
0.5
10
1
0.1(from Venn Diagram)
10
P A or B P A P B P A and B
or using formula:
P prime or even
1
0.1
10
3
10
4 4 1
7
10 10 10 10
P prime or odd
5
4
3
6
10 10 10 10
Probabilities are very easy to calculate if data is given in table form. If you are given
information not in table form, try to tabulate it before you start your calculations.
The table is sometimes referred to as a contingency table.
P(A and B) = 0
Multiplication law: P(A and B) P(A)P(B|A)=P(B)P(A|B)
If statistically independent:
0.50
1.00.
2. When using the general multiplication rule, P(A and B) is equal to:
A P(A|B).P(B)
B P(A).P(B)
C P(B)/P(A)
D P(A)/P(B)
3. A recent survey of banks revealed the following distribution for the interest rate being
charged on a home loan (based on a 15-year mortgage with a 20% deposit):
7.0%
7.5%
8.0%
8.5%
> 8.5%
Interest rate
0.12
0.23
0.24
0.35
0.06
Probability
If a bank is selected at random from this distribution, what is the chance that the interest
rate charged on a home loan will exceed 8.0%?
A 0.06
B 0.41
C 0.59
D 1.00
B 50/400
C 195/400
D 245/400
5. Given that alcohol was not involved, what proportion of the accidents were multiple vehicle?
A 50/170
B 120/170
C 205/230
D 25/230
6. The connotation expected value or expected gain from playing Roulette at a casino
means:
A the amount you expect to gain on a
single play
C the amount you need to break even
over many plays
7.
If two events are collectively exhaustive, the probability that one or the other occurs is
A 0
8.
0.50
1.00
There are 100 female students and 230 male students in a class. The probability that a
randomly picked student is a female is:
A 0
9.
0.50
0.30
According to a survey of American households, the probability that the residents own two
cars IF annual household income is over $25,000 is 80%. Of the households surveyed,
60% had incomes over $25,000 and 70% had two cars. The probability that the residents
of a household own two cars AND have an income less than or equal to $25,000 a year is:
A 0.12
B 0.18
C 0.22
D 0.48
10. A company has two machines that produce widgets. An older machine produces 23%
defective widgets, while the new machine produces only 8% defective widgets. In addition,
the new machine produces three times as many widgets as the older machine does.
Given a randomly chosen widget was tested and found to be defective, what is the
probability it was produced by the new machine?
A 0.08
B 0.15
C 0.489
D 0.511
Passed
20
45
Failed
20
15
13. If a student is selected at random, what is the probability that the student passed QMET103?
14. If a student is selected at random, what is the probability that the student failed QMET103
AND is male?
15. Given that the selected student had passed, what is the probability that the student was
male?
16. A local retail store surveyed 1000 people and asked whether they intended to purchase a
large television over the next 12 months. Twelve months later, the same respondents
were contacted and asked whether they actually purchased the television.
Their responses are summarized in the following table:
Planned to purchase
Actually Purchased
No
50
650
Yes
200
100
Yes
No
a) What is the probability that a randomly selected person planned to purchase a large
television?
b) What is the probability that a randomly selected person planned to purchase a
television AND actually purchased a television?
c) What is the probability that a randomly selected person planned to purchase a
television OR actually purchased a television?
d) Given that a randomly selected person planned to purchase a television, what is the
probability that he/she actually purchased a television?
e) Are the two events, planning to purchase a television and actually purchasing a
television, statistically independent? (Show working).
17. 300 students were sampled to determine attitudes to internal assessment workloads.
Students from both Commerce and Science Divisions were sampled and the following table
produced:
Science
Commerce
Workload
too light
20
100
Workload
about right
30
20
Workload
too much
50
80
a) What is the probability that a randomly selected person in the sample considers the
workload too light?
b) What is the probability that a randomly selected person in the sample considers the
workload about right AND too light?
c) What is the probability that a randomly selected person in the sample is a commerce
student OR considers the workload too much?
d) Given that a randomly selected student is from the Commerce Division, what is the
probability that the student considers the workload about right?
e) What is the probability that a randomly selected student is not a science student AND
they think the workload is too light?
18. There are 50 students in the Lincoln University Rugby Club and 20 of them take vitamin C
daily. 30% Rugby Club students catch a cold each year. 20% of students who take
Vitamin C every day caught a cold last year.
a) Prepare a contingency table for the above information.
b) What is the probability that a randomly selected student who does not take Vitamin C
every day caught a cold last year?
c) Given that the randomly selected student caught a cold last year, what is the
probability that he takes Vitamin C?
d) Are taking Vitamin C and catching a cold independent events? Support your answer
with appropriate mathematical calculations.
19. A soft drink company is interested in introducing a new Cola brand to the market. Initially
they developed three different flavours and want to select the flavour which would be the
most popular one. Their research department randomly selected 100 males and 100
females and asked them to choose the best flavour between the three flavours (say A, B
and C). The results are summarised in the following table:
Flavour
A
Male
25
Female
30
B
C
35
40
50
20
2
6
A
B
Questions 7-10
1 C
Questions 11-15
1
0.4
0.8
0.65
0.2
0.31
Question 16
A
0.25
0.2
0.35
0.8
NO
Question 17
a
0.4
0.83
0.1
0.33
Question 18
Took Vit C
NO Vit C
Total
Caught cold
4
11
15
b
NO Cold
16
19
35
c
Total
20
30
50
d
PCold PvitC 0.4 0.3 0.12; Pcold and vit C 0.08 0.12
0.3667
0.2667
Question 19
a
0.275
0.1
0.4
0.625
0.0889
MCQ 13.5
A quantitative statement about a population is called:
(a) Research hypothesis (b) Composite hypothesis (c) Simple hypothesis (d) Statistical hypothesis
MCQ 13.6
A statement that is accepted if the sample data provide sufficient evidence that the null hypothesis is false is
called:
(a) Simple hypothesis
(b) Composite hypothesis (c) Statistical hypothesis
(d) Alternative hypothesis
MCQ 13.7
The alternative hypothesis is also called:
(a) Null hypothesis (b) Statistical hypothesis
MCQ 13.8
A hypothesis that specifies all the values of parameter is called:
(a) Simple hypothesis
(b) Composite hypothesis
(c) Statistical hypothesis
MCQ 13.9
The hypothesis 10 is a:
(a) Simple hypothesis
MCQ 13.10
If a hypothesis specifies the population distribution is called:
(a) Simple hypothesis
(b) Composite hypothesis
(c) Alternative hypothesis
MCQ 13.11
A hypothesis may be classified as:
(a) Simple
(b) Composite
(c) Null
MCQ 13.12
The probability of rejecting the null hypothesis when it is true is called:
(a) Level of confidence
(b) Level of significance
(c) Power of the test
MCQ 13.13
The dividing point between the region where the null hypothesis is rejected and the region where it is not
rejected is said to be:
(a) Critical region
(b) Critical value
(c) Acceptance region
(d) Significant region
MCQ 13.14
If the critical region is located equally in both sides of the sampling distribution of test-statistic, the test is
called:
(a) One tailed
(b) Two tailed
(c) Right tailed
(d) Left tailed
MCQ 13.15
The choice of one-tailed test and two-tailed test depends upon:
(a) Null hypothesis
(b) Alternative hypothesis
(c) None of these
MCQ 13.16
Test of hypothesis Ho: = 50 against H1: > 50 leads to:
(a) Left-tailed test
(b) Right-tailed test
(c) Two-tailed test
MCQ 13.17
Test of hypothesis Ho: = 20 against H1: < 20 leads to:
(a) Right one-sided test
(b) Left one-sided test
(c) Two-sided test
MCQ 13.18
Testing Ho: = 25 against H1: 20 leads to:
(a) Two-tailed test (b) Left-tailed test (c) Right-tailed test (d) Neither (a), (b) and (c)
MCQ 13.19
A rule or formula that provides a basis for testing a null hypothesis is called:
(a) Test-statistic
(b) Population statistic
(c) Both of these
MCQ 13.20
The range of test statistic-Z is:
(a) 0 to 1
(b) -1 to +1
(c) 0 to
(d) - to +
MCQ 13.21
The range of test statistic-t is:
(a) 0 to
(b) 0 to 1
(c) - to +
(d) -1 to +1
MCQ 13.22
If Ho is true and we reject it is called:
(a) Type-I error
(b) Type-II error
MCQ 13.23
The probability associated with committing type-I error is:
(a)
(b)
(c) 1
(d) 1
MCQ 13.24
A failing student is passed by an examiner, it is an example of:
(a) Type-I error
(b) Type-II error (c) Unbiased decision
MCQ 13.25
A passing student is failed by an examiner, it is an example of:
(a) Type-I error (b) Type-II error
(c) Best decision
(d) All of the above
MCQ 13.26
1 is also called:
(a) Confidence coefficient
MCQ 13.27
1 is the probability associated with:
(a) Type-I error
(b) Type-II error
MCQ 13.28
Area of the rejection region depends on:
(a) Size of
(b) Size of
(c) Test-statistic
MCQ 13.29
Size of critical region is known as:
(a)
(b) 1 -
MCQ 13.30
A null hypothesis is rejected if the value of a test statistic lies in the:
(a) Rejection region
(b) Acceptance region (c) Both (a) and (b)
MCQ 13.31
The test statistic is equal to:
MCQ 13.32
Level of significance is also called:
(a) Power of the test
(b) Size of the test (c) Level of confidence
MCQ 13.33
Level of significance lies between:
(a) -1 and +1
(b) 0 and 1
(d) - to +
(c) 0 and n
MCQ 13.34
Critical region is also called:
(a) Acceptance region
(b) Rejection region
MCQ 13.35
The probability of rejecting Ho when it is false is called:
(a) Power of the test
(b) Size of the test
(c) Level of confidence
MCQ 13.36
Power of a test is related to:
(a) Type-I error
(b) Type-II error
MCQ 13.37
In testing hypothesis + is always equal to:
(a) One
(b) Zero
(c) Two
MCQ 13.38
The significance level is the risk of:
(a) Rejecting Ho when Ho is correct
(c) Rejecting H1 when H1 is correct
MCQ 13.39
An example in a two-sided alternative hypothesis is:
(a) H1: < 0
(b) H1: > 0
(c) H1: 0
(d) H1: 0
MCQ 13.40
If the magnitude of calculated value of t is less than the tabulated value of t and H1 is two-sided, we
should:
(a) Reject Ho
(b) Accept H1
(c) Not reject Ho
(d) Difficult to tell
MCQ 13.41
Accepting a null hypothesis Ho:
(a) Proves that Ho is true
(c) Implies that Ho is likely to be true
MCQ 13.42
The chance of rejecting a true hypothesis decreases when sample size is:
(a) Decreased
(b) Increased
(c) Constant
MCQ 13.43
The equality condition always appears in:
(a) Null hypothesis
(b) Simple hypothesis (c) Alternative hypothesis
MCQ 13.44
Which hypothesis is always in an inequality form?
(a) Null hypothesis
(b) Alternative hypothesis
MCQ 13.45
Which of the following is composite hypothesis?
(a) o
(b) o
(c) = o
(d) o
MCQ 13.46
P (Type I error) is equal to:
(a) 1
(b) 1
(c)
MCQ 13.47
P (Type II error) is equal to:
(a)
(b)
(c) 1
(d) 1
MCQ 13.48
The power of the test is equal to:
(a)
(b)
(c) 1
(d) 1
(d)
MCQ 13.49
The degree of confidence is equal to:
(a)
(b)
(c) 1
MCQ 13.50
/ 2 is called:
(a) One tailed significance level
(c) Left tailed significance level
(d) 1
MCQ 13.51
Students t-test is applicable only when:
(a) n30 and is known (b) n>30 and is unknown (c) n=30 and is known (d) All of the above
MCQ 13.52
Students t-statistic is applicable in case of:
(a) Equal number of samples (b) Unequal number of samples (c) Small samples (d) All of the above
MCQ 13.53
Paired t-test is applicable when the observations in the two samples are:
(a) Equal in number (b) Paired
(c) Correlation
(d) All of the above
MCQ 13.54
The degree of freedom for paired t-test based on n pairs of observations is:
(a) 2n - 1
(b) n - 2
(c) 2(n - 1) (d) n - 1
MCQ 13.55
The test-statistic
(a) n
(b) n - 1
(d) n1 + n2 - 2
MCQ 13.56
In an unpaired samples t-test with sample sizes n1= 11 and n2= 11, the value of tabulated t should be
obtained for:
(a) 10 degrees of freedom
(b) 21 degrees of freedom
(c) 22 degrees of freedom
(d) 20 degrees of freedom
MCQ 13.57
In analyzing the results of an experiment involving seven paired samples, tabulated t should be
obtained for:
(a) 13 degrees of freedom
(b) 6 degrees of freedom
(c) 12 degrees of freedom
(d) 14 degrees of freedom
MCQ 13.58
The mean difference between 16 paired observations is 25 and the standard deviation of differences is
10. The value of statistic-t is:
(a) 4
(b) 10
(c) 16
(d) 25
MCQ 13.59
Statistic-t is defined as deviation of sample mean from population mean expressed in terms of:
(a) Standard deviation
(b) Standard error
(c) Coefficient of standard deviation
(d) Coefficient of variation
MCQ 13.60
Students t-distribution has (n-1) d.f. when all the n observations in the sample are:
(a) Dependent
(b) Independent (c) Maximum
(d) Minimum
MCQ 13.61
The number of independent values in a set of values is called:
(a) Test-statistic
(b) Degree of freedom
(c) Level of significance (d) Level of confidence
MCQ 13.62
The purpose of statistical inference is:
(a) To collect sample data and use them to formulate hypotheses about a population
(b) To draw conclusion about populations and then collect sample data to support the conclusions
(c) To draw conclusions about populations from sample data
(d) To draw conclusions about the known value of population parameter
MCQ 13.63
Suppose that the null hypothesis is true and it is rejected, is known as:
(a) A type-I error, and its probability is
(b) A type-I error, and its probability is
(c) A type-II error, and its probability is
(d) A type-Il error, and its probability is
MCQ 13.64
An advertising agency wants to test the hypothesis that the proportion of adults in Pakistan who read a Sunday
Magazine is 25 percent. The null hypothesis is that the proportion reading the Sunday Magazine is:
(a) Different from 25%
(b) Equal to 25%
(c) Less than 25 %
(d) More than 25 %
MCQ 13.65
If the mean of a particular population is o,
is distributed:
is distributed:
(a) As a standard normal variable, if both samples are independent and less than 30
(b) As a standard normal variable, if both populations are normal
(c) As both (a) and (b) state
(d) As the t-distribution with n1 + n2 - 2 degrees of freedom
MCQ 13.67
If the population proportion equals po, then
(a) As a standard normal variable, if n > 30
(b) As a Poisson variable
(c) As the t-distribution with v= n 1 degrees of freedom
(d) As a distribution with v degrees of freedom
is distributed:
MCQ 13.68
When is known, the hypothesis about population mean is tested by:
(a) t-test
(b) Z-test
(c) 2-test
(d) F-test
MCQ 13.69
Given o = 130,
(a) t
MCQ 13.70
Given Ho: = o, H1: o, = 0.05 and we reject Ho; the absolute value of the Z-statistic must have equalled
or been beyond what value?
(a) 1.96
(b) 1.65
(c) 2.58
(d) 2.33
MCQ 13.71
If p1 and p2 are not identical, then standard error of the difference of proportions (p1 p2) is:
MCQ 13.72
Under the hypothesis Ho: p1 = p2, the formula for the standard error of the difference between
proportions (p1 p2) is:
2.
If there is a very strong correlation between two variables then the correlation coefficient must be
a. any value larger than 1
b. much smaller than 0, if the correlation is negative
c. much larger than 0, regardless of whether the correlation is negative or positive
d. None of these alternatives is correct.
3.
In regression, the equation that describes how the response variable (y) is related to the
explanatory variable (x) is:
a. the correlation model
b. the regression model
c. used to compute the correlation coefficient
d. None of these alternatives is correct.
4.
The relationship between number of beers consumed (x) and blood alcohol content (y) was studied
in 16 male college students by using least squares regression. The following regression equation
was obtained from this study:
!= -0.0127 + 0.0180x
The above equation implies that:
a. each beer consumed increases blood alcohol by 1.27%
b. on average it takes 1.8 beers to increase blood alcohol content by 1%
c. each beer consumed increases blood alcohol by an average of amount of 1.8%
d. each beer consumed increases blood alcohol by exactly 0.018
5.
6.
7.
8.
Regression analysis was applied to return rates of sparrowhawk colonies. Regression analysis was
used to study the relationship between return rate (x: % of birds that return to the colony in a given
year) and immigration rate (y: % of new adults that join the colony per year). The following
regression equation was obtained.
! = 31.9 0.34x
Based on the above estimated regression equation, if the return rate were to decrease by 10% the
rate of immigration to the colony would:
a. increase by 34%
b. increase by 3.4%
c. decrease by 0.34%
d. decrease by 3.4%
9.
In least squares regression, which of the following is not a required assumption about the error
term ?
a. The expected value of the error term is one.
b. The variance of the error term is the same for all values of x.
c. The values of the error term are independent.
d. The error term is normally distributed.
10.
Larger values of r2 (R2) imply that the observations are more closely grouped about the
a. average value of the independent variables
b. average value of the dependent variable
c. least squares line
d. origin
11.
12.
13.
In regression analysis, the variable that is used to explain the change in the outcome of an
experiment, or some natural process, is called
a. the x-variable
b. the independent variable
c. the predictor variable
d. the explanatory variable
e. all of the above (a-d) are correct
f. none are correct
14.
In the case of an algebraic model for a straight line, if a value for the x variable is specified, then
a. the exact value of the response variable can be computed
b. the computed response to the independent value will always give a minimal residual
c. the computed value of y will always be the best estimate of the mean response
d. none of these alternatives is correct.
15.
A regression analysis between sales (in $1000) and price (in dollars) resulted in the following
equation:
! = 50,000 - 8X
The above equation implies that an
a. increase of $1 in price is associated with a decrease of $8 in sales
b. increase of $8 in price is associated with an increase of $8,000 in sales
c. increase of $1 in price is associated with a decrease of $42,000 in sales
d. increase of $1 in price is associated with a decrease of $8000 in sales
16.
17.
18.
19.
20.
21.
The data are the same as for question 4 above. The relationship between number of beers
consumed (x) and blood alcohol content (y) was studied in 16 male college students by using least
squares regression. The following regression equation was obtained from this study:
!= -0.0127 + 0.0180x
Suppose that the legal limit to drive is a blood alcohol content of 0.08. If Ricky consumed 5 beers
the model would predict that he would be:
a. 0.09 above the legal limit
b. 0.0027 below the legal limit
c. 0.0027 above the legal limit
d. 0.0733 above the legal limit
22.
In a regression analysis if SSE = 200 and SSR = 300, then the coefficient of determination is
a. 0.6667
b. 0.6000
c. 0.4000
d. 1.5000
23.
If the correlation coefficient is 0.8, the percentage of variation in the response variable explained
by the variation in the explanatory variable is
a. 0.80%
b. 80%
c. 0.64%
d. 64%
24.
If the correlation coefficient is a positive value, then the slope of the regression line
a. must also be positive
b. can be either negative or positive
c. can be zero
d. can not be zero
25.
26.
27.
Regression analysis was applied between $ sales (y) and $ advertising (x) across all the branches
of a major international corporation. The following regression function was obtained.
! = 5000 + 7.25x
If the advertising budgets of two branches of the corporation differ by $30,000, then what will be
the predicted difference in their sales?
a. $217,500
b. $222,500
c. $5000
d. $7.25
28.
Suppose the correlation coefficient between height (as measured in feet) versus weight (as
measured in pounds) is 0.40. What is the correlation coefficient of height measured in inches
versus weight measured in ounces? [12 inches = one foot; 16 ounces = one pound]
a. 0.40
b. 0.30
c. 0.533
d. cannot be determined from information given
e. none of these
29.
Assume the same variables as in question 28 above; height is measured in feet and weight is
measured in pounds. Now, suppose that the units of both variables are converted to metric (meters
and kilograms). The impact on the slope is:
a.
the sign of the slope will change
b.
the magnitude of the slope will change
c.
both a and b are correct
d.
neither a nor b are correct
30.
Suppose that you have carried out a regression analysis where the total variance in the response is
133452 and the correlation coefficient was 0.85. The residual sums of squares is:
a. 37032.92
b. 20017.8
c. 113434.2
d. 96419.07
e. 15%
f.
0.15
31.
This question is related to questions 4 and 21 above. The relationship between number of beers
consumed (x) and blood alcohol content (y) was studied in 16 male college students by using least
squares regression. The following regression equation was obtained from this study:
!= -0.0127 + 0.0180x
Another guy, his name Dudley, has the regression equation written on a scrap of paper in his
pocket. Dudley goes out drinking and has 4 beers. He calculates that he is under the legal limit
(0.08) so he decides to drive to another bar. Unfortunately Dudley gets pulled over and
confidently submits to a road-side blood alcohol test. He scores a blood alcohol of 0.085 and gets
himself arrested. Obviously, Dudley skipped the lecture about residual variation. Dudleys
residual is:
a.
b.
c.
d.
+0.005
-0.005
+0.0257
-0.0257
32.
You have carried out a regression analysis; but, after thinking about the relationship between
variables, you have decided you must swap the explanatory and the response variables. After
refitting the regression model to the data you expect that:
a. the value of the correlation coefficient will change
b. the value of SSE will change
c. the value of the coefficient of determination will change
d. the sign of the slope will change
e. nothing changes
33.
Suppose you use regression to predict the height of a womans current boyfriend by using her own
height as the explanatory variable. Height was measured in feet from a sample of 100 women
undergraduates, and their boyfriends, at Dalhousie University. Now, suppose that the height of
both the women and the men are converted to centimeters. The impact of this conversion on the
slope is:
a.
the sign of the slope will change
b.
the magnitude of the slope will change
c.
both a and b are correct
d.
neither a nor b are correct
34.
A residual plot:
a. displays residuals of the explanatory variable versus residuals of the response variable.
b. displays residuals of the explanatory variable versus the response variable.
c. displays explanatory variable versus residuals of the response variable.
d. displays the explanatory variable versus the response variable.
e. displays the explanatory variable on the x axis versus the response variable on the y axis.
35.
When the error terms have a constant variance, a plot of the residuals versus the independent
variable x has a pattern that
a. fans out
b. funnels in
c. fans out, but then funnels in
d. forms a horizontal band pattern
e. forms a linear pattern that can be positive or negative
36.
You studied the impact of the dose of a new drug treatment for high blood pressure. You think
that the drug might be more effective in people with very high blood pressure. Because you
expect a bigger change in those patients who start the treatment with high blood pressure, you use
regression to analyze the relationship between the initial blood pressure of a patient (x) and the
change in blood pressure after treatment with the new drug (y). If you find a very strong positive
association between these variables, then:
a.
there is evidence that the higher the patients initial blood pressure, the bigger the impact
of the new drug.
b.
there is evidence that the higher the patients initial blood pressure, the smaller the impact
of the new drug.
c.
there is evidence for an association of some kind between the patients initial blood
pressure and the impact of the new drug on the patients blood pressure
d.
none of these are correct, this is a case of regression fallacy
Question 37:
A variety of summary statistics were collected for a small sample (10) of bivariate data, where the
dependent variable was y and an independent variable was x.
X = 90
Y = 170
n = 10
(Y Y )(X X) = 466
(X X ) = 234
(Y Y ) = 1434
2
2
SSE = 505.98
37.1
Use the formula to the right to compute the sample correlation coefficient:
a. 0.8045
b. -0.8045
c. 0
d. 1
37.2
37.3
37.4
37.5
37.6
c
b
b
c
a
c
a
b
a
c
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
b
b
e
a
d
d
c
c
b
d
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
b
b
d
a
b
a
a
a
b
a
31.
32.
33.
34.
35.
36.
37.1
37.2
37.3
37.4
c
b
d
c
d
d
a
b
d
d
37.5 a
37.6 a
Chapter 15
Multiple Choice Questions
(The answers are provided after the last question.)
1. What is the median of the following set of scores?
18, 6, 12, 10, 14 ?
a. 10
b. 14
c. 18
d. 12
2. Approximately what percentage of scores fall within one standard deviation of the mean in a
normal distribution?
a. 34%
b. 95%
c. 99%
d. 68%
3. The denominator (bottom) of the z-score formula is
a. The standard deviation
b. The difference between a score and the mean
c. The range
d. The mean
4. Let's suppose we are predicting score on a training posttest from number of years
of education and the score on an aptitude test given before training. Here is the regression
equation
Y = 25 + .5X1 +10X2,
where X1 = years of education and X2 = aptitude test score.
What is the
predicted score for someone with 10 years of education and a aptitude test score of 5?
a. 25
b. 50
c. 35
d. 80
5. The standard deviation is:
a. The square root of the variance
b. A measure of variability
c. An approximate indicator of how numbers vary from the mean
d. All of the above
6. Hypothesis testing and estimation are both types of descriptive statistics.
a. True
b. False
b. Regression coefficient
c. Regression equation
d. Regression line
15. The ______ is the value you calculate when you want the arithmetic average.
a. Mean
b. Median
c. Mode
d. All of the above
16. ___________ are used when you want to visually examine the relationship between two
quantitative variables.
a. Bar graphs
b. Pie graphs
c. Line graphs
d. Scatterplots
17. The _______ is often the preferred measure of central tendency if the data are severely
skewed.
a. Mean
b. Median
c. Mode
d. Range
18. Which of the following is the formula for range?
a. H + L
b. L x H
c. L - H
d. H L
19. Which is a raw score that has been transformed into standard deviation units?
a. z score
b. SDU score
c. t score
d. e score
20. Which of the following is NOT a measure of variability?
a. Median
b. Variance
c. Standard deviation
d. Range
21. Which of the following is NOT a common measure of central tendency?
a. Mode
b. Range
c. Median
d. Mean
22. What is the median of this set of numbers: 4, 6, 7, 9, 2000000?
a. 7.5
b. 6
c. 7
d. 4
23. What is the mean of this set of numbers: 4, 6, 7, 9, 2000000?
a. 7.5
b. 400,005.2
c. 7
d. 4
24. Which of the following is interpreted as the percentage of scores in a reference group that
falls below a particular raw score?
a. Standard scores
b. Percentile rank
c. Reference group
d. None of the above
25. The median is ______.
a. The middle point
b. The highest number
c. The average
d. Affected by extreme scores
26. Which measure of central tendency takes into account the magnitude of scores?
a. Mean
b. Median
c. Mode
d. Range
27. If a test was generally very easy, except for a few students who had very
low scores, then the distribution of scores would be _____.
a. Positively skewed
b. Negatively skewed
c. Not skewed at all
d. Normal
28. How many dependent variables are used in multiple regression?
a. One
b. One or more
c. Two or more
d. Two
29. Which of the following represents the fiftieth percentile, or the middle point in a set of
numbers arranged in order of magnitude?
a. Mode
b. Median
c. Mean
d. Variance
30. If a distribution is skewed to the left, then it is __________.
a. Negatively skewed
b. Positively skewed
c. Symmetrically skewed
d. Symmetrical
31. In a grouped frequency distribution, the intervals should be what?
a. Mutually exclusive
b. Exhaustive
c. Both A and B
d. Neither A nor B
32. When a set of numbers is heterogeneous, you can place more trust in the measure of central
tendency as representing the typical person or unit.
a. True
b. False
33. Non-overlapping categories or intervals are known as ______.
a. Inclusive
b. Exhaustive
c. Mutually exclusive
d. Mutually exclusive and exhaustive
34. To interpret the relationship between two categorical variables, a contingency table should be
constructed with either column or row percentages, and ----.
a. If the percentages are calculated down the columns, then comparisons should be made across
the rows
b. If the percentages are calculated across the rows, comparisons should be made down the
columns
c. Both a and b are correct
d. Neither a nor b is correct
Answers:
1. d
2. d
3. a
4. d
5. d
6. b
7. a
8. b
9. b
10. c
11. a
12. c
13. a
14. a
15. a
16. d
17. b
18. d
19. a
20. a
21. b
22. c
23. b
24. b
25. a
26. a
27. b
28. a
29. b
30, a
31. c
32. b
33. c
34. c
STATISTICS 8
CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS
Correct answers are in bold italics..
This scenario applies to Questions 1 and 2: A study was done to compare the lung capacity of
coal miners to the lung capacity of farm workers. The researcher studied 200 workers of each
type. Other factors that might affect lung capacity are smoking habits and exercise habits. The
smoking habits of the two worker types are similar, but the coal miners generally exercise less
than the farm workers.
1. Which of the following is the explanatory variable in this study?
a. Exercise
b. Lung capacity
c. Smoking or not
d. Occupation
2. Which of the following is a confounding variable in this study?
a. Exercise
b. Lung capacity
c. Smoking or not
d. Occupation
This scenario applies to Questions 3 to 5: A randomized experiment was done by randomly
assigning each participant either to walk for half an hour three times a week or to sit quietly
reading a book for half an hour three times a week. At the end of a year the change in
participants' blood pressure over the year was measured, and the change was compared for the
two groups.
3. This is a randomized experiment rather than an observational study because:
a. Blood pressure was measured at the beginning and end of the study.
b. The two groups were compared at the end of the study.
c. The participants were randomly assigned to either walk or read, rather than choosing
their own activity.
d. A random sample of participants was used.
4. The two treatments in this study were:
a. Walking for half an hour three times a week and reading a book for half an hour three
times a week.
b. Having blood pressure measured at the beginning of the study and having blood pressure
measured at the end of the study.
c. Walking or reading a book for half an hour three times a week and having blood pressure
measured.
d. Walking or reading a book for half an hour three times a week and doing nothing.
10. A polling agency conducted a survey of 100 doctors on the question Are you willing to treat
women patients with the recently approved pill RU-486? The conservative margin of error
associated with the 95% confidence interval for the percent who say 'yes' is
a. 50%
b. 10%
c. 5%
d. 2%
11. Which one of these statistics is unaffected by outliers?
a. Mean
b. Interquartile range
c. Standard deviation
d. Range
12. A list of 5 pulse rates is: 70, 64, 80, 74, 92. What is the median for this list?
a. 74
b. 76 c. 77 d. 80
13. Which of the following would indicate that a dataset is not bell-shaped?
a. The range is equal to 5 standard deviations.
b. The range is larger than the interquartile range.
c. The mean is much smaller than the median.
d. There are no outliers.
14. A scatter plot of number of teachers and number of people with college degrees for cities in
California reveals a positive association. The most likely explanation for this positive
association is:
a. Teachers encourage people to get college degrees, so an increase in the number of
teachers is causing an increase in the number of people with college degrees.
b. Larger cities tend to have both more teachers and more people with college degrees, so
the association is explained by a third variable, the size of the city.
c. Teaching is a common profession for people with college degrees, so an increase in the
number of people with college degrees causes an increase in the number of teachers.
d. Cities with higher incomes tend to have more teachers and more people going to college,
so income is a confounding variable, making causation between number of teachers and
number of people with college degrees difficult to prove.
15. The value of a correlation is reported by a researcher to be r = 0.5. Which of the following
statements is correct?
a. The x-variable explains 25% of the variability in the y-variable.
b. The x-variable explains 25% of the variability in the y-variable.
c. The x-variable explains 50% of the variability in the y-variable.
d. The x-variable explains 50% of the variability in the y-variable.
16. What is the effect of an outlier on the value of a correlation coefficient?
a. An outlier will always decrease a correlation coefficient.
b. An outlier will always increase a correlation coefficient.
c. An outlier might either decrease or increase a correlation coefficient, depending on
where it is in relation to the other points.
d. An outlier will have no effect on a correlation coefficient.
Questions 22 and 23: A newspaper article reported that "Children who routinely compete in
vigorous after-school sports on smoggy days are three times more likely to get asthma than their
non-athletic peers." (Sacramento Bee, Feb 1, 2002, p. A1)
22. Of the following, which is the most important additional information that would be useful
before making a decision about participation in school sports?
a. Where was the study conducted?
b. How many students in the study participated in after-school sports?
c. What is the baseline risk for getting asthma?
d. Who funded the study?
23. The newspaper also reported that "The number of children in the study who contracted
asthma was relatively small, 265 of 3,535." Which of the following is represented by
265/3535 = .075?
a. The overall risk of getting asthma for the children in this study.
b. The baseline risk of getting asthma for the non-athletic peers in the study.
c. The risk of getting asthma for children in the study who participated in sports.
d. The relative risk of getting asthma for children who routinely participate in vigorous
after-school sports on smoggy days and their non-athletic peers.
Questions 24 to 26: The following histogram shows the distribution of the difference between
the actual and ideal weights for 119 female students. Notice that percent is given on the
vertical axis. Ideal weights are responses to the question What is your ideal weight? The
difference = actual ideal. (Source: idealwtwomen dataset on CD.)
Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from
Midterms 1 and 2 are also representative of questions that may appear on the final exam.
1. A randomly selected sample of 1,000 college students was asked whether they had ever used the drug
Ecstasy. Sixteen percent (16% or 0.16) of the 1,000 students surveyed said they had. Which one of
the following statements about the number 0.16 is correct?
A. It is a sample proportion.
B. It is a population proportion.
C. It is a margin of error.
D. It is a randomly chosen number.
2. In a random sample of 1000 students, p = 0.80 (or 80%) were in favor of longer hours at the school
library. The standard error of p (the sample proportion) is
A. .013
B. .160
C. .640
D. .800
3. For a random sample of 9 women, the average resting pulse rate is x = 76 beats per minute, and the
sample standard deviation is s = 5. The standard error of the sample mean is
A. 0.557
B. 0.745
C. 1.667
D. 2.778
4. Assume the cholesterol levels in a certain population have mean = 200 and standard deviation =
24. The cholesterol levels for a random sample of n = 9 individuals are measured and the sample
mean x is determined. What is the z-score for a sample mean x = 180?
A. 3.75
B. 2.50
C. 0.83
D. 2.50
5. In a past General Social Survey, a random sample of men and women answered the question Are you
a member of any sports clubs? Based on the sample data, 95% confidence intervals for the
population proportion who would answer yes are .13 to .19 for women and .247 to .33 for men.
Based on these results, you can reasonably conclude that
A. At least 25% of American men and American women belong to sports clubs.
B. At least 16% of American women belong to sports clubs.
C. There is a difference between the proportions of American men and American women who
belong to sports clubs.
D. There is no conclusive evidence of a gender difference in the proportion belonging to sports
clubs.
6. Suppose a 95% confidence interval for the proportion of Americans who exercise regularly is 0.29 to
0.37. Which one of the following statements is FALSE?
A. It is reasonable to say that more than 25% of Americans exercise regularly.
B. It is reasonable to say that more than 40% of Americans exercise regularly.
C. The hypothesis that 33% of Americans exercise regularly cannot be rejected.
D. It is reasonable to say that fewer than 40% of Americans exercise regularly.
20. A random sample of 25 college males was obtained and each was asked to report their actual height
and what they wished as their ideal height. A 95% confidence interval for d = average difference
between their ideal and actual heights was 0.8" to 2.2". Based on this interval, which one of the null
hypotheses below (versus a two-sided alternative) can be rejected?
A. H0: d = 0.5
B. H0: d = 1.0
C. H0: d = 1.5
D. H0: d = 2.0
21. The average time in years to get an undergraduate degree in computer science was compared for men
and women. Random samples of 100 male computer science majors and 100 female computer science
majors were taken. Choose the appropriate parameter(s) for this situation.
A. One population proportion p.
B. Difference between two population proportions p1 p2.
C. One population mean 1
D. Difference between two population means 1 2
22. If the word significant is used to describe a result in a news article reporting on a study,
A. the p-value for the test must have been very large.
B. the effect size must have been very large.
C. the sample size must have been very small.
D. it may be significant in the statistical sense, but not in the everyday sense.
23. A random sample of 5000 students were asked whether they prefer a 10 week quarter system or a 15
week semester system. Of the 5000 students asked, 500 students responded. The results of this
survey ________
A. can be generalized to the entire student body because the sampling was random.
B. can be generalized to the entire student body because the margin of error was 4.5%.
C. should not be generalized to the entire student body because the non-response rate was 90%.
D. should not be generalized to the entire student body because the margin of error was 4.5%.
24. In a report by ABC News, the headlines read City Living Increases Mens Death Risk The headlines
were based on a study of 3,617 adults who lived in the United States and were more than 25 years old.
One researcher said, Elevated levels of tumor deaths suggest the influence of physical, chemical and
biological exposures in urban areas Living in cities also involves potentially stressful levels of
noise, sensory stimulation and overload, interpersonal relations and conflict, and vigilance against
hazards ranging from crime to accidents. Is a conclusion that living in an urban environment causes
an increased risk of death justified?
A. Yes, because the study was a randomized study.
B. Yes, because many of the men in the study were under stress.
C. No, because the study was a retrospective study.
D. No, because the study was an observational study.
25. A significance test based on a small sample may not produce a statistically significant result even if
the true value differs substantially from the null value. This type of result is known as
A. the significance level of the test.
B. the power of the study.
C. a Type 1 error.
D. a Type 2 error.
For the next two questions: An observational study found a statistically significant relationship between
regular consumption of tomato products (yes, no) and development of prostate cancer (yes, no), with
lower risk for those consuming tomato products.
26. Which of the following is not a possible explanation for this finding?
A. Something in tomato products causes lower risk of prostate cancer.
B. There is a confounding variable that causes lower risk of prostate cancer, such as eating vegetables
in general, that is also related to eating tomato products.
C. A large number of food products were measured to test for a relationship, and tomato products
happened to show a relationship just by chance.
D. A large sample size was used, so even if there were no relationship, one would almost certainly
be detected.
27. Which of the following is a valid conclusion from this finding?
A. Something in tomato products causes lower risk of prostate cancer.
B. Based on this study, the relative risk of prostate cancer, for those who do not consume tomato
products regularly compared with those who do, is greater than one.
C. If a new observational study were to be done using the same sample size and measuring the same
variables, it would find the same relationship.
D. Prostate cancer can be prevented by eating the right diet.
28. The best way to determine whether a statistically significant difference in two means is of practical
importance is to
A. find a 95% confidence interval and notice the magnitude of the difference.
B. repeat the study with the same sample size and see if the difference is statistically significant
again.
C. see if the p-value is extremely small.
D. see if the p-value is extremely large.
29. A large company examines the annual salaries for all of the men and women performing a certain job
and finds that the means and standard deviations are $32,120 and $3,240, respectively, for the men
and $34,093 and $3521, respectively, for the women. The best way to determine if there is a
difference in mean salaries for the population of men and women performing this job in this company
is
A. to compute a 95% confidence interval for the difference.
B. to subtract the two sample means.
C. to test the hypothesis that the population means are the same versus that they are different.
D. to test the hypothesis that the population means are the same versus that the mean for men is
higher.
30. One problem with hypothesis testing is that a real effect may not be detected. This problem is most
likely to occur when
A. the effect is small and the sample size is small.
B. the effect is large and the sample size is small.
C. the effect is small and the sample size is large.
D. the effect is large and the sample size is large.