Professional Documents
Culture Documents
Question 2 (1)
Given the below sample data compute what would be the mean, median, mode and standard deviation in
the Age 17 34 23 28 20 25 37 21 11 19 39 37 37 32 32
I.
II.
III.
IV.
Question 3 (2)
Temperature (In degrees Fahrenheit) as a variable in any study with sample observations like 35oF, 75 oF,
98.3 oF ,etc would be
I.
II.
III.
IV.
Question 4 (3)
The Lower Quartile of a Box Plot created from a dataset with observations as {2,4,40,44,46,66,56,33,45}
is 33 while the Upper Quartile is 46. Where should we be seeing the Lower Whiskers?
I.
II.
III.
IV.
2
4
40
None of the Above
Question 5 (1)
If you get some data related to the efficacy of a fertilizer based on tests on field wherein the quantity
administered and size of the field under study is different for different samples, then which of the following
would be an ideal process before you go ahead with data analysis
I.
II.
III.
IV.
Question 6 (1)
Probability of an event A is 0.4, and the probability of event B is 0.3. Assuming the two events are
independent of each other what is the conditional probability of A given B denoted by P(A|B)?
I.
II.
III.
IV.
0.4
0.12
0.1
None of the above
Question 7 (1)
A sampling distribution is the probability distribution for which one of the following?
I.
II.
III.
IV.
A sample
A population
A sample statistic
A population parameter
Question 8 (2)
Given that you have specified the confidence level at 95 %, if p value is less than then specify the and
maximum probability of Type I error respectively
I.
II.
III.
IV.
Question 9(3)
Select the hypothesis formulation and the corresponding best values for , in a Judiciary Scenario so as to
avoid punishing an innocent in lieu of which its okay to pronounce a real case of guilty as not guilty
I. H0 : Defendant is Guilty ,H1 : Defendant is not Guilty, = 10%
Coefficient of determination
Coefficient of correlation
Standard error of estimate
All of above
Question 11(1)
Percent total variation of the dependent variable Y explained by the set of independent variables
X1,X2,...,Xn is measured by
I. Coefficient of correlation
II. Coefficient of skewness
III. Coefficient of determination
IV. Standard deviation
Question 12(1)
Coefficient of correlation between age and mortality rate is 0.9 indicating
I.
II.
III.
IV.
Question 13 (2)
Given the ANOVA output, compute the missing values
Source of
Variation
Regression
Error
Total
I.
II.
III.
IV.
Sum of
Squares
Degrees of Freedom
4.351945854
Question 14(3)
From the following ANOVA output compute the total number of observations and number of variables
respectively
Source of
Variation
Regression
Error
Total
I.
II.
III.
IV.
Sum of
Squares
Degrees of Freedom
400
100
500
12
8
20
n = 20 and k =8
n = 21 and k = 12
n = 19 and k = 8
n = 18 and k = 12
Classification
Question 15(1)
Naive Bayes algorithm is a
I.
II.
III.
IV.
Question 16(1)
Decision tree algorithm is a
I.
II.
III.
IV.
Question 17(1)
Naive Bayes algorithm is a
I.
II.
III.
IV.
Prediction model
Classification model
Both of the Above
None of the Above
2.666666667
Question 18(2)
Which of these target variable types are used by CHAID for decision making?
I.
II.
III.
IV.
Numeric
Integer
Interval
Class
Up selling Only
Cross selling Only
Up selling and cross selling
None of the above
Question 22 (2)
Would you expect good number of rules in a transaction set of 100 records as compared to 100000
records?
I.
II.
III.
IV.
Yes
No
Cant Say
None of the Above
Question 23 (3)
At any point in time for a specific customer, is it possible to see more than one consequent as a
recommendation?
I.
II.
III.
IV.
Yes
No
Cant Say
None of the Above
Time
Sales
Both of the Above
None of the Above
Question 25(1)
What would we call an ordered set of data arranged in accordance with their time of occurrence?
I.
II.
III.
IV.
Arithmetic series
Time Series
Both of the Above
None of the Above
Question 26(1)
What would a time series indicate?
I.
II.
III.
IV.
Question 27(2)
What would be the systematic components of time series which follow regular pattern of variations?
I.
II.
III.
IV.
Noise
Signal
Correlation
None of the Above
Question 28(3)
Which of the following describes a time series as a weak stationary process?
a.
b.
c.
d.
I.
II.
III.
IV.
Constant mean
Constant variance
Constant auto covariance for given lags
Constant probability distributions
a
a&c
a,b & c
a,b,c & d
Clustering
Question 29(1)
What is the Clustering?
I.
II.
III.
IV.
Prediction of data
Classification of data
Partition of data
None of the Above
Question 30(1)
Do we identify a set of independent variables and a dependent variable when we do clustering?
I.
II.
III.
IV.
Yes
No
Cant say
None of the Above
Question 31(1)
Will we call cluster analysis a variable reduction technique?
I.
II.
III.
Yes
No
Cant say
Question 32 (2)
What of these would be a dependent variable in a clustering algorithm?
I.
II.
III.
Numerical
Categorical
Both of the Above
IV.
Logistic Regression
Question 33(1)
In a Logistic regression model, the level of significance for a variable in the model indicates
I.
II.
III.
IV.
Question 34(1)
What is the relation between level of confidence and the significance level ?
Level of confidence =
Level of significance = 1 -
Level of confidence = 1-
Level of confidence = Level of significance
I.
II.
III.
IV.
Question 35(2)
The likelihood term in logistic regression statistically
1.
2.
3.
4.