You are on page 1of 73

Indian Institute of Management Bangalore FPM 2009 Statistics for Management Mid Term Name:___________________ Time: 90 minutes Max

Points: 30 No.:_________________

Roll

This is an open-book and open-note test. However, sharing of material is NOT permitted.
Instructions: Do not seek any clarifications. Provide appropriate arguments and show calculation in support of your final answer. Answer all questions in the space provided. Do not attach any additional sheets, use the back pages, if necessary. I. Prof. AK Rao, who teaches the Quantitative Analysis course at the International Institute of Management, Bilekahally (IIMB) had given an assignment to his students. The students were asked to select a simple random sample of 64 Small Scale units from a database containing data on 13 million small scale units. Each student was asked to report the average number o f years that the unit is in operation (A) and the proportion of female owned units (B). they were also asked to calculate the confidence intervals for A and B. (A is age of the unit and B is the proportion of the Female Owned Units) Confidence Interval of Confidence Interval of A B Lower Upper Lower Upper limit Limit limit Limit 5.0100 5.9900 0.1439 0.3561 C1 C2 0.2564 0.4936 . . . . . . . . 6.2802 7.5198 D1 D2

Sl. No. 1 2 . . 250 1.

No. of Average Variance Female of A of A Owned units 5.5 4 16 4.8 4.4 24 . . . . . . 6.9 6.4 32 What is the value of C1?

2. What is the value of C2?

3. What is the value of D1?

4. What is value of D2?

5. What is the confidence level used for calculating the Confidence interval for B?

II.

Tangavelu, Deputy Director of the State Milk Marketing Federation is responsible for the quality control. He had been receiving many complaints about the quantity of milk that is paced into the sachets. He decided to test a hypothesis that = 500 ml. He took a sample of 12 half-litre packets of milk and got a sample variance of 218 ml2. (You can approximate some of the values in your answers by extrapolation)

1. For what range of values of the sample mean should he accept the Null hypothesis if = 0.05?

2. If the real value of is actually 510 ml, what is the probability of committing Type II error?

3. Tangavelu felt that it is appropriate that he should carry out a one sided hypothesis test and that he should make sure that the customers should not get less than 500 ml. He wanted the decision rule to be such that the probability of the customers getting less than 500 ml should be retained at 0.05. For what range of values of the sample mean should he accept the Null hypothesis?

4. If the real value of is actually 510 ml, what is the probability of committing Type II error in part 3 above?

5. Between the two hypotheses tests above (1 and 3), which one is more appropriate and why?

Quiz 2 Time: 60 minutes Max Points: 15

Indian Institute of Management Bangalore FPM 2009 Statistics for Management Name:___________________ Roll No.:_________________

This is an open-book and open-note test. However, sharing of material is NOT permitted.
Instructions: Do not seek any clarifications. Provide appropriate arguments and show calculation in support of your final answer. Answer all questions in the space provided. Do not attach any additional sheets, use the back pages, if necessary.
I. Gurumurthy, who is the Senior Asst Manager of Accreditation and Certification of Engineers (ACE) was given the responsibility of evaluating the effectiveness of different training methods followed by the ITIs for providing in-service training to technicians. He has selected 85 trainees and assigned each of them to a different ITI for training. After completion of the training, the skills acquired are measured through a test. The summary of the results is given below: (10 marks) Training Programme (sample) A B C Number of trainees (n) 25 40 20 Mean Standard deviation

250 265 305

42 35 40

1.

Is there a significant difference between the variances corresponding to samples A and B (i.e., between 2A and 2B)? (use appropriate value) (2 marks)

2.

Is there a significant difference between the means of samples A and B (i.e., between A and B)? Use = 0.05 (3 marks)

3.

Carry out an ANOVA to test if there are significant differences between the means of the three training programmes. (Fill the Table given below): (3 marks) State the null hypothesis and the alternate hypothesis:

Sl. No.

Source

df

Sum of Squares

Mean Squares

Conclusion:

4.

What is the conclusion based on the answers for 2 and 3 above?

(2 marks)

II. The traffic police in Bangalore are trying to fine-tune the traffic lights on Bannerghatta Road. They would like to recalibrate the duration of green light on the main road. Currently, the duration is 30 seconds. While the light is green on the main road, the side road will obviously have a red light. The police had collected data on the number of cars (they decided to ignore the two-wheelers) approaching the traffic light on the side road during the red-light period. The data collected by them is summarized below: No. of cars 0 1 2 2 10 3 29 4 34 5 47 6 27 7 15 8 21 9 10 10 4

No. of 30 sec 1 periods

The above table is to be interpreted as follows. There were 30 second periods (red-light) where only one car had approached the intersection on the side road. Similarly there were 47 periods where 5 cars were approached. The maximum number of cars approaching in any given period was 10. The traffic police felt that the number of cars approaching the traffic light should follow a Poisson distribution with an expected value of 5 cars per period. (a) Test at 5% level if the contention by the police is justified using a parametric test. (5 points)

Indian Institute of Management Bangalore FPM 2009 Statistics for Management Final Name:___________________ Time: 120 minutes Max Points: 30 No.:_________________

Roll

This is an open-book and open-note test. However, sharing of material is NOT permitted.
Instructions: Do not seek any clarifications. Provide appropriate arguments and show calculation in support of your final answer. Answer all questions in the space provided. Do not attach any additional sheets, use the back pages, if necessary.
Question I. (12 marks) Bilekahally Baking Products Limited has two manufacturing facilities one in Bilekahally and the other in Anekal. Both the facilities manufacture electric ovens. Kumar, who is the quality control inspector, feels that Bilekahally facility is better in productivity as compared to Anekal facility. In order to test his presumption, he collected data on the time required to assemble each oven. (Actually, the company buys the components and assembles the ovens). The time (in minutes) taken for 12 ovens from Anekal and 11 ovens from Bilekahally are given below: S. No. Anekal Bilekahally 1 54 52 2 56 56 3 50 49 4 68 57 5 48 51 6 68 54 7 72 59 8 59 55 9 56 53 10 70 57 11 72 54 12 58

A. Test whether there is a significant difference (at 5% level) between the two facilities, using an appropriate non-parametric test. [4]

B. Test whether there is a significant difference (at 5% level) between the two methods, using a parametric test. [4]

C. State clearly, the assumptions required to be made in parts A and B. [2]

D. Did you expect the conclusions from the above two tests (in parts A and B) to be the same? Briefly explain or justify your findings. [2]

Question II (10 Marks) Prof. Nayana Tara is trying determine if the learning abilities of the primary school students is influenced by the location of the residence of the families. She divided the 200 students based on their residential location as Rural (R), Semi Urban (S) and Urban (U). Their learning abilities are quantified by conducting a specially designed test and these are grouped into High (A - category), medium (B Category) and Low (C Category). When she studied the distribution of the students based on learning abilities and location of residence, she noticed certain interesting patterns. She also calculated the relative frequencies, which he used as a proxy for probabilities. Fill the table based on the patterns given below the table. Location of background Semi Urban (S) Urban (U)

Learning Category A B C Total

Rural (R )

Total

1. 2. 3. 4. 5. 6. 7.

B & U mutually Exclusive Rural constituted 50% P(AR)=P(U) P(S|A)=P(R) One-fourth of the Rural were in A - Category 35% of the students were in A Category P(A)=P( C)=P(B|R)

Test whether the learning abilities are independent of the location of residence.

Question III (8 marks) The price of gold in the Mumbai bullion market either goes down or up every day compared to the previous day (only the days on which the market is open).. The following pattern was observed over 20 consecutive days where a + indicates an increase and a indicates a decrease, and 0 indicates that the index was the same as the previous days. Days 1-4, 6-8, 12, 15-19, 24-25 had a +. Other days up to day 25 had a -. 1. Does this indicate a random sequence of pluses and minuses? (4 marks)

2. Does the sample data support the hypothesis that the probability of an increase equals 0.5? Choose a 5% significance level. (4 marks)

Question IV (10 marks)

A study was done to investigate the times put in by the executives in commercial firms. One hundred executives from different size firms were investigated and the information regarding their working hours per week is summarised in the following contingency table: Less than 40 hours/week Executives from firms that are: Large Medium Small between 40 and 45 hours/week Between 45 and 50 hours/week 20 10 7 more than 50 hours/week

2 4 3

8 6 4

20 10 6

a) What proportion of executives work more than 45 hours a week? Calculate a 95 percent confidence interval for the population proportion of executives who work more than 45 hours a week. (3 marks)

b) Test the null hypothesis that the proportion of executives who work for more than 45 hours a week is 0. 80. Use =0.05 (3 marks)

c) Test the null hyupothesis that the proportion of executives who work anywhere between 40 hours and 50 hours per week is the same between large and medium firms. (4 marks)

Indian Institute of Management Bangalore FPM 2010 Statistics for Management Quiz 2 Name:___________________ Time: 1 hour Max Points: 20

Roll No.:_________________

This is an open-book and open-note test. However, sharing of material is NOT permitted. Instructions: Do not seek any clarifications. Provide appropriate arguments and show calculation in support of your final answer. Answer all questions in the space provided. Do not attach any additional sheets, use the back pages, if necessary. (10 points) Naveen Bhatia is performing a one-sided hypothesis test for the market share of his company. He has set up the following hypothesis: H0: 0.5 HA: < 0.5 He has taken a sample of 100 and calculated the sample proportion. Based on the sample proportion, he calculated a p-value of 0.1020. 1. What was the sample proportion? I.

2. Based on the above sample proportion, calculate 95 percent two-sided confidence interval for .

3. If the confidence level is to be increased to 99 percent and if the width of the interval should be within 0.02, what would be the maximum possible sample size required?

4. Naveen has redefined his null hypothesis as a two sided test as follows: H0: = 0.4

HA: 0.4 What is the range of the sample proportion, p for which the above null hypothesis will be rejected? (consider = 0.05)

5. Given the above decision rule, calculate (give a possible alternate value of )

II. (10 points) Dr. Amar has calculated a one-sided confidence interval for the average age of the students in the adult education class. The upper limit of the one-sided 97.5 confidence interval was 61.8 years. When he calculated a two-sided confidence interval with the same confidence level, the lower limit of the interval turned out to be 40.8 years. 1. What is the standard error used by Amar in his calculations?

2. Given that Amar has used a sample size of 25 in calculating the sample mean, what is the variance of the population that he has assumed?

3. What is the value of the sample mean that he used in his calculations?

4. Assume that Amar does not know the population variance. He has calculated the sample variance s2 using the sample data. This variance turned out to be 36 years2. Based on this data, calculate a 90 percent two-sided confidence interval for .

5. Amar has set up a hypothesis test as follows: H0: = 50 HA: 50 What is the conclusion if the sample mean turns out to be 40.8. (Consider =0.10)

6. Calculate the probability of Type II error () for the above conclusion. You may consider any possible value as the alternate value for .

Indian Institute of Management Bangalore FPM 2010 Statistics for Management Final Exam Name:___________________ Time: 3 hours Max Points: 40

Roll No.:_________________

This is an open-book and open-note test. However, sharing of material is NOT permitted. Instructions: Do not seek any clarifications. Provide appropriate arguments and show calculation in support of your final answer. Answer all questions in the space provided. Do not attach any additional sheets, use the back pages, if necessary. I. (6 points) Our friend, Chubby Chunky is not happy with the data collected and analyzed by Bubly. He decided to collect his own sample, which is randomly selected. He collected the data with respect to weight loss from 20 persons, who had gone through the weight reduction programme. For obvious reasons, they decided to do a one sided hypothesis test. The sample standard deviation of the weight loss obtained from these 20 observations was 12.5 kgs. Chubby informed Bubly that he is willing to go ahead with the weight reduction programme if the sample average is greater than equal to 4.8327 kgs. Based on this decision, calculate the Type I error that Chubby is willing to tolerate.

1.

2.

Based on the above, formulate the null and alternate hypothesis.

3.

At this stage, Chubby decided to change the value to 0.025 and suitably modified the decision rule. The average weight loss of the 20 persons turned out to be 5.3. Should Chubby go ahead with the weight reduction programme?

II. (10 points) Dr. Rammohan is analyzing the number of patients treated for Dengue fever in three government hospitals namely Vani Vilas, Kempe Gowda and Bowring. He has taken a random sample of dates and collected the data on the number of dengue patients treated on the selected date from the records of each of the hospitals. The data is summarized below: Hospital Serial number of the selected day 1 2 3 4 Number of patients treated 44 47 61 22 27 21 60 35 44 23 42 30

5 38 22 40

6 43 19 28

Vani Vilas Kempe Gowda Bowring

15

31

(a) Test whether there is any significant difference between the average number of patients treated at the three hospitals using an appropriate parametric test. State clearly the assumptions required for the test (Use = 0.10)

(b)

Repeat the test using an appropriate non-parametric test.

(c) Should the conclusions between the above two tests consistent? Why or why not? If the results are not consistent, which one would you agree with and why?

III. (5 points) WaterGate toothpaste hired 200 students from the International Institute of Management Ltd. (IIML) to carry out a survey on the dentists. Each student is required to cover 10 dentists selected randomly (hence, their responses are independent of each other). One of the questions in the survey questionnaire is whether the dentist recommends WaterGate toothpaste to his patients. The response to this particular question is summarized as follows: Dentists 0 who said Yes Number of 1 Students 1 2 3 4 5 6 7 8 9 10

20

40

50

42

25

The above table is to be interpreted as follows: one of the 10 dentists surveyed by 3 students responded Yes to the above mentioned question (bold italics). Similarly, all the 10 dentists surveyed by 2 students responded Yes to the question (bold) 1. What is the average number of dentists who responded Yes?

2. WaterGate has defined success as a response of Yes from a dentist. What is the probability of success?

3. WaterGate feels that the number of doctors who responded Yes follows binomial distribution. State assumptions required for this.

4. Test the hypothesis to see if the above conjecture (about the binomial distribution) of WaterGate is tenable using a parametric test. State the null and alternate hypotheses clearly. Use only a parametric test.

IV. (5 points) Prema Kumari is trying to test the innovativeness in different industries. She has developed an Industrial Innovative Index (popularly known as 3I), which is based on weighted average of 12 different components obtained through a survey instrument. She obtained the data from 12 units from software industry and 10 units in automobile industry. She calculated 3I for each of the units and then summarized the same for the respective industry. The summarized data is presented below: sample mean 248.22 317.41 Sample Deviation 52 76 Standard

Software Automobile

1. Test the null hypothesis that there is no significant difference in innovativeness between the two industries using a t-test. Use = 0.05

2. If you want to test the above null hypothesis using ANOVA, what will be calculated F value?

3. What will be the p-value corresponding to the F value calculated above?

4. Prof. A K Rao who is the mentor for Prema Kumari, suggested that she should use a proportion test because, the 3I is not exactly measurable or quantifiable. He also suggested that she should increase the sample sizes considerably. Prema Kumari obtained data from 28 more units from software industry and 40 more units from automobile industry. She calculated the overall average 3I for all the units (combining both the industries). Those units whose 3I is more than this average are termed as High Innovators and others as Low Innovators. The Software industry had 15 High Innovators and Automobile industry had 30 High Innovators. Find a 95 percent confidence interval for the difference between the two population proportions.

5. Test the null hypothesis that there is no significant difference between the two population proportions. (use = 0.05)

V. (14 points) The Ministry of Home Security has estimated that the average number of tunnels under the border fence between India and Pakistan are 0.6 per kilometer. The Border Security Force is given the responsibility to fill the tunnels in the 50 kilometer stretch in the Poonch sector. The BSF uses a special machine to fill the Tunnels from the Indian side. It is estimated that each machine can fill 5 tunnels in one night shift. (The tunnels are to be filled during the 8-hour night shift between 8:00 pm and 4:00 am.) The assignment requires that BSF has to complete a 10 kilometer stretch in each night shift. You can assume that each night shift is synonymous to a 10-kilometer stretch. 1. What is the probability that a 10-kilometer length selected randomly in the Poonch sector has 10 or more tunnels?

2.

The distance between Defense Post 121-A and Goregoan village is 5 kilometers. What is the probability that there are exactly 4 tunnels on this stretch?

3.

If BSF encounters a tunnel, what is the distance, on average, it has to go before it encounters the next tunnel?

4.

What is the probability that BSF can manage with one machine in a one night shift?

5.

What is the probability that BSF has to deploy 4 such machines in a one night shift?

6.

What is the expected number of machines deployed in a one night shift?

7.

Given that BSF has encountered exactly 5 tunnels in a particular night shift, what is the probability that they will encounter exactly 6 tunnels on the next night shift?

Indian Institute of Management Bangalore FPM 2011 Statistics for Management Midterm Time: 1 hour 30 min Max Points: 30 This is an open-book and open-note test. However, sharing of material is NOT permitted. Instructions: Do not seek any clarifications. Provide appropriate arguments and show calculation in support of your final answer. Answer all questions in the space provided. Do not attach any additional sheets, use the back pages, if necessary. Name:___________________ Roll No.:_________________

I.

Prof. AK Rao was analyzing the General Management Admission Test (GMAT) scores. The test is taken by a large number of students and the final score is the sum total of the individual scores of 4 different components. He has selected a random sample of 25 students and calculated the sample average. He, then calculated a 95% two sided confidence interval for the population mean using the population standard deviation , which is known to him. He was not happy with the width of this interval and hence calculated a 90% confidence interval and found that the width of the confidence interval got reduced by 3.78 marks. 1. What is the standard deviation of the sample mean ) used by Prof. Rao in calculating the confidence intervals?

2. What was the standard deviation of the population ()?

3. If Prof. Rao wanted to retain the same confidence level of 95%, but wanted to reduce the width of the interval by 3.78 marks, what should be the increase in the sample size (as compared to original 25)? 4. The lower limit of the 95% confidence interval with the sample size of 25 was 200.24. What was the lower limit of the 90% confidence interval?

5. If he wanted to reduce the width of the 95% confidence interval by 50%, what should be the percentage increase in the sample size?

The probability distribution of annual rainfall (measured in inches) in the NaaVuru Desert is characterized by the following density function: f(X)= 0.125 X for 0 X 2 = 0.25 for 2 X 4 = 0.25 0.125 (X 4) for 4 X 6 = 0 otherwise If the rainfall is less than 2 inches in a year, it is considered as a Bad year. If the rainfall is more than 3 inches, it is considered as a Good year and the year is considered average if the rainfall is between 2 and 3 inches. 1. What is the probability that a particular year is Good? 2. What is the probability of having 3 consecutive Good years?

II.

3. The date farmers of NaaVuru desert found that whenever the rainfall is more than 4 inches, the date yields are High. Otherwise the yields are Low. What is the joint probability that the yields are High and the year is Good in terms of the rainfall?

4. What is the probability of getting two consecutive years where the yields are Low and the Years are Average?

III.

Santhi Bhushan travels from Shanthi Nagar to Whitefield every day by bus. He has collected the details of the bus trips from BMTC. BMTC informed him about two possible routes from Shanthi Nagar to Whitefield. One is the direct bus (Route no. 248K) and the other is via KBS (Kempagowda Bus Station) using Route 29 to KBS and then 406 to Whitefield. The details of mean time and standard deviation are as follows: Travel between time Standard deviation (Min) 1.0 2.0 4.0

Route

mean (Min) Via KBS (29 and Shanthi Nagar and 16.0 406) KBS KBS to Whitefield 44.0 248 K Shanthi Nagar to 68.0 Whitefield

However, Shanthi Bhushan later found out that, the route via KBS requires a 10 min wait at KBS, which was not included in the above table. He is also convinced that the actual time in each leg (for either company) is normally distributed and not dependent on one another. Shanthi Bhushan gets into trouble with his supervisor if he takes more than 72 minutes for the trip from Shanthi Nagar to Whitefield. 1. If Shanthi Bhushan decides to use the route via KBS, what is the expected time taken for a one way trip to reach whitefield? [2]

2. What is the distribution, including the values of the parameters, of time taken by the route via KBS? [3]

3. Assuming that Shanthi Bhushan decided to use Route 248 K for all the days during this year, what is the expected value of the number of times he gets into trouble with his supervisor? [5]

4. If Shanthi Bhushan uses Route 248 K for next 60 days, what is the probability that he will get into trouble with his supervisor for no more than 10 days? [5]

5. What is the probability that Route via KBS is quicker than Route 248 K? [Hint: X > Y X Y > 0]

IV.

Ahalya is studying the performance of BBAs and BComs in the GMAT. She knows that the total population is 3 lakh students in BBA and 5 lakh students is B Com. She decided to take a stratified sample from the two strata, BBM and B Com. She knows that the Standard Deviation () for BBM is 40 and that of B Com is 24. The cost of collecting data is Rs. 25 per each student of BBM and Rs. 16 for each student of B Com. She desires to have a sample size of 400. 1. What is the standard deviation of the overall estimate (which is the weighted average of and ) if the sample is equally distributed between the two strata? 2. What is the standard deviation of the overall estimate (which is the weighted average of and ) if the sample is proportionately distributed between the two strata? 3. What is the standard deviation of the overall estimate (which is the weighted average of and ) if the sample is OPTIMALLY distributed between the two strata in order to minimize the standard deviation?

4. Ahalya has only Rs. 6000 at her disposal. What is the sample size that can be managed with this budget such that the standard deviation of the overall estimate (which is the weighted average of and ) is minimized, taking advantage of the differential costs? 5. What would have been the sample size if she does not take advantage of the differential costs?

Indian Institute of Management Bangalore FPM 2011 Statistics for Management Quiz 2 Name:___________________ Time: 1 hour 30 min Max Points: 20 Roll No.:_________________ This is an open-book and open-note test. However, sharing of material is NOT permitted. Instructions: Do not seek any clarifications. Provide appropriate arguments and show calculation in support of your final answer. Answer all questions in the space provided. Do not attach any additional sheets, use the back pages, if necessary. I. Infosys, as a part of their CSR activities, decided to provide a training program for the bus drivers on the behavioral aspects. It is hoped that their behavior towards the commuters would improve (for the better) after they go through the training Program. Mohandas Gandhi, who is responsible for the program has decided to use one of the OB instruments to measure the score of behavior before and after the training program in order to measure the effectiveness of the program. Based on the difference in the scores, he created an index called Behavioral Aptitude Improvement Index (BAII). A Positive BAII will indicate an improvement in the behavior of the drivers. Mohandas has selected 16 drivers at random and put them through the training program and calculated the BAII for each of them. The average BAII of these 16 drivers was 25. Dr. Ravi Nath, the author of the OB instrument that Mohandas used to measure the BAII informed him that the standard deviation of the population () is known to be 60. Mohandas was told by the company head of CSR that they are willing to go ahead with the training programs if the BAII is large enough. 1. For obvious reasons, Mohandas decided to carry out a one-sided hypothesis test with = 0.05. Test the hypothesis based on the data obtained from the 16 sample drivers. (2 points)

2. What will happen to the above conclusion if Mohandas changes his hypothesis to a two sided test, keeping = 0.05? (2 points)

3. Calculate the probability of Type II error () for the decision rule for question 2 above. Use 10 as the alternate value for . (2 points)

4. Mohandas felt that using the standard deviation given by Dr. Ravi is not appropriate and decided to use the variance that he calculated from the 16 sample observations in testing the null hypothesis. The variance based on the 16 observations was 2000. Test the null hypothesis based this variance. (2 points)

5. Calculate the probability of Type II error () for the decision rule for question 4 above. Use -10 as the alternate value for . (2 points)

6. Test the null hypothesis that the Population Variance (2) of BAII is actually 3600 based on the sample variance calculated by Mohandas. (2 points)

II.

Ram Mohan is standing for elections in the local house building cooperative society. In order to ascertain his chances of winning, he selected a certain number of members of the society and polled them. Based on the responses of these members, he calculated the proportion (p) of those in favor of his candidature as 0.45. Using this sample proportion, he decided to do a two sided hypothesis that that = 0.5 and the alternate hypothesis as 0.5. He calculated the p-value to test the null hypothesis as 0.0455. 1. What would be his conclusion based on this p-value? (2 points)

2. What was the sample size that he used to test the null hypothesis? (2 points)

3. Calculate a one-sided 95% confidence interval for the real value of . (2 points)

4. If he wanted to do a one-sided hypothesis test (instead of the two-sided test above), how would you frame the null hypothesis and the alternate hypothesis? What is the corresponding p-value? Would you advise Ram Mohan to go ahead with the elections? Explain why or why not. (2 points)

Indian Institute of Management Bangalore FPM 2011 Statistics for Management Final Name:___________________ Time: 120 min Max Points: 40 Roll No.:_________________ This is an open-book and open-note test. However, sharing of material is NOT permitted. Instructions: Do not seek any clarifications. Provide appropriate arguments and show calculation in support of your final answer. Answer all questions in the space provided. Do not attach any additional sheets, use the back pages, if necessary. I. NS Steel Works acquired a new plant in Delhi. Kamini, who is the Senior Vice President, is trying to compare the productivity of the workers in the Delhi plant with their existing plant at Bangalore. She selected 5 workers from Bangalore and six workers from Delhi and calculated the time (in minutes) taken by them to manufacture a particular Steel item. The data along with the mean and variance is given in the table below: Plant Worker no. Mean Varianc e 1 2 3 4 5 6 Bangalor 1 13 5 6 2 e 5.4 22.3 Delhi 9 9 7 11 10 20 11 21.2 1. Test the null hypothesis that there is no difference in the average time taken in Bangalore (B) and Delhi (D) using the t test. Use = 0.10 (6 marks)

2. If you were to use ANOVA for the above test, what will be the value of the calculated F? (1 mark)

3. Test the null hypothesis in question 1 above using an appropriate non-parametric test. (3 marks)

Ranjita Mohan is doing her FPM in the International Institute of Management and Commerce (IIMC). Her guide wanted her to do 4 case studies before she can actually prepare her questionnaire for collecting data. He had given her the contact numbers of 10 CEOs of Biotech firms. She decided to call one CEO at a time and request for an appointment. The guide told her that the probability of success (getting an appointment) is 0.3 for each of the firms and that the CEOs decide on the appointment independent of each other. Ranjita decided that once she gets 4 appointments, she need not bother to call any more CEOs. Define the random variable X=number of calls that Ranjita has to make in order get 4 appointments. (Obviously, the minimum possible value of X is 4). 1. Give the probability distribution of X.(5 marks)

II.

2. What is the probability that 4 X 10? Should this be equal to 1? If not, explain why not?

3. Ranjita got 4 appointments, but one of the CEOs who had agreed to give an appointment had cancelled because she is indisposed. What is the probability that she has no more CEOs left to call?

4. What is the probability that Ranjita was not able fulfill the requirement of 4 appointments? (treat this question independent of question 3 above)

III.

Prof. Swamy, who has been the CAT coordinator, always felt that the students from south perform better in CAT (not necessarily because he is from South). In order to test his conjecture, he created a two-way table with the regions on one side and the performance in CAT on the other side. For this purpose, he grouped the students into 3 categories, based on their performance in CAT as High, Medium and Low. He had selected 200 students randomly and filled in the frequencies in the two way table. He also calculated the relative frequencies and used them as the probabilities. When he looked at the frequencies and relative frequencies, he found some interesting patterns which are listed below the table. Region South (S)

Category High (H) Medium (M) Low (L) Total

Central (C)

North (N)

Total

i. ii. iii. iv. v. vi. vii.

M & N mutually Exclusive South constituted 50% P(HS)=P(N) P(C|H)=P(S) One-fourth of the S are in High Category 35% of the students are in High Category P(H)=P( L)=P(M|S)

Test whether the Performance Category is independent of the Region (6 marks)

IV.

No. Travelled No. of days

When Shylaza attended the Executive MBA classes in the International School of Business (ISB), her statistics professor explained to her that the number of seats occupied in a bus follows binomial distribution, if the people who can travel in the bus are earmarked, and all of them decide to travel or not independently. Since she commutes everyday by the company minibus to work, she decided to test this hypothesis. She collected the data from the personnel section about the number of people travelled on this particular bus for 100 days. The minibus is earmarked for 10 persons (including Shylaza). She summarized the data for the 100 days in the table below: 0 1 2 3 4 5 6 7 8 9 10 0 0 0 1 2 5 14 25 28 19 6

The above table is to be interpreted as follows: there were 6 days (out of the 100) on which all the 10 persons travelled and similarly, there were 25 days on which 7 persons travelled. 1. In order to help Shylaza test her hypothesis, calculate the probability of success based on the above table.(2 marks)

2. Test the hypothesis (using =0.10) using a non parametric test.(4 marks)

V.

NS Steel Forgings have two heavy duty machines, of which one needs to scrapped because the company is about to receive a brand new machine. The production manager knows that each of the two machines do produce certain number of defectives. He decided to scrap the machine which produces more defectives. He selected 150 parts produced by Machine 1 and 200 produced by Machine 2. He found that there were 9 defectives with respect to Machine 1 and 22 defective with respect to Machine 2. 1. Decide which machine needs to scrapped by testing the null hypothesis that 1=2 where i is the population proportion of defectives produced by Machine i. Use =0.10 (3 marks)

2. Do a one-sided hypothesis test (H0: 1 2). Keep = 0.10. Recommend which machine needs to be scrapped? (3 marks)

3. What is the result of the test in question 2 above, if is changed to 0.05? (1 mark)

4.

What is your final recommendation? Explain why? (1 mark)

Indian Institute of Management Bangalore FPM 2012 Statistics for Management Research

Quiz 2 Name:___________________ Time: 60 minutes Max Points: 20 Roll No.:_________________ This is an open-book and open-note test. However, sharing of material is NOT permitted. Laptops are NOT allowed Instructions: Do not seek any clarifications. Provide appropriate arguments and show calculation in support of your final answer. Answer all questions in the space provided. Do not attach any additional sheets, use the back pages, if necessary. I. Swetha who is working as a business analyst for the Honest Abe Loyal Finance Co. (HALF) was asked by her boss, Dileep to estimate the average investment in fixed deposits by the middle class families. She collected the data from a sample of 100 families and calculated the sample mean.

1. Swetha has set up the following hypothesis for testing: H0: 200,000 H1: > 200,000 She found that the null hypothesis was rejected if she used = 0.05. When she reduced the to 0.025, she failed to reject the null hypothesis. Assume that = 10,000 and n = 100 What is the possible range within which the sample mean lies? (3 points)

2. Calculate the probability of committing Type II error if the alternate value for is 201,000 and = 0.025? (2 points)

3. Calculate the probability of committing Type II error if the alternate value for is 201,000 and = 0.05? (2 points)

4. Swetha found that there were few errors made in the data entry. She made the required corrections and then the sample mean turned out to be 201,200. If Swetha decides to reject the null hypothesis based on this value of the sample mean, what is the maximum possible value of that she is willing to tolerate? (3 points)

Dileep, Swethas boss wanted her to get an idea about the variation with respect to investment in fixed deposits. Swetha selected a sub sample of 25 families and calculated the sample standard deviation (s) based on these 25 observations. The value of s turned out to be Rs. 14400. 5. Calculate a 95 percent confidence interval for the population variance, 2 (3 points)

6. Test whether Swethaa assumption about the population variance in question 1 above. Use = 0.025 for this purpose. (2 points)

7. Swetha decided to test the null hypothesis (same as question 1 above) using the data from the sub sample of 25 observations above. She also decided to use the sample standard deviation calculated from these observations. What is the range of the values of sample mean for which she will reject the null hypothesis, if = 0.025? (3 points)

8. The sample average turned out to be Rs. 202,972. Calculate the p-value associated with this sample average. (You may round off the results to 3 digits after decimal). (2 points)

Final Time: 120 minutes Max Points: 50

Indian Institute of Management Bangalore FPM 2012 Statistics for Management Research Name:___________________ Roll No.:_________________

World Analytical Research Technologies (WART) is one of the major players in the outsourcing of analytical services and deals with all the major insurance companies in the world. They have sent some of their employees for training to three different Management Institutes namely Indian Institute of Management Acumen (IIMA), International Institute of Management Bilekahally (IIMB), and Indian Institute of Management and Commerce (IIMC). After completing the training programme, these employees are given a special test to measure their analytical skills and the scores are given below: Participant number Institute 1 2 3 Scores IIMA 22 38 43 IIMB 30 40 28 IIMC 22 19 15

4 44 44 31

5 47 23 27

6 61 42 21

60

35

(a) Test whether there is any significant difference between the average scores of the three training programmes using an appropriate parametric test. (Use = 0.10)

(b)

Repeat the test using an appropriate non-parametric test.

(c) Should the conclusions between the above two tests consistent? Why or why not? If the results are not consistent, which one would you agree with and why?

NS Software Systems (N3S) has received a contract to develop software for a B2C site to be hosted at Houston, Texas. The company has formed a team of 3 members for the project. The company, as usual, is plagued by the team members leaving the company in the middle of the project, leading to delays in the project completion. The company officials felt that, as far as this particular project is concerned, there will not be any delay in the project if not more than one person leaves the team. On the other hand, it was felt that if 2 members leave the project, there will be a delay of 25% and if all the three leave the team, the delay will be 40%. The company officials made two assumptions (1) the team members do not influence each other and (2) once a member leaves the team, the replacement can be made to continue until the project is completed. The company recently appointed two outside consultants, Merrily Lynch and Price Watered-Down (PWD) to estimate the attrition rates. Merrily Lynch estimated an attrition rate of 20% where as PWD estimated it to be 30%. It was known that the estimates of Merrily Lynch are 3 times more reliable than those of PWD. 1. Assuming that the estimates of Merrily Lynch are correct, What is the probability that the project will not be delayed?

What is the probability that the delay in the project will be not more than 25%?

What is the probability that the delay in the project will be 40%?

2.

What is the probability that no member of the team leaves before the project is completed?

3.

What is the probability that exactly member of the team leaves before the project is completed? Two members? All the three members? What is the expected delay in the project?

4.

V. The sensex index either goes down or up every day compared to the previous day. The following pattern was observed over 20 days where a + indicates an increase and a indicates a decrease, and 0 indicates that the index was the same as the previous days. Days 1-4, 6-8, 12, 15-19, 24-25 had a +. Other days up to day 25 had a -. 3. Does this indicate a random sequence of pluses and minuses?

4. Does the sample data support the hypothesis that the probability of an increase equals 0.5? Choose a 5% significance level.

5. A financial analyst ranked the increases. Odd numbered days on which there was an increase got odd ranks. Even numbered days got even ranks. Do a ranks test to test the hypothesis that the average rank on odd and even days is the same (among days with an increase in the sensex value).

III. Dream Cake Works (DCW) specializes in high value cakes. Given their specialized nature, they make exactly 10 cakes every day. But, the demand for the cakes is random and can be anywhere between 0 and 10. All the unsold cakes at the end of the day are given to the IIMB hostel at a steep discount (where there is always some party is going on). DCW collected the data on the cakes sold for the past 200 days and the same is presented below: No. of 0 Cakes sold No. of 1 days 1 2 3 4 5 6 7 8 9 10

20

40

50

42

25

The above table is to be interpreted as follows: there are 3 days on which exactly one cake is sold. Similarly there are 2 days on which exactly 10 cakes are sold. 5. What is the average number of cakes sold per day?

6. Consider that if a cake is sold on any given day it is a success. What is the probability of success?

7. DCW feels that the number of cakes sold on any given day follows binomial distribution. State assumptions required for this.

8. Test the hypothesis to see if the above conjecture (about the binomial distribution) of DCW is tenable using a parametric test. State the null and alternate hypotheses clearly. Use only a parametric test.

Indian Institute of Management Bangalore


Introduction to Statistical Methods Mid Term Time: 90 min. Max Points: 30 Name:___________________ Roll No.:_________________

Do not seek any clarifications. State all your assumptions, if any, very clearly. Answer all questions in the space provided. Do not attach any additional sheets. I. (12 points) Bangalore dairy receives the raw milk collected from various villages through a number of trucks. There is a feeling that the present unloading dock is not adequate. Currently, the unloading process is that the truck is backed up to the platform and the milk cans are unloaded manually. There is suggestion to replace the present dock which is of fixed height with a hydraulically adjustable platform. Thus the height of the platform can be aligned with the truck and unloading can be done with machines. As unloading progresses, the bed of the truck rises in height and the platform can be adjusted accordingly. The finance director of Bangalore diary is not convinced that such an investment is warranted. On the other hand, Kumara Swamy who is the Dy. Manager, Logistics feels that there are too many trucks waiting to be unloaded at the dock. He collected the data of the number of trucks that are arriving at the unloading dock on an hourly basis. The data is presented in the table below. No. of 0 Trucks No. of 2 Hours 1 6 2 11 3 29 4 35 5 46 6 27 7 15 8 16 9 10 10 3

The above table is to be interpreted as follows. There were 6 hourly periods where only one truck arrived at the dock. Similarly there were 46 hourly periods where 5 trucks arrived. The maximum number of trucks arriving at in any given hour was 10. 1. Calculate the average number of trucks arriving at the dock during one hour.

2. Assume that the number of trucks that arrive follow a Poisson distribution. What is the most likely value of the random variable, number of trucks arriving to be unloaded?

3. The trucks are not refrigerated and hence it is important to unload the truck as soon as possible (otherwise, the milk gets spoiled and becomes a waste). The dock is operated with 3 dedicated employees. Whenever the number of truck to be unloaded goes beyond 7 per hour, Kumara Swamy draws three more employees from the production. What is the probability that he will have to draw additional employees from production?

4. In a 30 minute period, what is the probability that the number of trucks arriving is exactly 4?

5. The dedicated labor costs Rs. 1800 per hour. The logistics department is charged Rs. 3000 per hour when they draw the employees from the production. Consider the cost per hour as the random variable, C. What the expected value of this random variable C? What is its standard deviation?

6. What is the probability that the time between two consecutive arrivals of the rucks is more than 30 minutes?

II. (10 points) Dr. Calderon-Madrid used semi-Markov process and duration models as a base for analyzing the labor market in urban Mexico. Based on a sample of 256 employees, she calculated the sample average of the employees who remained in the same job status and built a two sided, 95% confidence interval for the , based on the sample mean. The upper limit of the interval was found to be 3036.6. Dr. Calderon-Madrid calculated the coefficient of variation using the sample mean and . 1. Considering that the coefficient of variation was 5.3 (530%), find out what is the value of assumed by Dr. Calderon-Madrid in calculating the confidence interval?

2. What is the value of the sample mean?

3. If Dr. Calderon-Madrid wanted to get a 99% confidence interval with the same precision (width) as that of the 95% confidence level, what should be the percentage change her sample size?

4. Presume that Dr. Calderon-Madrid made an error with respect to the value of that she assumed. The actual value of was 50% of what she had assumed. What will be the percentage change in the sample size, if she wants a 95% confidence interval with the same upper and lower limits as earlier (upper limit of 3036.6)?

5. If the population mean is actually 3000 (unknown to Dr. Calderon-Madrid), what is the probability getting a sample whose mean is greater than 3036.6?

III. (8 points) There are two possible routes available for the truck that brings milk from Kolar to Bangalore every day. The time taken on both the routes is known to follow normal distribution. Route I has a mean () of 110 minutes and standard deviation () of 40 minutes. On the other hand, Route II has a mean of 100 minutes and standard deviation of 50 minutes. 1. The milk pick-up was delayed on a particular day and the driver was told that he should reach Bangalore in less than 80 minutes. Which route do you suggest for the truck? Explain why?

2. If the driver was told that he can take up to 150 minutes to reach Bangalore, which route should he prefer? Why?

3. The driver has to pay a penalty of Rs. 200, if the truck takes more than 164 minutes. What is the expected value of the penalty that the driver has to pay in each of the routes?

4. The truck has taken Route II on 5 consecutive days. What is the probability that the driver has paid the penalty on 2 out of the 3 days?

Indian Institute of Management Bangalore


Introduction to Statistical Methods End Term
Time: 120 min. Name:___________________

Max Points: 50 Roll No.:_________________ Do not seek any clarifications. State all your assumptions, if any, very clearly. Answer all questions in the space provided. Do not attach any additional sheets.

USE = 0.05 in all the questions dealing with hypothesis testing, unless otherwise stated. I. (18 points) Swetha has developed a new index for measuring operational competencies called OCI. She tested this instrument on the officers and staff on the Southern Railway. She had selected 16 people from the staff and 16 from the officers. The data with respect to the final OCI is given below: II.
Sl. No. OCI Sl. No. OCI Sl. No. OCI Sl. No. OCI 1 79.4 11 80.6 1 67.8 11 64.2 2 87.8 12 85.8 2 65 12 65 3 63 13 79.4 3 90.2 13 71.4 4 71.4 14 67.8 4 62.6 14 91.8 Staff 5 69 15 65 Officers 5 55.8 15 55.8 6 68.2 16 71.4 6 65.4 16 78.3 7 83 8 77.8 9 87 10 95.4

7 64.6

8 57.8

9 50.2

10 71.4

The summary of the above data is given in the table below:


Mean Staff Officers 77.00 67.33 STD Dev 9.092 11.49

1. Test the null hypothesis that there is no difference between the two population means, (officers and Staff) using a t test.

2. Test the null hypothesis that there is no difference between the two population means, (officers and Staff) using ANOVA. Fill the ANOVA Table below: Source df

Sum of Squares

Mean Squares

F value

3. Test the null hypothesis that there is no difference between the two population means, (officers and Staff) using a non-parametric test.

4. From among the three different tests done above, which one do you prefer? Explain why.

III.

(12 points) Durandhar Bhadwadekar is planning to take a stratified sample for measuring the effectiveness of the adult education scheme in Mandya district. He decided to segregate all the taluks into three categories based on the literacy rate. Based on the data available from earlier studies, he prepared the following table: Cost of sampling (Rs.)/ household 16 9 4

Literacy No. of Adult beneficiaries Standard Category of (neo literates) Deviation taluk in thousands High 100 60 Medium 300 40 Low 600 30 He decided on a total sample of 300 households.

1. Determine how many neo literates to be selected from each of the three literacy categories based on proportional allocation. Calculate the variance of the overall sample mean under this scenario.

2. Determine how many neo literates to be selected from each of the three literacy categories based on optimal allocation. Calculate the variance of the overall sample mean under this scenario.

3. Considering that the total budget allocated for the sampling is only Rs. 1,200, determine the sample size and the allocation across the three literacy categories based on optimal allocation. Calculate the variance of the overall sample mean under this scenario.

4. Considering that the total budget allocated for the sampling is only Rs. 1200, determine the sample size and the allocation across the three literacy categories using the cost of sampling so as to minimize the variance of the overall sample mean. Calculate the variance of the overall sample mean under this scenario.

IV.

(10 points) Prof. AK Rao is looking at the distribution of the student scores in the QM course across all IIMs. He selected a random sample of 600 students and summarized their scores as given in the table below:
Class Interval Lower Limit 10 20 30 40 50 60 70 80 Upper Limit 20 30 40 50 60 70 80 90 Frequency 4 15 48 110 198 142 64 19

1. Calculate the mean and standard deviation of the marks based on the above data.

2. Test the null hypothesis that the above data follows normal distribution with a mean of 60 and standard deviation of 15. Use the Chi-square test.

V.

(10 points) Roshan is a doctoral student who is studying the team performance across different types of teams. He divided the teams into three categories namely operational teams, project teams and ad-hoc teams based on the purpose for which they are created. He has identified 200 employees belonging to different categories of teams and evaluated their individual performance. These employees are rated on a four point scale in terms of their performance. He summarized his findings in the following table. Team Category Operational Teams Project teams Ad-hoc Teams Performance Rating Low Average 30 35 25 15 5 20

High 20 30 0

Excellent 5 10 5

Test the hypothesis that the job satisfaction level is independent of the department. Use = 0.10

You might also like