You are on page 1of 5

1

Concordia University
INSE 6220 Winter 2017
Advanced Statistical Approaches to Quality
Assignment #1: Solutions

Purpose: This assignment will give you the opportunity to learn more about probability distributions, statistical inference,
hypothesis tests, confidence intervals, and statistical process control charts.
Instructions: Submit your assignment electronically through Moodle before midnight of the due date. Hard copy
submissions will not be accepted. Write your name and ID number on the first page of your assignment submission
(otherwise, 5 marks will be deducted).

An entrepreneur faces many bureaucratic and legal hurdles when starting a new business. The following data shows the
time, in days, to complete all of the procedures required to start a business in 24 developed countries:
23 4 29 44 47 24 40 23 23 44 33 27 60 46 61 11 23 62 31 44 77 14 65 42
It is also reported that the overall average time to start a business in all developed countries is 29 days.
(a) Construct the dot plot.
(b) Construct the stem-and-leaf diagram.
(c) Calculate the sample range, mean, and median of the time-to-start data.
(d) Calculate the first and third quartiles.
(e) Calculate the sample variance and standard deviation.
(f) Check the assumption of normality for the time-to-start data using normplot and boxplot.
(g) Test whether the average time is different from the reported overall average time at the 5% significance level.
(h) Find the 90% two-sided confidence interval to estimate the average time. Comment on your result.
Solution:
(a) The dot plot is shown in Figure 1.

Fig. 1. Dot plot for the time data.

(b) The stem-and-leaf diagram is shown below:


0 | 4
1 | 1 4
2 | 3 3 3 3 4 7 9
3 | 1 3
4 | 0 2 4 4 4 6 7
5 |
6 | 0 1 2 5
7 | 7
(c) The sample range, mean, and median of the time-to-start data are R = 73, x = 37.38, and Q2 = 36.5
(d) The first and third quartiles are Q1 = 23 and Q3 = 46.5
(e) The sample variance and standard deviation are s2 = 345.03 and s = 18.57.
(f) As shown in Figure 2, the normal probability plot and box plot indicate that the time data constitute a sample
from a normal population. The normal probability plot appears to be reasonably straight. Although the box plot is not
perfectly symmetric, it is not too skewed and there are no outliers.
MATLAB code
>> X = [23 4 29 44 47 24 40 23 23 44 33 27 60 46 61 11 23 62 31 44 77 14 65 42];
>> subplot(1,2,1); normplot(X); subplot(1,2,2); boxplot(X);

(g) Step 1: From the given information, we have n = 24, x = 37.38, s = 18.57, 0 = 29, and = 0.05. So the
formulation of the null and alternative hypotheses should be
H0 : = 29
H1 : =6 29
INSE 6220 Winter 2017 Assignment #1

0.98 80
0.95
70
0.90

60
0.75
Probability

50
0.50
40
0.25
30
0.10
20
0.05
0.02 10

10 20 30 40 50 60 70
Data 1
(a) (b)
Fig. 2. (a) Normal plot and (b) box plot for the time data.

Step 2: Since is unknown, the appropriate test statistic is the t-test and its value is given by
0
x 37.38 29
t0 = = = 2.21
s/ n 18.57/ 24

Step 3: The rejection region is |t0 | > t/2,n1 , where t/2,n1 = t0.025,23 = 2.07.
Step 4: Since |t0 | = 2.21 > t/2,n1 = 2.07, we reject the null hypothesis H0 .
The p-value is equal to 2[1 F (|t0 |)] = 2[(1 F (2.21)] = 0.04, which is smaller than the significance level = 0.05;
therefore, we reject the null hypothesis.
(h) The 90% two-side confidence interval on is given by
s s
t/2,n1 x
x + t/2,n1
n n
18.57 18.57
37.38 1.71 37.38 + 1.71
24 24
30.88 43.87
We can be 90% sure that the mean will not be between 30.88 and 43.87. Since 0 = 29 does not lie in this interval, we
reject the null hypothesis.
MATLAB code
>> X = [23 4 29 44 47 24 40 23 23 44 33 27 60 46 61 11 23 62 31 44 77 14 65 42];
>> mu0 = 29; alpha = 0.1;
>> [h,p,ci,stats]=ttest(X,mu0,alpha,both)

The management of a bank has embarked on a program of statistical process control and has decided to use variable
control charts to study the waiting time of customers during the peak noon to 1 p.m. lunch hour to detect special
causes of variation. Four customers are selected during the one-hour period; the first customer to enter the bank every
15 minutes. Each set of four measurements makes up a subgroup (sample). Table I lists the waiting time (operationally
defined as the time from when the customer enters the line until he or she begins to be served by the teller) for 20 days.

(a) Construct a table that shows the waiting time data along with the sample means and sample ranges.
(b) Estimate the process mean and standard deviation.
(c) Construct the R- and the X-charts. Identify the out-of-control points using all Western Electric rules. If necessary,
revise your control limits, assuming that any samples that violate Western Electric rules can de discarded.
(d) Assuming that the waiting times are normally distributed and and that specifications are 6 5 minutes, calculate
the process capability index and the proportion of the process that will not meet specifications.
Solution:

2
INSE 6220 Winter 2017 Assignment #1

TABLE I
Waiting Time for Customers at a Bank.

Sample Data
Number X1 X2 X3 X4
1 7.2 8.4 7.9 4.9
2 5.6 8.7 3.3 4.2
3 5.5 7.3 3.2 6.0
4 4.4 8.0 5.4 7.4
5 9.7 4.6 4.8 5.8
6 8.3 8.9 9.1 6.2
7 4.7 6.6 5.3 5.8
8 8.8 5.5 8.4 6.9
9 5.7 4.7 4.1 4.6
10 4.9 6.2 7.8 8.7
11 7.1 6.3 8.2 5.5
12 7.1 5.8 6.9 7.0
13 6.7 6.9 7.0 9.4
14 5.5 6.3 3.2 4.9
15 4.9 5.1 3.2 7.6
16 3.7 4.0 3.0 5.2
17 2.6 3.9 5.2 4.8
18 4.6 2.7 6.3 3.4
19 7.2 8.0 4.1 5.9
20 6.1 3.4 7.2 5.9

TABLE II
Waiting Time Data with Sample Means and Ranges.

Sample Data
Number X1 X2 X3 X4 Xi Ri
1 7.2 8.4 7.9 4.9 7.100 3.5
2 5.6 8.7 3.3 4.2 5.450 5.4
3 5.5 7.3 3.2 6.0 5.500 4.1
4 4.4 8.0 5.4 7.4 6.300 3.6
5 9.7 4.6 4.8 5.8 6.225 5.1
6 8.3 8.9 9.1 6.2 8.125 2.9
7 4.7 6.6 5.3 5.8 5.600 1.9
8 8.8 5.5 8.4 6.9 7.400 3.3
9 5.7 4.7 4.1 4.6 4.775 1.6
10 4.9 6.2 7.8 8.7 6.900 3.8
11 7.1 6.3 8.2 5.5 6.775 2.7
12 7.1 5.8 6.9 7.0 6.700 1.3
13 6.7 6.9 7.0 9.4 7.500 2.7
14 5.5 6.3 3.2 4.9 4.975 3.1
15 4.9 5.1 3.2 7.6 5.200 4.4
16 3.7 4.0 3.0 5.2 3.975 2.2
17 2.6 3.9 5.2 4.8 4.125 2.6
18 4.6 2.7 6.3 3.4 4.250 3.6
19 7.2 8.0 4.1 5.9 6.300 3.9
20 6.1 3.4 7.2 5.9 5.650 3.8
X = 5.941 R = 3.275

(a) Table II shows the waiting time data with extra columns displaying the sample means and sample ranges. The
grand mean and average range are also listed at the bottom of the table. For a sample size of n = 4, we have d2 = 2.059,
D3 = 0, D4 = 2.282, and A2 = 0.729.
(b) Using Table II, the process mean and standard deviation can be estimated as follows

R 3.275
X = 5.941 and
= = = 1.5906
d2 2.059

3
INSE 6220 Winter 2017 Assignment #1

(c) The upper control limit, center line, and lower control limit for the R-chart are

U CL = D4 r = 7.474
CL = r = 3.275
LCL = D3 r = 0

The R-chart is analyzed first to determine if it is stable. Figure 3(a) shows that none of the points is plotted outside
the control limits, and there are no other signals indicating a lack of control. Thus, there are no indications of special
sources of variation on the R-chart.
The X-chart can now be analyzed. The control limits for the X-chart are

U CL = x
+ A2 r = 8.327
= 5.941
CL = x
LCL = x A2 r = 3.555

Figure 3(b) shows that samples 17 and 18 are below 2 (Rule 2: two out of three consecutive points fall below).
Assuming assignable causes, we can discard these samples from the data.
8 8.5
UCL
UCL
8
7
7.5
6
7
5
Sample Range

Sample Mean

6.5

4 6 CL
CL 5.5
3
5
2
4.5
1 4 18
17
0 LCL 3.5 LCL
5 10 15 20 5 10 15 20
Sample Number Sample Number
(a) R-chart (b) X-chart
Fig. 3. R- and X-charts for the waiting time data.

MATLAB code
>> load waitingtime.mat; %load waiting time data
>> [stats,plotdata]=controlchart(X,chart,{r,xbar},sigma,range,rules,we6);

Thus, if the out-of-control points at samples 17 and 18 are discarded, then the new control limits are calculated using
the remaining 18 samples. The revised mean range and grand mean are the given by
= 6.136
x r = 3.294

The revised control limits for the new R-chart are

U CL = D4 r = 7.518
CL = r = 3.294
LCL = D3 r = 0

and the revised control limits for the new X-chart are
+ A2 r = 8.536
U CL = x
= 6.136
CL = x
LCL = x A2 r = 3.736

The new X- and R-charts are shown in Figure 4. Notice now that all the points fall within the limits, indicating that
the process may be stable.

4
INSE 6220 Winter 2017 Assignment #1

8 9
UCL UCL
7
8
6
7
5
Sample Range

Sample Mean
4 6 CL
CL
3
5
2

1 4
LCL
0 LCL
3
2 4 6 8 10 12 14 16 18 2 4 6 8 10 12 14 16 18
Sample Number Sample Number
(a) R-chart (b) X-chart
Fig. 4. Revised R- and X-charts for the waiting time data.

(d) From the given information, we have LSL = 1 and U SL = 11. From the revised R- and X-charts, we have
X = 6.136 and R = 3.294. Thus, the estimated process standard deviation is
= R/d2 = 3.294/2.059 = 1.6. The
process capability indices are given by
U SL LSL
Cp = = 1.0417
6
X LSL
CpL = = 1.07
3

U SL X
CpU = = 1.0133
3

Cpk = min(CpL , CpU ) = 1.0133

The Cpk value is between 1 and 1.33, which indicates that the process in barely capable and produces more than 63 but
less than 2.7 non-conforming units per million. This process has a spread just about equal to specification width. It
should be noted that if the process mean moves to the left or the right, a significant portion of product will start falling
outside one of the specification limits. This process must be closely monitored.
The total percentage of nonconforming produced by the process is

Percentage of Nonconforming = 1 P (LSL < X < U SL)


!
LSL X X X U SL X
= 1P < <




= 0.0018

where () is the cdf of the standard normal distribution N (0, 1). Thus, the proportion of process not meeting specifi-
cations is 0.18%, which is quite small. Although we found that the process is in control, it is however barely capable of
meeting the stated specifications. This process must be closely monitored.

You might also like