You are on page 1of 16

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

17
I I

Probit Analysis
Probit Analysis Overview, 17-2 Probit Analysis, 17-2

MINITAB Users Guide 2

17-1

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Chapter 17

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis Overview

Probit Analysis Overview


A probit study consists of imposing a stress (or stimulus) on a number of units, then recording whether the unit failed or not. Probit analysis differs from accelerated life testing (page 16-6) in that the response data is binary (success or failure), rather than an actual failure time. In the engineering sciences, a common experiment would be destructive inspecting. Suppose you are testing how well submarine hull materials hold up when exposed to underwater explosions. You subject the materials to various magnitudes of explosions, then record whether or not the hull cracked. In the life sciences, a common experiment would be the bioassay, where you subject organisms to various levels of a stress and record whether or not they survive. Probit analysis can answer these kinds of questions: For each hull material, what shock level cracks 10% of the hulls? What concentration of a pollutant kills 50% of the fish? Or, at a given pesticide application, what is the probability that an insect dies?

Probit Analysis
Use probit analysis when you want to estimate percentiles, survival probabilities, and cumulative probabilities for the distribution of a stress, and draw probability plots. When you enter a factor and choose a Weibull, lognormal, or loglogistic distribution, you can also compare the potency of the stress under different conditions. MINITAB calculates the model coefficients using a modified Newton-Raphson algorithm.

Data
Enter the following columns in the worksheet:
I

two columns containing the response variable, set up in success/trial or response/frequency format one column containing a stress variable (treated as a covariate in MINITAB) (optional) one column containing a factor

I I

17-2

MINITAB Users Guide 2

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Probit Analysis

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

Response variable The response data is binomial, so you have two possible outcomes, success or failure. You can enter the data in either success/trial or response/frequency format. Here is the same data arranged both ways:
Success/trial format Temp 80 120 140 160 Success 2 4 7 9 Trials 10 10 10 10
The Success column contains the number of successes; the Trials column contains the number of trials.

Response/frequency format Response 1 0 1 0 1 0 1 0 Frequency 2 8 4 6 7 3 9 1 Temp 80 80 120 120 140 140 160 160

The Response column contains values which indicate whether the unit succeeded or failed. The higher value corresponds to a success. The Frequency column indicates how many times that observation occurred.

Factors Text categories (factor levels) are processed in alphabetical order by default. If you wish, you can define your own ordersee Ordering Text Categories in the Manipulating Data chapter of MINITAB Users Guide 1 for details.
h To perform a probit analysis

How you run the analysis depend on whether your worksheet is in success/trial or response/frequency format.
1 Choose Stat Reliability/Survival Probit Analysis.

MINITAB Users Guide 2

17-3

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Chapter 17

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

2 Do one of the following:


I

Choose Responses in success/trial format.


1 In Number of successes, enter one column of successes. 2 In Number of trials, enter one column of trials.

Choose Responses in response/frequency format.


1 In Response, enter one column of response values. 2 If you have a frequency column, enter the column in with frequency.

3 In Stress (stimulus), enter one column of stress or stimulus levels. 4 If you like, use any of the options described below, then click OK.

Options
Probit Analysis dialog box
I I

include a factor in the modelsee Probit Analysis on page 17-2. choose one of seven common lifetime distributions, including the normal (default), lognormal basee, lognormal base10, logistic, loglogistic, Weibull, and extreme value distributions.

Estimate subdialog box


I

estimate percentiles for the percents you specifysee Percentiles on page 17-8. These percentiles are added to the default table of percentiles. estimate survival probabilities for the stress values you specifysee Survival and cumulative probabilities on page 17-9. specify fiducial (default) or normal approximation confidence intervals. specify a confidence level for all of the confidence intervals. The default is 95%.

I I

Graphs subdialog box


I I I I

suppress the display of the probability plot. draw a survival plotsee Survival plots on page 17-10. do not include confidence intervals on the above plots. plot the Pearson or deviance residuals versus the event probability. Use these plots to identify poorly fit observations.

17-4

MINITAB Users Guide 2

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Probit Analysis

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

Options subdialog box


I

enter starting values for model parameterssee Estimating the model parameters on page 17-11. change the maximum number of iterations for reaching convergence (the default is 20). MINITAB obtains maximum likelihood estimates through an iterative process. If the maximum number of iterations is reached before convergence, the command terminatessee Estimating the model parameters on page 17-11. use historical estimates for the model parameters. In this case, no estimation is done; all resultssuch as the percentilesare based on these historical estimates. See Estimating the model parameters on page 17-11. estimate the natural response rate from the data or specify a valuesee Natural response rate on page 17-12. if you have response/frequency data, you can define the value used to signify the occurrence of a success. Otherwise, the highest value in the column is used. enter a reference level for the factorsee Factor variables and reference levels on page 17-11. Otherwise, the lowest value in the column is used. perform a Hosmer-Lemeshow test to assess how well your model fits the data. This test bins the data into 10 groups by default; if you like, you can specify a different number.

Results subdialog box


I

display the following in the Session window: no output the basic output, which includes the response information, regression table, test for equal slopes, the log-likelihood, multiple degrees of freedom test, and two goodness-of-fit tests the basic output, plus distribution parameter estimates and the table of percentiles and/or survival probabilities (default) the above output, plus characteristics of the distribution and the Hosmer-Lemeshow goodness-of-fit test
When you select fiducial confidence intervals, MINITAB will display fiducial confidence intervals for the median, Q1, and Q2 and normal confidence intervals for mean, standard deviation, and IQR in the characteristics of distribution table.

Note

show the log-likelihood for each iteration of the algorithm.

MINITAB Users Guide 2

17-5

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Chapter 17

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

Storage subdialog box


I I

store the Pearson and deviance residuals store the characteristics of the fitted distribution, including percentiles and their percents, standard errors, and confidence limits survival probabilities and their stress level and confidence limits store information on the estimated equation, including event probability estimated coefficients and standard error of the estimates variance/covariance matrix natural response rate and standard error of the natural response log-likelihood for the last iteration

Output
The default output consists of:
I I I

the response information the factor information the regression table, which includes the estimated coefficients and their standard errors. Z-values and p-values. The Z-test tests that the coefficient is significantly different than 0; in other words, is it a significant predictor? natural response ratethe probability that a unit fails without being exposed to any of the stress. the test for equal slopes, which tests that the slopes associated with the factor levels are significantly different. the log-likelihood from the last iteration of the algorithm. two goodness-of-fit tests, which evaluate how well the model fits the data. The null hypothesis is that the model fits the data adequately. Therefore, the higher the p-value the better the model fits the data. the parameter estimates for the distribution and their standard errors and 95% confidence intervals. The parameter estimates are transformations of the estimated coefficients in the regression table. the table of percentiles, which includes the estimated percentiles, standard errors, and 95% fiducial confidence intervals. the probability plot, which helps you to assess whether the chosen distribution fits your datasee Probability plots on page 17-10.

I I

17-6

MINITAB Users Guide 2

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Probit Analysis
I

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

the relative potencycompares the potency of a stress for two levels of a factor. To get this output, you must have a factor, and choose a Weibull, lognormal, or loglogistic distribution. Suppose you want to compare how the amount of voltage affects two types of light bulbs, and the relative potency is .98. This means that light bulb 1 running at 117 volts would fail at approximately the same time as light bulb 2 running at 114.66 volts (117 .98).

Probit model and distribution function


MINITAB provides three main distributionsnormal, logistic, and extreme value allowing you to fit a broad class of binary response models. You can take the log of the stress to get the lognormal, loglogistic, and Weibull distributions, respectively. This class of models (for the situation with no factor) is defined by: j = c + ( 1 c ) g ( 0 + x j ) where
j = the probability of a response for the jth stress level

g(yj) = the distribution function (described below) 0 xj c = the constant term = the jth level of the stress variable = unknown coefficient associated with the stress variable = natural response rate

The distribution functions are outlined below: Distribution


logistic normal extreme value

Distribution function
g(yj) = 1 ( 1 + e g(yj) = (yj) g(yj) = 1 e
e
yj

Mean
0 0

Variance
pi2 / 3 1

yj

(Euler constant) pi2 / 6

Here, pi in the Variance column of the table is 3.14159. The distribution function you choose should depend on your data. You want to choose a distribution function that results in a good fit to your data. Goodness-of-fit statistics can be used to compare fits using different distributions. Certain distributions may be used for historical reasons or because they have a special meaning in a discipline.
MINITAB Users Guide 2 17-7

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Chapter 17

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

Percentiles
At what stress level do half of the units fail? How much pesticide do you need to apply to kill 90% of the insects? You are looking for percentiles. Common percentiles used are the 10th, 50th, and 90th percentiles, also known in the life sciences as the ED 10, ED 50 and ED 90 (ED = effective dose). The probit analysis automatically displays a table of percentiles in the Session window, along with 95% fiducial confidence intervals. You can also request:
I I I

additional percentiles to be added to the table normal approximation rather than fiducial confidence intervals a confidence level other than 95%

The Percentile column contains the stress level required for the corresponding percent of the events to occur. In this example, you exposed light bulbs to various voltages and recorded whether or not the bulb burned out before 800 hours.
Table of Percentiles
At 104.9931 volts, 1% of the bulbs burn out before 800 hours.

Percent Percentile 1 104.9931 2 106.9313 3 108.1795 4 109.1281 etc.

Standard Error 1.3715 1.2661 1.1997 1.1504

95.0% Fiducial CI Lower Upper 101.9273 107.3982 104.1104 109.1598 105.5144 110.2980 106.5795 111.1656

h To modify the table of percentiles 1 In the Probit Analysis main dialog box, click Estimate.

2 Do any of the following:


I

In Estimate percentiles for these additional percents, enter the percents or a column of percents.
MINITAB Users Guide 2

17-8

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Probit Analysis

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

Choose Normal approximation to request normal approximation rather than fiducial confidence intervals. Change the confidence level for the percentiles (default is 95%): In Confidence level, enter a value. This changes the confidence level for all confidence intervals.

Survival and cumulative probabilities


What is the probability that a submarine hull will survive a given strength of shock? At a given pesticide application, what is the probability that an insect survives? You are looking for survival probabilitiesestimates of the proportion of units that survive at a certain stress level. When you request survival probabilities, they are displayed in a table in the Session window. In this example, we exposed light bulbs to various voltages and recorded whether or not the bulb burned out before 800 hours. Then we requested a survival probability for light bulbs subjected to 117 volts:
The probability of a bulb lasting past 800 hours is 0.7692 at 117 volts.

Table of Survival Probabilities Stress Probability 117.0000 0.7692 95.0% Fiducial CI Lower Upper 0.6224 0.8825

To calculate cumulative probabilities (the likelihood of failing rather than surviving), subtract the survival probability from 1. In this case, the probability of failing before 800 hours at 117 volts is 0.2308.
h To request survival probabilities 1 In the Probit Analysis main dialog box, click Estimate.

2 In Estimate survival probabilities for these stress values, enter one or more stress

values or columns of stress values.

MINITAB Users Guide 2

17-9

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Chapter 17

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

Probability plots
A probability plot displays the percentiles. You can use the probability plot to assess whether a particular distribution fits your data. In general, the closer the points fall to the fitted line, the better the fit. For a discussion of probability plots, see Probability plots on page 15-37. When you have more than one factor level, lines and confidence intervals are drawn for each level. If the plot looks cluttered, you can turn off the confidence intervals in the Graphs subdialog box. You can also change the confidence level for the 95% confidence by entering a new value in the Estimate subdialog box.

Survival plots
Survival plots display the survival probabilities versus stress. Each point on the plot represents the proportion of units surviving at a stress level. The survival curve is surrounded by two outer linesthe 95% confidence interval for the curve, which provide reasonable values for the true survival function. For an illustration of a survival plot, see Survival plots on page 15-40.
h To draw a survival plot 1 In the Probit Analysis dialog box, click Graphs.

2 Check Survival plot. 3 If you like, turn off the 95% confidence intervaluncheck Display confidence

intervals on above plots. Click OK.


4 If you like, change the confidence level for the 95% confidence intervalclick

Estimate. In Confidence level, enter a value. Click OK.

17-10

MINITAB Users Guide 2

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Probit Analysis

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

Factor variables and reference levels


You can enter numeric, text, or date/time factor levels. MINITAB needs to assign one factor level to be the reference level, meaning that the estimated coefficients are interpreted relative to this level. Probit analysis creates a set of design variables for the factor in the model. If there are k levels, there will be k-1 design variables and the reference level will be coded with all 0s. Here are two examples of the default coding scheme:
Factor A with 4 levels (1 2 3 4) A1 A2 A3 1 0 0 0 2 1 0 0 3 0 1 0 4 0 0 1 Factor B with 3 levels (High Low Medium) B1 B2 High 0 0 Low 1 0 Medium 0 1

reference level

reference level

By default, MINITAB designates the lowest numeric, date/time, or text value as the reference factor level. If you like, you can change this reference value in the Options subdialog box.

Estimating the model parameters


MINITAB uses a modified Newton-Raphson algorithm to estimate the model parameters. If you like, you can enter historical estimates for these parameters. In this case, no estimation is done; all resultssuch as the percentilesare based on these historical estimates. When you let MINITAB estimate the parameters from the data, you can optionally:
I I

enter starting values for the algorithm. change the maximum number of iterations for reaching convergence (the default is 20). MINITAB obtains maximum likelihood estimates through an iterative process. If the maximum number of iterations is reached before convergence, the command terminates.

Why enter starting values for the algorithm? The maximum likelihood solution may not converge if the starting estimates are not in the neighborhood of the true solution, so you may want to specify what you think are good starting values for parameter estimates.

MINITAB Users Guide 2

17-11

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Chapter 17

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

h To control estimation of the parameters 1 In the Probit Analysis main dialog box, click Options.

2 Do one of the following:


I

To estimate the model parameters from the data (the default), choose Estimate model parameters. To enter starting estimates for the parameters: In Use starting estimates, enter one starting value for each coefficient in the regression table. Enter the values in the order that they appear in the regression table.

Note

Do not enter a starting value for the natural response rate here.

To specify the maximum number of iterations, enter a positive integer.


I

To enter your own estimates for the model parameters, choose Use historical estimates and enter one starting value for each coefficient in the regression table. Enter the values in the order that they appear in the regression table.

Natural response rate


The regression table includes the natural response ratethe probability that a unit fails without being exposed to any of the stress. This statistic is used in situations with high mortality or high failure rates. For example, you might want to know the probability that a young fish dies without being exposed to a certain pollutant. If the natural response rate is greater than 0, you may want to consider the fact that the stress does not cause all of the deaths in the analysis. You can choose to estimate the natural response rate from the data, or set the value. You would set the value when you have a historical estimate, or to use as a starting value for the algorithm.

17-12

MINITAB Users Guide 2

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Probit Analysis

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

e Example of a probit analysis

Suppose you work for a lightbulb manufacturer and have been asked to determine bulb life for two types of bulbs at typical household voltages. The typical line voltage entering a house is 117 volts + 10% (or 105 to 129 volts). You subject the two bulbs to five stress levels within that range108, 114, 120, 126, and 132 volts, and define a success as: The bulb fails before 800 hours.
1 Open the worksheet LIGHTBUL.MTW. 2 Choose Stat Reliability/Survival Probit Analysis. 3 Choose Response in success/trial format. 4 In Number of successes, enter Blows. In Number of trials, enter Trials. 5 In Stress (stimulus), enter Volts. 6 In Factor (optional), enter Type. In Enter number of levels, enter 2. 7 From Assumed distribution, choose Weibull. 8 Click Estimate. In Estimate survival probabilities for these stress values, enter

117. Click OK.


9 Click Graphs. Uncheck Display confidence intervals on above plots. Click OK in

each dialog box. Session window output


Probit Analysis: Blows, Trials versus Volts, Type Distribution: Weibull

Response Information Variable Value Blows Success Failure Trials Total Factor Information Factor Type Levels Values 2 A B Count 192 308 500

Estimation Method: Maximum Likelihood

MINITAB Users Guide 2

17-13

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Chapter 17

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

Regression Table Variable Constant Volts Type B Natural Response Coef -97.019 20.019 0.1794 0.000 Standard Error 7.673 1.587 0.1598 Z P -12.64 0.000 12.61 0.000 1.12 0.262

Test for equal slopes: Chi-Square = 0.2585, DF = 1, P-Value = 0.611 Log-Likelihood = -214.213 Goodness-of-Fit Tests Method Pearson Deviance Type = A Tolerance Distribution Parameter Estimates Parameter Shape Scale Estimate 20.019 127.269 Standard Error 1.587 0.737 Standard Error 1.8424 1.6355 1.5090 1.4171 1.3449 1.2854 1.2348 1.1909 1.1523 1.1177 0.8986 0.7901 0.7358 0.7179 95.0% Normal CI Lower Upper 17.138 23.384 125.832 128.722 95.0% Fiducial CI Lower Upper 96.9868 104.3407 101.0429 107.5731 103.5009 109.5267 105.2866 110.9457 106.6975 112.0680 107.8683 113.0007 108.8717 113.8017 109.7516 114.5057 110.5364 115.1354 111.2458 115.7062 116.1208 119.7003 119.2012 122.3424 121.5505 124.4720 123.5231 126.3718 Chi-Square 2.516 2.492 DF 7 7 P 0.926 0.928

Table of Percentiles Percent Percentile 1 101.1409 2 104.7307 3 106.9008 4 108.4760 5 109.7203 6 110.7531 7 111.6387 8 112.4158 9 113.1096 10 113.7373 20 118.0817 30 120.8808 40 123.0693 50 124.9600

-----the rest of this table omitted for space-----

17-14

MINITAB Users Guide 2

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Probit Analysis

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
Probit Analysis

Table of Survival Probabilities 95.0% Fiducial CI Stress Probability Lower Upper 117.0000 0.8306 0.7807 0.8785 Type = B Tolerance Distribution Parameter Estimates Parameter Shape Scale Estimate 20.019 126.134 Standard Error 1.587 0.704 Standard Error 1.8617 1.6562 1.5303 1.4386 1.3663 1.3065 1.2556 1.2113 1.1722 1.1371 0.9108 0.7929 0.7280 0.6989 95.0% Normal CI Lower Upper 17.138 23.384 124.761 127.522 95.0% Fiducial CI Lower Upper 96.0399 103.4706 100.0595 106.6728 102.4960 108.6073 104.2667 110.0121 105.6661 111.1226 106.8277 112.0453 107.8234 112.8374 108.6967 113.5335 109.4760 114.1558 110.1805 114.7197 115.0289 118.6590 118.1018 121.2561 120.4520 123.3436 122.4294 125.2031

Table of Percentiles Percent Percentile 1 100.2388 2 103.7965 3 105.9472 4 107.5084 5 108.7416 6 109.7652 7 110.6429 8 111.4131 9 112.1007 10 112.7228 20 117.0285 30 119.8026 40 121.9716 50 123.8454

-----the rest of this table omitted for space----Table of Survival Probabilities 95.0% Fiducial CI Stress Probability Lower Upper 117.0000 0.8009 0.7460 0.8546 Table of Relative Potency Factor: Type Comparison A VS B Relative Potency 0.9911 95.0% Fiducial CI Lower Upper 0.9754 1.0068

MINITAB Users Guide 2

17-15

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

CONTENTS
Chapter 17

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE
References

Graph window output

Interpreting the results The goodness-of-fit tests (p-values = 0.926, 0.928) and the probability plot suggest that the Weibull distribution fits the data adequately. Since the test for equal slopes is not significant (p-value = .611), the comparison of lightbulbs will be similar regardless of the voltage level. In this case, the lightbulbs A and B are not significantly different because the coefficient associated with type B is not significantly different than 0 (p-value = .262). At 117 volts, what percentage of the bulbs last beyond 800 hours? Eight-three percent of the bulb As and 80% of the bulb Bs last beyond 800 hours. At what voltage do 50% of the bulbs fail before 800 hours? The table of percentiles shows you that 50% of bulb As fail before 800 hours at 124.96 volts; 50% of bulb Bs fail before 800 hours at 123.85 volts.

References
[1] D.J. Finney (1971). Probit Analysis, Cambridge University Press. [2] D.W. Hosmer and S. Lemeshow (1989). Applied Logistic Regression, John Wiley & Sons, Inc. [3] P. McCullagh and J.A. Nelder (1992). Generalized Linear Models, Chapman & Hall. [4] W. Murray, Ed. (1972). Numerical Methods for Unconstrained Optimization, Academic Press. [5] W. Nelson (1982). Applied Life Data Analysis, John Wiley & Sons.

17-16

MINITAB Users Guide 2

CONTENTS

INDEX

MEET MTB

UGUIDE 1

UGUIDE 2

SC QREF

HOW TO USE

You might also like