You are on page 1of 4

Determination of Sample Size

S. Qin
in Service Inspection
G. E. 0. Widera When performing inservice inspection on a large volume of identical components, it
Fellow ASME becomes an almost impossible task to inspect all those in which defects may exist,
even if their failure probabilities are known. As a result, an appropriate sample size
Department of Mechanical and needs to be determined when setting up an inspection program. In this paper, a
Industrial Engineering, probabilistic analysis method is employed to solve this problem. It is assumed that
Marquette University, the characteristic data of components has a certain distribution which can be taken
1515 W. Wisconsin Avenue,
Milwaukee, Wl 53233
as known when the mean and standard deviations of serviceable and defective sets
of components are estimated. The sample size can then be determined within an
acceptable assigned error range. In this way, both false rejection and acceptance
can be avoided with a high degree of confidence.

I Introduction nents are detected has been assigned, the minimum sample size
can be calculated.
The objective of inservice inspection (ISI) is to prevent unde-
sired failure of components or subsystems during their normal
operating life so as to avoid casualties, environmental damage, II Concept of the Theorem of Neyman and E.
and economic loss. After a risk assessment of components is Pearson
completed, an inspection program can be developed. There are Let £ be a random variable with a known distribution, and
three primary factors which must be taken into account; these
the parameters 6,, 62, . . . , 6„, upon which the distribution
are the time between inspection, the size of the sample con-
depends be unknown. Further, let x{, x2,.. ., xn be the observed
taining the degraded components (must be determined with a
values of £. Each set of xu x2, •. •, x„ belongs to one of two
sufficiently high level of confidence), and the probability that
categories, Rn i or R„2 . For example, if the value of p„ is assigned
the inspection method detects the degradation in the process
(ASME, 1991). When the number of components is large, an (0 < p0 < 1), let
appropriate sample size must be determined in view of the fact 1 "
that the cost for inspection of all components is, in general, a = - X xk (1)
unacceptably high. In the determination of the sample size, the n ,_,
variability in the environmental conditions and stress values When a < p , set {x } belongs to R \ otherwise set [x ]
a k nl k
should, of course, be considered first. However, when a large belongs to R .
n2
volume of components is subjected to nearly similar conditions, There are two alternative hypotheses: Hx and H2. We must
then obviously some method is needed to determine an econom- • make a choice between them, depending upon whether the ob-
ical sample size so as to decrease the inspection cost. To some servation values {x } belong to R or R . With reference to
k nl n2
extent, though, practical experiences may be available to help the foregoing example, Hi is the hypothesis that the expectation
engineers make that decision. of £ is less than p„, while H2 is that the expectation of £ is
There are two main factors which will affect the determina- greater thanp„, i.e.,
tion of the sample size. One is the probability that the sample
contains degraded components and the other is the cost of in- Ht: a < po;
spection. We intuitively know that the larger the sample size,
the larger the probability of the detection of degraded compo- H2: a > po (2)
nents as well as the higher the cost of inspection. So, the prob- When we make the choice of Hx or H2, two types of error
lem here is that given a sufficiently large probability for de- may occur. The first type of error is that Ht is rejected while it
tecting degradation in a process, how large of a sample size is actually is true. In other words, Hi is actually true, but [x ]
k
economical, or with a view to the cost of inspection and the falls in region R„ . The second type of error is that Hi is accept-
2
consequence of failure, what is the optimal sample size. able, but it is actually false. In this context, R„2 is referred to
In the following sections, a probabilistic analysis method is as the critical region. Let <x\ and a2 be the probabilities that an
presented to estimate the minimum sample size which has an error of Type I and of Type II occurs, respectively. For a given
acceptable probability of detection of degradation. The premise number of trials n, the choice of critical region can affect the
of the method is that the distribution function of the characteris- values of a^ and a , but it cannot make c^ and a arbitrarily
2 2
tic data is known. We depend on the characteristic data to small simultaneously. Thus, if n possesses a certain value, criti-
identify the serviceable and defective sets of components. Fortu- cal region R„ can only be chosen so as to make a take on a
2 2
nately, in practice, the characteristic data of components can be minimum value under the condition that a, attains a certain
considered to approximately have a normal distribution. When value. It can, thus, be shown that the following theorem of
both the mean and standard deviation are then estimated through
Neyman and Pearson holds (Gnedenko, 1992).
statistical approaches and the probability that degraded compo-
Theorem 1: Of all the possible critical regions for which the
probability of an error of Type I is a i, the probability of an
error of Type II assumes its least value for that critical region
Contributed by the Pressure Vessels and Piping Division and presented at the R%2 which consists of all points (xu x2, . . ., x„) for which
Pressure Vessels and Piping Conference, Minneapolis, Minnesota, June 19-23,
1994, of THE AMERICAN SOCIETY OF MECHANICAL ENGINEERS. Manuscript re-
ceived by the PVP Division, May 18, 1994; revised manuscript received March fl/U/tf,) a c Il/U/tf>) (3)
26, 1996. Associate Technical Editor: R. Streit.

Journal of Pressure Vessel Technology FEBRUARY 1997, Vol. 1 1 9 / 5 7

Copyright © 1997 by ASME


Downloaded From: http://pressurevesseltech.asmedigitalcollection.asme.org/ on 09/12/2017 Terms of Use: http://www.asme.org/about-asme/terms-of-use
The number c is determined from the condition This inequality defines the critical region R%2 which makes a2
be the minimum. It is noted that the expectation of the variable
P{(xux2, . . . , * „ ) C R*2/H,} = * ( c ) = a, (4) (xk — yui) is 0, while its variance is 1. According
to Eq. (4), we have
H e r e , / ( x t / f t ) a n d / ( x * / f t ) are the probability density func-
tions of xk under the conditions that hypotheses ft and ft hold,
respectively, and P { (JCI , JC2, . . . , x„) C R%2/Hi} the probability
that (x,, jt2, . . . , x„) falls into region R*2 under the condition
— f
2n Jk,
V27T Jk
,-(z2/2)
dz = a{ (V)

that ft is true. Further, 7?*2 is the optimum critical region.


The meaning of this theorem is that when the number of or
trials n is known and the probability cti is assigned inequality
(3), then Eq. (4) can be employed to determine the critical -U2/2)
region R%2 which makes the probability of an error of Type II
<K*i) -T-f dz = \ — ot\ (7)
reach its minimum value. On the other hand, if the probabilities V2TT J
a, and a2 are assigned, the number of trials n can be determined On the other hand, critical region R$t, in which a, is the
by means of Eqs. (3) and (4). For this purpose, let us examine minimum under the condition that a2 is known, can be defined
(3) and (4) again. As a matter of fact, inequality (3) limits the by following inequality:
region into which {xk} will fall when the probability of an error
of Type I attains value ax. Then, through Eq. (4) we are able -(l/2a2)SJ=1[(li-^2)2-(J:1;-^1)2]
to determine the value of c. If inequality (3) is reversed, it can =3 C (8)
be used to determine the critical region R${ under the condition
that n and a2 are known. Therefore, if at and a2 are assigned,
there will be two inequalities and two equations. Making use
of them, we can then determine the unknown numbers c a In c Vw .
and n. X (xk - A*2) - z ~ (M2 - Mi)
V"(A*2 - Mi) 2(T
In the next section, an example is given to further explain how
to use this theorem to solve the problem of the determination of
sample size when the probabilities ax and a2 have been assigned. = kx (M2 - Mi) (9)
a
Making use of Eq. (4) again, we have
III Example
Mi) = J /•*,-('/"/<') (/ij-/!,)
Let us suppose that there exists a batch of components which
needs to be inspected, and that we know that the test data (such
as flaw size) upon which our judgment depends has a normal
<M &i
a
(M2-
) i£ e~iz,2)dz (10)

distribution. When the mean of the test data is equal to or greater


than \i2, this batch of components is unacceptable; otherwise, Equations (7) and (10) can be transformed into the following
it is acceptable. According to the design data or from the results forms:
of a previous inspection, the mean of the test can be assumed
as fj,{ when it is acceptable. So, now we have two hypotheses ki = ip(l - or,) {!')
ft and ft; ft represents the fact that the mean of this batch
of components is /ii and it is acceptable, and ft the reverse. and
We expect the probability a2, i.e., the probability of false accep-
tance, to be very small, say 0.01, and at not to be greater than
0.05. In accordance with the theorem given in the foregoing, k\ (M2 - Mi) = ^ ( " 2 ) (10')
we then have a

,-(l/2<72)2J.1[(j:t-Ai2)2-(j;t-/J1)2] where, tp is the reverse function of $. Equations ( 7 ' ) and (10')


(5) are then solved simultaneously so that the number of trials n is
determined as follows:

a In c Vw
X (xk - Mi) (\i2 - /ii) = hi ( 6 ) [<K1 - «i) - <A(«2)]2 (H)
CTVn (u2 — nx (M2 ~ Mi) :

Nomenclature
ax = mean of characteristic data which ft = alternative hypothesis correspond- «i = probability that error of Type I
indicates components that can be ing to ft occurs
used continuously n = sample size a2 = probability that error of Type II
a2 = mean of characteristic data which n' = critical no. of defective compo- occurs
indicates components that must be nents for which all such compo- •y, ji = confidence levels
repaired or replaced nents should be replaced H = expectation of population
a' = critical value of characteristic data np = sample size of previous inspection a = variance of population
which indicates components that R„i = region corresponding to ft $( •) = standard normal distribution
fail Rn2 = region corresponding to ft function
ft = null hypothesis .? = standard deviation of characteristic !/>(•) = reverse function of $( •)
data

58 / Vol. 119, FEBRUARY 1997 Transactions of the ASME

Downloaded From: http://pressurevesseltech.asmedigitalcollection.asme.org/ on 09/12/2017 Terms of Use: http://www.asme.org/about-asme/terms-of-use


Because we know the values of a, Mi, fi2, «i, and a 2 , through definition of standard deviation and by use of Theorem 3, again,
use of the normal distribution table, we can readily obtain the it can be calculated that
value of n. Let a = 2.5, Mi = 16.5, \x2 = 20. Substituting these
values as well as the values of a, and a2 we have assigned into
Eq. (11), we obtain 12 (xk - a,,)2
(15)
n - 1
2.5 2
(1.648 + 2.33) 2 =
(20 - 16.5) According to Theorem 3, the mean a0 and deviation s can be
used as substitutions for the expectation fj, and variance a. In
This is the minimum number of trials needed to establish sample practice, when a particular kind of component is inspected for
characteristics under the condition that the probabilities a, and the first time, the mean a ( , which indicates the components
a2 are 0.05 and 0.01, respectively. can be used continuously; the mean a2, which indicates the
components must be repaired or replaced and the deviation s
can be estimated by the use of design data. In other cases, the
IV Determination of the Distribution and Its Numer- values of a,, a2, and s can be estimated by use of the data from
ical Characteristics a previous inspection. Here, provided that a, < a2, in view of
the degradation of components, the value of a{ will increase
From the foregoing example, we can see that it is possible progressively unless one or more of the degraded components
to employ Theorem 1, which was presented in Section II, are repaired or replaced. When ax is being calculated, the inspec-
to determine the sample size of an inspection. To do so, one tion data of repaired or replaced components should be excluded
condition should be satisfied. This is that regardless of our and replaced by the data of the new components. Then a2 can
ignorance of the distribution of the inspection data, as long be calculated as follows:
as its mean /! and standard deviation a are finite, the sum of
inspection data from n observations tends toward a normal
n np (16)
distribution. The mean and deviation can then be determined • a.
a2 = — a H — -
from the inspection data with acceptable accuracy. As a
matter of fact, the central limit theorem gives us solid sup- Here, a' is the critical observation value of the components
port on this point (Gnedenko, 1992; Marts and Waller, when failure occurs, np the previous sample size, and n' the
1982). critical number of defective components under which all of
Theorem 2: If independent and identically these kind of components should be replaced because the cost
distributed random variables such that E(xk) = n and D{xk) = of maintenance and inspection will be very high. If we take the
a2 < 00, i = 1 , 2 , . . . , n, then for every fixed y, as n tends value of ax = Mi> np = 15, and n' = 10 in the example in
to 00 Section III, then a' should be 21.75.
Substituting Eq. (16) into (11), we have
2 xk — tifi
*=i
<y >dz (12) ~ [i//(l - at) - iKa2)]2 (17)
Tin y V27J- (a' - a , ) 2

This theorem implies that for any distribution having ji < °°


and a < a>, the distribution of observation values xx, x2, . . .,
It is noted that in (17) deviation s is used as a substitution for
x„, which are independent and identically distributed, tends to
variance a and mean ax for expectation /i,. Therefore, some
a normal distribution. Thus in engineering practice, many char-
errors are introduced in the equation. Because of the existence
acteristic data, such as strength, stress, and safety factor, can
of these errors, the result obtained from the example in Section
be considered to have a normal distribution (ASME, 1993). Of
III must be modified.
course, it would be more ideal if we would know the exact
First, let us analyze the error inherent in item (a' — ax)2.
distribution that observation values would have. Thus, for most
The mean a, has a r-distribution. If y,\ is the expectation, the
problems in engineering, it is possible to use Theorem 1 to
random variable t is defined by
determine the sample size. But, it should be noted that besides
the distribution form, the numerical characteristics fi and a
must also be estimated. Fortunately, there is another theorem, (18)
Khinlchine's theorem (Gnedenko, 1992), which can supply us sHn„
with a method for the estimation of expectation fi and vari-
The probability of \t\ < ty{np) is expressed as
ance a.
Theorem 3: If £1, £2, • • • are identically distributed indepen-
dent random variables which have finite expectations (£(£„) = < ty(np) (19)
I sHn„
/i), then as « -> 001 for any float e > 0

1 If the value of the confidence level y has been assigned and np


I& <e 1 (13) is known, the value of ty{np) can then be found using a t-
distribution table. Thus

By virtue of this theorem, the mean a0 of a set of observation Mi I < ty(.np) - r =b


values X\, x2, . . . , x„ can be calculated as

I "
(14)
ax — b < Mi < fl| + b

Because 2 (xk - a„) = 0, 2 (xk - a0)2 may be expressed as a To avoid n being infinite and to guarantee that ( a ' - /J-I) ' s
sum of the square of (n - 1) stochastically independent nor- positive, according to the'assumption that a' > Mi» fli m u s t be
mally distributed variables (Hald, 1952). According to the less than ( a ' — b). The minimum value of ( a ' — ^ ) would

Journal of Pressure Vessel Technology FEBRUARY 1997, Vol. 1 1 9 / 5 9

Downloaded From: http://pressurevesseltech.asmedigitalcollection.asme.org/ on 09/12/2017 Terms of Use: http://www.asme.org/about-asme/terms-of-use


be ( a ' - a i - 6 ) . If we take the values in the foregoing example V Summary
and assign y = 0.95, it can be found that ty(np) = 1.753. Then As shown in this paper, the theorem of Neyman and E. Pear-
son can supply us with a method to estimate the sample size
b = 1 . 7 5 3 * ^ - = 1.13 for an inspection. The method, as demonstrated in Sections III
and IV, can be conveniently applied to situations in which the
V15 characteristic data of components have either normal or loga-
and rithmic normal distributions. In engineering practice, most of
the characteristic data of components can be considered as being
(a' -ay-b) = (21.75 - 16.5 - 1.13) = 4.12 independent and identically distributed. Therefore, in accor-
Next, let us estimate the error resulting from the substitution dance with the central limit theorem, these conditions will allow
of the deviation s for the variance a. As stated in Zacks (1992), us to use the normal or the logarithmic normal distributions as
the sample standard deviation has a chi-square distribution. asymptotic expressions of the distributions of these data, even
Namely though in some situations, other distributions may represent a
better way to express the data. This fact makes it easy to esti-
S^_ X
y](n-l) mate the sample size by use of the theorem presented in Sec-
tion II.
t 7 2 _ > n- 1
To summarize, the estimation of the sample size can be car-
This yields ried out by use of the following procedures:
1 Determine the mean a, by use of the data resulting from
44 - ^ f j = * (20) the previous inspection or from the design stage (when prepar-
ing for the first inspection).
[cr 2 n - 1 J
2 Determine the mean a2 by use of Eq. (16).
or, we can calculate 3 Determine the deviation by use of the data resulting from
the previous inspection or from design stage (when preparing
for the first inspection).
4 4 - ^ 1 = 1-7, 4 Choose «i and a2, the probabilities of false rejects and
_er2 np - 1 J defect escape, respectively.
5 Making use of the method presented in Sections III and
When the upper bound of a2 is used, Eq. (17) should be further
IV, determine the necessary minimum sample size.
modified as follows:
It is noted from Eq. (11) that the minimum sample size n is
„... / 2 *2K-P inversely proportional to the difference between a2 and a^. If
the degradation of components develops on the average, the
( ^ ) V " f l i - * ) 2 < ( n , - 1) value of (a-i — ax) will, in general, be decreased progressively.
The cost of inspection will then become more expensive be-
cause of the increase in sample size n. In this case, a maximum
X [i//(l - «,) - «A(a2)]2 (21) value n*, which is acceptable as far as cost is concerned, may
be determined by virtue of the decision analysis method
where (ASME, 1991, 1993) or the method of using the utility function
(Kenney and Raiffa, 1976). When the value of n resulting from
b = ty(np) T = Eq. (11) is greater than n*, it is not necessary to carry out any
inspection for this kind of a component and the replacement of
all components may be more economical. On this point, the
In order that the final result for the sample size n is sufficiently
approach proposed in this paper may be used not only to esti-
precise, (1 - ji) should be less than 0.1. If we take (1 - y,)
mate the necessary minimum sample size n, but also to decide
= 0.05 and np= 15, we can find, from the chi-square distribution
when it is more appropriate to replace all of the components
table, JC2 = 6.57. Then, the value of the variance in the example
rather than perform an inspection.
given in Section III can be estimated as

a2 ^ *2(n„-l) = 2.5 2 xl4 = ^ 3


References
X
ASME, 1991, Risk-Based Inspection—Development of Guideline, New York,
y\(.np-l) 6.57 NY.
ASME, 1993, The Use of Decision Analytic Reliability Methods in Codes and
Taking into account the aforementioned errors, the final result of Standards Work, New York, NY.
the example in Section III should be modified to be as follows: Gnedenko, B. V., 1992, The Theory of Probability, Chelsea Publishing Com-
pany.
Hald, A., 1952, Statistical Theory With Engineering Applications, John Wiley &
n = ^ ( 1 . 6 4 8 + 2.33) 2 = 28 Sons Inc., New York, NY.
Keeney, R. L., and Raiffa, H., 1976, Decision With Multiple Objectives: Prefer-
This value is much greater than that obtained in Section III. It ences and Value Tradeoffs, John Wiley & Sons Inc., New York, NY.
is thus seen that the errors resulting from the substitution of the Martz, H. F., and Waller, R. A„ 1982, Bayesian Reliability Analysis, John
Wiley & Sons Inc., New York, NY.
mean and the deviation for the expectation and the variance, Zacks, S., 1992, Introduction to Reliability Analysis—Probability Models and
respectively, cannot be neglected. Statistical Methods, Springer-Verlag.

60 / Vol. 119, FEBRUARY 1997 Transactions of the ASME

Downloaded From: http://pressurevesseltech.asmedigitalcollection.asme.org/ on 09/12/2017 Terms of Use: http://www.asme.org/about-asme/terms-of-use

You might also like