You are on page 1of 8

Designation: E 456 – 96

Standard Terminology for


Relating to Quality and Statistics1
This standard is issued under the fixed designation E 456; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (e) indicates an editorial change since the last revision or reapproval.

1. Scope systematic error or bias component.


1.1 This terminology includes those quality and statistical aliases, n—in a fractional factorial design, two or more effects
terms in wide use in ASTM for which standard definitions which are estimated by the same contrast and which,
appear desirable. therefore, cannot be estimated separately. E 1325
assignable cause, n—a factor that contributes to variation, and
2. Referenced Documents
which is feasible to detect and identify.
2.1 ASTM Standards:
NOTE 2—Many factors will contribute to variation but it may not be
E 177 Practice for the Use of the Terms Precision and Bias
feasible (economically or otherwise) to identify some of them.
in ASTM Test Methods2
E 1325 Terminology Relating to Design of Experiments2 attribute data, n—observed values or determinations which
E 1402 Terminology Relating to Sampling2 indicate the presence or absence of specific characteristics.
3. Significance and Use DISCUSSION—Items or units of material may be evaluated by counting
or measurement. Attributes are counted whereas variables are mea-
3.1 This terminology is the general terminology standard for sured. Attribute distributions are discrete. See variables data.
terms defined by Committee E-11.
3.2 Citation is made to other E-11 standards which contain attributes, method of, n—measurement of quality by the
more extensive information regarding the particular term and method of attributes consists of noting the presence (or
its usage. These references may be to other practices and absence) of some characteristic or attribute in each of the
guides or to more specific terminology standards, such as units in the group under consideration, and counting how
Terminology E 1325. many units do (or do not) possess the quality attribute, or
how many such events occur in the unit, group, or area.
4. Terminology average run length (ARL)—(1) sample sense, n—the aver-
age number of times that a process will have been sampled
acceptance (control chart or acceptance control chart and evaluated before a shift in process level is signaled, and
usage, n), n—a decision that the process is operating in a (2) unit sense, n—the average number of units that will have
satisfactory manner with respect to the statistical measures been produced before a shift in level is signaled.
being plotted: action limits: control limits.
accepted reference value, n—a value that serves as an DISCUSSION—A long ARL is desirable for a process located at its
agreed-upon reference for comparison, and which is derived specified level (so as to minimize calling for unneeded investigation or
corrective action) and a short ARL is desirable for a process shifted to
as: (1) a theoretical or established value, based on scientific
some undesirable level (so that corrective action will be called for
principles, (2) an assigned or certified value, based on promptly). ARL curves are used to describe the relative quickness in
experimental work of some national or international organi- detecting level shifts of various control chart systems.
zation, or (3) a consensus or certified value, based on
collaborative experimental work under the auspices of a balanced incomplete block design (BIB), n—an incomplete
scientific or engineering group. block design in which each block contains the same number
accuracy, n—the closeness of agreement between a test result k of different versions from the t versions of a single
and an accepted reference value. principal factor arranged so that every pair of versions
occurs together in the same number, l, of blocks from the b
NOTE 1—The term accuracy, when applied to a set of test results, blocks. E 1325
involves a combination of a random component and of a common
batch, n—a definite quantity of some product or material
produced under conditions that are considered uniform.
1
This terminology is under the jurisdiction of ASTM Committee E-11 on Quality
and Statistics and is the direct responsibility of Subcommittee E11.60 on Terminol- NOTE 3—A batch is usually smaller than a lot.
ogy.
Current edition approved June 10, 1996. Published September 1996. Originally bias, n—the difference between the expectation of the test
published as E 456 – 72. Last previous edition E 456 – 92. results and an accepted reference value.
2
Annual Book of ASTM Standards, Vol 14.02.

Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.

1
E 456
NOTE 4—Bias is the total systematic error as contrasted to random linear combinations of the treatments (contrasts).
error. There may be one or more systematic error components contributing
to the bias. A larger systematic difference from the accepted reference NOTE 8—Contrast analysis involves a systematic tabulation and analy-
value is reflected by a larger bias value. sis format usable for both simple and complex designs. When any set of
orthogonal contrasts is used, the procedure, as in the example, is
characteristic, n—a property of items in a sample or popula- straightforward. When terms are not orthogonal, the orthogonalization
tion which, when measured, counted or otherwise observed, process to adjust for the common element in nonorthogonal contrast is
helps to distinguish between the items. also systematic and can be programmed. E 1325
cluster sampling, n—when the primary sampling unit com-
control—(evaluation), n—an evaluation to check, test, or
prises a bundle of elementary units or a group of subunits,
verify; (authority): the act of guiding, directing, or manag-
the term cluster sampling may be applied.
ing; (stability): a state of process in which the variability is
DISCUSSION—Examples of cluster sampling are: selection of city attributable to a constant system of chance causes.
blocks as primary sampling units; selection of a household as a cluster control chart factor, n—a factor, usually varying with sample
of people (of which only one may be interviewed); selection of bundles size, to convert specified statistics or parameters into a
of rods or pipe from a shipment; and selection, from a shipment, of
central line value or control limit appropriate to the control
cartons that contain boxes or packages within them.
chart.
completely randomized design, n—a design in which the control chart method, n—the method of using control charts
treatments are assigned at random to the full set of experi- to determine whether or not processes are in a stable state.
mental units. E 1325 control limits, n—limits on a control chart which are used as
completely randomized factorial design, n—a factorial ex- criteria for signaling the need for action, or for judging
periment (including all replications) run in a completely whether a set of data does or does not indicate a state of
randomized design. E 1325 statistical control.
component of variance, n—a part of a total variance identified conventional true value of a quantity, n—value attributed to
with a specified source of variability. a particular quantity and accepted, sometimes by conven-
composite design, n—a design developed specifically for tion, as having an uncertainty appropriate for a given
fitting second order response surfaces to study curvature, purpose.
constructed by adding further selected treatments to those
NOTE 9—88Conventional true value” is sometimes called 88assigned
obtained from a 2n factorial (or its fraction). E 1325 value”, 88best value”, 88conventional value”, or 88reference value”. 88Ref-
confounded factorial design, n—a factorial experiment in erence value”, in this sense, should not be confused with 88reference
which only a fraction of the treatment combinations are run value” in the sense of an influence quantity affecting a measuring
in each block and where the selection of the treatment instrument.
combinations assigned to each block is arranged so that one NOTE 10—Frequently, a number of results of measurements of a
or more prescribed effects is(are) confounded with the block quantity is used to establish a conventional true value.
DISCUSSION—When warning limits are used, the control limits are
effect(s), while the other effects remain free from confound-
often called “action limits.” Action may be in the form of investigation
ing. of the source(s) of an “assignable cause”, making a process adjustment,
NOTE 5—All factor level combinations are included in the experiment. or terminating a process. Criteria other than control limits are also used
E 1325 frequently.

confounding, n—combining indistinguishably the main effect dependent variable, n—See response variable.
of a factor or a differential effect between factors (interac- design of experiments, n—the arrangement in which an
tions) with the effect of other factor(s), block factor(s) or experimental program is to be conducted, and the selection
interactions(s). of the levels (versions) of one or more factors or factor
combinations to be included in the experiment. Synonyms
NOTE 6—Confounding is a useful technique that permits the effective include experiment design and experimental design.
use of specified blocks in some experiment designs. This is accomplished E 1325
by deliberately preselecting certain effects or differential effects as being
deviation, n—the difference between a measurement or quasi-
of little interest, and arranging the design so that they are confounded with
block effects or other preselected principal factor or differential effects, measurement and its stated value or intended level.
while keeping the other more important effects free from such complica- DISCUSSION—Deviation should be stated as a difference in terms of
tions. Sometimes, however, confounding results from inadvertent changes the appropriate data units. Sometimes these units will be original
to a design during the running of an experiment or from incomplete measurement units; sometimes they will be quasi-measurements; that
planning of the design, and it serves to diminish, or even to invalidate, the is, a scaled rating of subjective judgments; sometimes they will be
effectiveness of an experiment. E 1325 designated values representing all continuous or discrete measurements
falling in defined cells or classes.
contrast, n—a linear function of the observations for which
the sum of the coefficients is zero. error of result, n—the test result minus the accepted reference
value (of the characteristic).
NOTE 7—With observations Y1, Y2,..., Yn, the linear function
a1Y1 + a2Y2 + ... + an Yn is a contrast if, and only if (ai = 0, where the ai NOTE 11—It is not possible to correct for random error.
values are called the contrast coefficients. E 1325
experimental design, n—see design of experiments. E 1325
contrast analysis, n—a technique for estimating the param- experiment space, n—the materials, equipment, environmen-
eters of a model and making hypothesis tests on preselected tal conditions and so forth that are available for conducting

2
E 456
an experiment. E 1325 within-laboratory control, the intermediate measures of precision are
experimental unit, n—a portion of the experiment space to likely to vary appreciably from laboratory to laboratory. Thus, intermedi-
which a treatment is applied or assigned in the experiment. ate precisions may be more characteristic of individual laboratories than of
the test method.
NOTE 12—The unit may be a patient in a hospital, a group of animals,
a production batch, a section of a compartmented tray, etc. E 1325 intermediate precision conditions, n—conditions under
which test results are obtained with the same test method
evolutionary operation (EVOP), n—a sequential form of using test units or test specimens (see Practice E 691,2 10.3)
experimentation conducted in production facilities during taken at random from a single quantity of material that is as
regular production. nearly homogeneous as possible, and with changing condi-
NOTE 13—The principal theses of EVOP are that knowledge to improve tions such as operator, measuring equipment, location within
the process should be obtained along with a product, and that designed the laboratory, and time.
experiments using relatively small shifts in factor levels (within produc- item, n—(1) an object or quantity of material on which a set of
tion tolerances) can yield this knowledge at minimum cost. The range of observations can be made: (2) an observed value or test
variation of the factors for any one EVOP experiment is usually quite result obtained from an object or quantity of material.
small in order to avoid making out of tolerance products, which may
require considerable replication, in order to be able to clearly detect the DISCUSSION—The second usage in the definition is generally limited
effect of small changes. E 1325 to generic descriptions such as in the definition of “population.” Terms
such as “observation,” “measurement,” “test result,” “unit,” “value” or
factorial experiment (general), n—in general, an experiment “yield” are more common in specific applications. A set as used here
in which all possible treatments formed from two or more may be one or more variables.
factors, each being studied at two or more levels (versions)
level (of a factor), n—a given value, a specification of
are examined so that interactions (differential effects) as well
procedure or a specific setting of a factor.
as main effects can be estimated. E 1325
2n factorial experiment, n—a factorial experiment in which n NOTE 17—88Version” is a general term applied both to quantitative and
factors are studied, each of them in two levels (versions). qualitative factors. The more restrictive term 88level” is frequently used to
E 1325 express more precisely the quantitative characteristic. For example, two
versions of a catalyst may be presence and absence. Four levels of a heat
fractional factorial design, n—a factorial experiment in
which only an adequately chosen fraction of the treatments treatment may be 100°C, 120°C, 140°C, and 160°C. E 1325
required for the complete factorial experiment is selected to lot—a definite quantity of a product or material accumulated
be run. under conditions that are considered uniform for sampling
NOTE 14—This procedure is sometimes called fractional replication. purposes.
lower control limit (LCL), n—control limit for points below
frame, n—a list, compiled for sampling purposes, which the central line.
designates the items (units) of a population or universe to be lower tolerance limit (LTL) (lower specification limit), n—a
considered in a study. tolerance limit that defines the lower conformance boundary
DISCUSSION—When a frame is available, sampling schemes can be for an individual unit of a manufacturing or service opera-
devised for selection of the units directly (one-stage), or in two or more tion.
stages. In multi-stage sampling, a frame is needed for each stage. As an main effect, average effect, n—a term describing a measure
example, the cartons of a lot could be the first-stage units, packages for the comparison of the responses at each level (version) of
within the carton could be second-stage units, and items within the a factor averaged over all levels (versions) of other factors in
packages could be the third-stage units.
the experiment.
fully nested experiment, n—a nested experiment in which the NOTE 18—The term 88main effect” may describe the parameter in an
second factor is nested within levels (versions) of the first assumed model or the estimate of this parameter. E 1325
factor and each succeeding factor is nested within versions
of the previous factor. E 1325 mixture design, n—a design in which two or more ingredients
hierarchical experiment, n—see nested experiment. or components shall be mixed and the response is a property
incomplete block design, n—a design in which the experi- of the resulting mixture that does not depend upon the
ment space is subdivided into blocks in which there are amount of the mixture.
insufficient experimental units available to run a complete NOTE 19—The proportions of each of the q components (Xi) in the
set of treatments or replicate of the experiment. E 1325 c
intermediate precisions, n—the closeness of agreement be- mixture shall satisfy the conditions O # Xi # 1 and ( Xi = 1; and each
i51
tween test results obtained under specified intermediate experimental point is defined in terms of these proportions.
precision conditions. NOTE 20—In some fields of application the experimental mixtures are
NOTE 15—The specific measure and the specific conditions must be described by the terms 88formulation” or 88blend.” The use of mixture
specified for each intermediate measure of precision; thus, 88standard designs is appropriate for experimenting with the formulations of manu-
deviation of test results among operators in a laboratory,” or 88day-to-day factured products, such as paints, gasoline, foods, rubber, and textiles.
standard deviation within a laboratory for the same operator.” NOTE 21—In some applications, the proportions of the components of
NOTE 16—Because the training of operators, the agreement of different the mixture may vary between 0 and 100 % of the mixture (88complete
pieces of equipment in the same laboratory and the variation of environ- domain”). In others, there may be operative restraints, so that at least one
mental conditions with longer time intervals all depend on the degree of component cannot attain 0 or 100 % (88reduced domain”). E 1325

3
E 456
method of least squares, n—a technique of estimation of a ments and other factors nested within the crossed combina-
parameter which minimizes (e2, where e is the difference tions.
between the observed value and the predicted value derived
NOTE 26—It is not unusual to find that experiments consist of both
from the assumed model. E 1325 factorial and nested segments. See nested experiment. E 1325
natural process limits (NPL), n—limits which include a
stated fraction of the individuals in a population. Plackett-Burman designs, n—a set of screening designs using
orthogonal arrays that permit evaluation of the linear effects
NOTE 22—Natural process limits will not ordinarily be the dimensional
limits shown on an engineering drawing. They are mostly used to compare
of up to n = t − 1 factors in a study of t, treatment
the natural capability of the process to tolerance limits. combinations. E 1325
DISCUSSION—For populations with a normal (Gaussian) distribution, population, n—the totality of items or units of material under
the natural process limits ordinarily will be at 63 s. If placed around consideration.
the standard level, these limits identify the boundaries which will
include approximately 99.7 % of the individuals in a process that is DISCUSSION—The word “items” may be interpreted in the sense of
properly centered and in a state of statistical control. In many measurements, or possible measurements, for a single characteristic, or
circumstances (several machines making the same product that serially occasionally for multiple characteristics, on all items or units of
feed into the process) it is recognized that in addition to the variability material being considered. The word “totality” may refer to items not
around a single level, an acceptable zone of “standard” levels (for the available for inclusion in samples as well as those which are available.
different machines) is required. Then the NPL may be placed around precision, n—the closeness of agreement between independent
the Acceptable Process Levels (APL) that define this zone so that the test results obtained under stipulated conditions.
NPL identify the boundaries within which at least 99.7 % of the
individuals will be included in a process located at the APL, or inside NOTE 27—Precision depends on random errors and does not relate to
the zone. It should be noted that there is no assumption made that the the true value or the specified value.
process levels within the zone are random variables. NOTE 28—The measure of precision usually is expressed in terms of
imprecision and computed as a standard deviation of the test results. Less
nested experiment, n—an experiment to examine the effect of precision is reflected by a larger standard deviation.
two or more factors in which the same level (version) of a NOTE 29—88Independent test results” means results obtained in a
factor cannot be used with all levels (versions) of other manner not influenced by any previous result on the same or similar test
factors. Synonym: hierarchical experiment. E 1325 object. Quantitative measures of precision depend critically on the
observation, n—(1) the process of obtaining information stipulated conditions. Repeatability and reproducibility conditions are
regarding the presence or absence of an attribute of a test particular sets of extreme stipulated conditions.
specimen, or of making a reading on a characteristic or probability sample, n—a sample of which the sampling units
dimension of a test specimen, or (2) the attribute or mea- have been selected by a chance process such that, at each
surement information obtained from the process. (The term step of selection, a specified probability of selection can be
88observed value” is preferred for this second usage.) attached to each sampling unit available for selection.
NOTE 23—See Annex A1. NOTE 30—These probabilities of selection need not be equal. If equal,
see simple random sample. See the general term—sample. Also, see
observed value, n—the value obtained by carrying out the Practice E 1052 in this volume.
complete protocol of the test method once, being either a
single test determination or an average or other specified random error of result, n—a component of the error which,
combination of a specified number of test determinations. in the course of a number of test results for the same
characteristic, varies in an unpredictable way.
NOTE 24—See Annex A1. randomization, n—the procedure used to allot treatments at
orthogonal array, n—a table of coefficients identifying the random to the experimental units so as to provide a high
levels, or some weight associated with the levels, for each degree of independence in the contributions of experimental
factor to be used in the analysis of specified effects, which error to estimates of treatment effects.
are arranged in such a manner that each effect will be NOTE 31—An essential element in the design of experiments is to
independent of the other effects. E 1325 provide estimates of effects free from biases due to undetected assignable
orthogonal contrasts, n—two contrasts are orthogonal if the causes within the experimental space. Randomization is a process to
contrast coefficients of the two sets satisfy the condition that, minimize this risk. The operational procedure for assignment 88at random”
when multiplied in corresponding pairs, the sum of the involves the use of random numbers or some similar method for assuring
products is equal to zero. See contrast and contrast analy- that each unit has an equal chance of being selected for each treatment.
sis. E 1325 E 1325
partially balanced incomplete block design (PBIB), n—an randomized block design, n—a design in which the experi-
incomplete block design in which each block contains the ment space is subdivided into blocks of experimental units,
same number k, of different versions from the t versions of the units within each block being more homogeneous than
the principal factor. units in different blocks.
NOTE 25—The arrangement is such that not all pairs of versions occur NOTE 32—In each block the treatments are allocated randomly to the
together in the same number of the blocks; some versions can therefore be experimental units within each block. Replication is obtained by the use of
compared with greater precision than others. E 1325 two or more blocks, depending on the precision desired, and a separate
randomization is made in each block. E 1325
partially nested experiment, n—a nested experiment in
which several factors may be crossed as in factorial experi- randomized block factorial design, n—a factorial experiment

4
E 456
run in a randomized block design in which each block reproducibility limit, n—(R) the value below which the
includes a complete set of factorial combinations. E 1325 absolute difference between two test results obtained under
repeatability, n—precision under repeatability conditions. reproducibility conditions may be expected to occur with a
NOTE 33—Repeatability is one of the concepts or categories of the
probability of approximately 0.95 (95 %).
precision of a test method. NOTE 40—The reproducibility limit is 2.8 ('1.96 =2 ) times the
NOTE 34—Measures of repeatability defined in this compilation are reproducibility standard deviation. The multiplier is independent of the
repeatability standard deviation and repeatability limit. size of the interlaboratory study (that is, of the number of laboratories
repeatability conditions, n—conditions where independent participating), as explained in Practice E 177.2
NOTE 41—The approximation to 0.95 is reasonably good (say 0.90 to
test results are obtained with the same method on identical 0.98) when many laboratories (30 or more) are involved but is likely to be
test items in the same laboratory by the same operator using poor when fewer than eight laboratories are studied.
the same equipment within short intervals of time.
reproducibility standard deviation (SR), n—the standard
NOTE 35—See precision Note 3. deviation of test results obtained under reproducibility con-
DISCUSSION—The “same operator, same equipment” requirement
ditions.
means that for a particular step in the measurement process, the same
combination of operator and equipment is used for every test result. NOTE 42—Other measures of the dispersion of test results obtained
Thus, one operator may prepare the test specimens, a second measure under reproducibility conditions are the 88reproducibility variance” and
the dimensions and a third measure the mass in a test method for the 88reproducibility coefficient of variation.”
determining density. NOTE 43—The reproducibility standard deviation includes, in addition
DISCUSSION—By “in the shortest practical period of time” is meant to between-laboratory variability, the repeatability standard deviation and
that the test results, at least for one material, are obtained in a time a contribution from the interaction of laboratory factors (that is, differ-
period not less than in normal testing and not so long as to permit ences between operators, equipment and environments) with material
significant change in test material, equipment or environment. factors (that is, the differences between properties of the materials other
than that property of interest).
repeatability limit (r), n—the value below which the absolute
difference between two individual test results obtained under residual error, n—the difference between the observed result
repeatability conditions may be expected to occur with a and the predicted value (estimated treatment response);
probability of approximately 0.95 (95 %). Observed Result minus Predicted Value. E 1325
NOTE 36—The repeatability limit is 2.8 ('1.96 =2 ) times the
response surface, n—the pattern of predicted responses based
repeatability standard deviation. This multiplier is independent of the size on the empirical model derived from the experiment obser-
of the interlaboratory study, as explained in Practice E 177.2 vations. E 1325
NOTE 37—The approximation to 0.95 is reasonably good (say 0.90 to response variable, n—the variable that shows the observed
0.98) when many laboratories (30 or more) are involved, but is likely to results of an experimental treatment. Synonym dependent
be poor when fewer than eight laboratories are studied. variable. E 1325
repeatability standard deviation, n—the standard deviation robustness, n—insensitivity of a statistical test to departures
of test results obtained under repeatability conditions. from underlying assumptions.
NOTE 38—It is a measure of the dispersion of the distribution of test DISCUSSION—Many statistical test procedures depend on the form of
results under repeatability conditions. the assumed distribution of the population sampled to obtain exact
NOTE 39—Similarly, 88repeatability variance” and 88repeatability coef- values for the probability statements. If departures from the assumed
ficient of variation” could be defined and used as measures of the distribution do not materially affect the decisions which would be based
dispersion of test results under repeatability conditions. on the statistical tests involved, the test is considered “robust.” For
DISCUSSION—In an interlaboratory study, this is the pooled standard example, tests based on an assumption of normality that compare
deviation of test results obtained under repeatability conditions. See averages generally are robust even though the underlying distribution
Practice E 691. of individual items in the population is not normal. On the other hand,
DISCUSSION—The repeatability standard deviation, usually consid- the F-statistic for comparing variances may be an indicator of lack of
ered a property of the test method, will generally be smaller than the normality rather than a simple variance comparison.
within-laboratory standard deviation. (See within-laboratory standard
deviation.)
ruggedness, n—insensitivity of a test method to departures
from specified test or environmental conditions.
reproducibility, n—precision under reproducibility condi-
tions. DISCUSSION—An evaluation of the “ruggedness” of a test method or
an empirical model derived from an experiment is useful in determining
reproducibility conditions, n—conditions where test results
whether the results or decisions will be relatively invariant over some
are obtained with the same method on identical test items in range of environmental variability under which the test method or the
different laboratories with different operators using different model is likely to be applied.
equipment.
ruggedness test, n—a planned experiment in which environ-
DISCUSSION—Identical material means either the same test units or mental factors or test conditions are deliberately varied in
test specimens are tested by all the laboratories as for a nondestructive order to evaluate the effects of such variation.
test or test units or test specimens are taken at random from a single
quantity of material that is as nearly homogeneous as possible. (See DISCUSSION—Since there usually are many environmental factors that
Practice E 691.) might be considered in a ruggedness test, it is customary to use a
DISCUSSION—A different laboratory of necessity means a different “screening” type of experiment design (see screening design) which
operator, different equipment, and different location and under different concentrates on examining many first order effects and generally
supervisory control. assume that second order effects such as interactions and curvature are

5
E 456
relatively negligible. Often in evaluating the ruggedness of a test specification limits, n—see tolerance limits.
method, if there is an indication that the results of a test method are staggered nested experiment, n—a nested experiment in
highly dependent on the levels of the environmental factors, there is a which the nested factors are run within only a subset of the
sufficient indication that certain levels of environmental factors must be
included in the specifications for the test method, or even that the test
versions of the first or succeeding factors. E 1325
method itself will need further revision. standard deviation, n—the most usual measure of the disper-
sion of observed values or results expressed as the positive
run, n—(1) an uninterrupted sequence of occurrences of the square root of the variance.
same attribute or event in a series of observations, and (2) a statistic, n—a quantity calculated from a sample of observa-
consecutive set of successively increasing run-up or succes- tions, most often to form an estimate of some population
sively decreasing run-down values in a series of variable parameter.
measurements. statistical measure, n—statistic or mathematical function of a
DISCUSSION—In control chart applications, some variable measure- statistic.
ments are treated as attributes in determining runs. For example, a run
DISCUSSION—The word statistical emphasizes that measures are
might be considered a series of a specified number of consecutive
subject to inherent errors and that, in estimating a population parameter,
points above or below the central line.3
they represent a sample, with inherent sampling variability.
sample, n—a group of items, observations, test results, or subgroup, n—(1) object sense, n—a set of units or quantity of
portions of material, taken from a large collection of items, material obtained by subdividing a larger group of units or
observations, test results, or quantities of material, which quantity of material, and (2) measurement sense, n—a set
serves to provide information that may be used as a basis for of groups of observations obtained by subdividing a larger
making a decision concerning the larger collection. group of observations. See rational subgroup.
DISCUSSION—The sample may be the units of material themselves or systematic error of result, n—a component of the error,
the set of the observations collected from them. The decision may or which in the course of a number of test results for the same
may not involve taking action on the units of material, or on the characteristic, remains constant or varies in a predictable
process. It is necessary to describe whether the sample is to be selected way.
on a simple random, a stratified random, or other specified basis.
Probability samples, that is, samples selected by chance using appro- NOTE 46—Systematic errors and their causes may be known or un-
priate randomization, are required to make confidence interval state- known.
ments and similar statistical inferences about the parameters of the
sampled population. systematic sampling, n—sample selection procedure in which
every kth element is selected from the universe or popula-
sample size, n—the number of units in a sample or the number tion; for example, u, u + k, u + 2k, u + 3k, etc., where u is in
of observations in a sample. the interval 1 to k.
sampling fraction, f, n—the ratio f of the number of sampling
units selected for the sample to the number of sampling units DISCUSSION—If k = 20 and u = 7 is the initial unit selected, then
sampling units 7, 27, 47, 67, ..., would comprise the sample. When N/k
available.
is not an integer, there is a small bias due to the end effect. When u is
NOTE 44—For the simple random sample case, f = n/N where n is the selected by a chance process and N/k is an integer, the systematic
sample size and N is the number of sampling units available. When f > sample will provide unbiased estimates of the population average or
0.10 estimation of the precision of an estimator should take account of this total. Situations for which N/k is not an integer usually ignore the small
magnitude of f. See finite population correction. or negligible bias in estimating the mean or total. Schemes have been
developed for non-integer N/k to overcome sampling bias. See Jessen.4
sampling with replacement, n—a procedure used with some Estimation of the precision of an average computed from a
probability sampling plans in which a selected unit is systematic sample is a difficult problem that has no generally satisfac-
replaced after any step in selection so that this sampling unit tory solution. Independent replicate systematic samples provide an
is available for selection again at the next step of selection, approach to variance estimation, but have been rejected by some
or at any other succeeding step of the sample selection writers. In some ASTM situations where replicate samples may be
obtained on a routine basis, the technique may be useful. See Cochran5
procedure. for an extended discussion of variance estimation for systematic
screening design, n—a balanced design, requiring relatively sampling.
minimal amount of experimentation, to evaluate the lower
order effects of a relatively large number of factors in terms test determination, n—(1) the process of deriving from one or
of contributions to variability or in terms of estimates of more test observations (observed values) the presence or
parameters for a model. absence of an attribute or the value of a characteristic or
dimension of a single test specimen, or (2) the attribute
NOTE 45—In screening designs, the term lower order effects is some- (presence or absence) or value derived from the process (see
times limited to first order terms such as linear components of main
test specimen).
effects, but often includes both first order terms and second order terms
such as two factor interactions and quadratic curvature components of NOTE 47—See Annex A1.
main effects. E 1325
4
Jessen, R. J., “Statistical Survey Techniques,” John Wiley & Sons, Inc., New
3
Other examples may be found in references such as Nelson, L. S., “Interpreting York, 1978, Sec. 12.2.
5
Shewhart X̄ Control Charts,” Journal of Quality Technology, Vol 17, No. 2, April Cochran, W. G., “Sampling Techniques,” John Wiley & Sons, Inc., New York,
1985. 1977, Chapter 8.

6
E 456
test observation, n—see observation. nents of error: (1) bias, and (2) the random error attributed to
test result, n—the value of a characteristic obtained by the imprecision of the measurement process.
carrying out a specified test method. DISCUSSION—Quantitative measures of uncertainty generally require
NOTE 48—The test method should specify that one or a number of descriptive statements of explanation because of differing traditions of
individual observations be made and their average or another appropriate usage and because of differing circumstances. For example: (1) the bias
function, such as the medium or the standard deviation, be reported as the and imprecision may both be negligible; (2) the bias may not be
test result. It also may require standard corrections to be applied, such as negligible while the imprecision is negligible; (3) neither the bias nor
correction of gas volumes to standard temperature and pressure. A test the imprecision may be negligible; (4) the bias may be negligible while
result, therefore, can be a result calculated from several observed values. the imprecision is not negligible.
In the simple case, the test result is the observed value itself.
unit, n—an object on which a measurement or observation
test specimen, n—the portion of a test unit needed to obtain a may be made.
single test determination.
DISCUSSION—The word “unit” is commonly used in the sense of a
NOTE 49—When used for a physical test, this is sometimes called 88test unit of product (service, etc.)—the entity of product inspected in order
piece.” For a chemical test, it is sometimes called test portion or test to determine its classification or its measurements. This entity may be
sample. For optical and other tests, it is also sometimes called test sample. a single article, a set of like articles treated collectively, a subassembly,
In interlaboratory evaluation of test methods and other statistical proce- a stated quantity of material, etc. The unit of product or service need not
dures, it is best to reserve the word sample for the whole amount of be the same as the unit of purchase, supply, production, or shipment.
material involved and not the individual test specimens, pieces or portions
being tested. universe (population), n—the totality of the set of items,
NOTE 50—See Annex A1. units, or measurements, etc., real or conceptual, that is under
consideration.
test unit, n—the total quantity of material (containing one or
more test specimens) needed to obtain a test result as NOTE 54—This definition of universe is being revised to incorporate the
specified in the test method. See test result. concept of including one or more populations. Use with caution.
tolerance limits (specification limits), n—limits that define upper control limit (UCL), n—control limit for points above
the conformance boundaries for an individual unit of a the central line.
manufacturing or service operation. upper tolerance limit (UTL) (upper specification limit),
DISCUSSION—Limits may be established either with or without the use n—a tolerance limit applicable to the upper conformance
of probability considerations. Tolerance limits may be in the form of a boundary for an individual unit of a manufacturing or service
single (unilateral) limit (upper or lower) or double (bilateral) limits operation.
(upper and lower). Double, or two-sided limits occur more frequently. variables, method of, n—measurement of quality by the
Double limits are often stated as a symmetrical deviation from a stated method of variables consists of measuring and recording the
value, but they need not be symmetrical. Frequently the term specifi-
numerical magnitude of a quality characteristic for each of
cation limits is used instead of tolerance limits. While tolerance limits
is generally preferred in terms of evaluating the manufacturing or the units in the group under consideration.
service requirements, specification limits may be more appropriate for NOTE 55—This involves reference to a continuous scale of some kind.
categorizing material, product, or service in terms of their stated
requirements. variables data, n—measurements which vary and may take
tolerance specification, n—the total allowable variation any of a specified set of numerical values.
around a level or state (upper limit minus lower limit), or the DISCUSSION—The term “random variable” or “variate” is often used
maximum acceptable excursion of a characteristic. to indicate that each of the specified set of values is associated with a
specified relative frequency or probability, and that each is a random
DISCUSSION—The determination of the amount of variation to be sample from a continuous or a discrete, or discontinuous, population
allowed involves the product or service requirements and consideration encompassing the specified values.
of process capability (see natural process limits), measurement
variability, and other appropriate elements or some compromise among variance, n—a measure of the squared dispersion of observed
these. values or measurements expressed as a function of the sum
treatment, n—a combination of the levels (versions) of each of the squared deviations from the population mean or
of the factors assigned to an experimental unit, synonym sample average.
treatment combination. NOTE 56—The sample variance, or variance of a sample of n observed
treatment combination, n—see treatment. values, is computed as s2 = [1/(n − 1)][((yi − ȳ)2]. The sample standard
trueness, n—the closeness of agreement between the popula- deviation s is the positive square root of the sample variance. The
tion mean of the measurements or test results and the population variance s2 = *R (y − µ)2f (y)dy, where R is the region over
which the random variable y is defined, and where f (y) is the probability
accepted reference value.
density function and µ is the population mean of y. The population
NOTE 51—The measure of trueness usually is expressed in terms of standard deviation (s) is the positive square root of the population
bias. Greater bias means less favorable trueness. variance.
NOTE 52—88Population mean” is, conceptually, the average value of an DISCUSSION—A listing of the sample variance s2 should always be
indefinitely large number of test results. accompanied by the degrees of freedom on which it is based. The
NOTE 53—Trueness is the systematic component of accuracy. degrees of freedom for the sample variance described above are (n − 1).
uncertainty, n—an indication of the variability associated with within-laboratory standard deviation, n—the standard de-
a measured value that takes into account two major compo- viation of test results obtained within a laboratory for a

7
E 456
single material under conditions that may include such Youden square, n—a type of block design derived from
elements as different operators, equipment, and longer time certain Latin squares by deleting, or adding, rows (or
intervals. columns) so that one block factor remains complete blocks
NOTE 57—Because the training of operators, the agreement of different and the second block factor constitutes balanced incomplete
pieces of equipment in the same laboratory and the variation of environ- blocks. E 1325
mental conditions with longer time intervals depend on the degree of
within-laboratory control, the within-laboratory standard deviation is
likely to vary appreciably from laboratory to laboratory.

ANNEX

(Mandatory Information)

A1. MEASUREMENT TERMINOLOGY

A1.1 A test method often has three distinct stages: (1) the test method specifies that only one test determination is to be
direct observation of dimensions or characteristics, (2) the made, then the test determination value is the test result of the
combining of the observed values to obtain a single test test method. Some test methods require that several determi-
determination, and (3) the combining of a number of test nations be made and the values obtained be averaged or
determinations to obtain the test result of the test method. The otherwise combined to obtain the test result of the test method.
term measurement may be applied to any one or more of these Averaging of several determinations is often used to reduce the
stages of the measurement process. effect of local variations of the property within the material.
A1.2 In the simplest of test methods a single direct
observation is also the test determination and the test result. A1.3 Precision statements for ASTM test methods are
For example, a test observation required by a test method may usually based on test results, not test determinations or obser-
be the mass of a test specimen prepared and weighed in a vations. If for some compelling reason an ASTM committee
specified way. The observation would also be the test determi- wished to address the issue of variation between test determi-
nation of the mass of the test specimen, and if only one nations (in addition to the variation among test results), the
specimen is to be weighed, the observed weight would also be committee can do so with a clear declaration (of what is being
the test result of the test method. Another test method may done) to avoid confusion. Sampling plans and product speci-
require the measurement of the area of the test specimen as fications should specify the sample size in terms of the number
well as the mass, and then direct that the mass be divided by of replicate test results. A test method should specify the
the area to obtain the mass per unit area of the test specimen. required observations to obtain a test determination and the
The whole process of measuring the mass and the area and number of test determinations to be averaged or otherwise
calculating the mass per unit area is a test determination. If the combined to obtain a single test result.

ASTM International takes no position respecting the validity of any patent rights asserted in connection with any item mentioned
in this standard. Users of this standard are expressly advised that determination of the validity of any such patent rights, and the risk
of infringement of such rights, are entirely their own responsibility.

This standard is subject to revision at any time by the responsible technical committee and must be reviewed every five years and
if not revised, either reapproved or withdrawn. Your comments are invited either for revision of this standard or for additional standards
and should be addressed to ASTM International Headquarters. Your comments will receive careful consideration at a meeting of the
responsible technical committee, which you may attend. If you feel that your comments have not received a fair hearing you should
make your views known to the ASTM Committee on Standards, at the address shown below.

This standard is copyrighted by ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959,
United States. Individual reprints (single or multiple copies) of this standard may be obtained by contacting ASTM at the above
address or at 610-832-9585 (phone), 610-832-9555 (fax), or service@astm.org (e-mail); or through the ASTM website
(www.astm.org).

You might also like