You are on page 1of 8

The fundamentals of Statistical Process Control (though that was not what it was called at the time)

and the associated tool of the Control Chart were developed by Dr Walter A Shewhart in the mid-
1920’s. His reasoning and approach were practical, sensible and positive. In order to be so, he
deliberately avoided overdoing mathematical detail. In later years, significant mathematical attributes
were assigned to Shewharts thinking with the result that this work became better known than the
pioneering application that Shewhart had worked up.

The crucial difference between Shewhart’s work and the inappropriately-perceived purpose of SPC that
emerged, that typically involved mathematical distortion and tampering, is that his developments
were in context, and with the purpose, of process improvement, as opposed to mere process
monitoring. I.e. they could be described as helping to get the process into that “satisfactory state”
which one might then be content to monitor. Note, however, that a true adherent to Deming’s
principles would probably never reach that situation, following instead the philosophy and aim of
continuous improvement.

Explanation and Illustration:

What do “in control” and “out of control” mean?

Suppose that we are recording, regularly over time, some measurements from a process. The
measurements might be lengths of steel rods after a cutting operation, or the lengths of time to
service some machine, or your weight as measured on the bathroom scales each morning, or the
percentage of defective (or non-conforming) items in batches from a supplier, or measurements of
Intelligence Quotient, or times between sending out invoices and receiving the payment etc., etc..

A series of line graphs or histograms can be drawn to represent the data as a statistical distribution. It
is a picture of the behaviour of the variation in the measurement that is being recorded. If a process is
deemed as “stable” then the concept is that it is in statistical control. The point is that, if an outside
influence impacts upon the process, (e.g., a machine setting is altered or you go on a diet etc.) then,
in effect, the data are of course no longer all coming from the same source. It therefore follows that
no single distribution could possibly serve to represent them. If the distribution changes unpredictably
over time, then the process is said to be out of control. As a scientist, Shewhart knew that there is
always variation in anything that can be measured. The variation may be large, or it may be
imperceptibly small, or it may be between these two extremes; but it is always there.

What inspired Shewhart’s development of the statistical control of processes was his observation that
the variability which he saw in manufacturing processes often differed in behaviour from that which he
saw in so-called “natural” processes – by which he seems to have meant such phenomena as
molecular motions.
Wheeler and Chambers combine and summarise these two important aspects as follows:

• "While every process displays variation, some processes


display controlled variation, while others display uncontrolled
variation."

In particular, Shewhart often found controlled (stable variation in natural processes and uncontrolled
(unstable variation in manufacturing processes. The difference is clear. In the former case, we know
what to expect in terms of variability; in the latter we do not. We may predict the future, with some
chance of success, in the former case; we cannot do so in the latter.

Why is "in control" and "out of control" important?

Shewhart gave us a technical tool to help identify the two types of variation: the control chart - (see
Control Charts as the annex to this topic).

What is important is the understanding of why correct identification of the two types of variation is so
vital. There are at least three prime reasons.

First, when there are irregular large deviations in output because of unexplained special causes, it is
impossible to evaluate the effects of changes in design, training, purchasing policy etc. which might be
made to the system by management. The capability of a process is unknown, whilst the process is out
of statistical control.

Second, when special causes have been eliminated, so that only common causes remain,
improvement then has to depend upon management action. For such variation is due to the way that
the processes and systems have been designed and built – and only management has authority and
responsibility to work on systems and processes. As Myron Tribus, Director of the American Quality
and Productivity Institute, has often said:

• “The people work in a system.

• The job of the manager is

○ To work on the system

○ To improve it, continuously,

• With their help.”

Finally, something of great importance, but which has to be unknown to managers who do not have
this understanding of variation, is that by (in effect) misinterpreting either type of cause as the other,
and acting accordingly, they not only fail to improve matters – they literally make things worse.
These implications, and consequently the whole concept of the statistical control of processes, had a
profound and lasting impact on Dr Deming. Many aspects of his management philosophy emanate
from considerations based on just these notions.

So why SPC?

The plain fact is that when a process is within statistical control, its output is indiscernible from
random variation: the kind of variation which one gets from tossing coins, throwing dice, or shuffling
cards. Whether or not the process is in control, the numbers will go up, the numbers will go down;
indeed, occasionally we shall get a number that is the highest or the lowest for some time. Of course
we shall: how could it be otherwise? The question is - do these individual occurrences mean anything
important? When the process is out of control, the answer will sometimes be yes. When the process is
in control, the answer is no.

So the main response to the question Why SPC? is therefore this: It guides us to the type of action
that is appropriate for trying to improve the functioning of a process. Should we react to individual
results from the process (which is only sensible, if such a result is signalled by a control chart as being
due to a special cause) or should we instead be going for change to the process itself, guided by
cumulated evidence from its output (which is only sensible if the process is in control)?

Process improvement needs to be carried out in three chronological phases:

• Phase 1: Stabilisation of the process by the identification and


elimination of special causes:

• Phase 2: Active improvement efforts on the process itself, i.e.


tackling common causes;

• Phase 3: Monitoring the process to ensure the improvements


are maintained, and incorporating additional improvements as
the opportunity arises.

Control charts have an important part to play in each of these three Phases. Points beyond control
limits (plus other agreed signals) indicate when special causes should be searched for. The control
chart is therefore the prime diagnostic tool in Phase 1. All sorts of statistical tools can aid Phase 2,
including Pareto Analysis, Ishikawa Diagrams, flow-charts of various kinds, etc., and recalculated
control limits will indicate what kind of success (particularly in terms of reduced variation) has been
achieved. The control chart will also, as always, show when any further special causes should be
attended to. Advocates of the British/European approach will consider themselves familiar with the use
of the control chart in Phase 3. However, it is strongly recommended that they consider the use of a
Japanese Control Chart (q.v.) in order to see how much more can be done even in this Phase than is
normal practice in this part of the world.
References:

• WHY SPC? – BRITISH DEMING ASSOCIATION

Related topics

• Control Charts

• Pareto Analysis

• Ishikawa Diagrams

The "levels of measurement", or scales of measure are expressions that typically refer to the theory of scale types developed
by the psychologist Stanley Smith Stevens. Stevens proposed his theory in a 1946 Science article titled "On the theory of scales
of measurement"[1]. In that article, Stevens claimed that all measurement in science was conducted using four different types of
scales that he called "nominal", "ordinal", "interval" and "ratio".

Contents
• 1 The theory of scale types
○ 1.1 Nominal scale
○ 1.2 Ordinal scale
○ 1.3 Interval scale
○ 1.4 Ratio measurement
• 2 Debate on classification scheme
• 3 Scale types and Stevens' "operational theory of
measurement"
• 4 Notes
• 5 See also
• 6 References
• 7 External links

[edit] The theory of scale types

Stevens (1946, 1951) proposed that measurements can be classified into four different types of scales. These are shown in the
table below as: nominal, ordinal, interval, and ratio.

Admissible Scale Mathematical


Scale Type Permissible Statistics
Transformation structure

nominal (also standard set


One to One
denoted as mode, chi square structure
(equality (=))
categorical) (unordered)

Monotonic
ordinal median, percentile increasing totally ordered set
(order (<))

interval mean, standard deviation, correlation, Positive linear affine line


regression, analysis of variance (affine)

All statistics permitted for interval scales plus


Positive similarities
ratio the following: geometric mean, harmonic field
(multiplication)
mean, coefficient of variation, logarithms

Nominal scale

At the nominal scale, i.e., for a nominal category, one uses labels; for example, rocks can be generally categorized as igneous,
sedimentary and metamorphic. For this scale, some valid operations are equivalence and set membership. Nominal measures
offer names or labels for certain characteristics.

Variables assessed on a nominal scale are called categorical variables; see also categorical data.

Stevens (1946, p. 679) must have known that claiming nominal scales to measure obviously non-quantitative things would have
attracted criticism, so he invoked his theory of measurement to justify nominal scales as measurement:

The central tendency of a nominal attribute is given by its mode; neither the mean nor the median can be defined.

We can use a simple example of a nominal category: first names. Looking at nearby people, we might find one or more of them
named Aamir. Aamir is their label; and the set of all first names is a nominal scale. We can only check whether two people have
the same name (equivalence) or whether a given name is in on a certain list of names (set membership), but it is impossible to say
which name is greater or less than another (comparison) or to measure the difference between two names. Given a set of people,
we can describe the set by its most common name (the mode), but cannot provide an "average name" or even the "middle name"
among all the names. However, if we decide to sort our names alphabetically (or to sort them by length; or by how many times
they appear in the US Census), we will begin to treat this nominal scale into an ordinal scale.

Ordinal scale

Rank-ordering data simply puts the data on an ordinal scale. Ordinal measurements describe order, but not relative size or degree
of difference between the items measured. In this scale type, the numbers assigned to objects or events represent the rank order
(1st, 2nd, 3rd, etc.) of the entities assessed. A scale may also use names with an order such as: "bad", "medium", and "good"; or
"very satisfied", "satisfied", "neutral", "unsatisfied", "very unsatisfied." An example of an ordinal scale is the result of a horse
race, which says only which horses arrived first, second, or third but include no information about race times. Another is the
Mohs scale of mineral hardness, which characterizes the hardness of various minerals through the ability of a harder material to
scratch a softer one, saying nothing about the actual hardness of any of them.

When using an ordinal scale, the central tendency of a group of items can be described by using the group's mode (or most
common item) or its median (the middle-ranked item), but the mean (or average) cannot be defined.

In a critique of psychometrics, Stevens argued that many of the measurements used by the field only measure relative order, not
comparative magnitude, of variables such as intelligence:

As a matter of fact, most of the scales used widely and effectively by psychologists are ordinal scales. In the strictest propriety
the ordinary statistics involving means and standard deviations ought not to be used with these scales, for these statistics imply a
knowledge of something more than the relative rank order of data (1946, p.679).

Psychometricians like to theorise that psychometric tests produce interval scale measures of cognitive abilities (e.g. Lord &
Novick, 1968; von Eye, 2005) but there is little prima facie evidence to suggest that such attributes are anything more than
ordinal for most psychological data (Cliff, 1996; Cliff & Keats, 2003; Michell, 2008). In particular, although some psychologists
say otherwise,[2] IQ scores reflect an ordinal scale, in which all scores are only meaningful for comparison, rather than an interval
scale, in which a given number of IQ "points" corresponds to a unit of intelligence. [3][4][5] Thus it is an error to write that an IQ of
160 is just as different from an IQ of 130 as an IQ of 100 is different from an IQ of 70.[6][7]
In mathematical order theory, an ordinal scale defines a total preorder of objects (in essence, a way of sorting all the objects, in
which some may be tied). The scale values themselves (such as labels like "great", "good", and "bad"; 1st, 2nd, and 3rd) have a
total order, where they may be sorted into a single line with no ambiguities. If numbers are used to define the scale, they remain
correct even if they are transformed by any monotonically increasing function. This property is known as the order isomorphism.
A simple example follows:

Since x-8, 3x, and x3 are all monotonically increasing functions, replacing the ordinal judge's score by any of these alternate
scores does not affect the relative ranking of the five people's cooking abilities. Each column of numbers is an equally legitimate
ordinal scale for describing their abilities. However, the numerical (additive) difference between the various ordinal scores has no
particular meaning.

See also Strict weak ordering.

Interval scale

Quantitative attributes are all measurable on interval scales, as any difference between the levels of an attribute can be multiplied
by any real number to exceed or equal another difference. A highly familiar example of interval scale measurement is
temperature with the Celsius scale. In this particular scale, the unit of measurement is 1/100 of the difference between the melting
temperature and the boiling temperature of water at atmospheric pressure. The "zero point" on an interval scale is arbitrary; and
negative values can be used. The formal mathematical term is an affine space (in this case an affine line). Variables measured at
the interval level are called "interval variables" or sometimes "scaled variables" as they have units of measurement.

Ratios between numbers on the scale are not meaningful, so operations such as multiplication and division cannot be carried out
directly. But ratios of differences can be expressed; for example, one difference can be twice another.

The central tendency of a variable measured at the interval level can be represented by its mode, its median, or its arithmetic
mean. Statistical dispersion can be measured in most of the usual ways, which just involved differences or averaging, such as
range, interquartile range, and standard deviation. Since one cannot divide, one cannot define measures that require a ratio, such
as studentized range or coefficient of variation. More subtly, while one can define moments about the origin, only central
moments are useful, since the choice of origin is arbitrary and not meaningful. One can define standardized moments, since ratios
of differences are meaningful, but one cannot define coefficient of variation, since the mean is a moment about the origin, unlike
the standard deviation, which is (the square root of) a central moment.

Ratio measurement

Most measurement in the physical sciences and engineering is done on ratio scales. Mass, length, time, plane angle, energy and
electric charge are examples of physical measures that are ratio scales. The scale type takes its name from the fact that
measurement is the estimation of the ratio between a magnitude of a continuous quantity and a unit magnitude of the same kind
(Michell, 1997, 1999). Informally, the distinguishing feature of a ratio scale is the possession of a non-arbitrary zero value. For
example, the Kelvin temperature scale has a non-arbitrary zero point of absolute zero, which is denoted 0K and is equal to
-273.15 degrees Celsius. This zero point is non arbitrary as the particles that compose matter at this temperature have zero kinetic
energy.

Examples of ratio scale measurement in the behavioral sciences are all but non-existent. Luce (2000) argues that an example of
ratio scale measurement in psychology can be found in rank and sign dependent expected utility theory.

All statistical measures can be used for a variable measured at the ratio level, as all necessary mathematical operations are
defined. The central tendency of a variable measured at the ratio level can be represented by, in addition to its mode, its median,
or its arithmetic mean, also its geometric mean or harmonic mean. In addition to the measures of statistical dispersion defined for
interval variables, such as range and standard deviation, for ratio variables one can also define measures that require a ratio, such
as studentized range or coefficient of variation.

Debate on classification scheme

There has been, and continues to be, debate about the merits of the classifications, particularly in the cases of the nominal and
ordinal classifications (Michell, 1986). Thus, while Stevens' classification is widely adopted, it is by no means universally
accepted[8].
Duncan (1986) observed that Stevens' classification nominal measurement is contrary to his own definition of measurement.
Stevens (1975) said on his own definition of measurement that "the assignment can be any consistent rule. The only rule not
allowed would be random assignment, for randomness amounts in effect to a nonrule". However, so-called nominal measurement
involves arbitrary assignment, and the "permissible transformation" is any number for any other. This is one of the points made in
Lord's (1953) satirical paper On the Statistical Treatment of Football Numbers.

Among those who accept the classification scheme, there is also some controversy in behavioural sciences over whether the mean
is meaningful for ordinal measurement. In terms of measurement theory, it is not, because the arithmetic operations are not made
on numbers that are measurements in units, and so the results of computations do not give numbers in units. However, many
behavioural scientists use means for ordinal data anyway. This is often justified on the basis that ordinal scales in behavioural
science are really somewhere between true ordinal and interval scales; although the interval difference between two ordinal ranks
is not constant, it is often of the same order of magnitude. For example, applications of measurement models in educational
contexts often indicate that total scores have a fairly linear relationship with measurements across a range of an assessment. Thus,
some argue, that so long as the unknown interval difference between ordinal scale ranks is not too variable, interval scale
statistics such as means can meaningfully be used on ordinal scale variables. Statistical analysis software such as PSPP require
the user to select the appropriate measurement class for each variable. This ensures that subsequent user errors cannot
inadvertently perform meaningless analyses (for example correlation analysis with a variable on a nominal level).

Scale types and Stevens' "operational theory of measurement"

The theory of scale types is the intellectual handmaiden to Stevens' "operational theory of measurement", which was to become
definitive within psychology and the behavioral sciences, despite Michell's characterization as its being quite at odds with
Michell's understanding of measurement in the natural sciences (Michell, 1999). Essentially, the operational theory of
measurement was a reaction to the conclusions of a committee established in 1932 by the British Association for the
Advancement of Science to investigate the possibility of genuine scientific measurement in the psychological and behavioral
sciences. This committee, which became known as the Ferguson committee, published a Final Report (Ferguson, et al., 1940,
p. 245) in which Stevens' sone scale (Stevens & Davis, 1938) was an object of criticism:

That is, if Stevens' sone scale was genuinely measuring the intensity of auditory sensations, then evidence for such sensations as
being quantitative attributes must be produced. The evidence needed was the presence of additive structure - a concept
comprehensively treated by the German mathematician Otto Hölder (Hölder, 1901). Given the physicist and measurement
theorist Norman Robert Campbell dominated the Ferguson committee's deliberations, the committee concluded that measurement
in the social sciences was impossible due to the lack of concatenation operations. This conclusion was later rendered false by the
discovery of the theory of conjoint measurement by Debreu (1960) and independently by Luce & Tukey (1964). However,
Stevens' reaction was not to conduct experiments to test for the presence of additive structure in sensations, but instead to render
the conclusions of the Ferguson committee null and void by proposing a new theory of measurement:

The Canadian measurement theorist William Rozeboom (1966) was an early and trenchant critic of Stevens' theory of scale
types. But it was not until much later with the work of mathematical psychologists Theodore Alper (1985, 1987), Louis Narens
(1981a, b) and R. Duncan Luce (1986, 1987, 2001) did the concept of scale types receive the mathematical rigour that it lacked at
its inception. As Luce (1997, p. 395) bluntly stated:

Notes

1. ^ Stevens, S. S. (1946). "On the Theory of Scales of Measurement". Science 103 (2684): 677–
680. doi:10.1126/science.103.2684.677. PMID 17750512.
http://www.sciencemag.org/cgi/rapidpdf/103/2684/677.
2. ^ Sheskin, David J. (2007). Handbook of Parametric and Nonparametric Statistical Procedures
(Fourth ed.). Boca Raton (FL): Chapman & Hall/CRC. p. 3. ISBN 9781584888147. Lay summary
(27 July 2010). "Although in practice IQ and most other human characteristics measured by
psychological tests (such as anxiety, introversion, self esteem, etc.) are treated as interval
scales, many researchers would argue that they are more appropriately categorized as ordinal
scales. Such arguments would be based on the fact that such measures do not really meet the
requirements of an interval scale, because it cannot be demonstrated that equal numerical
differences at different points on the scale are comparable."
3. ^ Mussen, Paul Henry (1973). Psychology: An Introduction. Lexington (MA): Heath. p. 363.
ISBN 0-669-61383-7. "The I.Q. is essentially a rank; there are no true "units" of intellectual
ability."
4. ^ Truch, Steve (1993). The WISC-III Companion: A Guide to Interpretation and Educational
Intervention. Austin (TX): Pro-Ed. p. 35. ISBN 0890795851. "An IQ score is not an equal-interval
score, as is evident in Table A.4 in the WISC-III manual."
5. ^ Bartholomew, David J. (2004). Measuring Intelligence: Facts and Fallacies. Cambridge:
Cambridge University Press. p. 50. ISBN 9780521544788. Lay summary (27 July 2010). "When
we come to quantities like IQ or g, as we are presently able to measure them, we shall see
later that we have an even lower level of measurement—an ordinal level. This means that the
numbers we assign to individuals can only be used to rank them—the number tells us where
the individual comes in the rank order and nothing else."
6. ^ Eysenck, Hans (1998). Intelligence: A New Look. New Brunswick (NJ): Transaction Publishers.
pp. 24–25. ISBN 1-56000-360-X. "Ideally, a scale of measurement should have a true zero-point
and identical intervals. . . . Scales of hardness lack these advantages, and so does IQ. There is
no absolute zero, and a 10-point difference may carry different meanings at different points of
the scale."
7. ^ Mackintosh, N. J. (1998). IQ and Human Intelligence. Oxford: Oxford University Press. pp. 30–
31. ISBN 0-19-852367-X. "In the jargon of psychological measurement theory, IQ is an ordinal
scale, where we are simply rank-ordering people. . . . It is not even appropriate to claim that
the 10-point difference between IQ scores of 110 and 100 is the same as the 10-point
difference between IQs of 160 and 150"
8. ^ Velleman, Paul F.; Wilkinson,Leland (1993). "Nominal, Ordinal, Interval, and Ratio Typologies
Are Misleading". The American Statistician (American Statistical Association) 47 (1): 65–72.
doi:10.2307/2684788. http://www.jstor.org/stable/2684788.
9. ^ Chrisman, Nicholas R. (1998). Rethinking Levels of Measurement for Cartography.
Cartography and Geographic Information Science, vol. 25 (4), pp. 231-242

See also
• Measure (mathematics)
• Inter-rater reliability
• Cohen's kappa
• Category theory
• Quantitative data
• Qualitative data
• Ramsey–Lewis method

References
• measurement and psychophysics. In S.S. Stevens (Ed.), Handbook of experimental psychology
(pp. 1–49). New York: Wiley.
• Stevens, S.S. (1975). Psychophysics. New York: Wiley.
• von Eye, A. (2005). Review of Cliff and Keats, Ordinal measurement in the behavioral sciences.
Applied Psychological Measurement, 29, 401–403.

You might also like