Professional Documents
Culture Documents
control
The Histogram as a Measurement
of Process Consistency
Ordinary histogram
6000
0
2000
4000
1000
Frequency
1500
8000
2000
10000
Cumulative histogram
500
Frequency
-4
-2
-4
rnorm (1000)
-2
rnorm (1000)
Figure 1. Both histograms use the same data, the difference is in how the data is presented.
www.metalfinishing.com
500
0
Frequency
1000
1500
qualitycontrol
-6
-4
-2
SHAPE OR FORM OF A
DISTRIBUTION
The shape of a histogram provides
important information about the
data distribution. The histogram is
may be highly or moderately skewed
to the left or right. A symmetrical
shape is also possible, although a
histogram is never perfectly symmetrical. If the histogram is skewed to
the left, or negatively skewed, the tail
extends further to the left.
The mode of a distribution is that
value which is most frequently
occurring or has the largest probability of occurrence. The sample mode
occurs at the peak of the histogram.
For many phenomena, it is quite
common for the distribution of the
response values to cluster around a
single mode (unimodal) and then
distribute themselves with lesser frequency out into the tails. The normal
distribution is the classic example of
a unimodal distribution.
The histogram shown in Figure 2
illustrates data from a bimodal (2
peak) distribution. The histogram
serves as a tool for diagnosing problems
such
as
bimodality.
Questioning the underlying reason
for distributional non-unimodality
frequently leads to greater insight
and improved deterministic modelwww.metalfinishing.com
Positive Skewed
Skewed Histogram
Negative Skewed
qualitycontrol
Platykurtic
Leptokurtic
Figure 6. Illustration of Kurtosis.
no best number of cells, and different cell sizes can reveal different
features of the data. Some theoreticians have attempted to determine
an optimal number of cells, but
these methods generally make
strong assumptions about the shape
of the distribution. Depending on
the actual data distribution and the
goals of the analysis, different cell
widths may be appropriate, so experimentation is usually needed to
determine an appropriate width.
There are, however, various useful
guidelines and rules of thumb.
Most engineers favor setting the
number of cells somewhere between
11 and 17, but always an odd number. The later point is important so
that the mid-point of the distribution is not split between two cells. It
is also a good rule, when using measurement data, to set the cell limits a
point halfway between the number
of decimal points of the most precise
data. Consider what happens where a
cell is 4 to 8 and the next cell 8 to 12.
A reading of 8 could fall in either cell,
hence the rule.
Kurtosis. In probability theory and
statistics, kurtosis is derived from the
Greek word meaning bulging is any
measure of the peakedness of the
38 I metalfinishing I September 2012
probability distribution of a
real-valued random variable. In
a similar way to
the concept of
skewness, kurtosis is a descriptor of the shape of a
probability distribution and, just as
for skewness, there are different ways
of quantifying it for a theoretical distribution and corresponding ways of
estimating it from a sample from a
population.
One math-based common measure
of kurtosis, originating with Karl
Pearson, is based on a scaled version
of the fourth moment of the data or
population, but it has been argued
that this measure really measures
heavy tails, and not peakedness. For
this measure, higher kurtosis means
more of the variance is the result of
infrequent extreme deviations, as
opposed to frequent modestly sized
deviations. It is common practice to
use an adjusted version of Pearsons
kurtosis, the excess kurtosis, to provide a comparison of the shape of a
given distribution to that of the normal distribution. Distributions with
negative or positive excess kurtosis
are called platykurtic or leptokurtic
distributions, respectively. When a
curve, or histogram, is compared to a
normal distribution, a platykurtic
data set has a flatter peak around its
mean, which causes thin tails within
the distribution.
Leptokurtic is a description of the
kurtosis in a distribution in which
BIO
Leslie W. Flott, Ph.B., CQE, ASQ Fellow,
is certified as an IDEM Wastewater
Treatment Operator and Indiana
Wastewater Treatment Operator. He
received his Bachelor of Science Degree in
Chemistry
from
Northwestern
University and his Masters Degree in
materials engineering from Notre Dame
University. Most recently, Flott served as
the environmental program director and
instructor at Ivy Tech Community
College. Prior to that, he was the health,
environment, and safety manager at
Wayne Metal Protection Company.
www.metalfinishing.com