You are on page 1of 41

Reliability Distributions

Normal & LogNormal


4.3
4.2
Sample Mean

4.1
4.0
3.9
3.8
1 1
3.7
Subgroup 0 5 10 15 20 25

0.6
1
0.5
Sample StDev

0.4
0.3
0.2
0.1
S

0.0

Author
Anil Kumar Ammina
Reliability Engineering
BE Analytic Solutions
Key Terminologies
Descriptive Statistics :
Numerical index that describes some characteristics of the population. It
describes sample or finite population, gathers and describes data in terms of
tables, graphs etc. and it is subjective in nature

Inferential / Analytical Statistics :


Quantitative technique that enables the experimenter to generalize about the
population using limited number of observations. It uses of probability and
distributions and endeavor to reduce subjectivity Population Sample
(parameter) (Statistic)
Mean µ x

Variance σ s
Point Estimator:
The estimation of the population in terms of a single value. Specified by putting
hat on the parameter. If q the parameter, q^ is the point estimator

Interval Estimator:
The estimation of the population in terms of a range. Specified as Point Estimator
^ D is the interval estimator
± Error If q the parameter, q±
Types of Data

Variable data:
Data which can be quantified, e.g., temperature, number of countries etc.

Continuous Variable: Variable that can take values in infinitesimally close


neighborhood of the previous value and yet makes some meaning, e.g., pressure,
length etc.
Discrete Variable: Variable that can take values at definitely distinguishable
intervals of the previous value, e.g., no. of students in the class, no. of defects
etc.
Binary Variable: Special type of discrete variable that can takes values as 0 and
1; also Yes/No or Pass/Fail or Defective/Non defective type of data.

Attribute data :
Data which cannot be quantified, e.g., fragrance of a flower, aesthetics of the
product, pleasing appearance, conduct of the students etc. Based on human
feelings e.g. taste of tea. Subjectivity in assessment.
Visualizing Your Data – Box Plot

Outlier
* any point outside the Lower or
Upper Limit.

When is it used? Maximum Observation


that falls within the
Upper Limit = Q3 + 1.5 (Q3 - Q1)
You have a set or sets of continuous
data.
75th Percentile (Q3)

You wish to visually check the distribution Median/50th Percentile(Q2)


of the data.

You wish to look for evidence of


differences between data sets. 25th Percentile (Q1)

Minimum Observation
that falls within the
Lower Limit = Q1 - 1.5 (Q3 - Q1)
Box Plot Exercise

A 5-weight% NaOH solution is used in


large quantities in your chemical
process.

You have data for analyses on 100


batches each of this solution that you
have purchased from four different
vendors over the past year.

You are seeing problems in your


process and have started to wonder if 7.5

maybe they are related to variations in 6.5

the solutions that you are purchasing.


NaOH-W%
5.5

Let’s take a “visual look” at the data. 4.5

3.5
1 2 3 4
Vendor
Visualizing Your Data – Histogram
When is it used? 15

You have a set or sets of continuous 10

Frequency
data.
5

You wish to visually check the shape


(frequency distribution) and spread of the 0

data. 3.5 4.5 5.5 6.5


Vendor 4

How does Minitab Construct a Histogram?


Counts the number of data points
Determines the Range, R, for the data
Determines the # of classes, K.
Determines class width = R/K
Determines how many data points
fall into each class.
Plots the number/class (frequency) in
bar graph format

Note
Number of classes,K, is user adjustable (under “options”)
Visualizing Your Data – Scatter Plot

When is it used?

You have a set of paired data for two


continuous variables.

You wish to visually check for evidence of


correlation (trends) between the
variables.
Descriptive Statistics
Are we putting the correct amount of beverage into our cans?

Target Value = 500 ml

Sample Size = n = 24

485 490 495 500 505 510 515

Most Data Sets:


Show a tendency to cluster about a central point
Exhibit a variability (dispersion/spread)
Central Tendency
Represents the nominal value of the process

Mean (x)

Median (“middle” data point)

Quartile values (Q1, Q3)

Q1 Q3
Spread
Represents the variation in the process

Standard Deviation (s) Standard Deviation

Range or Span (P95 - P5)

P5 P95

Stability Factor (SF) = Q1/Q3

Q1 Q3 Q1 Q3
Descriptive Statistics

Provides summary report


of basic statistical information
Click on Graphs,
Select Graphical
Summary

Minitab Output
Normality Check
Histogram

Box Plot
Basic Statistical
Information

Confidence Intervals
Using Excel
45
40
35
30
25
20
15
10
5
0
0 2 4 6 8 10 12 14 16 18
Series1 0 0 0 0 4 8 37 42 7 2

Summary Statistics for Data


Mean 11.95
Standard Error 0.19
Median 12.04
Mode #N/A
Standard Deviation 1.86
KURTOSIS Sample Variance 3.45
Kurtosis 1.00
Skewness 0.03
Range 10.90
Minimum 6.89
Maximum 17.79
Sum 1195.41
Count 100
Normal ( Gaussian ) Distribution
This is the distribution that is approximately followed by the random processes and
is specified by just two parameters, viz., µ (mu) and σ (sigma) as follows -
( x − µ )2
1 −
2σ 2
N ( x, µ , σ ) = e
2π σ
•Symmetric about its mean µ
•As σ increases, it becomes flatter, the peak reduces.

Gaussian Distribution  Majority of the processes


naturally follow
0.1
Normal Distribution.
Probability De ns ity

0.08
sigma =5
Function

0.06 sigma = 10  Non-assignable reasons


0.04 sigma = 20 of variation are attributed
sigma = 30 to Normal distribution
0.02

0
-50 -30 -10 10 30 50 70 90 110 130 150 170
x
Mean
The expected or mean value of a continuous random variable X with pdf f (x) is


µX = E (X ) = ∫ x ⋅ f ( x ) dx For continuous data
−∞

n
xi
µX = ∑i=1 n
For discrete data

Mean is simply the numerical average of the data


Variance
The variance of continuous random variable X with pdf f(x) and mean is


2
σ X = V (x) = ( x − µ )2 ⋅ f ( x)dx For continuous data
−∞
= E [( X − µ )2 ]

n
( xi − x ) 2
S2 = ∑
i =1 n −1
For discrete data

The standard deviation is σ X = V ( x).


Short-cut Formula for Variance V (X ) = E ( X 2
) − [E ( X )]
2

The Variance is a measure of the average squared deviation from the mean µ
Z (Standardised Normal) Distribution
Put the following substitution in the normal distribution
z=
(x − µ )
σ
The probability density function 2is given as follows -
1 − z2
N (0,1) = N ( z ,0,1) = e for − ∞ ≤ z ≤ ∞
2π x = µ + zσ
•Symmetric about its mean z=0
•Basis for 6σ and z-score
•Binds the family of normal distributions into a single
distribution.
z Distribution
Z = 6 means
0.5
Z-score is 6
Probability Density

0.4
Function

0.3

0.2 and
0.1

0
-4 -3 -2 -1 0 1 2 3 4 The process is 6σ
z
Standardize the Normal Distribution

X–µ
Z=
Normal
σ Standardized
Distribution f (Z) Normal Distribution
f(X)
σz=1
σ

µ X µZ =0 Z
One table!
Standardized Normal Distribution

X −µ 6.2 − 5
Z= = = 0.12
σ 10

Normal Distribution Standardized


Normal Distribution
f(X)
( )
σ = 10 σZ =1

6.2 X 0.12 Z
µ =5 µZ = 0
Shaded Area Exaggerated
P(X > 8) = .3821

X −µ 8−5
Z= = = .30
σ 10

Normal Distribution Standardized


Normal Distribution
σ = 10
σZ =1

8 X .30 Z
µ =5 µZ = 0
Shaded Area Exaggerated
P(X > 8) = .3821 (continued)

Cumulative Standardized Normal


Distribution Table (Portion) µZ = 0 σZ =1
Z .00 .01 .02
.6179
0.0 .5000 .5040 .5080
.3821
0.1 .5398 .5438 .5478

0.2 .5793 .5832 .5871 0 .30 Z

0.3 .6179 .6217 .6255


P(X>8) = 1– .6179 = .3821
Shaded Area Exaggerated
Standardized Normal Distribution
(Different Type of Tables)

Z = X − µ = 6 . 2 − 5 = 0 . 12
σ 10
Normal Standardized
Distribution Normal Distribution

σ = 10 σZ = 1
.0478

µ = 5 6.2 X µ = 0 .12 Z
Z
Shaded Area Exaggerated
Normal Distribution Thinking Challenge

You work in Quality Control for, Light bulb


life has a normal distribution with
µ = 2000 hours & σ = 200 hours. What’s
the probability that a bulb will last?
 a. between 2000 & 2400
hours?
 b. less than 1470 hours
Solution: P(2000 ≤ X ≤ 2400)

X −µ 2400 − 2000
Z= = = 2.0
σ 200
Normal Standardized
Distribution Normal Distribution
σ = 200 σZ = 1

.4772

µ = 2000 2400 X µ Z= 0 2.0 Z


Solution: P(X ≤ 1470)

X −µ 1470 − 2000
Z= = = −2.65
σ 200
Normal Standardized
Distribution Normal Distribution

σ = 200 σZ = 1

.0040
1470 µ = 2000 X -2.65 µ Z= 0 Z
Finding Z Values for Known Probabilities

What is Z Given Cumulative Standardized


Probability = 0.6217 ? Normal Distribution Table

µZ = 0 σZ =1 Z .00 .01 0.2

0.0 .5000 .5040 .5080


.6217
0.1 .5398 .5438 .5478

0.2 .5793 .5832 .5871


0 Z
Shaded Area 0.3 .6179 .6217 .6255
Exaggerated Z = .31
Recovering X Values for Known Probabilities

Normal Distribution Standardized


Normal Distribution
σ = 10
.6217 σZ =1

? X 0.31 Z
µ =5 µZ = 0

X = µ + Zσ = 5 + (.31)(10 ) = 8.1
Normal Distribution Example
The bolt has mean diameter of 10.5mm and its standard deviation is 0.8mm.
What is the probability that in a random sample, the diameter is between 8.8mm
and 12.5mm?

Solution :
µ = 10.5; σ = 0.8; xL = 8.8; xU = 12.5
xL − µ 8.8 − 10.5
zL = = = −2.125
σ 0.8
xU − µ 12.5 − 10.5
zU = = = 2.5
σ 0.8
Pr( xL ≤ x ≤ xU ) = Pr( z L ≤ z ≤ zU )
∴ Pr(8.8 ≤ x ≤ 12.5) = Pr(−2.125 ≤ z ≤ 2.5)
Normal Distribution Example

= −

Pr(−2.125 ≤ z ≤ 2.5) = Pr(−∞ ≤ z ≤ 2.5) − Pr(−∞ ≤ z ≤ −2.125)

Single-Tail Z Table (values from 0.00 to 7.99)


z

Total area under z Z distribution is symmetric


distribution is ‘1.0000’

Pr(−2.125 ≤ z ≤ 2.5) = (1 − Pr(2.5 ≤ z ≤ ∞)) − Pr(2.125 ≤ z ≤ ∞)


∴ Pr(−2.125 ≤ z ≤ 2.5) = (1 − 0.0062) − 0.0168 = 0.9770
Assessing Normality

 Construct charts
 For small- or moderate-sized data sets, do stem-and-leaf display and
box-and-whisker plot look symmetric?
 For large data sets, does the histogram or polygon appear bell-shaped?
 Do the mean, median and mode have similar values?
 Check the P-Value
Central Limit Theorem

For majority of the distributions, the mean of the samples follow almost a normal
distribution as the sample size becomes at least 30.

When the inference is to be drawn about the sample mean E.g. Sachin
Tendulkar’s average batting performance in test series. It may be possible that his
batting statistics for all the tests is a non-normal distribution. In such situation, one
may like to consider a series average and use this data of means to analyze his
performance.

It is used as the basis to calculate confidence interval when the sample size is
greater than 30

For all practical purposes, Central Limit Theorem is true


Normal Probability Density Function

It is the most widely-used general purpose distribution.

The pdf of the normal distribution


Normal Statistical Properties

The Normal Mean= Median = Mode

The reliability for a mission of time

There is no closed-form solution for the normal reliability function


The instantaneous normal failure rate
The Lognormal Distribution

The lognormal distribution is commonly used to model, if the logarithm of the


random variable is normally distributed

The pdf for this distribution

Where, = ln(T), where the T values are the times-to-failure.

equal probabilities under the normal and lognormal pdfs, incremental areas should also be equal
The mean of the lognormal distribution, µ, is

The mean of the natural logarithms of the times-to-failure,

The standard deviation of the lognormal distribution

The standard deviation of the natural logarithms of the times-to-failure


The Lognormal Reliability Function
Effect of σΤ
•The lognormal distribution is a distribution skewed to the right.
•The pdf starts at zero, increases to its mode, and decreases thereafter.
•The degree of skewness increases as increases for a given
•For the same ,the pdf’s skewness increases as increases.
Effect of µ’

For values significantly greater than 1, the pdf rises very sharply in the
beginning, i.e. for very small values of T near zero, and essentially follows the
ordinate axis, peaks out early, and then decreases sharply like an exponential
pdf or a Weibull pdf with 0 < β < 1.

The parameter, µ’ , in terms of the


logarithm of the T‘ s is also the scale
parameter, and not the location parameter
as in the case of the normal pdf.

The parameter , or the standard deviation


of the T's in terms of their logarithm or of
their , is also the shape parameter and
not the scale parameter, as in the normal
pdf, and assumes only positive values.
Example 1
Concentration of pollutants produced by chemical factory historically known to
exhibit lognormal distribution. Suppose it is assumed that the concentration of a
certain pollutant, in parts per million, has a lognormal distribution with parameters
µ=3.2 and σ= 1. What is the probability that the concentration exceeds
8 parts per million.

Let the random variable X be pollutant concentration

P[ X > 8] = 1 − P[ X ≤ 8]

Since ln(X) has a normal distribution with mean µ=3.2 and standard deviation σ=1,

 ln(8) − 3.2 
P[X ≤ 8] = Φ   = Φ(−1.12) = 0.1314
 1 

Here, we use the Φ notation to denote the cumulative distribution function of the
standard normal distribution. As a result, the probability that the pollutant
concentration exceeds 8 parts per million is 0.1314.
Example 2
The Life, in thousands of miles, of a certain type of electronic control for locomotives
Has an lognormal distribution with µ = 5.149 and σ = 0.737. Find the 5th percentile of
the life of such locomotive.

We know, from Z-table that P ( Z < −1.645) = 0.05

Let X be the life of the locomotive, since ln(X) has a normal distribution.

ln( X ) = 5.149 + (0.737)(−1.645)

= 3.937
Hence, X=51.265

This means that only 5% of the locomotives will have lifetime less than 51265 miles
Questions ?
Thank You

You might also like