You are on page 1of 33

Reliability

Definition of Terms
- Failure Modes.
- The Price of Reliability.
Reliability Functions
- Cost Functions
- The Bathtub Curve.
- Useful Lifetime.
Reliability Impact Factors
- Environmental Factors.
- Design & Manufacture.
- Accelerated Lifetime.
Case Study
-Hi-Rel Processing.
Definitions

Reliability - The ability of an item to
perform its required function under defined
conditions for a stated period of time.

Failure The termination of the ability of
an item to perform its required function.

Degrees of Failure
Failures may be SUDDEN (non-predictable) or
GRADUAL (predictable). They may also be
PARTIAL or COMPLETE.

A Catastrophic failure is both sudden and
complete.

A Degradation failure is both gradual and partial.
Causes of Failure

Misuse failures attributable to the
application of stresses beyond the
stated capabilities of the item.

Inherent Weakness failures
attributable to weakness inherent
in the item itself when subjected
to stresses within the stated
capabilities of the item.
Reliability vs Cost
Three separate cost factors related to the reliability
of an item throughout its life

Design & Development
Production
Maintenance & Repair
Cost-Reliability Functions
MTBF & MTTF
Mean Time Between Failures Applies to
repairable items.

Mean Time To Failure Applies to non-repairable
items.

Both of these terms indicate the average time an
item is expected to function before failure.


Failure Rate vs Time
Early Failures substandard
components, manufacturing faults.
Random Failures this is the
useful lifetime of the item.
Reliability is predictable in this
region.
End-of-Life Failures items
reaching the end of their useful life.
Also called the wear-out period.
Because of the characteristic shape, this
is commonly known as the Bathtub Curve.
Useful Lifetime
t
m
R e

=
Reliability is predictable.
R = reliability.
t = time for which equipment is run.
m = MTBF
Note that R has no units. The prediction yields a number <1.
Closer to 1 = greater reliability.
Examples
If an item of equipment has MTBF of 500hrs, then the reliability for 100hrs
operation is :-
100
500
e

= 0.8187 (81.87% probability of survival)


and if the equipment is operated for 1000hrs, the reliability will be :-
500
1000
e

=0.1353 (13.53% probability of survival)


Design &
Manufacture

Pre-Production Design
Control of Production
Working Tolerances
Material Quality
Component Quality
Component Stress
Installation &
Environmental

Temperature
Humidity
Vibration
Chemical Attack
Interconnections

Factors Affecting
Reliability
Factors Affecting Reliability
Installation
&
Environmental
Temperature
Generally, Operation at higher
temperatures degrades reliability
performance. Internally generated heat
must be removed by mechanisms such
as cooling fins or forced-air.

In high ambient temperatures, the
process of removing excess heat
becomes more difficult.

Equipment operating in low ambient
temperatures will need to be
designed using components
which can tolerate this environment.

Humidity
Moisture can cause oxidation and corrosion and
reduce insulation effectiveness. Particularly
vulnerable are solder joints and connectors.

Equipment designed for use in areas of high humidity
will use components and materials which are
selected for their resistance to damage by
moisture.

Vulnerable components, such as circuit boards, can
be protected by encapsulation e.g. in resin.
Individual components may be hermetically sealed.

Vibration
Vehicles (cars, ships, aircraft etc) are
particularly prone to vibration
damage.

Vulnerable equipment can use flexible
mountings.

Components on a PCB can be made less susceptible to vibration by the
use of encapsulation.

Vibration effects on electronic components has been minimised by the
process of miniaturisation.

Chemical Corrosion
Atmospheric pollutants and natural airborne chemicals (such as
salt air atmosphere in coastal regions) can corrode metals
(PCB tracks, solder joints, connector terminals etc) and
even break down some plastics used for insulation.
Selection of appropriate materials is
crucial. Again, encapsulation can help
to protect vulnerable components,
particularly circuit boards.

Interconnections
Interconnections are liable to degradation by vibration, humidity and
chemical factors. They are one of the most vulnerable components in an
electronic system.
Connections internal to electronic
modules, such as inverters, can be
reasonably well protected by
appropriate mounting and by
encapsulation.
However, other interconnections, eg
between solar panels, will be subject
to mechanical stress and corrosion
damage.

Factors Affecting Reliability
Design
&
Manufacture



Component Reliability
Typical Failure Rates of Electronic Components

Component Type Failure Rate (%/1000h)
Capacitors Ceramic 0.025
Paper 0.05
Tantalum 0.1
Electrolytic 0.2

Diodes Silicon 0.001

Resistors Carbon 0.005
Wirewound 0.03
Film 0.1

Transistors Discrete Silicon 0.01

Connections Soldered 0.001

Connectors Per Pin 0.005
Operating Stresses
Weighting Factors for Electronic Components

Component Operating Condition Weighting Factor
Resistors 0.1 of max. rating 1.0
Transistors 0.5 of max. rating 1.5
Diodes max. rating 2.0

Capacitors 0.1 of wkg voltage 1.0
0.5 of wkg voltage 3.0
max wkg voltage 6.0

}
System Failure Rate = [(Component Failure Rate) x (Quantity) x (Weighting Factor)]

Example
An electronic system uses :
20 silicon transistors @ 0.1 x max rating 20 carbon resistors @ 0.5 x max rating
10 silicon transistors @ 0.5 x max rating 50 ceramic capacitors @ 0.1 x max rating
10 diodes @ 0.1 x max rating 20 electrolytic capacitors @ 0.5 x max rating
100 carbon resistors @ 0.1 x max rating 500 soldered connections

Overall failure rate is 14.85% per
1000 hours. The MTBF can be
found by dividing this number into
100,000.
100, 000
6734
14.85
MTBF = = Hours
Production Monitoring & Quality
Control
Continuous assessment of key
quality monitors during manufacture
allows early identification of process
variation and prompt action to
optimise processes.
Quality control feedback loops may
also be implemented on incoming
materials and components.
Accelerated Life
The bathtub curve predicts a high
early failure rate.
Elevated temperatures are used to
accelerate component aging and
ensure that products move from the
Early Failure area and into the
Useful Lifetime area.
The technique is used to pre-screen
early failures during manufacturing.
High temperatures accelerate all known chemical
reactions. Almost all failure mechanisms associated
with semiconductor devices are the result of a
chemical reaction
Arrhenius Equation
= Rate of the chemical reaction.
= A constant.
e = Activation energy in electron volts (eV) that is
associated with the chemical reaction.
K = Boltzmans constant.
T = Absolute temperature.
0
A
E
KT
r r e

=
r
r
0
r
Acceleration Factor
is the elevated temperature.

is the temperature for which the
new reaction rate is calculated.

Is the reaction rate at the elevated
temperature.

Is the reaction rate at
1
T
2
T
1
r
2
r
2
T
The constant, , is the same for both
temperatures and has been cancelled
out of the equation
0
r
|
|
.
|

\
|

=
2 1
1 1
2
1
T T K
E
A
e
r
r
Case Study

MOSFET Hi-Rel
Processing
Process Flows
Standard process flow (left)

Hi-Rel process flow (right)

Hi-Rel process flows includes many
more process monitors during
production as well as accelerated
life testing and other quality
conformance testing designed to
enhance product reliability
High Temperature Gate Bias
HTGB
Burn-in temperature 150C.

Gate terminal is biased during burn-in.

Typical burn-in time 48hrs.

Failure criteria - failure to meet data
sheet specifications.
Purpose is to check the integrity of the gate oxide. This test
identifies failures caused by weak or damaged oxide or if the oxide
is contaminated.
High Temperature Reverse Bias
HTRB
Burn-in temperature 150C.

Drain terminal is biased during burn-in.

Typical burn-in time 168hrs.

Failure criteria - failure to meet data
sheet specifications.
Purpose is to check the integrity of the field termination and the
quality of the body-drain junction. This test also identifies failures
caused by surface contamination.
Other Hi-Rel Processing
Salt Atmosphere subjects the devices to a highly corrosive
atmosphere of salt and moisture at the elevated temperature of 35C to
simulate long-term exposure to seacoast atmospheric conditions.
Failure Criteria excessive corrosion of package, loss of marking
legibility, loss of hermeticity.


Thermal Shock (liquid-to-liquid) defined number of temperature
cycles from -55C to +150C with 5-minute exposure at each temperature,
maximum 5 second transfer time between temperatures. Tests for die
attach integrity and package hermeticity. Any cracks present in the silicon
chip will be propagated by this test, leading to failure.
Failure Criteria Failure to meet datasheet specification, loss of
hermeticity.


Other Hi-Rel Processing
(continued)
Temperature Cycle (air-to-air) defined number of temperature
cycles from -55C to +150C with 10-minute exposure at each
temperature, with a 5 minute dwell at ambient during transfer. Similar to
Thermal Shock but often activates different failure mechanisms due to
longer exposure to temperature extremes and more gradual temperature
change.
Failure Criteria Failure to meet datasheet specification, loss of
hermeticity.


Pressure Pot device subjected to 121C @ 15 PSIG in an
atmosphere of 100% RH. To check the performance of the device in humid
environments. Identifies passivation defects, poor package sealing and
contamination level during assembly.
Failure Criteria Failure to meet datasheet specification.
Full test details and comprehensive procedures are detailed in
the MIL-STD methods,

Especially :-

MIL-STD 750 - Standard Test Methods For Semiconductor
Devices

MIL-STD 883 - Test Method Standards - Microcircuits

You might also like