You are on page 1of 105

Federal University of

Technology Owerri

Reliability

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

RISK
Major accidents in recent years have taken a sad toll of lives:
Bhopal
Chernobyl
Piper Alpha
Challenger
So have natural disasters:
Bam
December 26 Tsunami
The immediate reaction is always It must never happen again
We need to eliminate hazards as far as possible and reduce
the risks so that the remaining hazards are only a small
addition to the inherent risks of everyday life
RELIABILITY & RISK ANALYSIS TECHNIQUES are the
methods used to assess the safety of modern complex systems
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

RELIABILITY
Definition

Reliability is the ability of a product


to perform as intended ( that is without failure and
within specified performance limits )

for a specified mission time


when used in the manner and for the purposes
intended

under specified application and operational


conditions

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

RELIABILITY
Alternative Definition

Reliability is the probability that a device or system


properly performs its intended function
over time
when operated within the environment for which it
is designed

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

RELIABILITY
Definition stresses 4 elements

Probability

quantitative

Adequate performance must be defined


Time

the period over which we can expect a


certain degree of performance

Operating conditions temperature, humidity,


shock, vibration

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

RELIABILITY

Characteristics of a Product
Estimated in Design
Controlled in Manufacturing
Measured during Testing
Sustained in the Field

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

RELIABILITY
Importance of Reliability
In this modern day of science and technology where
complex devices are used for commercial, military,
scientific, consumer and pleasure purposes
A high degree of reliability is an absolute necessity
There is too much at stake in terms of cost and
human life to take any significant risks with devices
that might not function properly when needed most

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

RELIABILITY

First we will deal with

Foundation of Reliability
Probability and Statistics
Then

In-depth reliability engineering


considerations

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

Objective
To give an overview of
The reliability issues
Techniques
Tasks
Limitations associated with
The design
Manufacture
Operation

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

Probability & Statistics

Pragmatic approach
Discussion will include
Shape of failure distributions
Estimating parameters

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

10

RELIABILITY
Focus on
Preventing failures through
Robust design and manufacturing practices
Based on
Life cycle loads and stresses
Product architecture
Potential defects and failure mechanisms

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

11

RELIABILITY
There are 2 strands in Reliability
FAULT AVOIDANCE
Conservative Design
High Quality Components
FAULT TOLERANCE
Assumes despite all efforts components will fail

USE REDUNDANCY

Price in efficiency

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

12

RELIABILITY

The Characteristics of a Product are:


Estimated in Design
Controlled in Manufacturing
Measured during Testing
Sustained in the Field

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

13

RELIABILITY
Random input
Variables

Continuous
{x}
Discrete
{y}

Output
performance
Characteristics

Binary
{z}

Performance Characteristics of an engineering product

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

14

Quality Production
The Quality of a Product
Performance characteristics may be
Continuous. Fuel consumption
these are objective can be accurately established
by independent measurements and not dependent
on the opinion of an individual
Discrete. Visual appeal, body style
these are subjective based on some scale like (5)
excellent (4) Good..
Binary Based on some feature that the product
does or does not possess. Presence or absence
of sun roof..

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

15

PERFORMANCE CHARACTERISTICS
FOR A FAMILY CAR
Continuous {x}

Discrete {y}

Binary {z}

Urban fuel

Visual appeal of body and Leaded/ unleaded petrol?

consumption

style

Time from 0 to

Visual appeal of interior

Starts first time?

Comfort of ride

Central locking?

60m.p.h.
Braking distance
at 60m.p.h.
Engine noise level Range of exterior colours Quad stereo?
% CO2 in exhaust
Maximum speed

Range of interior colours

Tinted glass?
Power assisted steering?
Sun-roof?

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

16

Specification
The manufacturer of an engineering product will need to
produce a specification
defines the product for a potential customer.
Consists of
a set of target values x1T, x2T,
a target vector {xT}
Urban fuel consumption
40 miles per gallon
Maximum speed
100 miles per hour
Time from 0 to 60m.p.h.
13 seconds
Braking distance at 60m.p.h.
180 feet
Engine noise level
70dB
% CO2 in exhaust
1%
Random effects X1T + 1
Tolerance limits Tolerance vector {}
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

17

RELIABILITY
Target performance {xT}
Tolerance {}
Reject if :
the actual performance {x} lies outside {xT }
Both manufacturer and customer need to know how the actual
random variations in a given performance characteristic {x} for a
given product, across different individual units and under different
environmental and operating conditions, compare with the target and
tolerance values {xT} and { }.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

18

RELIABILITY
A statistical analysis of the variations is required.
This involves calculating:
Mean
Standard deviation
Probability
Probability density function
To do this we require N sample values of the
performance characteristic {x} specified by xi where
i = 1,2,N

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

19

RELIABILITY
Mean x = 1
N

i=N

xi

i=1

1
Standard Deviation = N
1
Root mean square xRMS = N

i=N

(xi - x)2

i=1

i=N

(xi)2

i=1

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

20

RELIABILITY
System Failure Rate The Bathtub Curve
F
a
I
l
u
r
e
R
a
t
e

Infant Mortality
Period

Operating Period

Wear-out Period

Time

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

21

THE BATHTUB CURVE


The infant mortality period or debugging stage
Failures typically caused by manufacturing flaws
Damage received in transit
Damage received in handling
The operating period
Smaller failure rate
Failure rate tends to remain constant
Failure typically due to only to chance
Failure generally results from severe, unpredictable and
usually unavoidable stresses that arise from environmental
factors such as vibrations, temperature, shock and pressure.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

22

THE BATHTUB CURVE


The wear-out period
Failure rate increases rapidly
Failure as a result of gradual degradation of some
property of the system essential to proper operation
The degradation may occur from causes such as
fatigue, creep, corrosion and abrasion
We are most interested in the period between infant
mortality and wear-out.
In this period we have
Constant failure rate
Exponential failure time density function

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

23

RELIABILITY
Failure Rate
Assume at t = 0 we have N0 articles
At time t = t we observe Ns have survived
The number failed is NF
So
N0 = NS(t) + NF (t)
And
R(t) = NS(t) / N0
R(t) is the Reliability as a function of time
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

24

RELIABILITY
Ns

Graph of NS vs t

N0

The failure rate is the limit


as t 0 of
(the gradient at t) NS

Ns(t)

t+t
ECE 510 Reliability and Quality Assurance in Electronics

Time
2nd Semester April. 2013

25

RELIABILITY
The reliability R of a product can be defined
as the probability that the product continues to
meet some specification
The unreliability F of a product can be
defined as the probability that the product fails to
meet the specification
Both reliability and unreliability vary with time
R(t) decreases with time
F(t) increases with time
R(t) + F(t) = 1
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

26

PRACTICAL RELIABILITY
DEFINITIONS
Non- repairable items
Suppose that N individual items of a given non-repairable
product are placed in service and the times at which failures
occur are recorded during a test interval T
Further assume that all N items fail during T and the ith failure
occurs at time Ti
i.e. Ti is the survival time or up time for the ith failure
The total up time for N failures is therefore
I =N

Ti
I =1
and the mean time to failure is given by

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

27

PRACTICAL RELIABILITY
DEFINITIONS
Total up time
Number of failures

Mean Time To Fail =


i.e.

MTTF =

1
N

i=N

Ti

i=1

Number of Failures
Mean Failure Rate =
Total up time
N
i.e.

i=N

i=1

Ti

The mean failure rate is


the reciprocal of MTTF

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

28

REPAIRABLE SYSTEMS
Mean Time To Failure & Mean Time
Between Failures
LIVE
TTF

Under
Repair

Repair
Time

TBF

1 N
MTTF fti
N i 1
fti is TTF

1
MTTF
t[ Nf (t )dt ] tf (t )dt

N0
0
or total life for N devices = N MTTF
and between t and t+t the number live is NR(t)

MTTF

R(t )dt
0

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

29

PRACTICAL RELIABILITY
DEFINITIONS
There are N survivors at
1
time t = 0, N - i at t = Ti,
decreasing to zero at
time t = T. The figure
shows the probability of
survival, i.e. the
reliability, Ri = (N-i) / N
decreases from Ri = 1 at 2/N
t = 0, to Ri = 0 at t = T.
1/N
MTTF = Total area under 0
graph

T1
T2

ECE 510 Reliability and Quality Assurance in Electronics

TN
t

2nd Semester April. 2013

30

Quantification of Reliability
ReliabilityThe probability that a system/component
works
AvailabilityThe probability that a system/component
works on demand
Availability at time t The probability that a system /
component works on demand
AvailabilityThe fraction of the total time that a system /
component can perform its required function

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

31

Quantification of Reliability
Unavailability = 1 - Availability
Unreliability = 1 - Reliability
For the failure process let
F(t) = P[a given component fails in [0,t)]
The corresponding probability density function f(t) is therefore
dF (t )
f (t )
dt

So
So

f(t)dt = P[a component fails in time period [t, t + dt)]


t

F (t ) f (t )dt
0

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

32

Quantification of Reliability
Transition to the failed state can be characterised by the
conditional failure rate h(t).
This function is sometimes referred to as the hazard rate or
hazard function.
This parameter is a measure of the rate at which failures
occur taking into account the size of the population with the
potential to fail, i.e. those that are still functioning at time t:
So
h(t)dt = P[a component fails in time period t, t + dt| it has
not failed in [0, t)]
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

33

Quantification of Reliability
For conditional probabilities we can write:
P( A B)
P( A B)
P( B)

Since h(t)dt is a conditional probability we can define events A


and B as follows by comparing this with the equation above:
A Component fails between t and t + t+dt
B component has not failed in[0, t)
With events defined like this P(A B) = P(A) since if the
component fails between t and t+dt it is implicit that it cannot
have failed before time t
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

34

Quantification of Reliability
h(t)dt =

P[component fails between t and t+dt]


P[component not fail in [0, t)]
f (t ) dt
=
1 F (t )
t
t
f (t ' )
h
(
t
'
)
dt
'

0
0 1 F (t ' ) dt '

Integrating gives
t

h(t ' )dt ' = -ln[1-F(t)]


0

F(t) = 1-exp h(t ' ) dt '

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

35

Quantification of Reliability
t

F(t) = 1-exp h(t ' )dt '

If h(t), the failure rate or hazard rate for a general system or


component is plotted against time we get the bathtub curve.
In the useful life period h(t) = is constant.
So after integration

F(t) = 1-e-t

And the reliability, the probability that the component works


continuously over (0, t] is the exponential function
R(t) = e-t

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

36

System Mean Time to Failure


When system failure can be tolerated and repair can
be instigated an important measure of the system
performance is the system availability
A=

MTBF
MTBF + MTTR

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

37

QUANTIFIED RISK ASSESSMENT


SYSTEM LIFE CYCLE
SYSTEM
DEFINITION
PHASE

CONCEPT
DESIGN
PHASE

Establish
reliability
requirements

DETAIL
DESIGN
PHASE

Perform global
safety/
availability
Set provisional assessment
reliability/
Identify critical
availability
areas and
targets
components
Prepare
reliability
specification

Confirm /
review targets

MANUFACTURING

OPERATING

PHASE

PHASE

FMEA / FTA of
critical systems
and components

Prepare and
implement
reliability
Review reliability specifications
database

Carry out
detailed system
reliability
assessment

Review reliability
demonstrations

Prepare safety
case

ECE 510 Reliability and Quality Assurance in Electronics

Audit reliability
performance
Collect and
analyse
reliability, test
and
maintenance
data
Assess reliability
impact of
modifications

2nd Semester April. 2013

38

RELIABILITY
CRITICAL FAILURES
Where failure causes total loss of function
MAJOR FAILURES
Where failure causes major loss of function but
the product can still be used to some extent
MINOR FAILURES
Where failure leaves the product still able to be
used to perform the major function but with the
loss of some convenience function

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

39

RELIABILITY NETWORKS
A reliability network is a representation of the
reliability dependencies between components of a
system
Dependencies are used in such a way as to
represent the means by which the system will
function
Such a network can be used to assess the
probability of failure of a system

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

40

TOPOLOGICAL RELIABILITY
The functional behaviour of most systems can be

characterised by a network diagram


Nodes denote the subsystems
Branches of the network represent the functional
relationship between these subsystems
Example
A high voltage supply system consisting
Transmitter A
of two transmitters A and B and
Power
C
a power supply.
Supply
For the system to work
the power supply and at least one of
Transmitter B
the transmitters must operate
A path must exist between C and D for the system to work.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

41

RELIABILITY NETWORKS
The question of what constitutes proper operation or
proper function for a particular type of equipment is
usually specific to the equipment
Rather than attempt to suggest a general definition
for proper function we assume that the appropriate
definition for a device of interest has been specified

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

42

RELIABILITY NETWORKS

We can represent the functional status of the


device as

1
0

if the device functions properly


if the device has failed

Note that this representation is intentionally binary.


We assume that the status of the equipment of interest is either
satisfactory or failed.
There are many types of equipment where one or more de-rated
states are possible and methods have been developed to cope.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

43

RELIABILITY NETWORKS
We presume that most equipment is
comprised of components and that the status
of the device is determined by the status of the
components.
Let n be the number of components that make
up the device and define the component status
variables xi as

xi

1
0

if the device functions properly


if the device has failed

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

44

RELIABILITY NETWORKS
The set of n components that make up the device is
represented by the
component status vector:
x = { x1, x2,, xn }
The dependence of the device status on the
component status is represented by the
function
= (x)
referred to as a system structure function
or a
system status function
or simply as a
structure

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

45

RELIABILITY NETWORKS

There are 4 generic types of structural


relationships between a device and its
components.

1.
2.
3.
4.

Series
Parallel
k out of n
All others

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

46

RELIABILITY NETWORKS
SERIES SYSTEMS
Definition
A series system is one in which all components
must function properly in order for the system
to function properly.
Reliability block diagram of a series system

Conceptual analogue
circuit

Series electrical

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

47

TOPOLOGICAL RELIABILITY
Example 2
In an aircraft electronics system consisting of
a sensor subsystem,
guidance subsystem,
computer subsystem and
fire control subsystem
the system can only operate successfully if these four subsystems operate
Sensor

Guidance

Computer

Fire Control

NOTE: The

figure only depicts the functional relationship required for system


operation and does not necessarily mean that these subsystems are electrically
wired together in series.

Examples where components are not physically connected:


The set of legs on a 3-legged stool. The set of tyres on a car.
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

48

RELIABILITY NETWORKS
SERIES SYSTEMS
For the series structure the requirement that all
components must function implies that an
algebraic form for the structure function is:
n

(x) =

xi

i=1

Examples

x1= x2 = 1, x3 = 0 results in (x) = 0


results in (x) = 0
functioning x1= x2 = x3 = 0
x1= x2 = x3 = 1
results in (x) = 1

Only the
of all components
results in system
function

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

49

RELIABILITY NETWORKS
Definition

PARALLEL SYSTEMS

A parallel system is one in which any one component must


function properly in order for the system to function properly.
Reliability block diagram of a series system
1
2
3

Conceptual analogue

Parallel electrical circuit


ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

50

RELIABILITY NETWORKS
PARALLEL SYSTEMS

1
2
3
Examples

Similar to the series the structure function for


the parallel system may be expressed as:
n

(x) = 1- (1- xi)


i=1

x1= x2 = 1, x3 = 0
1
x1= 1, x2 = x3 = 0
1
x1= x2 = x3 = 0
0

results in (x) =
results in (x) =
results in (x) =

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

51

RELIABILITY NETWORKS
PARALLEL SYSTEMS
Parallel systems are often referred to as Redundancy

Often,

but not always, the parallel components are identical


There are actually several ways in which the redundancy
may be implemented
This diversity can lead to different reliability under different
environmental conditions
A

distinction is made between redundancy obtained using a


parallel structure in which all components function simultaneously
(ACTIVE REDUNDANCY) and that obtained using parallel
components of which one functions and the other or others wait
as standby units (STANDBY REDUNDANCY).

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

52

RELIABILITY NETWORKS
Definition

k-out-of-n SYSTEMS

A k-out-of-n system is one in which any k of the n


components that comprise the system must function properly
in order for the system to function properly.
1
k-out-of-n

2
3
4

(x)=

if

otherwise

i=1

xi k

Example cases for a 3-out-4 system


x1= x2 = x3 = 1, x4 = 0
results in (x) = 1
x1= x2 = 1, x3 = x4 = 0
results in (x) = 0
x1= x2 = x3 = 0, x4 = 1
results in (x) = 0

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

53

TOPOLOGICAL RELIABILITY
Example 3
In a computer system with a computer, a controller, and three
memory units suppose that the system can only satisfy its
operational requirements if at least two of the three memory units
are operable and both the computer and the controller are operable.
The 4 branches represent the 4 possible
ways we can obtain system
operation.
Unit 1

Unit 2

Unit 1

Unit 3
Controller

Unit 2

Unit 3

Unit 1

Unit 2

Computer

Unit 3

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

54

Equivalent Computer Network

2
4

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

55

TOPOLOGICAL RELIABILITY
Communication System diagram
Antenna
Receiver
Converter

Teleprinter
Pulse
Shaping
Unit

Antenna
Receiver
Converter

Teleprinter

At least 1 of the two Antenna Receiver Converters must work


At least 1 of the two Teleprinters must work
The Pulse shaper must work

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

56

TOPOLOGICAL RELIABILITY
In general suppose the topological or network representation
of a system consists of n nodes and define
R(N1,Nm) = probability that nodes number N1,, Nm are
operating and the other n-m nodes are not operating
Then the probability that exactly m nodes are simultaneously
operating is given by
Rm = ..... R(N1, , Nm)
N 1 Nm
Where the sum is taken over all positive integers N1, , Nm
such that n N1 > N2 >.>Nm 1
Thus the probability that at least k nodes are operating is
n
given by
Rm
mk
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

57

Reliability of Systems
1 Series systems
R1

R2

R3

R4

Ri

Rm

System of m elements in series with individual reliabilities


R1, R2, , Ri, Rm respectively.
The system will only survive if every element survives , if
one element fails the system fails.
Assume The reliability of each element is independent
of the reliability of the other elements
The probability that the system survives is the probability
that element 1 survives and the probability that element 2
survives and the probability that element 3 survives etc.
The system reliability is the product of the element
reliabilities.
Rsyst = R1R2R3 Ri Rm
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

58

Reliability of Systems
1 Series systems
If we assume a constant failure rate for the elements
then since
Ri = e-t
Rsyst = e-1 t e-2 t .e-I t .e-mt
So if syst is the overall system failure rate
Rsyst = e-syst t = e-(1 + 2 + .+ I t . +m) t
syst = 1 + 2+. +i+. +m
Failure rate of a series network is the sum of the
individual element failure rates so it is important to
keep the number of elements to a minimum and so the
reliability will be maximum.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

59

Reliability of Systems
1 Series systems
Unreliability of Series System with Small Failure rates
Protective systems have element and system UNRELIABILITIES F
that are very small. The corresponding system reliabilities are
therefore very close to 1, for example 0.9999 may be typical. Then the
calculation of Rsyst = R1R2R3 Ri Rm may be arithmetically unwieldy
and the alternative equation involving unreliabilities may be more
useful since
Rsyst = 1 - Fsyst and Ri = 1- FI
We have
1 - Fsyst= (1- F1 ) (1- F2 ) (1- FI ) (1- Fm )
= 1 (F1 + F2 ++ FI + +Fm )
+ terms involving products of Fs
If the individual Fi are small i.e. Fi << 1 the terms involving products
of Fs can be neglected giving the approximate equation
Fsyst F1 + F2 ++ FI + +Fm

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

60

Reliability of Systems
Parallel Systems
An overall system consisting of n individual
Elements or systems in parallel with
Individual unreliabilities F1, F2, , Fj, Fn
Only one individual element is necessary
to meet the functional requirements of the
overall system
The remaining elements increase the
reliability of the system
THIS IS CALLED REDUNDANCY.
Failure only if ALL the elements fail
The unreliability of a parallel system
Fsyst = F1F2FjFn

ECE 510 Reliability and Quality Assurance in Electronics

F1
F2

Fj

Fn

2nd Semester April. 2013

61

Reliability of Systems
Voting Systems
Majority voting systems are used to protect
hazardous plant and processes and have
applications in the chemical, nuclear and aerospace
industries. Diagram shows a typical system with 2
out of 4 voting with initiators A, B, C, D.
A
B
C

2oo4
voting
element

Shut down
system

Trip setting
Process
parameter inputs

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

62

Reliability of Systems
Voting Systems
Suppose R and F are the reliability and unreliability of
the individual initiators.
The overall initiation system fails to protect the plant if
either all 4 initiators fail
or any 3 initiators fail
If 2 or less initiators fail there are still sufficient left to
trip the plant
Since the Fs are normally small then the rare events
approximation is valid and the overall system
unreliability is the sum of the following probabilities

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

63

Reliability of Systems
Voting Systems
FINIT = Probability that A and B and C and D fail
+
Probability that A and B and C fail
+
Probability that B and C and D fail
+
Probability that A and C and D fail
+
Probability that A and B and D fail
Each of the terms is the product of individual
unreliabilities and reliabilities.
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

64

Reliability of Systems
Voting Systems
FINIT = F4 + F3R + F3R + F3R + F3R = F4 + 4F3R
= F3(F + 4R)
This result can be obtained from the binomial
expansion of (F + R)4
(F + R)4 = F4 + 4F3R + 6F2R2 + 4FR3 + R4
The first term F4 represents the probability of all 4
initiators failing and the second the total probability
of 3 failing. If R 1 and F 1 Then FINIT 4F3
The unreliability of the complete protective system is
FSYST FINIT + FVOTING + FSHUT-DOWN

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

65

Reliability of Systems
Majority Voting Systems
In a majority voting system there are n trip channels
and the plant is tripped if m (n m) indicate that the
plant should be tripped.
Such a system is referred to as m out of n or m oo n
The binomial distribution can be used to calculate
overall failure probabilities.
Consider the jth term in the binomial expansion of
(F +R)n where F and R are the single channel
reliability and unreliability and n is the total number
of channels.
n
This is
CjFjRn-j and is the probability that
j channels fail i.e. (n-j) channels survive.
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

66

Reliability of Systems
Fail-Safe & Fail-Danger
Fail-Danger failure
Any system or component failure that prevents, or
tends to prevent, the plant being tripped when a
potentially hazardous fault condition occurs.
Example A pressure switch failed to open when the
pressure exceeded the trip pressure
Fail-Safe failure
Any system or component failure that
produces a plant trip when a plant trip is not
required.
Example A pressure switch opened when the
pressure was below the trip pressure
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

67

Reliability of Systems
Fail-Safe & Fail-Danger
A fail-danger failure is a very serious occurrence
Fail-safe failures are less serious but cause loss of
production and confidence in the trip system
Fail-danger and Fail-safe failures will generally
have different failure rates and so different failure
probabilities
Detailed information on the failure rates associated
with all possible modes of failure of trip equipment
is not always available
We may have to assume both rates are equal to the
average failure rate
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

68

Reliability of Systems
Fail-Safe & Fail-Danger
Supposing we wish to calculate overall fail-danger
and fail-safe for a system with two out of three
voting i.e. 2 oo 3 where m = 2 and n = 3 . These
probabilities can be calculated from the binomial
expansion of (R + F) 3, where F and R are the single
channel reliability and unreliability respectively.
So we have (F + R)3 = F3 + 3F2R + 3FR2 + R3
where F3 represents the probability that all 3
channels fail, 3F2R the probability that 2 channels
fail, 3FR2 the probability that 1 channel fails and R3
the probability that no channel fails
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

69

Reliability of Systems
Fail-Safe & Fail-Danger
Looking at fail-danger first
if either 2 or 3 channels fail dangerously then
there are correspondingly only 1 or zero channels
left working.
This is insufficient to trip the plant with 2 oo 3
voting and an overall fail danger situation has
occurred.
If FD is the single channel fail-danger probability
then the overall fail danger probability is
PD = FD3 + 3R FD2 = FD2(3R + FD)
In a protective system R 1 and FD 1 giving
PD 3FD2
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

70

Reliability of Systems
Fail-Safe & Fail-Danger
Looking at fail-safe
A fail-safe failure of no channels or only one
channel will not cause a plant trip with 2 oo 3
voting.
A fail-safe failure of two channels will cause an
unnecessary plant trip.
The failure of a third channel is irrelevant because
the plant is tripped by only 2 channels.
The overall fail-safe probability is therefore
PS = 3RFS2 3FS2
where FS is the single channel fail-safe probability
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

71

Reliability of Systems
Fail-Safe & Fail-Danger
Overall fail-danger probability
n

PD = CrF D
n

where r = n-m+1 and Cr = n! {r!(n-r)!}


FD

MAX

= 1- e

- T
D

DT (if DT 1

Overall fail-safe probability


n

PS = CmF

m
S

where Cm= n! {m!(n-m)!}


MAX
S

= 1- e

- T
S

ST (if ST 1

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

72

Reliability of Systems
Fail-Safe & Fail-Danger
FRACTIONAL DEAD TIME FDT
Is related to fail-danger probability
Is the mean proportion of the testing interval T that
the trip system is incapable of protecting the plant
T

FDT = {1 T} 0 FD(t)dt
FDT is a similar concept to unavailability
n

r
D

FDT = {1 (r+1)} CrF

Majority voting can be implemented with


combinatorial logic.
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

73

FAULT TREE , EVENT TREE and


FMECA ANALYSIS

To check for fault propagation one technique is


Failure
Modes
Event
Criticality
Analysis
A full FMECA is hard and expensive. Take every
component, wire, connector and think of every possible
fault. Consider the effects of all of these - are they single
point failures? Can they propogate?
propagate? Document the
results.
An FMECA is a development from an FMEA - (Failure
Modes Event Analysis)
FMEA and FMECA are bottom-up analyses. The alternative
approach is a fault tree analysis.
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

74

FMECA ANALYSIS

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

75

FMECA ANALYSIS

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

76

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

77

FAULT TREE , EVENT TREE and


FMECA ANALYSIS
Event trees are

Encountered frequently in the analysis of


events including human activities that can
lead to disasters or undesirable events

Sometimes called Cause - Consequence


analysis

Used more frequently in Safety studies

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

78

EVENT TREE
Consider the following example of a fire alarm system.
Ideally if there is a fire then
The alarm goes off.
A sprinkler system extinguishes the fire.
In each case there is a human standby
If either the alarm or the sprinkler system fails
Human operator can operate either or both

This can be represented by the event tree in the figure

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

79

EVENT TREE
Fire
starts

Alarm
functions

Operator notices
malfunction

YES
0.999
YES
510-4
NO
10-3
NO

YES
0.99
NO
10-2

Sprinkler
system
functions

Operator notices
malfunction

YES
0.998 YES
NO
0.9
NO
-3
210
0.1
YES
0.998
YES
NO 0.999
NO
210-3
10-3

0.9995

ECE 510 Reliability and Quality Assurance in Electronics

Fire Suppressed
4.9810-4
Fire Suppressed
8.9910-7
Fire Spreads
9.9910-8
Fire Suppressed
4.910-7
Fire Suppressed
9.910-10
Fire Spreads
9.910-13
Fire Spreads
510-9
NO FIRE
0.9995
2nd Semester April. 2013

80

EVENT TREE
Notice that of all the possible outcomes only three are that the fire
spreads
The possible sequence of events that that can lead to this undesireable
event can now be identified from these outcomes.
The alarm fails to function and the operator fails to notice and
take action in time
The alarm functions but the sprinkler fails to function and the
operator fails to notice and take action in time
The alarm fails to function, and the operator notices, but the
sprinkler fails to function, and the operator fails to notice
If sufficient data exists to estimate probabilities the likelihood of the
various outcomes can be obtained
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

81

FAULT TREE
An Example of a Deductive approach.
What can cause this?
Used to identify the causal relationships leading to a
specific system failure mode.
The system failure mode is the TOP event and the
FAULT TREE is developed in branches below this
event showing its causes.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

82

Fault Tree From Logic Expression


T = (abc + f)[(a + d)f](a +be)
a
b
c
a

T4

T1

f
f
T5

T2

d
b
T6
e

T3

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

83

Fault Tree from Logic Expression


Simplifying the expression:
T = (abc + f)[(a + d)f](a +be)
= (abc +f)(af + df)(a + be)
= abcf + af + abcdf + adf + abcef + abef + abcdef + bdef
Using XX = X
= abcf(1+d+e+de) + af (1+d+be) + bdef
Using (1 + X) = 1
= abcf + af + bdef
= af (bc + 1) + bdef
= af + bdef
= f(a +bde)
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

84

QUALITY DESIGN AND QUALIFICATION


TESTING ACTIVITIES
Design testing refers to
laboratory tests
on computer
and / or
prototype models
to prove that the design is capable of
meeting the quality specification

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

85

QUALITY DESIGN AND QUALIFICATION

Qualification testing refers to


field testing of
pre-production models
and
production models
involving
all performance characteristics
over the full range of relevant
environmental variables
to further verify that the specification can
be met.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

86

DESIGN FOR RELIABILITY


Objective
To design a given product or system which
meets the target failure rate T
under the environmental conditions
specified.
It is assumed that
all components and elements are operating
in the useful life region where failure rate is
constant with time.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

87

DESIGN FOR RELIABILITY


General principles to be observed.
a) Element / component selection
Only elements / components with well
established failure rate data / models should be
used
Some technologies are inherently more reliable than
others.
e.g.
Solid state switching devices are more reliable
than electromechanical reed relays
Inductive displacement transducers are more
reliable than the resistive potentiometer type.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

88

DESIGN FOR RELIABILITY


b) De-rating
Stress (x) was defined as variable which when applied
to an element or component tends to increase failure
rate.
e.g.
mechanical stress
voltage
Strength (y) was defined as any property of the
element or component which resists the applied
stress
e.g.
elastic limit
rated voltage
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

89

DESIGN FOR RELIABILITY


To reduce failure rate
strength should exceed stress by an
adequate
(y-x)
Safety Margin
(x2 + y2)
In a mechanical element SM > 5.0
In an electronic circuit the voltage
Stress Ratio SR should be kept below
0.7
x
Stress Ratio

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

90

DESIGN FOR RELIABILITY

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

91

DESIGN FOR RELIABILITY

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

92

DESIGN FOR RELIABILITY

e)Redundancy
The use of several identical elements /
systems connected in parallel increases the
reliability of the overall system.
Redundancy should be considered in situations
where either the complete system or certain
elements of the system have too high a failure
rate.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

93

DESIGN FOR RELIABILITY

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

94

DESIGN FOR RELIABILITY


If the probability of common mode failure limits
the reliability of the overall system
equipment diversity should be considered
Here a common function is carried by two
systems in parallel
but
with

Each element is made up of


different elements

different operating principles


e.g.
A temperature measurement device made up of
two subsystems in parallel
one electronic
one pneumatic
ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

95

DESIGN FOR RELIABILITY

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

96

High Reliability Design


The system designer may consider component
redundancy
ADVANTAGES of redundancy
The quickest solution if time is of prime importance
The easiest solution, if the component is already
designed
The cheapest solution, If the component is
economical in comparison with the cost of redesign
The only solution, if the reliability requirement is
beyond the state of the art

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

97

High Reliability Design


DISADVANTAGES of redundancy
Too expensive, if the components are costly
Exceed the limitations on size and weight,
particularly in satellites
Exceed the power limitations, particularly in active
redundancy
Attenuate the input signal, requiring additional
amplifiers which increase complexity
Require sensing and switching circuitry so complex
as to offset the advantage of redundancy

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

98

Exercises
1) Discuss, giving examples, the methods including procurement and testing procedures,
used by manufactures to ensure the reliability of a product.
2) Discuss the differences in reliability required in systems such as consumer products,
trains, aeroplanes, satellites. What value would you assign to the overall failure rate of
each of these systems.
3) What do you understand by the reliability of a system? Discuss some practical ways to
assign a quantitative value to the reliability of a system.
4) Draw a fault tree for the lighting system in a car and hence derive a logic equation for
the failure of the headlights.
5) Discuss the concepts of fail-safe and fail-danger. Why is the single unit probability for
fail-safe and fail-danger often assumed to be the same. Explain why in spite of this
assumption the overall fail-safe probability of a complex system will be different from the
overall fail-safe probability.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

99

Problems
1) )The figure shows a protective system, based on temperature measurement. The system is to have a
maximum fail-danger probability not exceeding 810-3 and a maximum fail-safe probability not exceeding
510-2. The system is tested and proved to be working correctly at three-week intervals. Annual fail-safe
and fail-danger failure rates for each component are:
Thermocouple
S = D = 0.5
Thermocouple input trip amplifier/comparator
S = D = 0.1
m out of n voting element
S = D = 0.05
Logic operated switch
S = D = 0.1
Solenoid valve
S = D = 0.1
Trip valve
S = D = 0.1
Calculate the maximum fail-safe and fail-danger probability PS and PD for:
a) The high integrity voting equipment, HIVE.
b) The high integrity trip initiator, HITI.
c) The high integrity shutdown system, HISS.
And hence
d) The total system fail-danger probability
e) The total system fail-safe probability
State whether the system meets the design criteria

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

10
0

Problems

0.5

0.1
0.1

0.5

0.1

0.1

Thermocouple

HITI

0.1

0.05

0.1
0.5

0.1

2 oo 3 voting

Logic switch

0.1
Solenoid valve

0.1
Trip valve

Trip amp/comp

HIVE

HISS

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

10
1

Problems
Solution
i
ii

HIVE Maximum FS = FD = 1- e-0.053/52 = 2.8 10-3


HITI single channel
Maximum FS = FD = 1- e-0.63/52 = 3.4 10-2
2 00 3 voting HITI;
PD= 3FD2 = 3 (3.4 10-2)2 3.4 10-3

iii

HISS single channel


Maximum FS = FD = 1- e-0.33/52 = 1.7 10-2
Two channels in parallel
PD= FD2 = (1.7 10-2)2 0.3 10-3
PS= 2FS 2 1.7 10-2 = .3.44 10-2

iv

Total System fail-danger probability = (PD)HITI + (PD)HIVE + (PD)HISS


= 3.4 10-3 + 2.8 10-3 + 0.3 10-3
= 6.5 10-3

Total System fail-safe probability = (P S)HITI + (PS)HIVE + (PS)HISS


= 3.4 10-3 + 2.8 10-3 + 34.4 10-3
= 40.6 10-3
= 4.1 10-2
The system meets the design specification

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

10
2

Problems

2) A taxi owner has 20 cars. Records show each car on average breaks down once every 2 years and that this is
reasonably constant. How many breakdown calls will he have per year?
What is the probability of 1 breakdown in a 3 month period?
Solution.
Statistical assumptions hold
(for 1 car)
c = 1/2
F = Nc = 10
3 months = 1/4 year
F(t) = 1 - e

-Ft

-10.1/4

F(1/4) = 1 - e
= 0.92
OR

92% chance of at least one failure in 3 months.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

10
3

Problems
3) A basic guidance and navigation system for a proposed space probe consists of an Inertial Set, a
Canopus Sensor, a Sun Sensor, and a Computer. The reliability for each device is Rinertial set =
0.95; Rsun sensor = 0.90; Rcanopus sensor = 0.85; Rcomputer = 0.90. For the system to operate all four
subsystems must be operating. Due to design constraints the space probe can only contain one
Inertial set and one Computer. To increase the reliability of the system three Canopus Sensors
and two Sun Sensors are used in hot redundancy.
(i). Draw a reliability block diagram for the system without the redundancy
(ii). Draw a reliability block diagram for the system including the redundancy
Calculate the reliability of the system in each case.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

10
4

Problems
(4) In a distillation column a hazardous situation is created if the flow rate of steam to the reboiler goes high; this
causes a high flow rate of vapour up the column producing a high pressure which could cause the vessel to rupture.
The temperature control loop consists of a platinum resistance thermometer (PRT), a transmitter ( which converts
resistance change to a 4-20mA current signal), a controller, a current-to-pneumatic converter and a control valve.
The plant is protected by a pressure trip system consisting of a pressure switch and three-way solenoid valve
located in the air line between the converter and the control valve.
Failure mode and effect analysis of the system shows that:
1. A fail-danger situation F in which the Steam control valve moves fully open, occurs if either Pressure in valve
bonnet increases (F1) or Control valve fails open (F2).
2. F1 occurs if Pressure signal to control valve increases (F3) and Solenoid does not vent air (F4).
3. F3 occurs if PRT short circuit (F5) or Transmitter O/P fails low (F6) or Controller O/P fails high (F7) or I/P
Converter O/P fails high (F8).
4 F4 occurs if Pressure switch fails to open (F9) or Solenoid fails to vent (F10)
(i). Draw a fault tree diagram for the fail-danger failure
(ii). Write down the logic expression for the fail-danger failure F.

ECE 510 Reliability and Quality Assurance in Electronics

2nd Semester April. 2013

10
5

You might also like