You are on page 1of 29

Fault Tree Analysis (Reliability)

for Beginners
Introduction to Reliability & its
Applications
• The probability that an item can perform a
required function under given conditions for
a given time interval.

• Reliability is a measure of failures over a


period of time and applies to any
mechanical or electrical as well as human
tasks.
Bath - tub Curve
• Reliability applied to system components
which are assumed to have settled down
into steady state or useful life phase.

• The reliability characteristics of most


components follow the so called reliability
‘Bath - tub Curve’.
I II III

BURN-IN USEFUL LIFE WEAR-OUT


HAZARD RATE (OR NUMBERS
OF FAILURES)

1 2 3 30 YEARS
COMPONENT LIFE CYCLE

RELIABILITY BATHTUB CURVE


The three phases of the ‘Bath - tub
Curve’

• Phase I - Declining Hazard rate as weak


components are eliminated (Burn in period )
• Phase II - Approximately constant Hazard rate due
to chance failure (Useful life)
• Phase III - Increasing hazard rate due to age (Wear
out period)
COST IMPLICATION OF
RELIABILITY
• Reliability assessment and implementation
will optimise life cycle cost of equipment.
• The life cycle is: -
• Design and prototype development
• Manufacture
• In-service development
• Maintenance
• Obsolescence
• Withdrawal from service
• Eventual scrapping and disposal
Reliability Methodology

Mechanical
Components
Reliability Database
Reliability
Electrical Reliability Driven
Components Modelling & Design;
Reliability Database Analysis System Design
Performance
Program Monitoring
Criteria
Instrument
Components and Life Cycle
Reliability Database
Costs
Human Error
Reliability Database
Analysis Framework

• Define the existing system design concept


and various sub systems
• Identify the components (i.e. within the sub
systems)
• Specify the requirement and performance
criteria
• Apply reliability analysis to the overall
system
Some Useful Definitions
• Failure Rate; The number of failures experienced or expected for a
device divided by the total equipment operating time. The failure rate is
the numerical inverse of the mean time between failures (MTBF).

• Mean Time to Repair (MTTR); The total amount of time spent


performing all corrective maintenance repairs.

• Redundancy; The existence of more than one piece of equipment, any


of which could perform a given function.

• Fault Tree Analysis; A deductive, top-down method of analyzing


system design and performance based of system components failure
logic. It is performed graphically using a logical structure of AND, OR
and Voting gates.

• Common Cause Failure; An event or mechanism that can cause two or


more failures simultaneously is called a common cause.
Basics of Simple Reliability
Modelling
• Reliability Modelling is based on Boolean
Algebra and uses Fault Tree Analysis.

• Object for this Presentation is to simplify it


and use Excel spreadsheets for high level
applications (e.g. SIL high level
calculations).
Introduction to Fault Tree Logic Gates

Input 1
Output ‘OR’ Gate
Input n

Input 1
Output ‘AND’ Gate
Input n

Input 1
Output ‘Voting’ Gate
Input n
‘OR’ Gate Modelling

‘OR’ P1
Ptotal
Gate
P2

• For ‘OR’ Gate add Probability of Failures, thus P total = P1 + P2

• Example 1: For ESD loop to fail, any failure of Valve or Sensor can
cause total failure.

• Example 2. For household domestic power cut, any failure of power


transmission lines or power generation can cause supply failure.
‘AND’ Gate Modelling

‘AND’ P1
Ptotal Gate
P2

• For ‘AND’ Gate multiply Probability of Failures, thus P total = P1 x P2

• Example 1: For compressor damage caused by hydraulic imbalance, the


surge control valve should fail as well as the vibration interlock failure
to shutdown the compressor, i.e. there must be coincidence of
hazardous events.

• Example 2: For kitchen fire / explosion, oven gas should leak as well as
presence of an ignition source.
‘VOTING’ Gate Modelling
Device 1
PD

Ptotal ‘Voting’ Device 2


Gate
PD

Device 3
PD

• The most common form is “2 out of 3” sensors / switches.


• Example: 2oo3 pressure detection
• The detectors are commonly identical and have similar failure
probabilities ‘PD’
• 2 3
Ptotal  3PD (1  PD )  PD
How to Calculate Item Failure
Probability
• Basic Equation; R  e  t
• Reliability = e - Failure Rate x Time
• Failure Probability = 1 - Reliability
• At low values of λt (i.e. λ<<1) Reliability = 1 - λt
• Thus
Pfailure  t
Revealed Failure Probability

Pfailure  t

• Pfailure = Failure probability between 0 and 1


• t = Mean time to repair an item (MTTR) - hour
• λ= Failure rate (per million item hours)
Hidden (Dormant) Failure Probability
 
Pfailure     MTTR 
2 

• θ = Maintenance Frequency - hour


• Pfailure = Failure probability between 0 and 1
• t = Mean time to repair an item (MTTR) - hour
• λ= Failure rate (per million item hours)
Failure Data
• Estimate component failure rate for reliability
analysis - use Generic historical performance
analysis of components or Vendor Data.

• Failure rate is the rate at which failure occurs as a


function of time or as a function of demand.

• It is expressed as the expected number of failures


of a given failure mode, per item, as failures per
million item hours.
Typical Generic Failure Data (OREDA -2002)

• Turbine Driven Centrifugal Compressor (10000-20000) kW Mean


Critical Failure Rate = 79.26e-6/hr and Active Repair Time = 5.6 hours
• Gas Turbine Aeroderivative (10000-20000) kW Mean Critical Failure
Rate = 591.75e-6/hr and Active Repair Time = 15.9 hours
• Centrifugal Oil Export Pump Mean Critical Failure Rate = 183.60e-
6/hr and Active Repair Time = 53.3 hours
• Conventional PSV Mean Critical Failure Rate = 3.84e-6/hr and Active
Repair Time = 6.3 hours
• ESD Valve Mean Critical Failure Rate = 18.82e-6/hr and Active
Repair Time = 3.5 hours
• Process Control Valve Mean Critical Failure Rate = 6.91e-6/hr and
Active Repair Time = 14.2 hours
• Level Process Sensor Mean Critical Failure Rate = 3.99e-6/hr and
Active Repair Time = 5.0 hours
Revealed Failure Probability Example

If the car gearbox failure rate is = 40e-6/hr (or


0.350 failures per year), then using the following
equation, Pfailure  t

(It is revealed failure as car will not move if


gearbox is broken)

t = Mean time to repair car gearbox is 5 hrs


The failure probability is P = 0.0002
(i.e. reliability at 99.98%)
Hidden (or Dormant) Failure Probability
Example
If the car brake failure rate = 1.5e-6/hr (or 0.013 failures
per year), then using the following equation;
 
Pfailure     MTTR 
2 

(It is hidden ‘or dormant’ as its failure is only revealed on


application)

θ = 4320 hrs (6 monthly maintenance frequency)


t = Mean time to repair car brake is 4 hrs
The failure probability is P = 0.0032
(i.e. Reliability of 99.68%)
Effect of Maintenance Frequency
on Dormant Failures

In Previous Example if change the maintenance


frequency;
θ = 8640 hrs (12 monthly maintenance frequency)
The failure probability is P = 0.0065
(i.e. Reliability of 99.35%)
Common Cause Failure
• Common Cause Failure; An event or mechanism that can
cause two or more failures simultaneously is called a
common cause.

• Applies only to dormant (hidden) failures.

• Use β-factor and incorporate in component failure rate;


 
Pfailure     MTTR 
2 

• Use Concept of Diversity;


Redundant Systems β = 2x10-1
Partially Diverse Systems β = 2x10-2
Fully Diverse Systems β = 2x10-3

• Add it to Fault Tree as an Event


Common Cause Failure

• It is not easy to model CCF


• Use it carefully
• Apply it when not sure on the ‘components
quality’
Application of Reliability Modelling

Exothermic Reaction Reliability Analysis


Exothermic Reaction Reliability
• Feeds A and B are reacted to produce C.
• Feeds A and B, and Product C are flammable and, under
certain conditions, explosive.
• If the flow rates of either Feeds A or B exceed certain
levels, the reaction will runaway.
• If the reaction temperature is not controlled, the reaction
path can shift, resulting in a runaway reaction.
• The runaway reaction results in vaporisation of the
reactants and overpressure of the vessel.
• The overpressure is developed too quickly to be relieved
using a pressure relief valve.
Exothermic Reaction Fault Tree Model
Reactor Process Control and Safety System Failure

Cooling Water System Failure Reactants Flow Control System Failure

Cooling Water Cooling Water


Reactant A Reactant B
Temperature Pump System
Flow Control Flow Control
Control Failure Failure
Failure Failure

Temperature Temperature Cooling Water Cooling Water Line A Flow Line A Flow Line B Flow Line B Flow
Control Sensor Pump Failure Pump Motor Sensor Control Valve Sensor Control Valve
Valve Failure Failure Failure Failure Failure Failure Failure

Note: Fault Tree not includes SIL 3 ESD loops at Feed A & B inlet lines
Fault Tree Analysis on Excel (in the
absence of Fault Tree Software)

Example:
A gas/liquid separator operates at high pressure
and require avoiding high pressure gas
breakthrough to a downstream lower pressure
rated vessel if level detection and control fail.
Fault Tree Analysis on Excel

Need to install ESD system on high pressure


separator liquid outlet and perform Safety
Integral Level Calculation for SIL 1, SIL 2
and SIL 3 ESD systems configurations.

You might also like