You are on page 1of 39

Probabilistic Risk Assessment Page 1 of 39

Probabilistic Risk Assessment


References:

The materials for this lecture note have been adopted from the following references:

1. Stamatelatos, Michael. Probabilistic Risk Assessment Procedures Guide for NASA


Manageress and Practitioners Washington, DC, August, 2002.
2. American Institute of Chemical Engineers (AIChE), (2000). Guidelines for chemical
process quantitative risk analysis. Second Edition, New York: AIChE
3. Crowl D.A., Louvar J.F. (2002). Chemical Process Safety, Fundamentals with
Applications – 2nd edition, Upper Saddle River, N.J.: Prentice Hall PTR.
4. Skelton, B. (1997). Process Safety Analysis: an introduction. Institution of Chemical
Engineers.
5. Modarres M. Risk Analysis in Engineering: Techniques, Tools, and Trends.
6. Vose, D. Risk Analysis: A Quantitative Guide.

1. Introduction
 Probabilistic risk assessment (PRA) is a systematic procedure for investigating how
complex systems are built and operated
 PRA model shows how human, software, and hardware elements of a system
interact with each other. It also, assesses the most significant contributors to the
risks of the system
 Risk is expressed as a function of the frequency or probability of an accident and the
consequences f the accident
 PRA for process systems use logic tree to model the causes and consequences for
an accident
Probabilistic Risk Assessment Page 2 of 39

Figure 1: Implementation of the Triplet Definition of Risk in PRA

Importance of PRA

The most important strengths of the PRA, as the formal engineering approach to risk
assessment are:
 PRA provides an integrated and systematic examination of a broad set of design
and operational features of a complex system.
 PRA incorporates the influence of system interactions and human-system interfaces.
 PRA provides a model for incorporating operating experience with the complex
system and updating risk estimates.
 PRA provides a process for the explicit consideration of uncertainties
 PRA permits the analysis of competing risks (e.g., of one system versus another or
of possible modifications to an existing system).
 PRA permits the analysis of (assumptions, data) issues via sensitivity studies.
 PRA provides a measure of the absolute or relative importance of systems,
components to the calculated risk value.
 PRA provides a quantitative measure of overall level of health and safety for the
engineered system.

2. Steps in conducting a PRA:


 NASA provided a systematic guideline for PRA
Probabilistic Risk Assessment Page 3 of 39

 The components of PRA and their interaction in a PRA process has been depicted in
Figure 2

Figure 2: Components of the overall PRA process

2.1 Objectives and methodology:

 Define the objectives before preparing PRA for a system


 The most common objectives are including design improvement, risk acceptability,
decision support, regulatory and oversight support, and operation life management.
 Develop an inventory of possible techniques for the desired analysis.
 The methods such as analyzing the system logic, data, human interactions, software
error, risk ranking and uncertainty analysis should be studied and selected.
2.2 Familiarization and Information assembly:

 Familiarize the general knowledge about the physical layout (e.g., facility, design,
process) of the overall system, administrative controls, barriers, subsystems, etc.
Following steps are recommended for this step:
1. Major critical barriers, structures, emergency, safety systems and human
interventions should be identified.
2. Physical interactions among all major subsystems (or parts of the system) should
be identified and explicitly described. The result should be summarized in a
dependency matrix.
3. Past major failures and abnormal events should be noted and studied.
Probabilistic Risk Assessment Page 4 of 39

4. A good filing system must be created at the outset and maintained throughout
the study.
 With the help of the designers, operators, and owners, the information regarding the
ground rules for the analysis, the scope of the analysis, and the configuration and
phases of the operation of the overall system should be assembled.
2.3 Initiating Events (IEs) Identification:

 It involves identifying the abnormal events for the system


 The complete set of IEs that serve as trigger events in sequences of events
(accident scenarios) leading to end states must be identified and retained in the
analysis.
 The following steps are recommended to perform in this step:
1. Select a method of identifying specific operational and non-operational initiating
events. This can be accomplished with special types of top level hierarchies,
called master logic diagrams (MLDs) or with techniques like FMEA.
2. Using the selected method, identify a set of initiating events.
3. Group the initiating events having the same effect on the system.
2.4 Scenario Modeling:

This step involved with following procedures:


1. Identify the mitigating function functions for each initiating events ( or group of IEs)
2. Identify the corresponding human actions, systems, or hardware operations
associated with each function, along with their necessary condition of success.
3. Develop a functional event tree for each IE (or group of IEs)
4. Model event tree for each initiating event proceeds with inductive logic and
progresses through the scenario, a series of successes or failures of intermediate
events called pivotal events, until an end state is reached
2.5 Failure or Logic modeling:

 Each failure (or its complement, success) of a pivotal event in an accident scenario
is usually modeled with deductive logic and probabilistic tools called fault trees (FTs)
or master logic diagrams (MLDs).
 FT is most common and popular method to calculate the probability of subsystem
failure or occurrence of an accident scenario.
 The flowing procedures are recommended as a part of developing FT:
1. Develop a FT for each event in the ET heading for which actual historical failure
data do not exist.
Probabilistic Risk Assessment Page 5 of 39

2. Explicitly model dependencies of a subsystem on the other subsystems and


intercomponent dependencies.
3. Include all potentially reasonable and probabilistically quantifiable causes of
failure in the FT
2.6 Data Collection, Analysis, and Development:

 Various types of data must be collected and processed for use throughout the PRA
process.
 This activity proceeds in parallel, or in conjunction, with some of the steps.
 Data are assembled to quantify the accident scenarios and accident contributors.
 Data include component failure rate data, repair time data, IE probabilities, structural
failure probabilities, human error probabilities (HEPs), process failure probabilities,
and common cause failure (CCF) probabilities.
 Uncertainty bounds and uncertainty distributions also represent each datum.
2.7 Quantification and Integration:

The FTs and ET are integrated and their events are quantified to determine frequencies
of scenarios and associated uncertainties in the calculations of final risk values.
The following steps are recommended for Quantification and Integration in PRA:
 Merge corresponding FTs associated with each failure or success event modeled in
the ET scenarios. Determine the truncated minimal cut sets.
 Calculate the total frequency of each scenario, using the frequency of IEs, the
probability of barriers failure including contributions of test and maintenance
frequency, human error probabilities (HEPs), and common cause failure (CCF)
probabilities.
 Group the scenarios according to the end state of the scenario defining the
consequence. All end states are then grouped, i.e., their frequencies are summed up
into the frequency of a representative end state.
 Calculate the total frequency of all scenarios of all event trees.
2.8 Uncertainty Analysis:

 As part of the quantification, uncertainty analyses are performed to evaluate the


degree of knowledge or confidence in the calculated numerical risk results.
 Monte Carlo simulation methods are generally used to perform uncertainty analysis,
although other methods exist.
 Steps are recommended:
Probabilistic Risk Assessment Page 6 of 39

1. Identify models and parameter that are uncertain and the methods of uncertainty
estimation to be used for each.
2. Describe scope of PRA and significance and contributions of elements that are
not modeled or considered.
3. Estimate and assign probability distributions depicting model and parameter
uncertainties.
4. Propagate uncertainties associated with the barrier models and parameters to
find the uncertainty associated with the risk value.
5. Present the uncertainties associated with risks and contributors to risk.
2.9 Sensitivity Analysis:

 Sensitivity analyses are also frequently performed in a PRA to indicate analysis


inputs or elements whose value changes cause the greatest changes in partial or
final risk results
 They are also performed to identify components in the analysis to whose quality of
data the analysis results are or are not sensitive
2.10 Risk Ranking and Important Analysis:

 Special techniques are used to identify the lead, or dominant, contributors to risk in
accident sequences or scenarios
 The identification of lead contributors in decreasing order of importance is called
importance ranking
 This process is generally performed first at the FT and then at the ET levels

 Different types of risk importance measures are determined again usually using the
integrated PRA program
2.11 Risk Result Interpretation:

 After calculating the risk values, they must be interpreted to determine whether any
revisions are necessary to refine the results and analysis.
 The basic steps are:
1) Determine accuracy of the logic models and scenario structures, assumptions,
and scope of PRA.
2) Identify system elements for which better information would be need to reduce
uncertainties in failure probabilities and model used to calculate performance.
3) Revise the PRA and reinterpret the results until attaining stable and accurate
results.
Probabilistic Risk Assessment Page 7 of 39

3. Failure model and events frequencies of a system


Require application of probability theory and the logic of certainty, i.e., logical operations
with events.
3.1 Events and Boolean Operations

 An event is a meaningful statement that can be true or false. Thus, “it will rain today”
is an event, while the statement “it may rain today” is not an event, because it can
never be proven to be true or false.
 An indicator variable, X. whose values are 1 or 0 depending on whether the event is
true or false, is useful to assign for an event E to perform Boolean operations.

Figure 3: Definition of an Indicator Variable

 Boolean operations:

a. The negation: For the event E, we define its complement E such that E is
false when E is true. The indicator variable expression is:

Figure 4: The NOT Operation

Figure 4 shows the Venn diagram for the NOT operation, as well as the logic
gate “not.”
Probabilistic Risk Assessment Page 8 of 39

b. The intersection: Given two events A and B, we form a third event C such that
C is true whenever both A and B are true. The Venn diagram and the logic
gate AND is shown in Figure 5. The Boolean and the indicator variable
expressions are:

Figure 5: The Intersection of Events

Two events are said to be mutually exclusive if they cannot be true at the same time.

The union: Given two events, A and B, we form a third event C such that C is true
whenever either A or B is true. The logic gate OR is shown in Figure 6. The Boolean and
the indicator variable expressions are:

Figure 6: The Union of Events


Probabilistic Risk Assessment Page 9 of 39

3.2 Failure model for a series system

A series system is such that all its components are required for system success.
Equivalently, the system fails if any component fails. Figure 7 represents the block and
logic (failure) diagram for a series system.

Figure 7: Block and logic diagram (Failure) for a series system

3.3 Failure model for a parallel system

A parallel system is a redundant system that is successful, if at least one of its elements is
successful. Equivalently, the system fails if all of its components fail. Figure 8 represents
the block and logic (Failure) diagram for a parallel system

Figure 8: Block and logic diagram (Failure) for a parallel system


Probabilistic Risk Assessment Page 10 of 39

3.4 Failure model for Structure Functions

The system indicator variable can be expressed in terms of the indicator variables of the
components. In general, the indicator variable of the top event is a function of the primary
inputs:

where S (X) is the structure or switching function and it maps an n-dimensional vector of 0s
F

and 1s into 0 or 1.As an example shown in Figure 9, consider a two-out-of-three system, in


which at least two components are needed for success

Figure 9: Block Diagram of the Two-out-of-Three System

The system fails if any two or all three components fail (OR gate). Thus, the structure
function for this system can be written as:

Figure 10: Logic Diagram of the Two-out-of-Three System


Probabilistic Risk Assessment Page 11 of 39

3.5 Binomial model: probability of failure on demand

Consider,
Pr(failure to start on demand) ≡ q
Pr(successful start on demand) ≡ p
Clearly, q + p = 1
A distribution that is often used in connection with these probabilities is the binomial
distribution. It is defined with following Figure

Figure 11: Binary States of a system


The probability of exactly k failures in n trials is

Where,

The two commonly used moments are:

The probability that n trials will result in at most m failures is:

3.6 Failure while running

If T, the time to failure of a component is a continuous random variable (CRV), then


Probabilistic Risk Assessment Page 12 of 39

Note the distinction between h(t) and f(t) :

 f(t)dt is unconditional probability of failure in (t,t +dt).


 h(t)dt is conditional probability of failure in (t,t +dt) given that the component has
survived up to t.
Bath-tub curve: Represent the failure nature of components.

 Highest failure rate exhibits for a component at infant mortality stage and old age)
stage. Between these two the failure rate is reasonably constant.
 On average, most component fails after a certain period of time which is called is
called the average failure rate and is represented by λ with units of faults/time.
Probabilistic Risk Assessment Page 13 of 39

Constant failure
Failure Rate λ

rate

Infant mortality
Burn in Old age
Wear out

Time

Figure 12: Typical Bath-tub curve

4. Probability distributions
4.1 Exponential model

It is used widely in reliability and risk assessment because it is the only one with a constant
failure rate. Its probability density function (pdf) is

Where, λ is the failure rate.


The CDF is
𝐹(𝑡) = 1 − 𝑒 −𝜆𝑡
And the reliability
𝑅(𝑡) = 𝑒 −𝜆𝑡
The hazard function is
ℎ(𝑡) = 𝜆 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡
and the first two moments are:

4.2 Weibull model:

A flexible distribution that is used widely in reliability is the Weibull distribution.


The CDF is
Probabilistic Risk Assessment Page 14 of 39

where b > 0 and λ > 0. It can be shown that

where Γ is the gamma function.

The CDF is
𝑏
𝐹(𝑡) = 1 − 𝑒 −(𝜆𝑡) 𝑓𝑜𝑟 𝑡 > 0
And the reliability
𝑏
𝑅(𝑡) = 𝑏𝜆(𝜆𝑡)𝑏−1 𝑒 −(𝜆𝑡)
The hazard function is
ℎ(𝑡) = 𝑏𝜆(𝜆𝑡)𝑏−1
It can be observed that for b < 1, b = 1, and b > 1, this distribution can be used as a life
distribution for the infant mortality, useful life (Constant Failure rate), and wear-out periods,
respectively.
4.3 Event frequency: Poisson model

Events occur over a continuum (time, space) at an average constant rate, λ. The
occurrence of an event in a given interval is assumed to be independent of that in any other
non-overlapping interval. This distribution is used to describe the occurrence of initiating
events (IEs) in risk assessment.
The Poisson distribution gives the probability of exactly k events occurring in (0,t):
Probabilistic Risk Assessment Page 15 of 39

5. Scenario and Logic tree modeling in PRA


5.1 Basic terminology:

 Initiating event: Any unwanted, unexpected or undesired event (e.g., system or


equipment failure, human error or a process upset, toxic or flammable release)
refers to the initiating event for the event tree.
 Events: The events following the initiating event are termed as precursor events or
sometimes termed only as the events for the event tree (e.g., ignition, explosion,
release drifting).
 Outcome events: The possible effects, the scenarios or outcomes of an initiating
event, are known as outcome events (e.g., fireball, vapor cloud, explosions).
 Top-event: The unwanted event that is placed at the top in a fault tree and further
analyzed to find the basic causes is known as top-event.
 Basic-event: The basic causes that are not further developed or defined are known
as basic-events (e.g., equipment item failure, human failure, external event). It
represents the basic causes for the fault tree.
 Intermediate Event: Event in the fault tree that can be further developed by basic-
events is known as intermediate event.
5.2 Master Logic Diagram (MLD)

 Master logic diagrams (MLDs) are graphical representations of system perturbations.


 MLDs is used for the identification of IEs
 An MLD (Figure 13) is a hierarchical, top-down display of IEs, showing general types
of undesired events at the top, proceeding to increasingly detailed event descriptions
at lower tiers, and displaying initiating events at the bottom.
 An MLD resembles an FT, but it lacks explicit logic gates. An MLD also differs from
an FT in that the initiators defined in the MLD are not necessarily failures or basic
events.
Probabilistic Risk Assessment Page 16 of 39

Figure 13: A Typical Structure of a Master Logic Diagram (MLD)

Construction procedure of MLD

 MLDs are a hierarchical depiction of ways in which system perturbations occur.


 The top event in each MLD is an end state
 Events that are necessary but not sufficient to cause the top event are enumerated
in even more detail as the lower levels of the MLD are developed.
 For complex missions it may be necessary to develop phase-specific MLDs since
threats and initiators may change as the mission progresses.
 Pinch point is the key concept in MLD development. It refers the termination criterion
applied to each MLD branch.
 A pinch point occurs when every lower level of the branch has the same
consequence (relative to system response) as the higher levels.
 An important admonition in IE identification and grouping is that the final set of IEs
should be mutually exclusive.
Probabilistic Risk Assessment Page 17 of 39

5.3 Event Tree (ET):

 It is an inductive procedure which maps the all possible outcomes resulting from an
initiating event (any accidental release or occurrence), e.g. gas leakage, equipment
failure or human error.
 Identifies possible accidents or consequences arises from an initiating event.
 Identify the design and procedural weaknesses.
 Determine the probability of various outcomes (final consequences) resulting from
the initiating event.

Steps in ET:

 Identification of initiating event: Event Tree begins with an initiating event and
works forwards to its consequences. Examples: leak, gas release, loss of cooling
water in a reactor etc.
 Identifications of Safety Functions or pivotal events: They are designed to
mitigate the effects of failure. The safety functions or pivotal events are listed
according to the order at which they are intended to occur.
 Safety systems which automatically respond to the fault; trips, automatic
shutdown etc.
 Alarms which alert the operator
 Operator actions in response to alarms
 Barriers or containment systems to limit the effects on the initiating event
 Example:
High reactor output temperature.
Alarm alerts operator at high temp.
Operator reestablishes cooling water flow to the reactor.
Automatic shutdown system stops reaction
 Event Tree Construction:
Probabilistic Risk Assessment Page 18 of 39

a. Input the initiating event and safety functions:

Loss of High Output Operator Actions Automatic

coolant flow Temp Alarm Shutdown

Initiating event
Safety Functions Pivotal events

b. Qualitative evaluation of safety functions: It is qualitative judgment like


success/ failure, true/ false or yes/no to evaluate the safety functions or events’
consequences in different branches of the tree.

Loss of High Output Operator Automatic


Actions Shutdown
coolant flow Temp Alarm

2nd safety func.


st
1 safety func.

Success
Initiating Event

Failure

If the safety function has no affect the accident path proceeds with no
branch pt to the next safety function.
Probabilistic Risk Assessment Page 19 of 39

 Classify accident outcomes: All outcomes are identified according to the types of
consequence model.

Loss of High Output Operator Automatic


Actions Outcomes
coolant flow Temp Alarm Shutdown

Safe Condition, Resume


normal operation

Safe Condition,
Automatic shutdown
Success
Initiating Event Unsafe Condition,
Operator noticed
Runaway reaction
Failure Unstable condition,
Auto shutdown

Unsafe condition,
Operator unaware
about runaway reaction

 Estimate and rank outcomes probability: Estimates the frequency of outcomes


and ranks the consequence severity of outcomes. General equation to estimate the
frequency of each outcome:
n

i
    Pi
i 1
Where, λ respectively denotes the initiating event and outcome events frequency
5.4 Fault Tree (FT):

 Fault tree represent a graphical relationships among the events and an unwanted
event using logic gates.
 Start with the top events and reveals the basic causes of top-events in deductive
way.
 The relationship between the events are expressed using “AND” or “OR” logic gates.
 Quantify the probability of occurrence of an accident using fault tree and basic failure
data.
 Identifies system weakest links through cutsets.
Probabilistic Risk Assessment Page 20 of 39

 Evaluates common-mode failures.


Logic gates and events symbol

Symbol Gate type Description

. An output from the gate (event) will


AND
occur only when all input occurs

. An output from the gate (event) will


OR
occur if any of the inputs occur

An output from the gate (event) will


INHIBT Gate occur inputs occur and the inhibit event
also occur

A fault event that need no further


Basic event
definition

Undeveloped An event that cannot be further


event developed due to lack of information

Intermediate Event can be further described with


event s basic events

Transfer the connection from or to


Transfer gate
another fault tree
Probabilistic Risk Assessment Page 21 of 39

Important definitions for FTA

 Cut Set: A cut set is combinations of basic events; if all these basic events occur;
the top event is guaranteed to occur.

 Minimal Cut Set: A minimal cut set is one with no unnecessary basic event is
removed from the set, the remaining events collectively are no longer a cut set.

 Path Set: A path set is a collection of basic events; if none of the events in the sets
occur, the top event is guaranteed not to occur.

 Minimal Path Set: A minimal path set is a path set such that if any basic event is
removed from the set, the remaining events collectively are no longer a path set.

Steps in Fault Tree Analysis

1. Fault Tree Development


2. Minimum cut set finding
3. Probabilistic analysis using basic failure data
4. Sensitivity Analysis

Top-event

OR gate Intermediate event

External event

AND gate
Undeveloped event

Basic event TRASFER gate

Figure 14: Standard fault tree


Probabilistic Risk Assessment Page 22 of 39

Fault Tree Development

Identify the top event as


accident scenario

Identify the events that may


cause top event to occur

Does these
Yes events may No
be broken
down?
Identify the relationship
Identify the causes that of these events to top
may lead these events event

Identify relationship
events and their basic Transform these
causes relationships in fault
tree using gates

Transform these
relationships in fault
tree using gates
Probabilistic Risk Assessment Page 23 of 39

Rules of Boolean algebra

Commutative rule
P1∙P2= P2∙P1 P1+ P2= P2+P1
Associative rule
P1∙ (P2 ∙P3) = (P1∙ P2 ) ∙P3 P1+ (P2 +P3) = (P1+ P2 )+ P3
Distributive rule
P1∙ (P2 +P3) = (P1∙ P2 )+( P1∙P3) P1+ (P2 ∙P3) = (P1+ P2 ) ∙ ( P1+P3)
Idempotent rule
P1∙ P1= P1 P1∙ P1= P1
Rule of absorption
P1∙ (P1 +P2) = P1 P1 + (P1∙P2) = P1

Minimum Cut Set Finding


 Top-down approach
1. Uniquely identify all gates and basic events
2. Place the top gate in the first row of a matrix.
3. Replace all gates by basic events either using a or b.
a. Replace an “OR” gate by vertical arrangement.
b. Replace an “AND” gate by horizontal arrangement.
4. Delete all supersets (sets that contain another set as a subset)
 Bottom-up approach

 Similar, except start with gates containing only basic events.


 Generate two columns, one is for gates and other for the other for cut sets.
Start with gates that have only basic events as inputs.
1. Generate cut sets for each of these gates in the table.
2. For “OR” gate If gate use union rule and represent the basic events separately.
Example: A “OR” B= (A), (B).
3. For “AND” gate uses intersection rule and put the events into the same bracket.
Example: A “AND” B= (A, B).
Probabilistic Risk Assessment Page 24 of 39

Top-down approach example:

G1 Top-gate
“AND” is replaced by
G2,G3
vertical arrangement
A,G3 Gates are replaced with
G1
G4,G3 input events
A,C
A,G5 “OR” is replaced by
B, G3 vertical arrangement
G2 G3 C, G3
A,C
A,B
According to Boolean
B,C
algebra
B,G5
A×A≡ A
G4 G5 C,C
C,E4
A,C
A,B
B,C
A,B
C
C,A,B
The minimal cutsets for the tree are:

{C}, {A, B}, {A, C}, {B, C}, {A, B, C}


Bottom-up approach Example: using the same fault on above for bottom-up approach

G5 (A,B)
G4 (B),(C)
G2 (A),(G4)
G2 (A),(B),(C)
G3 (C),(G5)
G3 (C),(A,B)
G1 (G2,G3)
G1 (A,C),(B,C),(C),(A,B),(A,B),(A,B,C)

The minimal cutsets for the tree are:


{C}, {A, B}, {A, C}, {B, C}, {A, B, C}
Probabilistic Risk Assessment Page 25 of 39

Probabilistic analysis using basic failure data

1. Quantification of (Assignment of Probabilities or Frequencies to) Basic Events:

 Basic event can be directly quantifiable from data, including, if necessary,


conditioning of its probability on the occurrence of other basic events.
 Usually, basic events are formulated to be statistically independent, so that the
probability of the joint occurrence of two basic events can be quantified simply as the
product of the two basic event probabilities.
 A simple and widely used model is the exponential distribution that is based on the
assumption of constant time-to-failure.
 The probabilities of basic events are quantified probabilistically, i.e., using probability
density distributions that reflect our uncertainty—our limited state of knowledge—
regarding the actual probabilities of these events.
Top-event probability estimation
 Gate-by-gate approach: Straightforward and simple. Using union (OR gate) and
intersection (AND gate) rule to calculate top-event probability.

Gate Input pairing Output calculation


OR P(A) “OR” P(B) P (A U B) = P(A) +P(B)-P(A)P(B)
=P(A)+P(B)
[If P(A) and P(B) are small]
AND P(A) “AND” P(B) P (A ∩ B) = P(A)P(B)

Where, P (A) and P (B) are failure probability of components A and B respectively
 Cutsets approach: Quickest method. Applicable when the fault tree is large and the
failure rate/failure probability of basic events are small.

n
PTOP   C j
j1

n
C j   Pi
i 1
Where, PTOP is the probability for the top event and Cj is the probability of minimal cutsets.
And, i=1, 2, 3 .n, denotes the failure probability of corresponding components or basic
events.
Probabilistic Risk Assessment Page 26 of 39

Sensitivity analysis

 Basic-events (Components) importance (BIi): It is calculated by the sum of the


probability of occurrence of all cutsets containing the basic-event (component)
divided by the total probability of occurrence for the system .

C j

BI i   Fractionof systemunavailability contributed by basic - event i


i in j

PTop

The symbol ∑ in the equation denote a sum of all those probability of cutsets
containing basic-event i as one of its basic-events.
 Cutsets importance (CIj): It is the ratio of cutsets characteristic over the system
characteristic.

Cj
CI j   Fractionof systemunavailability contributed by cutsets j
PTOP

6. Data Modeling
Development of PRA database includes two main phases:
1. Information Collection and Classification
2. Parameter Estimation
Typical quantitative information of interest are:
 Internal initiating events (IEs) Frequencies
 Component Failure Frequencies
 Component Test and Maintenance Unavailability
 Common Cause Failure (CCF) Probabilities
 Human Error Rates
 Software Failure Probabilities
Data modeling for a PRA database involves the following steps:
 Model-Data Correlation (identification of the data needed to correspond to the level
of detail in the PRA models, determination of component boundaries, failure modes,
and parameters to be estimated, e.g., failure rates, MTTR)
Probabilistic Risk Assessment Page 27 of 39

 Data Collection (determination of what is needed, such as failure and success data
to estimate a failure rate, and where to get it, i.e., identification of data sources, and
collection and classification of the data)
 Parameter Estimation (use of statistical methods to develop uncertainty distribution
for the model parameters)
 Documentation (how parameter uncertainty distributions were estimated, data
sources used, and assumptions made)
Example of data requirement

The following Table shows the data needed to estimate the various parameters.

Events Probability Models Data Required

Number of events k
Initiating event
in time t

Number of failure
Binomial model: Constant
Standby component fails on events kin total
probability of failure on
demand number of demands
demand, or q
N
Component in operation fails Number of events k
Constant failure rate
to run, or component in total exposure time
changes T (total time standby
state during mission (state of component is
Tm: Mission time
component continuously operating, or time the
λO : Operating failure rate
monitored) component is on line)
Probability of failure: Number of failures
Basic events Pf  1  e t occurs in time t

Sources of information

Parameters of PRA models of a specific system should be estimated based on operational


data of that system. Often, however, the analysis has to rely on a number of sources and
types of information if the quantity or availability of system-specific data is insufficient. In
such cases surrogate data, generic information, or expert judgment are used directly or in
combination with (limited) system-specific data. According to the nature and degree of
relevance, data sources may be classified by the following types:
 Historical performance of successes and failures of an identical piece of equipment
under identical environmental conditions and stresses that are being analyzed (e.g.,
direct operational experience).
Probabilistic Risk Assessment Page 28 of 39

 Historical performance of successes and failures of an identical piece of equipment


under conditions other than those being analyzed (e.g., test data).
 Historical performance of successes and failures of a similar piece of equipment or
similar category of equipment under conditions that may or may not be those under
analysis (e.g., another program’s test data, or data from handbooks or compilations).
 General engineering or scientific knowledge about the design, manufacture and
operation of the equipment, or an expert’s experience with the equipment.
Generic Data Sources: Categories of generic data include:

 Hardware failure rates.


 Human and software failure probabilities,
Example sources of generic failure data are:
 Electronic Parts Reliability Data (EPRD)
 Non-electronic Parts Reliability Data (NPRD)
 Failure Mode Database
 MIL-STD-217
 Reliability Prediction Procedure for Electronic Equipment (Bellcore), TR-332
 Handbook of Reliability Prediction Procedures for Mechanical Equipment, NSWC
Standard 94/L07
 IEEE-500
System-Specific Data Collection and Classification: System-specific data can be collected
from sources such as
 Maintenance Logs
 Test Logs
 Operation Records
A systematic method of classification and detailed failure taxonomy is essential in collecting
the system-specific hardware failure data. Figure 15 shows classifications of failure
taxonomy according to the functional state of components.
Probabilistic Risk Assessment Page 29 of 39

Figure 15: Component Functional State Classification

Another aspect of reliability data classification is the identification of the failure cause. A
method of classifying causes of failure events is to progressively unravel the layers of
contributing factors to identify how and why the failure occurred.
Figure 16 shows the event classification process highlighting the part that deals with failure
cause classification.

Failure Cause: The event or process regarded as being


responsible for the observed physical and functional failure
modes (e.g., use of incorrect material)

Failure Mode: The particular way the Failure Mechanism: The physical
function of the component is affected change (e.g., oxidation, crack) in the
by the failure event (e.g., fails to start, component or affected item that has
fail to run) resulted in the functional failure mode

Figure 16: Failure Event Classification Process Flow


Parameter estimation method

The following methods e widely used:


Probabilistic Risk Assessment Page 30 of 39

1. Least square approach: Discussed on reliability section


2. Maximum likelihood estimation: Discussed on reliability section
3. Bayesian methods
Bayesian estimation

 Incorporates degree of belief and information beyond that contained in the data
sample forming the practical difference from classical estimation.
 In the framework of Bayesian approach, the parameters of interest are treated as
random variables, the true values of which are unknown. Thus, a distribution can be
assigned to represent the parameter, the mean (or for some cases the median) of
the distribution can be used as an estimate of the parameter of interest. Bayesian
parameter estimation is comprised of two main steps. The first step involves using
available information to fit a prior distribution to a parameter, such as a failure rate.
The second step of Bayesian estimation involves incorporating additional or new
data to update the prior distribution. This gives a posterior distribution, which better
represents the parameter of interest. This step is often referred to as “Bayesian
Updating.”
 The Bayes’ Theorem for continuous PDF, estimates the parameter for posterior PDF
f(θ/t) using the following relationships:
h( )l (t /  )
f ( / t )  

 h( )l (t /  )d


Here θ be a parameter of interest, h(θ) be a continuous prior PDF and l(t/θ) be the
likelihood function based on sample data t.
 For a discrete PMF, the Bayes’ Theorem can be written as:

Pr( /  i )
Pr( i /  )  Pr( i ) n

 Pr( ) Pr( /  )
i 1
i i

Where, Pr(θi/ε)= the conditional probability of θi given ε, or the posterior probability


for θi ; Pr(θi) = prior probability (can be expressed using a PMF): and Pr(ε/θi) = the
probability of obtaining the new information (ε) given a certain value (θi) for the
parameter. The following notation for the posterior distribution is also common:
Probabilistic Risk Assessment Page 31 of 39

Pr( /  i )
Pr ( i )  Pr( i ) n

 Pr( ) Pr( /  )
i 1
i i

Using the prior distribution of the parameter θ given by a PMF, the expected value of
the parameter can be computed as
n
E ( )   i Pr( i )
i 1

Based on the posterior distribution, the expected value of θ can be computed as:
n
E ( /  )   i Pr ( i )
i 1

The Bayesian estimation of the parameter can be used to compute Bayesian


probabilities that are obtained using the information gained about the parameter. For
example, the probability that X is less than some value x0 can be computed using
prior distribution as:
n
Pr( X  x0 )   Pr( i ) Pr( X  x0 /  i )
i 1

 Hence the entire Bayesian inference includes the following three stages:
1. Constructing the likelihood function based on the distribution of interest and type of
data available
2. Quantification of the prior information about the parameter of interest in form of a
prior distribution.
3. Estimation of the posterior distribution of the parameter of interest.
 The Bayes analog of the classical confidence interval is known as Bayes' probability
interval'. For constructing Bayes, probability interval, the following obvious relationship
based on the posterior distribution is used:
Pr( l     u )  1  

Prior distributions

Prior distributions can be specified in different forms. Possible forms include:


 Parametric (gamma, lognormal, beta):
o Gamma or lognormal for rates of events (time-based reliability models)
o Beta, truncated lognormal for event probabilities per demand
 Numerical (histogram, DPD, CDF values/percentiles)
Probabilistic Risk Assessment Page 32 of 39

o applicable to both time based and demand based reliability parameters


In the parametric forms, a number of probability distributions are extensively used in risk
studies as prior and posterior distributions. Few of them are listed below.
 Lognormal (μ,σ)

where μ and σ are the parameters of the distribution of 0 ≤ x < ∞. Lognormal distribution
can be truncated (Truncated Lognormal) so that the random variable is less than a
specified upper bound.
 Gamma (α,β)

where α and β are the parameters of the distribution of 0 ≤ x < ∞


 Beta (α,β)

where α and β are the parameters of distribution of 0 ≤ x ≤ 1.


The prior distribution is usually based on generic data. Information content of prior
distributions can be based on:
 Previous system-specific estimates
 Generic, based on actual data from other (similar) systems
 Generic estimates from reliability sources
 Expert judgment
 “Non-informative.” This type is used to represent the state of knowledge for the
situations where little a priori information exists or there is indifference about the
range of values the parameter could assume. A prior distribution that is uniformly
distributed over the interval of interest is a common choice for a non-informative
prior. However, other ways of defining non-informative prior distributions also exist.
Probabilistic Risk Assessment Page 33 of 39

Selection of Likelihood function

The form of the likelihood function depends on the nature of the assumed Model of the
World representing the way the new data/information is generated: Few selected likelihood
functions are discussed:
 Poisson Process: The Poisson distribution is the proper likelihood function

which gives the probability of observing k events (e.g., number of failures of a


component) in T units of time (e.g., cumulative operating time of the component), given
that the rate of occurrence of the event (failure rate) is λ .
 Bernoulli Process: The Binomial distribution is the proper likelihood function:

which gives the probability of observing k events (e.g., number of failures of a


component) in N trials (e.g., total number of tests of the component), given that the
probability of failure per trial (failure on demand probability) is q.
 For data in form of expert estimates or values for data sources, lognormal distribution is
a proper likelihood function.
Development of the posterior distribution

The various combinations of prior and likelihood functions as well as the form of the
resulting posterior distributions are listed in following Table 1.
Table 1 Typical Prior and Likelihood Functions Used in PRAs

Conjugate prior: Many practical applications of Bayes’ Theorem require numerical


solutions to the integral in the denominator of Bayes’ Theorem. Simple analytical forms for
the posterior distribution are obtained when a set of prior distributions, known as conjugate
prior distributions, are used. A conjugate prior distribution is a distribution that results in a
posterior distribution that is a member of the same family of distributions as the prior.
Probabilistic Risk Assessment Page 34 of 39

Few commonly used conjugate distributions are listed in Table 2. The formulas used to
calculate the mean and the variance of the resultant posterior in terms of the parameters of
prior and likelihood functions are provided.
Table 2: Commonly used Conjugate Priors in PRA

In Appendix I, some useful derivations using conjugate prior to calculate parameter for few
selected continuous PDFs are provided.
Probabilistic Risk Assessment Page 35 of 39

APPENDIX I
1. Binomial model
X is binomial random variable (the number of success n
Variate
Bernoulli trials) θ is the probability of success at each trial.
 n
Likelihood function l(θ/x): Binomial model:   x (1   ) x
 x
1
Conjugate prior h(θ): Beta distribution:   1 (1   )  1
B ( ,  )
1
Posterior PDF f((θ/x):  x  1 (1   ) n  x   1
B( x   , n  x   )
Bayes theorem states:
f ( / x)  h( )l ( / x)
  x (1   ) x    1 (1   )  1
  x  1 (1   ) n  x   1
The normalizing constant is therefore given by
1
1


x  1
(1   ) n  x   1 d
0

Using the Beta function identity


Derivation of posterior 1

distribution B(m, n)   y m1 (1  y ) n1 dy


0
and its relation to the Gamma function
mn
B(m, n) 
 ( m  n)
the posterior PDF becomes
1
f ( / x)   x  1 (1   ) n  x   1
B( x   , n  x   )
( n     )
  x  1 (1   ) n  x   1
 ( x   ) ( n  x   )
Probabilistic Risk Assessment Page 36 of 39

2. Poisson model:
X is a Poisson random variable (the number of events within
Variate a specified time interval t), λ
events
Likelihood function ( ) x
Poisson model: exp(-  )
l(λ /x): x!
1
Conjugate prior h(λ ): Gama distribution: 
 1exp ( )
 ( )
Gamma (nx   ,   n) distribution:
(   n)   nx (  nx 1)
exp  (   n) 
Posterior PDF f(λ /x):

(  nx )
Suppose x = (x1...xn) is a set of n independent frequencies
each distributed as a Poisson distribution with mean λ
Then given x, the likelihood is
n
( ) xi
l ( / x )   exp(-  )
i 1 xi !
( )  i
x

 exp(-n  )
x1! x 2 !...x n !

 nx exp(-n  ) Here, x 


x i

n
Using Bayes theorem :
f ( / x)  h( )l ( / x)
Derivation of posterior   1exp (  )  nx exp(-n  )
distribution
 (  nx 1)exp  (   n) 
The normalizing constant is therefore given by
1


(  nx 1)
exp  (   n) d
0

(   n)  nx
By integration it becomes:
(  nx )
the posterior PDF becomes
(   n)  nx (  nx 1)
f ( / x )   exp  (   n) d
(  nx )
Which is a Gamma (nx   ,   n) distribution.
Probabilistic Risk Assessment Page 37 of 39

3. Exponential model:
X is a exponential random variable (the waiting time between
Variate μ is
the mean waiting time between events
1 x
Likelihood function l(μ/x): Exponential model: exp(- )
 
 1
1 1 
Conjugate prior h(): Inverted Gama distribution:   exp ( )
 ( )   


Inverted Gamma (  n, nx   ) distribution:
n  1
Posterior PDF f(/x): (nx   ) n 1  1    1
  (nx   ) exp 
(n   )      
Suppose x = (x1...xn) is a set of independent and identically
distributed observations on the waiting time T between
consecutive events in a Poisson process, where E(T) = μ
the mean waiting time, then
n
1 xi
l (  / x)   exp(- )
i 1  
n

1
n x i
   exp(- i 1
)
 
x
n
1 nx
   exp(- ) Here, x 
i


   n
Using Bayes theorem :
f (  / x)  h(  )l (  / x)
 1 n
1  1 nx
Derivation of posterior    exp ( )    exp(- )
distribution    
n  1
1  1 
   exp  (nx   )
   
The normalizing constant is therefore given by
1
 n  1
1  1 
   
0
exp 
 
(nx   )

(n   )
By integration it becomes:
(nx   ) n 
the posterior PDF becomes
n  1
(nx   ) n  1  1    1
f (  / x)  (nx   )
  exp 
(n   )   
  
Which is a Inverted Gamma (  n, nx   )
distribution
Probabilistic Risk Assessment Page 38 of 39

4. Normal model: (σ known, unknown)


2

X is a normally distributed random variable whose mean 


Variate
unknown but whose variance σ2 is known
Likelihood function 1  x   2 
Normal model: exp  
l(/x): 2 2  2 2 

    0  
 2
1 
Conjugate prior h(): Normal model: exp  
2 0
2 
 2 0 
2

r   r2 
Normal distribution: exp     2 
2  2 
Posterior PDF f(/x):
 x     1 1 
Where,    2  02  r2 1  and  r2   2  2 
 1  0  n  1  0 
Suppose x = (x1...xn) is a set of n independent observations of a
normal random variate X with unknown mean  but known
2
variance σ . Then given x, the likelihood is
 x    
 2
n
1 
l (  / x)   exp  i 2 
i 1 2 2 
 2 

n
 1 
 exp  2
n

 x   
1
   
 2
i
 2
2
2
 i 1 
 x       x i  x   n   x 
2 2 2
But i

 1 
   exp 
 
1 n
  2 2
xi  x   n  x  
  2 i 1
 2 
2
2

Derivation of posterior Given the data x andτhevariance σ
distribution n
 1 
 
 2 
 2 
 1 2
and exp   x
 x 
 2
2 i
i 
are fixed constants independent of µ such that
 n
l (  / x)  exp    x 2 
 2 
2

 1 
 exp    x 2 
 2 1
2

The conjugate prior prob ab ility is N( 0 ,  02 such that
Using Bayes theorem :
Probabilistic Risk Assessment Page 39 of 39

f ( / x)  h( )l ( / x)
It follows
 1    x 
 2
   0 2  
f (  / x)  exp    

 2 
  2
1  02  
Using the following identity
AB
A( z  a) 2  B( z  b) 2  ( A  B)( z  c) 2  ( a  b) 2
( A  B)
Aa  Bb
Where, c  ,it can be shown
( A  B)
  x 2     0 2  1 1 
  2  2      d
2

 2
1  2
0  1  0 
Where,


x  12   0 02
d
 2
 02 ( x   0 ) 2
   12   02 
1
and
 12   02

distribution.  0 ; 0 ; d and  are constants independent of  . It


follows that

 1  1 1  2 

f (  / x)  exp   2  2      
 2   1  0 
  

The normalizing constant is therefore given by
1

 1  1

1  2 

  2   12  02 
exp          
    

   0 
By integration it becomes: c  1
2 2

2
the posterior PDF becomes

f (  / x) 
 12   02  2
1

 1  1
exp   2  2
1  
   2  


2  2   1  0
   

Which is a normal distribution wit mean  and variance
 2
1 
  02 .

The other cases of normal model are


a. Normal model: ( known, σ2 unknown)
b. Normal model: (Both  and σ2 unknown)

You might also like