Professional Documents
Culture Documents
Investigations:
In Search of the Prime Root
Cause of Major Incidents
By Philippe Guillard and James Tehrani
INCIDENT INVESTIGATIONS: IN SEARCH OF THE PRIME ROOT CAUSE OF MAJOR INCIDENTS
Getting Organized
Additionally, an Incident Investigation needs to be organized, planned
The fact is that most safety personnel and executed in a logical and scientific manner that pays attention to:
in organizations don’t have experience
dealing with major incidents because,
thankfully, they don’t happen very often.
However, if they do occur, companies Reducing Controlling Maintaining
need to be prepared by having a the sense of chaos and managing an and advancing
solid plan in place. After calling the and confusion when unexpected and corporate integrity.
appropriate authorities, attending to an incident occurs. damaging situation.
anyone who was hurt, evacuating if
necessary and securing the site, it’s time
to get organized for the investigation. Minimizing Arriving Preventing
It’s important to understand from the losses that can occur at true causes and repetition of incidents.
outset that an investigation will only with an accident. identifying significant
be successful if it is performed in a problem areas.
completely impartial way. Any bias will
hinder the investigation, and, in the
process, limit the ability to learn from the Learning
incident through lagging indicators. from both accidents and near-misses.
When investigating an incident, it is critical that the investigation is conducted in an organized fashion that is planned and executed
in a logical and scientific manner. Also, pointing fingers and playing the “blame game” should not be part of any investigation. For the
lead investigator or investigators, objectivity is key to determining what went wrong. This includes being able to filter out the emotional
response of workers who might take things personally after an incident because a colleague was hurt or a piece of equipment they
often work with failed, etc. It’s hard to look at incidents objectively when emotions come into play, so part of an investigator’s job is to
manage any chaos with an even keel and temperament.
Implement
Corrective Actions
Determine
Root Causes
Collect
Information
Preserve/
Document Scene
3
INCIDENT INVESTIGATIONS: IN SEARCH OF THE PRIME ROOT CAUSE OF MAJOR INCIDENTS
4
OPERATIONAL RISK | ENVIRONMENTAL PERFORMANCE | PRODUCT STEWARDSHIP
Selecting a Team
Similar to a schoolyard draft for teammates where you’d likely select a soccer/football player rather than a tennis player for a game of
kickball, you want to select the best people for the investigative job. That means determining who would be the most helpful to the team
by having a good idea of what needs to be investigated. Is it a liquid process incident? Gas? Mechanical? Ask yourself: “What level of
expertise would come in handy?” Keep in mind that you might have to pull in different experts at different times during the investigation
ranging from process engineers to mechanical engineers to metallurgy specialists, etc.
It’s important to explain to the team not to jump to conclusions. That said, the team’s job is to form a working hypothesis of what caused
the incident and why.
5
INCIDENT INVESTIGATIONS:
INCIDENT ININ
INVESTIGATIONS: SEARCH OFOF
SEARCH THE PRIME
THE ROOT
PRIME CAUSE
ROOT OFOF
CAUSE MAJOR INCIDENTS
MAJOR INCIDENTS
Before conducting any science experiment, it’s important to It’s not uncommon for there to be multiple hypotheses in the
develop a hypothesis of what you think happened. The same holds beginning of an investigation, but it’s important to boil these down
true with any Incident Investigation. For example: “My hypothesis to one working hypothesis through a screening method. One way
is that the valve on the gas compressor failed because it was of doing that is to assign a score or point total to an alternate
old and that caused too much gas to enter the separator vessel. hypothesis based on the evidence and research. For instance,
This led to an excessive pressure buildup.” (This is obviously a on a 10-point scale, how likely is this hypothesis? If it’s not very
much more simplistic hypothesis than what one would actually likely, it gets assigned a low number from the team, but if it’s
produce during a real investigation.) The next step is to test the very likely, it gets a higher number attached to it. Keep in mind
hypothesis or, in the case of an Incident Investigation, figure out that, in many cases, the working hypothesis will evolve as more
why an incident or near-miss occurred. information comes to light.
6
7
1 5 15 11
13 9
14 10
A sample list of some of the items that could be useful during an investigation:
2. Charged batteries (for phones, cameras, equipment, etc.) 10. Incident investigation forms
5. Leveling rod 13. Personal protective equipment: Gloves, hats, eyewear, ear
plugs, face masks, etc.
6. Clipboard and writing pad
14. Magnifying glass
7. Pens, pencils, markers
15. High-visibility plastic tape to mark off the area
8. Graph paper
EXPERT TIP: Most companies focus on the incident itself with good reason, but one area that sometimes gets
overlooked is organizing a disassembly team to take apart equipment to be analyzed later. You want to include workers
on the team who can handle the job without contaminating or getting rid of any potential evidence about what might
have caused the incident.
“
be quite complex.
7
INCIDENT INVESTIGATIONS: IN SEARCH OF THE PRIME ROOT CAUSE OF MAJOR INCIDENTS
One commonly used playbook for event sequencing is the Sequentially Timed
Events Plotting (STEP) process developed by Kingsley Hendrick and Ludwig
Benner Jr. in the book “Investigating Incidents With STEP.
In figure 3 below, you will find an example schematic for part of a gas compression train. In a situation where there has been a rupture
incident of the V-2 vessel as seen in figure 4 on the opposite page, this is an example of the STEP process being used to “describe” the
incident. This is based on the working hypothesis of the investigative team supported by the factual data and established timeline of
the incident.
Off-gas to
10" Line
Compressor
PCV
Compressor Shutdown
Interlock Flash Gas
I
Demister PSV
To Flare
Schematic-for-Gas-Compression-Train
Gas & Liquid
Separator LAHH
Feed
Vessel
1 Separator
12" Line V-1
@ Vessel
300 psig V-2 LIC
LAH
@
1 50 psig 2
LG
1
LIC
4"Line 6"Line
LCV LCV
1 Liquid
2
Effluent
3"
3"Bypass
Plant Field
Operator
Figure 3
8
INCIDENT INVESTIGATIONS: IN SEARCH OF THE PRIME ROOT CAUSE OFOPERATIONAL
MAJOR INCIDENTS
RISK | ENVIRONMENTAL PERFORMANCE | PRODUCT STEWARDSHIP
As the book states, “[A]ccidents should be investigated in a way that is compatible with the way a productive process is designed. All
processes involve actors, whether people or things, who act to introduce changes. Those actions are called events. Changes of state
occur as events interact during processes. Processes start with the first event which initiates a change of state and end when a new
state or outcome has been reached.”
Poor operation of
LCV-1 & 3" bypass
PSV not sized
Start LG-1 is unreadable
conservatively enough Timeline
ACTORS
V-1 V-1
Overfills
LAH-1 LAH-1
Alarms
LIC-1
Loop LIC-1 Reset
LIC-1 Faulty
LG-1 can't be
LG-1 read by PFO
PFO PFO
PFO notified Injured
PCV fully
PCV opens
PSV fully
PSV opens
Figure 4
9 9
INCIDENT INVESTIGATIONS: IN SEARCH OF THE PRIME ROOT CAUSE OF MAJOR INCIDENTS
Step 7: Finding the Prime Root Cause and Assigning Relevant Corrective Actions
There can be any number of root causes when investigating an incident, but there will be just one prime root cause. And this is what
investigators are after—answering the “how” and the “why” of the incident. There’s a lot of literature out there on the topic going
back decades, including Herbert Heinrich’s “five domino model”—a sequential accident model designed to determine what caused an
accident—and hazard and barrier analysis, which offers strategies to prevent accidents. The goal is, to use OSHA’s terminology, to do
more than “cut the weeds” that caused an incident and, instead, “uproot” it completely by determining the root cause.
Today, we often see companies employ the “5 Why” methodology discussed in Figure 3, Casual Tree Analysis, Fishbone Diagram (also
known as Cause and Effect Diagrams) and Fault Tree Analysis (FTA). The goal of these methodologies is to get to the root cause of the
incident beyond the easy-road answer of saying “It was a human error” or something to that effect. Keep in mind that we’re not saying
human errors don’t play a role in an incident, but it’s important to go beyond the obvious to establish things like “Was the operator
properly trained?” and “Were there changes involved in the process?”
Figure 4: Why x 5
When a child asks “why” over and over again, it might grate on the nerves a bit, but keep in mind that asking lots of questions is part of
a child’s learning process. The same question-and-answer approach can help an investigator learn what the root cause of an incident
or near-miss was, so don’t be afraid to ask lots of questions.
10
OPERATIONAL RISK | ENVIRONMENTAL PERFORMANCE | PRODUCT STEWARDSHIP
Conclusion:
A serious incident is traumatic for everyone involved, especially company critical steps it can take to help prevent future incidents
for the people who witnessed the event and/or knew someone and explain the importance of taking those steps. It’s then up
who was hurt or even killed. Often, an investigator has to be the to the organization to take the reins on mitigating any potential
shoulder to cry on while gathering as much information as possible future risks.
from the key witnesses. Being compassionate yet thorough is
most important. It’s not an easy job, but it’s an important one. Without taking the proactive steps prescribed based on the
It’s incumbent upon investigators to follow and enforce strict Incident Investigation, the case will never be properly closed.
“
protocol to ensure the integrity of the investigation. In the end,
the goal is to not only figure out what caused the incident but
also to implement the proper procedures, protocols, etc., to
‘It is not necessary to be solely aware
ensure it doesn’t happen again. What we know is that if proper
attention is not paid to causal factors and contributory causes, of how an incident occurred without a
it will negatively impact the amount of knowledge gained from detailed understanding of the mechanisms
the incident. of occurrence and how it might have been
prevented.’
In our experience, companies often fail to follow through on
implementing any prescribed corrective actions, and not doing so
can compromise the future safety of the organization. Even so,
an investigator cannot force a company to implement corrective
—Nigel Hyatt, P.Eng. (Ontario), chartered engineer
U.K., member of the Institution of Chemical Engineers,
retired Process Safety Management expert
“
actions following an incident, but an investigator can offer the
Philippe Guillard is Sphera's director of solution consulting. He is a professional engineer with a James Tehrani is Sphera's
background in chemical engineering, and has been with Sphera for 15 years. During his career, content marketing manager
Guillard has served as the lead investigator for numerous incident investigation accidents. His and editor-in-chief of Spark
career has spanned Operational Risk management, Environmental Performance, enterprise and magazine. He is an award-
cloud-based software management, industrial automation and control as well as process system winning writer and editor
integration. Guillard has facilitated numerous HAZOP and LOPA studies for many industries around based in Chicago.
the world. He holds a Bachelor of Engineering degree in Chemical Engineering from McGill University
in Montreal, Canada, and is a registered member of the Professional Engineers of Ontario.
11
About Sphera
Sphera is the largest global provider of Integrated
Risk Management software and information services
with a focus on Environmental Health & Safety (EHS),
Operational Risk and Product Stewardship. For more
than 30 years, Sphera has advanced Operational
Excellence by serving more than 3,000 customers
and over 1 million individual users across 70-plus
countries to create a safer, more sustainable and
productive world.
www.sphera.com