You are on page 1of 12

Incident

Investigations:
In Search of the Prime Root
Cause of Major Incidents
By Philippe Guillard and James Tehrani
INCIDENT INVESTIGATIONS: IN SEARCH OF THE PRIME ROOT CAUSE OF MAJOR INCIDENTS

It’s easy to jump to


conclusions when it comes
to incidents, but what you
perceive to be true isn’t
always so. That’s why a
thorough investigation is
critical to figuring out the
root cause of an incident or
near-miss event, and using
that knowledge is necessary
to help prevent future
unwanted events.

But looks can be deceiving


in the world of Incident
Investigations.
There’s a well-traveled “ambiguous illusion,”
which has become known as the “Boring Figure.”
It appears to have originated on a 19th century
German postcard, and the illustration has been
making its way around cyberspace for decades
now. (Incidentally, it’s not called “boring”
because observing it is a snore, but because
an experimental psychologist named Edwin
Boring wrote a paper about it in the 1930s.) If
you look at the picture one way, it looks like a
young woman wearing a bonnet staring off into
the distance, but if you look at the picture from a
different angle, the young woman’s ear becomes
an eye, her necklace a mouth and her neck the
chin of an older woman. A British cartoonist even
used the image in the satirical magazine Puck in
1915 with the crass title “My Wife and My
Mother-in-Law” attached to it.

There’s a good lesson here. Things are not


always as they seem in art—and that holds true
for workplace incidents as well. In other words,
it’s wise not to assume anything when it comes
to Incident Investigations. Let the facts of the
investigation speak for themselves. Image courtesy of the Library of Congress
2
OPERATIONAL RISK | ENVIRONMENTAL PERFORMANCE | PRODUCT STEWARDSHIP

Getting Organized
Additionally, an Incident Investigation needs to be organized, planned
The fact is that most safety personnel and executed in a logical and scientific manner that pays attention to:
in organizations don’t have experience
dealing with major incidents because,
thankfully, they don’t happen very often.
However, if they do occur, companies Reducing Controlling Maintaining
need to be prepared by having a the sense of chaos and managing an and advancing
solid plan in place. After calling the and confusion when unexpected and corporate integrity.
appropriate authorities, attending to an incident occurs. damaging situation.
anyone who was hurt, evacuating if
necessary and securing the site, it’s time
to get organized for the investigation. Minimizing Arriving Preventing
It’s important to understand from the losses that can occur at true causes and repetition of incidents.
outset that an investigation will only with an accident. identifying significant
be successful if it is performed in a problem areas.
completely impartial way. Any bias will
hinder the investigation, and, in the
process, limit the ability to learn from the Learning
incident through lagging indicators. from both accidents and near-misses.

When investigating an incident, it is critical that the investigation is conducted in an organized fashion that is planned and executed
in a logical and scientific manner. Also, pointing fingers and playing the “blame game” should not be part of any investigation. For the
lead investigator or investigators, objectivity is key to determining what went wrong. This includes being able to filter out the emotional
response of workers who might take things personally after an incident because a colleague was hurt or a piece of equipment they
often work with failed, etc. It’s hard to look at incidents objectively when emotions come into play, so part of an investigator’s job is to
manage any chaos with an even keel and temperament.

Figure 1: OSHA’s “Incident


Investigations: A Guide for Employers”

Implement
Corrective Actions

Determine
Root Causes

Collect
Information

Preserve/
Document Scene

3
INCIDENT INVESTIGATIONS: IN SEARCH OF THE PRIME ROOT CAUSE OF MAJOR INCIDENTS

Starting the Investigation

Step 1: Securing the Site


Once the emergency response plan has been executed, the first
step is to secure the site. Securing the site is for safety purposes
to ensure no one gets hurt or no one else gets hurt. It is just
as important to preserve as much of the evidence, or “clues”
if you will, as possible. That might mean taking measures to
ensure access to the incident site is appropriately restricted and
controlled. Remember that an Incident Investigation is much less Step 3: Selecting a Lead Investigator
dramatic than what you would see in, say, an episode of “CSI,” but In conjunction with Step 2, if an investigator has not been
it is also much more thorough and certainly cannot be wrapped identified yet, then it is time to choose one. Depending on the
up neatly in an hour like it would on the television show. It can significance of the event, it is important to decide whether to use
take days, weeks or perhaps months to get a complete picture internal or external investigators and determine the investigation
of what caused an incident. To get to the bottom of what caused team where applicable. In the case of a major event, an external
an incident, a thorough Incident Investigation should follow a investigator might be preferable. You can also choose a combined
methodical scientific approach. internal and external approach to examining the incident.
Regardless of what you decide, it’s important to appoint a lead
Step 2: Witness Testimony investigator to help the Incident Investigation run smoothly. Often
The second step, after the location is secured, is to get eye- supervisors conduct investigations, but the U.S. Occupational
witness testimony as soon as possible. You can determine the Safety and Health Administration (OSHA) recommends a
truthfulness or reliability of the information later, but in the early collaborative approach to an investigation. “To be most effective,”
going it is important to talk to witnesses to hear their recounts OSHA explains, “investigations should be conducted by a team in
while it’s fresh in their minds. Just like any good detective, the which managers and employees work together since each brings
investigator must be objective about the circumstances and will different knowledge, understanding and perspectives to an
need to talk to all the witnesses to determine the who, what, investigation. Working together will also encourage all parties to
where, when, why and how. Later, you can talk to key people who ‘own’ the conclusions and recommendations and to jointly ensure
might not have witnessed the incident, but can offer insight into that corrective actions are implemented in a timely manner.”
the area where the incident occurred, the technology present,
employee and maintenance schedules, etc. In the early going, Steps 4 & 5: Notifying Authorities and
however, it’s important to hear and document firsthand accounts. Evaluating Legal & Insurance Issues
We won’t spend too much time on this here, but after you’ve
selected your investigative team, it’s important that you contact
your legal team, insurance company and, especially, the pertinent
regulatory authorities in your country.

In the United States, OSHA, for example, has strict guidelines


for reporting injuries and deaths. If there’s a death, it must be
reported within eight hours, and if there’s an amputation, loss of
an eye or hospitalization, it must be reported within 24 hours. To
help get organized, OSHA also has a handy incident investigation
form to help companies fill out their documentation. The Canadian
Federal Workers’ Compensation Service, on the other hand,
requires companies to report incidents within three days; and
under the U.K. Health and Safety Executive agency’s Reporting
of Injuries, Diseases and Dangerous Occurrences Regulations
(RIDDOR) rules, incidents must be reported within 10 days.

4
OPERATIONAL RISK | ENVIRONMENTAL PERFORMANCE | PRODUCT STEWARDSHIP

Selecting a Team

Similar to a schoolyard draft for teammates where you’d likely select a soccer/football player rather than a tennis player for a game of
kickball, you want to select the best people for the investigative job. That means determining who would be the most helpful to the team
by having a good idea of what needs to be investigated. Is it a liquid process incident? Gas? Mechanical? Ask yourself: “What level of
expertise would come in handy?” Keep in mind that you might have to pull in different experts at different times during the investigation
ranging from process engineers to mechanical engineers to metallurgy specialists, etc.

It’s important to explain to the team not to jump to conclusions. That said, the team’s job is to form a working hypothesis of what caused
the incident and why.

Step 6: Develop a Working Hypothesis


& Sequence of Events
Now that you’ve tapped a lead investigator, selected a team,
interviewed the key witnesses and contacted the appropriate
people and regulatory organizations about the event, it’s time
to get out your Sherlock Holmes magnifying glass and start the
investigation. You’ll want to take lots of photographs, collect
as many relevant samples as you can and possibly enlist a
disassembly team to take apart equipment that needs to be
examined thoroughly. Make sure your team wears the proper
personal protective equipment when investigating, and don’t
discard anything unless it has been properly documented and
examined by the investigative team.

5
INCIDENT INVESTIGATIONS:
INCIDENT ININ
INVESTIGATIONS: SEARCH OFOF
SEARCH THE PRIME
THE ROOT
PRIME CAUSE
ROOT OFOF
CAUSE MAJOR INCIDENTS
MAJOR INCIDENTS

Forming a Working Hypothesis

Before conducting any science experiment, it’s important to It’s not uncommon for there to be multiple hypotheses in the
develop a hypothesis of what you think happened. The same holds beginning of an investigation, but it’s important to boil these down
true with any Incident Investigation. For example: “My hypothesis to one working hypothesis through a screening method. One way
is that the valve on the gas compressor failed because it was of doing that is to assign a score or point total to an alternate
old and that caused too much gas to enter the separator vessel. hypothesis based on the evidence and research. For instance,
This led to an excessive pressure buildup.” (This is obviously a on a 10-point scale, how likely is this hypothesis? If it’s not very
much more simplistic hypothesis than what one would actually likely, it gets assigned a low number from the team, but if it’s
produce during a real investigation.) The next step is to test the very likely, it gets a higher number attached to it. Keep in mind
hypothesis or, in the case of an Incident Investigation, figure out that, in many cases, the working hypothesis will evolve as more
why an incident or near-miss occurred. information comes to light.

Figure 2: Equipped for the Job

6
7
1 5 15 11

13 9
14 10
A sample list of some of the items that could be useful during an investigation:

1. Camera 9. Straight-edge ruler

2. Charged batteries (for phones, cameras, equipment, etc.) 10. Incident investigation forms

3. Video/audio recorder 11. Flashlight

4. Measuring devices in various sizes 12. Strings, stakes

5. Leveling rod 13. Personal protective equipment: Gloves, hats, eyewear, ear
plugs, face masks, etc.
6. Clipboard and writing pad
14. Magnifying glass
7. Pens, pencils, markers
15. High-visibility plastic tape to mark off the area
8. Graph paper

6 Source: OSHA’s “Incident Investigations: A Guide for Employers”


OPERATIONAL RISK | ENVIRONMENTAL PERFORMANCE | PRODUCT STEWARDSHIP

EXPERT TIP: Most companies focus on the incident itself with good reason, but one area that sometimes gets
overlooked is organizing a disassembly team to take apart equipment to be analyzed later. You want to include workers
on the team who can handle the job without contaminating or getting rid of any potential evidence about what might
have caused the incident.

Validating Your Working Hypothesis

As we mentioned in the beginning of this e-book: Things are not


always as they seem. That’s why after establishing a working
hypothesis, it’s time to go out and see if it holds true.

While conducting interviews and gathering facts, begin the


process of event sequencing. This is a critical step in determining
what happened first, second, third, etc. This will help you sniff out
the prime root cause, which we will discuss shortly. Every incident
has a sequence of events that took place to cause it. Think of it
this way: If you’re in a car accident, there are multiple things that
had to happen for the accident to take place. Maybe you were
driving 30 mph, but the car behind you was tailgating. A deer ran
out into the street, so you hit the brakes to avoid it. The car behind
you was too close, so it couldn’t stop in time. This is a simple
example, but remember that an Incident Investigation can often


be quite complex.

Eliminating the immediate causes is like


cutting weeds, while eliminating the root
causes is equivalent to pulling out the roots
so that the weed cannot grow back.

—OSHA’s “Incident Investigations: A Guide for Employers”

7
INCIDENT INVESTIGATIONS: IN SEARCH OF THE PRIME ROOT CAUSE OF MAJOR INCIDENTS

One commonly used playbook for event sequencing is the Sequentially Timed
Events Plotting (STEP) process developed by Kingsley Hendrick and Ludwig
Benner Jr. in the book “Investigating Incidents With STEP.
In figure 3 below, you will find an example schematic for part of a gas compression train. In a situation where there has been a rupture
incident of the V-2 vessel as seen in figure 4 on the opposite page, this is an example of the STEP process being used to “describe” the
incident. This is based on the working hypothesis of the investigative team supported by the factual data and established timeline of
the incident.

STEP Example: Schematic for Gas Compression Train

Off-gas to
10" Line
Compressor
PCV

Compressor Shutdown
Interlock Flash Gas
I
Demister PSV
To Flare
Schematic-for-Gas-Compression-Train
Gas & Liquid
Separator LAHH
Feed
Vessel
1 Separator
12" Line V-1
@ Vessel
300 psig V-2 LIC
LAH
@
1 50 psig 2
LG
1
LIC

4"Line 6"Line

LCV LCV
1 Liquid
2
Effluent

3"

3"Bypass

Plant Field
Operator

Figure 3

8
INCIDENT INVESTIGATIONS: IN SEARCH OF THE PRIME ROOT CAUSE OFOPERATIONAL
MAJOR INCIDENTS
RISK | ENVIRONMENTAL PERFORMANCE | PRODUCT STEWARDSHIP

As the book states, “[A]ccidents should be investigated in a way that is compatible with the way a productive process is designed. All
processes involve actors, whether people or things, who act to introduce changes. Those actions are called events. Changes of state
occur as events interact during processes. Processes start with the first event which initiates a change of state and end when a new
state or outcome has been reached.”

Poor operation of
LCV-1 & 3" bypass
PSV not sized
Start LG-1 is unreadable
conservatively enough Timeline

ACTORS

CCO CCO notifies


PFO

V-1 V-1
Overfills

LAH-1 LAH-1
Alarms

LIC-1
Loop LIC-1 Reset
LIC-1 Faulty

LCV-1 LCV-1 gas


LCV-1 opens fully breakthrough

LG-1 can't be
LG-1 read by PFO

3" bypass 3" bypass gas


3" bypass opened breakthrough

PFO PFO
PFO notified Injured

V-2 V-2 ruptures


V-2 overpressures above MAWP

PCV fully
PCV opens

PSV fully
PSV opens

STEP Diagram for Gas Compression Train Accident

Figure 4

9 9
INCIDENT INVESTIGATIONS: IN SEARCH OF THE PRIME ROOT CAUSE OF MAJOR INCIDENTS

Step 7: Finding the Prime Root Cause and Assigning Relevant Corrective Actions
There can be any number of root causes when investigating an incident, but there will be just one prime root cause. And this is what
investigators are after—answering the “how” and the “why” of the incident. There’s a lot of literature out there on the topic going
back decades, including Herbert Heinrich’s “five domino model”—a sequential accident model designed to determine what caused an
accident—and hazard and barrier analysis, which offers strategies to prevent accidents. The goal is, to use OSHA’s terminology, to do
more than “cut the weeds” that caused an incident and, instead, “uproot” it completely by determining the root cause.

Today, we often see companies employ the “5 Why” methodology discussed in Figure 3, Casual Tree Analysis, Fishbone Diagram (also
known as Cause and Effect Diagrams) and Fault Tree Analysis (FTA). The goal of these methodologies is to get to the root cause of the
incident beyond the easy-road answer of saying “It was a human error” or something to that effect. Keep in mind that we’re not saying
human errors don’t play a role in an incident, but it’s important to go beyond the obvious to establish things like “Was the operator
properly trained?” and “Were there changes involved in the process?”

Figure 4: Why x 5

When a child asks “why” over and over again, it might grate on the nerves a bit, but keep in mind that asking lots of questions is part of
a child’s learning process. The same question-and-answer approach can help an investigator learn what the root cause of an incident
or near-miss was, so don’t be afraid to ask lots of questions.

Here’s an example of what that might look like:

Q: Why was the toxic chemical


released from the tank? Q: Why was the supervisor too busy?
A: Because the gasket A: Because the maintenance
near the pipe was leaking. was taking longer than expected,
and there was pressure to get
the equipment back into service
to support production.

Q: Why was the wrong gasket taken


Q: Why was the gasket leaking? from the warehouse?
A: Because it was made of A: Because the job supervisor
an incompatible material. was too busy to check it
before it was installed.
Q: Why was a gasket with
incompatible material used?
A: Because the wrong
one was taken from the
warehouse during the last
maintenance activity.

10
OPERATIONAL RISK | ENVIRONMENTAL PERFORMANCE | PRODUCT STEWARDSHIP

Investigations often don’t go far enough in identifying the root


cause. In fact, they often end when it is determined that someone
can be blamed for an incident. Sometimes the root cause is listed
as “operator made a mistake while following procedure X,” and
the action listed is to “provide more training.” But the root cause
has not been identified: Why was the mistake made? What other
underlying factors were in play?

Establishing a root cause is the “golden goose” in helping to put


a meaningful, corrective plan in place to ensure the incident does
not occur again. However, don’t underestimate the possibility that
there are multiple root causes that led to the incident. There are
a number of causation theories to consider, including Heinrich’s
Domino Theory of Causation, which says accidents are the
result of a chain of events that fall in line much like dominoes;
the Multiple Causation Theory is similar but says there could
be multiple contributing causes and subcauses that led to the
incident; and there are several root-cause theories, including the
aforementioned “5 Whys.”

Conclusion:

A serious incident is traumatic for everyone involved, especially company critical steps it can take to help prevent future incidents
for the people who witnessed the event and/or knew someone and explain the importance of taking those steps. It’s then up
who was hurt or even killed. Often, an investigator has to be the to the organization to take the reins on mitigating any potential
shoulder to cry on while gathering as much information as possible future risks.
from the key witnesses. Being compassionate yet thorough is
most important. It’s not an easy job, but it’s an important one. Without taking the proactive steps prescribed based on the
It’s incumbent upon investigators to follow and enforce strict Incident Investigation, the case will never be properly closed.


protocol to ensure the integrity of the investigation. In the end,
the goal is to not only figure out what caused the incident but
also to implement the proper procedures, protocols, etc., to
‘It is not necessary to be solely aware
ensure it doesn’t happen again. What we know is that if proper
attention is not paid to causal factors and contributory causes, of how an incident occurred without a
it will negatively impact the amount of knowledge gained from detailed understanding of the mechanisms
the incident. of occurrence and how it might have been
prevented.’
In our experience, companies often fail to follow through on
implementing any prescribed corrective actions, and not doing so
can compromise the future safety of the organization. Even so,
an investigator cannot force a company to implement corrective
—Nigel Hyatt, P.Eng. (Ontario), chartered engineer
U.K., member of the Institution of Chemical Engineers,
retired Process Safety Management expert

actions following an incident, but an investigator can offer the

Philippe Guillard is Sphera's director of solution consulting. He is a professional engineer with a James Tehrani is Sphera's
background in chemical engineering, and has been with Sphera for 15 years. During his career, content marketing manager
Guillard has served as the lead investigator for numerous incident investigation accidents. His and editor-in-chief of Spark
career has spanned Operational Risk management, Environmental Performance, enterprise and magazine. He is an award-
cloud-based software management, industrial automation and control as well as process system winning writer and editor
integration. Guillard has facilitated numerous HAZOP and LOPA studies for many industries around based in Chicago.
the world. He holds a Bachelor of Engineering degree in Chemical Engineering from McGill University
in Montreal, Canada, and is a registered member of the Professional Engineers of Ontario.

11
About Sphera
Sphera is the largest global provider of Integrated
Risk Management software and information services
with a focus on Environmental Health & Safety (EHS),
Operational Risk and Product Stewardship. For more
than 30 years, Sphera has advanced Operational
Excellence by serving more than 3,000 customers
and over 1 million individual users across 70-plus
countries to create a safer, more sustainable and
productive world.

www.sphera.com

For more information contact us at:


sphera.com/contact-us/

®2018 Sphera. All Rights Reserved.

You might also like