You are on page 1of 5

1.

Process Alarm Management


M. S. MANNAN, H. H. WEST

(2005)

INTRODUCTION
Recently, major incidents have been caused or worsened by
the excessive and undisciplined growth in the quantity of
alarms and for that reason, effective alarm management
(A.M.) has moved to the top of many plant project lists.
Many currently installed computer control systems provide only limited information regarding abnormal operation.
The operator information is often not enhanced by the computer control system, and the operator is forced to search out
trends and use intuition and experience to evaluate abnormal
plant status.
Alarm growth is a natural outcome of the increased information load provided by modern control systems. However,
if alarms are not managed in a disciplined manner, uncontrolled alarm growth can result, leading to ineffective and
potentially dangerous alarm situations.
A structured approach to alarm management has emerged
to increase alarm effectiveness and thereby overall plant safety.
History of Alarm Systems
In the not-too-distant past, alarm systems consisted of a few
selected process measurements, which were hard-wire connected to panel board mounted annunciators or indicator lights,
which activated when the measurements exceeded some predefined limits. These panels provided alarm annunciation to
the plant operator. The panels were large but limited in capacity
and thereby tended to limit the number of configured alarms.
Modern distributed control systems (DCS) and programmable electronic systems (PES) are capable of defining limits
for each field measurement. Furthermore, calculated parameters, such as rate of change or a combination of field measurements could also have defined limits. Therefore, computer-based control systems have the capability to vastly
increase the number of configured alarms.
Alarms not only increase the amount of information provided to the operator but can often also be a source of operator
overload and confusion. There have been a number of major
incidents that might have been prevented if the plant operator
had recognized critical alarms among the flood of alarms that
1
were activated. One notable example of an alarm problem
was the Three Mile Island accident in 1979, where important
alarms were missed because of the flood of alarms received.

Another example of alarm overload was the Texaco refinery


explosion at Milford Haven in 1994, where, in the last 10
minutes prior to the explosion, the operators had to respond
to alarms that were annunciating at a rate of about three per
second.
Definitions
The most common definition of an alarm is a signal designed
to alert, inform, guide or confirm deviations from acceptable
system status. Similarly, an alarm system is defined as a
system for generating and processing alarms to be presented
to users.
Another common definition is that an alarm system is
designed to direct the operators attention toward significant
aspects of the current plant status. In other words, an alarm
system should serve to help the operator to manage his time.
Because the DCS system processes information on a tag
point-by-tag point basis, alarms and non-safety related events,
such as condition monitoring alerts, equipment fault status, or
journal logging events for engineering or maintenance attention, have often been described as alarms.
Hence, the A.M. system must address events that need
operator attention and various other non-safety related events.
Figure 1.6a provides a graphic description of the relationship
between alerts and safety-related alarms. The distinction that
is emerging between the two can be stated as follows: An
Alarm requires operator action, while an Alert is informational
in its nature and may not require a specific action by the
operator.
Alarm Basics
The purpose of alarm systems is to alert operators to plant
conditions that deviate from normal operating limits and
thereby require timely action or assessment. Hence, the purpose of an alarm is action.
Matters that are not worthy of operator attention should
not be alarmed but rather placed into a logging journal database for the potential use by other plant staff.
An alarm is useless if the operator does not know what
action is required in response to it. Similarly, if the operator
knows the correct action but cannot implement the action,
the alarm is also useless. Useless alarms should not exist.
59

2006 by Bla Liptk

60

General

management as an existing layer of protection, thereby assigning the alarm system components some designated level of
reliability.
Alarms that are not designated as safety should be carefully designed to ensure that they fulfill their role in reducing
demands on the safety-related systems.
For all alarms, regardless of their safety designation, attention is required to ensure that under abnormal conditions or
under severe emergency situations, the alarm system remains
effective and the limitations of the speed of human response are
recognized in its design.

Safety instrumented system


Operator alarm
Condition monitoring

Normal
operating
range

Alerts
Operator alarm

Alarm Set Points

Protective systems

FIG. 1.6a
Graphic description of the relationship between alerts and safetyrelated alarms.

Logging may be a suitable alternative to be used to record


non-safety related discrepancy events to prevent the unnecessary
sounding of alarms. A system for assessing the significance of
logged events to ensure timely intervention by maintenance personnel is absolutely required.

A.M. COSTS AND DESIGNS

The type of alarm and its set point must be established to


enable the operator to make the necessary assessment and
take the required timely action.
Engineering analysis of the potential timing of events that
generate alarms requires good control system engineering.
Then alarm set points must be documented and controlled in
accordance with the alarm system management control. Timing of alarms and potential operator reaction time are both
critical. Figure 1.6b illustrates how alerts can progress into
alarms and eventually into safety shutdowns.
The recommended best practice is that the engineering
analysis of alarm set points be coordinated with the plant
hazard and operability (HAZOP) studies.
Alarm Presentation

A.M. and Safety Instrumentation


Although alarm systems may not always have safety implications, they do have a role in enabling operators to reduce
the need for automatic shutdowns initiated by safety systems.
1
However, in cases where a risk reduction factor of 10 failures on demand is claimed, the safety system includes both
the alarm system and the operator. Hence, the total safety
3,4
system requires a suitable safety integrity level.
Many Layers of Protection Analysis (LOPA) safety integrity level determination studies have incorporated alarm

2006 by Bla Liptk

The alarm interface provided for the operator must be effective.


Alarms may be presented on annunciation panels, individual
indicators, CRT screens, or programmable display devices.
Alarms lists should be carefully designed to ensure that highpriority alarms are readily noted and recognized, while lowpriority alarms are not overlooked, and that the list remains
readable even during times of high alarm activity or when
alarms occur repeatedly.

Compressor suction drum level

The Honeywell Abnormal Situation Management Consortium


members estimated that a medium-sized petrochemical company consisting of about six facilities (refineries, chemical
plants, etc.) might accumulate $50 million to $100 million in
yearly losses due to plant upsets and other production-related
incidents. They also noted that losses caused by plant upsets
cost about 3 to 5% of the total throughput of the plant. The US
National Institute of Standards and Technology estimated that
plant upsets cost the national economy about $20 billion annually. These estimates may actually underestimate the real situation. Not included in these cost estimates are the rare tragic
incidents such as the Phillips Pasadena disaster or the Piper
Alpha tragedy.

Safety instrumented system shutdown


High level alarm
Alarm
Alert
Normal operating range
Time

FIG. 1.6b
Alerts can progress into alarms and eventually into safety shutdowns.

1.6 Process Alarm Management

Alarms should be prioritized in terms of which alarms


require the most urgent operator attention. Alarms should be
presented within the operators field of view and use consistent presentation style (color, flash rate, inscription convention). The alarms should be presented in an explicit manner,
should be distinguishable from other alarms, have the highest
priority, and when activated, remain on view all the time.
The presentation of alarms should not exceed that which
the operator is capable of acting upon, or alternatively the
alarms should be prioritized and presented in such a way that
the operator may deal with the most important alarms first
and without distraction from the others.
Each alarm should provide sufficient information to the
operator so that the operator will know the condition that
caused the alarm, the plant affected, the corrective action that
is required, priority of the alarm involved, the time of alarm
initiation, and the alarm status.
The visual display device may be augmented by audible
warnings. Where there are multiple audible warnings, they
should be designed so that they are readily distinguished from
each other and from emergency alarm systems. They should
be designed to avoid distraction of the operator in high workload situations.

Symptoms of A.M. Problems

Alarm Processing

The alarms should be processed in such a manner as to avoid


operator overload at all times (avoiding alarm floods). The
alarm processing should ensure that fleeting or repeating alarms
do not result in operator overload even under the most severe
conditions. Alarm processing techniques such as filtering, the
use of dead-band, de-bounce timers, and shelving have been
successfully exploited by industry. For shelving alarms, a rule
of thumb that has been used by some plants is that an alarm is
shelved if it is repeated 10 times in 10 minutes.
Applicable alarm processing techniques include grouping and first-out alarms, suppression of lower-priority alarms
(e.g., suppressing the high alarm when the high-high activates) suppression of out-of-service plant alarms, suppression
of selected alarms during certain operating modes, automatic
alarm load shedding, and shelving.
Care should be taken in the use of shelving or suppression
to ensure that controls exist to ensure that alarms are returned
to an active state when they are relevant to plant operation.

IMPROVING THE PLANTS A.M.


In order to improve alarm management (A.M.) in a plant, it
is first necessary to identify the symptoms of the problems
that exist. Next it is desirable to note available elements of
A.M. as the potential tools of correction. Once this been done,
an A.M. improvement and rationalization project can be initiated. An example of such a project will be described later
in this section.

2006 by Bla Liptk

61

The list below identifies some of the symptoms that might


signal problems with the design of the A.M. system:

Minor operating upsets produce a significant number


of alarm activations.
Routine operations produce a large number of alarm
activations that serve no useful purpose.
Alarm activations occur without need for operator
action.
Major operating upsets produce an unmanageable number of alarm activations.
Operators do not know what to do when a particular
alarm occurs.
Occasional high alarm activation rates occur.
A large number of configured high-priority alarms
occur.
Alarm conditions are not corrected for long periods of
time.
Alarms chatter or are frequently repeated transient.
No plant-wide philosophy for the alarm system
management has been established.
There are no guidelines for when to add or delete an
alarm.
Operating procedures are not tied to alarm activations.
There are active alarms when the plant is within normal operating parameters.
There is a lack of documentation on alarm set points.
The system does not distinguish safety, operational,
environmental, and informational events.
The schedule of alarm testing is erratic.
A large number of defeated alarms occurs.
Any operator or engineer can add an alarm or change
the set points on his or her own authority.

The Tools and Elements of A.M.


Some of the essential elements of alarm management are:

An alarm management philosophy document


Written operating procedures for alarm response
Ownership of the alarm system
Alarm prioritization
Alarm configuration records
Alarm presentation
Alarm logical processing, such as filtering and suppression
Informational alert management
Operator training
Alarm testing
Management of changes made to the alarm system

Management systems should be in place to ensure that the alarm


system is operated, maintained, and modified in a controlled
manner. Alarm response procedures should be available, and
alarm parameters should be documented.

62

General

The performance of the alarm system should be assessed


and monitored to ensure that it is effective during normal and
abnormal plant conditions. The monitoring should include
evaluation of the alarm presentation rate, operator acceptance
and response times, operator workload, standing alarm count
and duration, repeat or nuisance alarms, and operator views
of operability of the system. Monitoring may be achieved by
regular and systematic auditing.
A.M. Improvement and Rationalization Project
Alarm rationalization is a systematic procedure for the analysis and improvement of the DCS capability used to alert
operators of alarm conditions. This process normally results
in a reduction in the total number of configured alarms. Rationalization may occasionally identify the need for new alarms
or changes in the alarm priority. Alarm rationalization is a
structured procedure that involves a project team, with representatives from operations, maintenance, engineering, and
safety.
The rationalization procedure includes a dozen essential
steps, each of which is listed and described under separate
titles in the paragraphs that follow:
Assessing the Current Alarm System Performance Typical
metrics needed to assess the current and then later the improved
alarm system include:

Rate of alarm activation


Pattern of alarm activations
Priority alarm activations
Number of standing alarms
Time before acknowledge
Spread between trip points
Time before clearing
Chattering alarms
Correlated alarms
Number of disabled alarms
Total number of alarms
Alarms per operator
Alarm frequency by shift
Alarm rate during emergencies
Fraction of unacknowledged alarms
Average time to return to normal
Nuisance alarms
Disabled alarms
Bypassed alarms
Shelved alarms

Operational metrics must also be used, such as production rate, off-quality production, number of upsets, and any
other factors that the plant considers important or relevant.
Safety and environmental metrics that must be used are the
number of plant shutdowns, number of incidents/near misses,
releases to the atmosphere, and pressure relief activations or
releases to flare.

2006 by Bla Liptk

The amount of information needed to develop these metrics is daunting, thereby requiring special software to sort the
alarm journals in an efficient manner. Many DCS and Supervisory Control and Data Acquisition (SCADA) systems and
vendors provide alarm management features or products.
However, some of these are either primitive or do not provide
the necessary purpose by themselves, though they are generally improving.
There also are some add-on alarm software products on
the market that enhance a control systems basic alarm capabilities by providing online A.M., advanced logical processing
of alarms, alarm pattern recognition, and dynamic reconfiguration of the alarm system for varying operating conditions.
Developing an A.M. Philosophy Document A consistent, comprehensive alarm management procedure/philosophy is necessary before beginning an alarm rationalization project.
This procedure typically covers alarm type (quality, safety,
environmental, maintenance information), method of prioritization of alarms, alarm logical processing methods, and testing
requirements.
Reviewing the Basis for Alarm Set Points Evaluation of the
alarm configuration file and the accompanying engineering
reports is required to verify the basis for the alarm set points.
If not available, then an engineering study must be made to
recreate the basis for the alarm set points.
Identifying the purpose of the alarm and its correlation
to other alarms is particularly important. Analysis of the
alarm purpose includes defining the consequences of inaction
to alarm notification.
Analyzing Alarm Histories Dynamic analysis of the alarm
journals of several previous months is required to get a statistically valid sample. Disadvantages include correlation with operational, process, or equipment events which may not be available
in the electronic database.
Prioritizing the Alarms The significance of the alarm must
be determined through a ranking scheme or identified during
the HAZOP study. The process hazard/risk analysis must
determine the level of importance that is associated with the
operator detecting the alarm and performing the expected
action. Prioritization helps ensure that the operator understands
the importance of the alarm itself as well as the importance of
the alarm in relationship to other alarms.
The prioritization scheme is generally limited to the control
systems capabilities along with third-party alarm management
software on the system. The number of prioritization levels
should be kept to a minimum to minimize operator confusion.
Typical alarm priority categories can be critical, high, medium,
and low.
Incorporating Operator Actions in Procedures In addition to
ensuring that each alarm has a defined operator response, operator reliability must be enhanced by lowering the workload,
reducing the number of false alarms, and making the alarm

1.6 Process Alarm Management

displays obvious and the operator responses simple. Management should be made responsible to ensure that the operators
are well trained and that their performance is tested.
Considering Advanced Logical Processing Techniques In addition to grouping alarms, various suppression techniques and
artificial intelligence techniques may be piloted and then incorporated into the alarm system.
Updating the Alarm Presentation Techniques The latest techniques in DCS graphic design and control system design should
be considered to enhance operator effectiveness.
Implementing the Rationalization Project Since the rationalization may add or remove alarms and change presentation
techniques or configuration parameters, it is necessary to have
an implementation plan that involves the participation of the
appropriate personnel, such as the operators and operating
staff.

manageable). Similarly the distribution of priority alarms


should be less than five high alarms per shift and less than
two such alarms per hour. Journal alerts should not exceed
10 per hour.
Auditing The goal of a good audit process is to keep the alarm
system manageable and in control. Online dynamic alarm monitoring systems can assist in this. If the plant is not following
a comprehensive alarm management procedure, the alarm system may go out of control again in the future.
Managing the Change Including the alarm system within
the scope of the plant Management of Change program will
assist in retarding uncontrolled alarm growth.

References
1.
2.
3.

Benchmarking the New Alarm System Once the alarm rationalization is implemented, the final system should be evaluated
to determine the degree of success of the rationalization effort.
For example, the alarm rate should be less than one per
10 minutes (some plants suggest that one per 5 minutes is

2006 by Bla Liptk

63

4.
5.
6.

Alarm Systems, a Guide to Design, Management and Procurement,


Engineering Equipment and Materials Users Assn. Publication No. 191.
Nimmo, I., Abnormal Situation Management, Chemical Engineering
Progress, September 1995.
Functional Safety of Electrical/Electronic/Programmable Electronic
Safety-Related Systems, IEC 61508.
Functional Safety: Safety Instrumented Systems for the Process
Industry Sector, IEC 61511.
Better Alarm Handling, The British Health & Safety Executive, 1999.
ISA TR91.00.02, Criticality Classification Guideline Enunciator
Sequences and Specifications, ANSI/ISA S18.1 1992.

You might also like