You are on page 1of 37

GROUPEMENT BERKINE SONATRACH / ANADARKO HASSI BERKINE SUD

Introduction to Reliability Centered Maintenance (RCM)

Maintenance Has Changed


The world of equipment maintenance changed dramatically during the second half of the 20th century and it continues to do so today. Several major influences have been responsible for driving these changes: An enormous increase in the number of physical assets (such as buildings, factories, public and personal transport) that require maintenance. 1-Equipment has become extremely complex - for example, it is now rare to find anything that does not contain a computer or some electronics 2-Industries (such as manufacturing and mass transport) now put a much greater emphasis on safety and on operating without damaging the environment. 3-We now have a much better understanding of how equipment behaves, from installation to the point at which it fails When engineers were forced to respond to this wave of change, it became clear that traditional maintenance methods were no longer adequate - a new approach to equipment maintenance was required. The commercial aviation industry was the first to realise that change was necessary and committed significant resources to developing a solution in the 1960s and 1970s. The results entered the public domain in 1978 under the name "Reliability Centred Maintenance" or "RCM".

What is RCM?
RCM stands for Reliability Centred Maintenance. RCM may be defined as:

A process used to determine the maintenance requirements of any physical asset in its operating context.
But, if maintenance is defined as ensuring that physical assets continue to do what their users want them to do, then the definition of RCM can be expanded to:

A process used to determine what must be done to ensure that any physical asset continues to do what its users want it to do in its operating context.
RCM, i.e. "Reliability-centred Maintenance", is so called because it recognises that maintenance can do no more than ensure that physical assets continue to achieve

Lamine Cherif

their built-in capability or "inherent reliability". RCM also recognises that identical assets will have different maintenance requirements in different operating contexts.

Increased Expectations Looking back to the 1930s, we can divide up the years since then into three generations. We can then examine the expectations placed on the maintenance function in each of the three generations as follows:
First Generation: Prior to the Second World War, equipment was relatively simple and over-designed, so it tended to be reasonably reliable. The failures that did occur didnt matter too much and were quick and easy to repair. There was little need for the planned maintenance systems that are commonplace today. Second Generation: The Second World War quickly led to increased demand for many types of manufactured goods and severely limited the supply of skilled labour to industry. In response, factory equipment became more mechanised and more complex. Failures (and their downtime) began to matter more so preventive maintenance systems were developed in an attempt to prevent them - usually these were fixed interval overhauls. Third Generation: The last 30-40 years have seen an enormous increase in demand for manufactured goods and mass transportation. Industry responded with ever more automation and complexity in order to reduce the manpower needed to meet this demand; this in turn greatly increased costs of ownership and maintenance costs.

Advances in Maintenance Techniques The maintenance techniques available to engineers have grown in number and complexity over the three generations:

Lamine Cherif

First Generation: The only real option was to leave equipment running and fix it if it failed. Second Generation: The pressure for output fuelled demand for higher equipment availability. This in turn led to the development of the first preventive maintenance systems. Large and cumbersome (by today's standards) computers were introduced into the maintenance function in order to manage these systems. Third Generation: Today there is a vast, and even bewildering, range of highly advanced maintenance techniques available. The problem for maintenance engineers (besides learning what the available techniques are in the first place) is knowing which techniques are appropriate for which equipment and how often to use them. RCM helps with this enormously. Reliability Centred Maintenance (RCM) The developers of RCM took the unusual view (at the time) that the objective of equipment maintenance should be to keep the equipment doing whatever its users want it to do, rather than to prevent failures for the sake of preventing failures. With this emphasis on preserving what the user wants, Moubray defines RCM as: A process used to determine what must be done to ensure that any physical asset continues to do what its users want it to do in its present operating context. It is, therefore, no surprise that determining the operating context and what the user wants the equipment to do is the starting point for the RCM process, which is applied by asking and answering the following seven questions: Abstract Reliability-Centered Maintenance (RCM) is a phrase coined thirty years ago to describe a cost effective way of maintaining complex systems. The RCM method uses the answers to seven very basic questions to help determine the best maintenance tasks to implement in an Equipment Maintenance Plan (EMP). This paper focuses on those seven questions and how they help determine the EMP. Introduction On December 29th, 1978 F. Stanley Nowlan and Howard F. Heap published report number A066-579, "Reliability-Centered Maintenance". The report was the culmination of several years of work aimed at determining a new, more cost effective way of maintaining complex systems. They called it Reliability-Centered Maintenance (RCM) because programs developed through RCM "are centered on achieving the inherent safety and reliability capabilities of equipment at a minimum cost". RCM is a time consuming, resource intensive process. Many practitioners have tried to reduce the amount of time and resources required to accomplish RCM projects with varying degrees of success. The most successful ones have focused on understanding the basic goals of RCM, and on the seven basic questions that need to be asked about each asset. In this paper we will concentrate on understanding each of the seven questions and how the answers to those questions help determine a ReliabilityCentered approach to asset management.

Lamine Cherif

The Definition of Reliability In the book Maintainability, Availability, and Operational Readiness Engineering Dimitri Kececioglu defines reliability as: "The probability that a system will perform satisfactorily for given period of time under stated conditions." Nowlan and Heap define Inherent Reliability as: "the level of reliability achieved with an effective maintenance program. This level is established by the design of each item and the manufacturing processes that produced it. " In The Fault Tree Analysis Guide a system is defined as: "A composite of equipment, skills, and techniques capable of performing or supporting an operational role, or both. A complete system includes all equipment, related facilities, material, software, services, and personnel required for its operation and support to the degree that it can be considered self-sufficient in its intended operational environment." When we look at these definitions in conjunction it becomes very evident that any asset management program must address system development through all phases of a systems life. There is no maintenance program that can improve the reliability of a poorly designed system. Additionally, whatever maintenance program is developed is determined by the design of the system and the goals of the organization. The Goal of Reliability-Centered Maintenance (RCM) The primary goal of Reliability-Centered Maintenance (RCM) should therefore be to insure that the right maintenance activity is performed at the right time with the right people, and that the equipment is operated in a way that maximizes its opportunity to achieve a reliability level that is consistent with the safety, environmental, operational, and profit goals of the organization. This is achieved by addressing the basic causes of system failures and ensuring that there are organizational activities designed to prevent them, predict them, or mitigate the business impact of the functional failures associated with them.

Lamine Cherif

The Seven Questions of RCM There are seven basic questions used to help practitioners determine the causes of system failures and develop activities targeted to prevent them. The questions are designed to focus on maintaining the required functions of the system. 1. What are the functions of the asset? 2. In what way can the asset fail to fulfill its functions? 3. What causes each functional failure? 4. What happens when each failure occurs? 5. What are the consequences of each failure? 6. What should be done to prevent or predict the failure? 7. What should be done if a suitable proactive task cannot be found? What Are The Functions of the Asset? Every facility is uniquely designed to produce some desired output. Whether it is tires, gold, gasoline, or paper the equipment is put together into systems that will produce the end product. Each facility may have some unique equipment items, but in many cases common types of equipment are just put together in different ways. Within every RCM analysis we have two types of functions. First, the Main or Primary function, this function statement will describe the reason we have acquired this asset and the performance standard we expect it to maintain. Second, are the Support Functions, which list the function of each component or maintainable item that makes up the system. The Support Functions are provided by the bottom level of equipment in most facilities such as pumps, electric motors, valves, rollers, etc. Each of those maintainable items has one or more easily identifiable functions that enable the system to produce its required output. It is the loss of these functions that lead to variation in the Main or Primary function of the system and the safety, environmental, operational, and profit output of the facility.

Lamine Cherif

The key thing to remember when describing equipment functions is that we are interested in what the equipment does in relation to its operating context, not what it is capable of doing. For example, a cooling tower pump may be capable of pumping 100 gpm at 275 ft of head, but may only need to pump 75 10 gpm at that same pressure. It is necessary to focus on the required and secondary functions within the system operating context in order to analyze asset functions. Our main function statement for this system would address the functionality within the operating context; Be able to pump cooling tower water at a rate of 75 10 gpm at 275 15 ft of head while maintaining all quality, health, safety and environmental standards. The rate, the head requirement, quality, health, safety and environmental standards are all performance standards for the pump. Functions need to be well defined. Statements such as pump water from the pond dont lend themselves well to understanding what functional failure would look like. A statement such as pump 1000 100 gpm at 275 15 ft of head from the pond make it easy to understand what a functional failure might look like. If we can only pump 800 gpm then we obviously have an unacceptable variation in output.

Lamine Cherif

In What Way Can the Asset Fail to Fulfill its Functions? Nowlan and Heap said there are two types of failures. There are functional failures and potential failures. Functional failures are usually found by operators, and potential failures are usually found by maintenance personnel. In many organizations there are great debates about what constitutes a failure. In their original work Nowlan and Heap used a very good definition for failure. A failure is an unsatisfactory condition. Using this definition allows us to grasp the idea that equipment can continue to operate yet be considered failed. Many condition monitoring programs dont achieve their desired output because those running the program do not recognize that a failure has occurred as soon as an unsatisfactory condition is detected. They often try to run the equipment as long as possible or until they get closer to the F of the P-F curve. At Allied Reliability we call this managing to the F. More mature programs manage to the P, meaning that they take action as soon as the unsatisfactory condition is recognized. Remember, the further we go along the P-F curve the higher the level of business risk we are accepting. It is equally important to recognize that there is significant value in ensuring that equipment is installed and commissioned properly.

Lamine Cherif

The I-P-F curve shown above is the standard P-F curve with an I-P portion added. Point I is defined as the point of installation of the component. The I-P portion of the I-P-F curve is the failure free period. This is the time during which the operation is defect free. The I-P interval for machines that were installed improperly may be just a few seconds. The I-P interval for machines installed by well trained crafts people using well designed procedures, precision techniques, and precise measuring equipment, and commissioned by operators using well designed operating procedures may be years. The graphic above shows what the I-P-F curve for two differently installed identical machines might look like. The machine with the longer I-P interval was installed by well trained crafts personnel using a properly designed procedure and precision measuring devices, and commissioned by operators using a well designed operating procedure. The machine with the shorter I-P interval was installed by inadequately trained personnel using either no procedure or a poorly designed procedure without precision measuring devices and techniques, and commissioned by operators using either no procedure or a poorly designed procedure. The difference in lengths of the I-P portions of the curve for the two pieces of equipment may represent large sums of money. The dollars represent the additional cost of parts and labor and also the amount of additional foregone production as a result of the extra maintenance work that had to be performed. Looking at an organizations shift in focus from F toward I is a more effective way to determine its maturity than by looking at the age of their maintenance program. Many organizations reactively maintain equipment for a long time. An organization that is constantly focused on Point F and staying clear of it, will undoubtedly be a reactive culture. Typical things heard around this organization might be How long can we run it before it fails? and Just how bad is it?. An organizations first step toward maturity will be to shift its focus from Point F to Point P. The organization then focuses its efforts on understanding how things fail and their ability to detect these failures early. Typical things overheard in this organization may be something like: Is this the best way to detect these defects early? or I appreciate you letting me know about this problem, even though its very early. Further maturation results in a transition from focusing on Point P to focusing on Point I. Overheard in the hallways of this organization are things like Take the time to do it right, it will pay big dividends for us not too far down the road and Lets update the procedures for that job to reflect what we just learned. This organization is trying to prevent failures from occurring in the first place by applying best practices with fits, tolerances, alignment standards, contamination control and well documented procedures. They will see the step change in performance and they are the ones we label mature not the organizations that have been doing it poorly but for a longer period of time. The functional failure statement describes the loss of the equipments function, not what is wrong with the equipment. A good functional failure statement will most likely not have the noun name of an equipment part in it.

Lamine Cherif

What Causes Each Functional Failure? At the end of the day we will be building maintenance tasks designed to prevent functional failures from occurring. In order to do this we must understand what causes each functional failure. The cause may be the failure of some equipment part, but it can just as easily be a failure in some human activity. Improper operation and improper maintenance are likely to be the causes of failures. Remember the definition of a system. Everything and everybody in the facility has some impact on system reliability. It is very important to describe these causes or failure modes in a way that allows us to create a living program for improving asset management. Easy to use codes in the Enterprise Asset Management (EAM) system will allow us to capture data about what types of failures are occurring and to react to that data by reengineering the maintenance plan, training plan, or equipment design associated with the equipment. A well designed Failure Reporting, Analysis, and Corrective Action System (FRACAS) is a must for continuously improving system performance. For part failures we may want to use a simple three part code that consists of the part name, part defect, and defect cause.

What Happens When Each Failure Occurs? Known as Failure Effects, these statements clearly describe what happens when a failure occurs and what events are required to bring the process back to normal operating conditions. Different things can happen when a failure occurs. Not all failures are created equal. When listing failure effect statements we should fulfill the following criteria:

1. Events that led up to the failure Any immediate notable effects of wear or
imminent failure

2. First Sign of Evidence Is the failure evident to the operating crew as they

perform their normal duties? If so explain how. 3. Secondary Effects The effects of failure on the next higher indenture level under consideration. 4. Events Required to Bring the Process Back to Normal Operating conditions

Lamine Cherif

What Are the Consequences of Each Failure? What makes failures matter is their impact on the business. Every business has goals for profitability, safety performance, environmental performance, and operational performance. Each failure has a different impact on business performance, and it is important for the RCM team to understand the consequences of each one. Some failures are of little to no consequence, and some can result in the loss of lives, or in extreme cases total failure of the business. Most organizations use some sort of severity matrix to define the consequences of failures. The tables below represent just some of the ways this can be done.

How would your company handle creating severity rankings for failures? In most cases each failure will be ranked according to what is known as criticality. The criticality is the result of combining probability and consequence rankings together to yield a single number. The criticality will be a biased towards the businesss philosophy of safety, environmental, and operational risk. The tasks in the Equipment Maintenance Plan (EMP) generated from the RCM analysis are designed to lower the criticality of the significant failures in the system. Tasks can be rank ordered for implementation by implementing those that yield the higher reduction in criticality first.

Lamine Cherif

10

What should be done to predict or prevent the Failure? Each failure mode must be examined to determine what type of maintenance task, if any, should be used to prevent or predict it. Nowlan and Heap recognized four basic types of PM tasks. Scheduled failures Scheduled Scheduled life limit Scheduled inspection of an item at regular intervals to find any potential rework of an item at or before some specified age limit discard of an item (or one of its parts) at or before some specified inspection of a hidden-function item to find any functional failures

When and how these tasks are performed depends on the failure mechanism that is present. In the original report six failure shapes were investigated. The team determined that only 11% of the failure modes present in their study of aircraft part failures would lend themselves to scheduled rework or replacement. In this instance 89% of the failure modes present would require some sort of inspection. The majority of the failure modes, 63%, could actually be made worse by time based overhaul or replacement. Clearly, some good non-invasive method of inspecting for potential failures would be very beneficial.

Figure 3: Failure Shapes (John Moubray, Nolan and Heap)

In some cases it is not possible to detect functional failures during normal operations. Those undetectable failures are called hidden failures. Hidden failures are usually associated with some sort of protective system that is designed to minimize the impact or prevent the high consequences associated with a failure of the protected system. Items such as pressure safety valves (relief valves), circuit breakers, high temperature interlocks, and high level interlocks are just a few examples of devices that could have hidden failures. The bad news is that the consequences of failure can be extremely high. The good news is the probability of the catastrophic event is often quite low. It requires that both the protecting and the protected item fail at the same time. In cases where functional failure is not

Lamine Cherif

11

immediately detectable during normal operations a failure finding task must be done to prevent the high consequences associated with multiple failures. Table 6, reproduced from the Nowlan and Heap report presents a comparison of the four types of tasks and their applicability. For non-critical failures the order of preference will generally be inspection, rework, and lastly discard or replacement of the item.

When Nowlan and Heap published their report in 1979 condition monitoring methods such as vibration analysis (VA), ultrasonic inspection (UE), ultra-violet inspection (UV), and other non-invasive technology based inspection methods were in their infancy and were very expensive to deploy. Now, nearly thirty years later, technology based inspection methods are relatively inexpensive and easy to deploy. These methods are really nothing more than inspection methods that can be used on a periodic basis to determine the condition of equipment. We can be almost certain that Nowlan and Heap would have recommended extensive use of these technologies had they been readily available. In any case, the task chosen must either lower safety, environmental, or operational risk to an acceptable level, or for non-critical failures be economically effective. Risk is always the top driver in the decision making process. We may have to spend more money to ensure that we meet our risk goals.

Lamine Cherif

12

What should be done if a Suitable Proactive Task cannot be found? There may be a couple of reasons why we wouldnt be able to find a suitable proactive task. We are either unable to find a task that will lower business risk to an acceptable level, or we are unable to find a task that is economically feasible. Each case requires a different response. In the first case, the system will have to be redesigned to that an acceptable level of risk. In the second case, we can choose a run to failure approach for the failure mode. It is important to remember that when a run to failure strategy is employed we should then put in place consequence reduction tasks to mitigate the impact of the failure. The RCM team must ensure that appropriate steps are taken to have written procedures in place to deal with the failure mode, and that proper spares levels are maintained. Conclusion Answering the seven questions of RCM properly will yield a cost effective EMP that achieves the business goals for safety, environmental, and operational risk. Answering the questions properly requires a cross-functional team of maintenance, operations, and engineering personnel who have an understanding of how the asset Applying RCM It is not possible for one person to answer all the questions that RCM asks. The solution is to bring together a group of people (the RCM analysis group) who have technical knowledge about the equipment, knowledge of its operation (within its current operating context) and a basic understanding of RCM itself (through suitable training). A sound understanding of the RCM process is also required in order to guide the RCM analysis group through the RCM process and achieve consensus in answering the questions. This role is fulfilled by an RCM facilitator.

Lamine Cherif

13

RCM analysis group members are drawn from equipment maintainers, operators, possibly manufacturers/suppliers and occasionally specialists. The most important factor is that they know and understand the equipment being analysed using the RCM process. The aim is to reduce the size of the black hole in knowledge (i.e. the black area in the box representing all there is to know about the equipment in the diagram). Inevitably, there will be some gaps in the groups combined knowledge, but at the end of the RCM analysis each group member will usually have acquired useful knowledge about the equipment from other members of the group. Under the guidance of the RCM facilitator, the group follows the RCM process. The outputs of the analysis are: 1-a list of maintenance tasks to be performed by maintenance personnel at specified intervals. 2-a list of tasks to be performed by operating personnel at specified intervals. 3-a list of redesigns to be considered for implementation When the RCM analysis is complete, the output should be audited by whoever has overall responsibility for the equipment or system. This is so they can satisfy themselves that the analysis has been carried out correctly and that it is both sensible and defensible. The final step is to implement the results of the RCM analysis when the audit is

complete.

Teamwork Cross-functional, highly proactive and self-motivated team. Integrated by Maintenance personnel, Operations personnel, and Specialists (invited by special requirements). These people will have to be highly familiar with the subjects that they are examining. The team will be directed by a facilitator who may or may not come from one of these departments. The size of the team should be adequate (typically 4 or 5 people) but not too large-"too many cooks spoil the soup."

Lamine Cherif

14

Facilitator's Role The facilitator is the team leader. He will have to facilitate the implementation of any philosophies and techniques to be used, making the most of the different skills of the personnel who work in the teams. Facilitators will have to be competent in the following areas: 1. Techniques/tools to use 2. Analysis 3. Managing meetings 4. Time keeping 5. Administration, logistics 6. Communications The typical functions of the facilitator include: Organizing and directing all the activities involved in the project. Planning, scheduling and leading meetings. Ensuring that every scheduled meeting happens. He must, therefore identify alternatives to resolve any problems with any team member. Selecting the level, defining the borders and the work scope for the analysis, as well as considering the impact, the duration and the resources required for the project. Ensuring that all team members understand the process being followed. Ensuring that the process is correctly applied in the right order; avoid taking short cuts that affect the process integrity. Ensuring that the project is completed according to the plan, within reason. Co-ordinating all support material required by the team (drawings, diagrams, etc.), as well as, keeping documentation and sharing it with the team. Acting as the focal point of communications of the team, centralizing the information related to the work. Keeping management aware of the plan and team progress, generating high quality reports. Acting as the technical expert that clarifies any doubts about the process or methodology being followed that may be expressed by the team. Documenting the data generated if it is needed. Researching deeply on the subject of the project and be prepared not to accept incomplete information. In many cases verify the information generated in the meetings. So, he must have enough judgment to know if specialists are needed. Ensuring a consensus style of decision making. Managing any problems that may arise: interpersonal conflict, interruptions, etc.

Lamine Cherif

15

What type of person makes a good Facilitator? Facilitators are key people in successful projects. Better results are possible when facilitators are involved on a full time rather than part time basis. A good facilitator has broad knowledge of the assets. He must have reasonable knowledge of the process, but should not necessarily be an expert About the Meetings: The team must have common objectives, methodology, and an action plan/program. good knowledge of the

Special care must be taken with specialists invited to the meetings in order to provide them with enough and clear information before and after the meeting. The work session must not be longer than 90 minutes. 15 minute breaks should be held during sessions (if sessions longer than 90 min. are planned). Remember that the meetings are social events and should be pleasant events If it is not possible for all the team to attend, specialist sessions could be held, making sure that operations representatives participate. The facilitator should prepare an agenda, including the objectives to be achieved in the meeting, at the end of the meeting; achievement of those objectives should be checked. A meeting should never end without fixing the date and time of the next meeting. The meeting should never seek to allocate blame. Avoid making disparaging comments about team member opinions. The team should solve its internal problems without external interference. The facilitator will have to encourage the participation of all team members in an enthusiastic way. The meeting time should be used in an intelligent and effective way. The key information should be validated before taking further steps. Work based on facts and not on suppositions. Work on solutions for problems instead problems for solutions. Assigned activities which are not completed cause serious problems. The facilitator should find ways to make sure that the responsible team member does the required work. Defer complex problems until enough information is known about them. Communication is the vital element in this kind of big project. The facilitator could channel communications. Communication should cover the whole organization. The facilitator should be a good salesman of the project and its results, so that resources are allocated for it. Notice boards with information about the project and results achieved are an invaluable help. A graph with results obtained (e.g. $$ Vs. Time) could be useful.

Lamine Cherif

16

What RCM Achieves RCM has been applied in a wide range of industries in most countries throughout the world. Correctly applied, RCM produces a maintenance schedule that is optimised for the equipment in its operating context; the aim is to achieve inherent levels of equipment reliability and availability. The RCM derived maintenance and the process itself bring about the following benefits Safety - Greater safety and environmental protection due to: Improved maintenance of existing protective devices. The systematic review of safety implications of every failure. The application of clear strategies for preventing failure modes which can affect safety or impinge upon environmental regulations. Fewer failures caused by unnecessary maintenance. Performance -Improved operating performance due to: An emphasis on the maintenance requirements of critical equipment elements. The extension or elimination of overhaul intervals. Shorter and more focused maintenance tasks resulting in less extensive and costly shutdowns. Fewer "burn in" problems after maintenance (by eliminating unnecessary maintenance actions). The identification of unreliable components. Cost Effectiveness - Greater cost effectiveness due to: Less unnecessary routine maintenance. The prevention or elimination of expensive failures. Clearer operating policies. Clearer guidelines for acquiring new maintenance technology. Quality - Improved quality due to: A better understanding of equipment capacity and capability The clarification of equipment set-up specification and requirements. The confirmation or redefinition of equipment operating procedures. A clearer definition of maintenance tasks and objectives. Life-Cycle Cost - Reduced life-cycle costs by optimizing the maintenance workloads and providing a clearer view of spares and staffing requirements Equipment Life - Longer useful life of expensive items due to an increased use of On condition maintenance techniques. Maintenance Data - A comprehensive maintenance data base which: Provides a better understanding of the equipment in its operating context. Leads to more accurate drawings and manuals. Allows maintenance schedules to be more adaptable to changing circumstances documents the knowledge held by individuals on each piece of equipment. Motivation - Greater motivation of individuals, particularly those involved in the review process. This gives improved understanding of the equipment in its operating context and wider "ownership" of the resulting maintenance schedules Teamwork - Better teamwork brought about by the highly-structured group approach to analysing and addressing maintenance problems.

Lamine Cherif

17

Implantation of RCM
Armed with the maintenance strategic plan, it is set to do battle against the evils of breakdown All available maintenance option for a plant equipment and machinery should be known, and then decide which ones are the most appropriate The most notable techniques is reliability centered maintenance (RCM).

INTRODUCTION
To be competitive, industry must continually improve. Companies are embracing, like never before, efficiency methods such as just-in-time and total quality management. These structured, step-by-step systems can both identify and help implement ways to enhance the business. They are tools to build on and make better use of employees operating abilities and technology knowhow . Maintenance, too, is being changed by the competitive pressures in the marketplace. It also has much to learn from the new techniques that are transforming business practice. And those who use them properly are finding that better maintenance can mean bigger profits.

Lamine Cherif

18

There are several techniques that apply to maintenance performance. Their common goal is to continually improve that performance by Dealing with each type of failure most appropriately, in the most cost effective way. Enhancing productivity with a more proactive and a planned approach. Ensuring active support and cooperation of people for maintenance, materials, operations, technical, and administrative functions. One of the most notable techniques is reliability centered maintenance (RCM). By providing a strategic framework And using the knowledge and expertise of people in the organization it can accomplish two important goals. First, it identifies the maintenance requirements of a physical asset that meets the operational or production goals. Then it optimizes the performance, with the results. RCM works in a progression of related steps. First, it examines the functions and associated to productivity goals of the assets. Second, it assesses the ways those goals can fall short and the effects of failing . Finally, RCM\s detective work deduces the most feasible and effective ways to eliminate or reduce the consequences of failure. RCM was launched in the U.S. commercial airline industry during the early 1960s. It developed in response to rapidly increasing maintenance costs, poor availability, and concern over the effectiveness of traditional time based preventive maintenance. The problems were obvious, so was the need more reliable maintenance programs. Studies were conducted of existing engineering techniques and preventive maintenance practices. The results are in the right to a surprising fact about the traditional, time based, preventive maintenance approach: Scheduled all overhaul has little effect on the overall reliability of a complex item, and this failure is frequent.

Lamine Cherif

19

There are many items for which there is not effective form of scheduled maintenance. The results of these initial studies have extended far beyond the airlines. They were used to develop the basis of a logical preventive maintenance program that can apply throughout industry. This approach has since become known as the reliability centered maintenance RCM was first applied on a large scale to develop the maintenance program of the Boeing 747. Later, it was used for the L011 and DC10. The results have been impressive. These aircraft a cheap significant reduction in schedule and or time-based maintenance, with no decrease in reliability. For example only 66,000 labor hours of structural inspections were required before first heavy inspection at 20,000 flying hours on the Boeing 747, as compared to 4,000,000 labor hours over same period on the smaller DC8. And the DC-10, only seven items were subject to scheduled overhaul, in comparison with the schedule and overhaul of the 339 items on the DC8. RCM (or MSG-3 as it is known in the aerospace industry) is now used to develop the maintenance programs for all major types of aircraft. Other applications include the navy, utilities, the offshore oil industry, and manufacturing processes. RCM is particularly suitable where large, complex equipment is used and where equipment failure pose significant economic, safety, or environmental risks.

Lamine Cherif

20

CREATING VALUES FOR CUSTOMERS


As desirable as it may be to have a comprehensive, logically based maintenance program, it is of little use unless it helps maintenance, and the company as a whole, create value for its customers and shareholders. Typical benefits of RCM are outlined in figure 7-1. That advantages of instituting an RCM program depend on the nature of the business, the risks posed by equipment failures, and the state of the existing maintenance program.

RCM is based on the philosophy that maintenance is a key function of the company. It is crucial for the expected functional performance and productivity goals to be achieved. Further, maintenance requirements are best it developed by multidisciplinary teams from production, materials, maintenance, and technical departments, and should be founded on a logical, structural, and engineered approach. Some of the key precepts of RCM are that equipment redundancy should be eliminated, where appropriate; conditionbased or predictive maintenance tactics are favored over traditional timebased methods; and runtofailure is acceptable, where warranted

Lamine Cherif

21

To develop an RCMbased maintenance program for physical resources, we need to answer the following questions: What assets are owned and operated by the company and to which of these should RCM be applied? What are the functions and performance expectations of a selected asset? In what ways can it fail to perform these functions? What causes it to fail? What are the consequences of each failure? What should be done to prevent each failure, and what steps should be taken if effective preventive measures cant be found? These questions are answered through a logical, seven-step review process, illustrated in figure 7-2.

The process begins with an understanding of the business requirements and objective. This ensures that the maintenance program meets the productivity goals and the physical resource under review. The maintenance agenda is then undefined. Once that happens, an ongoing monitoring and review process is established to make the most of the program. The major steps in the RCM review process are described below.

Lamine Cherif

22

Step 1: Select Plant Area that Matter Businesses typically have thousands of pieces of machinery and equipment. These can range from pumps and valves to process systems, rolling mills, fleets of load-haul-dump (LHD) trucks, ships, or buildings. They may be fixed or mobile. Each asset will benefit from RCM in varying degrees. Therefore, the first step in the RCM process is to identify and prioritize the physical resources owned or operated by the enterprise. Only then can they be reviewed properly. Step 1: Select Plant Area that Matter This is a national stage involves: Establishing a structured , comprehensive list of all physical assets owned or used by the organization that require some form of maintenance or engineering attention. This list is referred to as the plants register, plant inventory, or equipment family tree. Assessing the Impact of the physical resources of the key business performance areas. These may include availability, process capability, quality, cost, and safety or environmental risks. This ensures that the review focuses on the areas or equipment in the plant that benefit most from RCM. Although several complimentary methods can be used in assessment, the precise method is not critical. Of more Importance is selecting a method, documenting it and its results, and then proceeding with the review. Simplicity is the key . Usually, the highest and lowest priority systems would be obvious. Its not worth the added effort to figure out the exact order of importance of those between the two. Step 1: Select Plant Area that Matter Establishing the boundaries between equipment systems. Boundaries include everything necessary for the physical resource to do its job. This helps define the scope of the review and organizes it into manageable pieces. One company selected its environmental control and monitoring equipment, including dust collectors and effluent samplers. They concluded that this category represented the greatest longterm risk. Step 2: Determine Key Functions and Productivity Goals Once the physical resource selected, its functions and the associated productivity goals are determined. This is a key step. The purpose of a maintenance tactic is to make sure the equipment is working properly and producing on schedule. Every physical asset has a function usually several. This can be categorized as: Primary Secondary Protective

Lamine Cherif

23

Step 2: Determine Key Functions and Productivity Goals Primary - this is why the equipment exist that all. It is usually evident from its name, as well as forms the interfaces that are supported between physical assets. An example of a conveyors primary function, for instance, is to transfer rock from hopper to crusher at a minimum rate of 10 tons/hour. Secondary - in addition to its a primary purpose, a physical assets usually has a number of secondary functions. These are sometimes less obvious, but the consequences of failure may be no less severe. Examples of secondary functions include maintaining a pressure boundary, relying local or control room indications, supplying structural support, or providing isolation. Step 2: Determine Key Functions and Productivity Goals Proactive - As processes and equipment increase in complexity, so do the ways in which they can fail dramatically. Likewise, the consequences of failure. To mitigate these dire results, protective devices are used. The job of these devices must be defined before adequate maintenance program can be developed. Typical protective functions Include warning operators of abnormal conditions, automatically shutting down a piece of equipment, and taking over a function that has failed. Step 2: Determine Key Functions and Productivity Goals In addition to defining the functions, this process highlights the expected level of performance or the productivity goals. This can include capacity, reliability, availability, product quality, and safety and environmental standards. While this may sound relatively straightforward, technical and maintenance performance are typically judged differently. Thus, performance can be defined as: Built-in or inherent what it can do. Required what we want it to do. Actual what it is doing. Step 2: Determine Key Functions and Productivity Goals In many instances, the equipment can deliver what is required of it with proper maintenance. Situations can arise, though, where whats required exceeds what physical resource is capable of. In these cases, maintenance cannot meet the performance events. If there is a big gap between the performance needed and the builtin ability or the performance currently being achieved, the equipment assets needs to be modified. Either it should be replaced with a more capable item, or operating changes must be made to reduce expectations.

Lamine Cherif

24

Step 2: Determine Key Functions and Productivity Goals Again, the purpose of the RCM review is to define the maintenance requirements for physical assets that are necessary to meet the business objectives. The level of performance, then, reflects what is required or wanted from the asset. Step 3: Determine Plausible Functional Failures The third step is to address all plausible ways in which equipment can perform below expectations. Partial and total shortcomings are considered, as well as an inadvertent function. Usually, we tend to think of an item failing when it stops working - a go/no go situation. For example, the car doesnt start or a compressor doesnt provide high pressure air. While some equipment is like this, notably electronic machinery, in other cases what constitutes a failure is less clear. Your car may start and run, but its acceleration is poor and it uses too much gas. To compress may run but does it provide enough air pressure of volume? Step 3: Determine Plausible Functional Failures Obviously, an idea of the boundary between acceptable and unacceptable performance is needed to determine when failure occurs. This bounty in the expected level of performance. The definition of functional failure is the inability of the physical assets to deliver its expected level of performance. This definition suggests that the function could fail in numerous ways, each with its own (usually different) modes and effects. Step 3: Determine Plausible Functional Failures These happen speak or are there may be: A total loss of function, where the Item stops working altogether. For example, a pumping system fails to provide any flow. A partial loss of function, where the item works but fails to achieve expected level of performance. For example, a pumping system fails to provide an adequate flow. Multiple levels of performance expected of from an individual function. Step 3: Determine Plausible Functional Failures The expected level of performance defines not only what is considered a failure, but the amount of maintenance needed to avoid that failure.

Lamine Cherif

25

As illustrated in figure7-3.

This frequently creates conflict between various departments. Its essential then, that all concerned the technical, operations, and maintenance departments play a part in drafting the performance levels. The joint seal of approval is essential before proceeding . Step 4: Determine Likely Failure Modes and Their Effects The next task is to set forth the likely failure modes and their cause and effect. The failure mode describes what can or has happened as opposed to what caused it to happen. For example, one failure mode of a pump could a seized bearing that halts any flow. Failure modes of spelled out because the process anticipating, preventing, detecting, and correcting failures is applied to any number of different examples. While many potential failures modes can be listed, only those that are fairly likely need be considered. Step 4: Determine Likely Failure Modes and Their Effects These include: Failure modes that have occurred on the same or similar equipment. This is determined through a review in of maintenance work order history and experience.

Lamine Cherif

26

Failure modes that are already the subject of preventive maintenance tasks. Other failure modes that have not happened but are considered possible because of experience or vendor/manufacturer recommendations. The extent to which these less--thanlikely failure modes are included with depend on their consequences. The greater the potential setback, the more these what if scenarios count. Step 4: Determine Likely Failure Modes and Their Effects These include: Failure modes that have occurred on the same or similar equipment. This is determined through a review in of maintenance work order history and experience. Failure modes that are already the subject of preventive maintenance tasks. Other failure modes that have not happened but are considered possible because of experience or vendor/manufacturer recommendations. The extent to which these less--thanlikely failure modes are included with depend on their consequences. The greater the potential setback, the more these what if scenarios count. Step 4: Determine Likely Failure Modes and Their Effects Possible causes of the particular failure are also identified since they have a direct bearing on the maintenance tactics used. In the example of the seized bearing, the cause of this failure could be a lack of lubrication. Other typical reasons are wear, erosion, corrosion, fatigue, dirt, incorrect operation, or faulty assembly. Step 4: Determine Likely Failure Modes and Their Effects What actually happens when each failure mode occurs is next identified . The effects are described fully, as if nothing were done to prevent the failure. This way, the consequences can be judged fairly . To do so, the following are described : The evidence of failure to the operating crew under normal conditions. The hazards the failure may pose to worker safety, public safety, process stability, or the environment. The effect on production output and maintenance. Step 5: Select Feasible and Effective Maintenance Tactics Failures of the physical resources owned or used by a company can very enormously. Their results may be potentially catastrophic or trivial. How great the Impact influences the way the company views the failure and the steps deemed necessary to prevent it, such as adding a backup systems. In some cases, it may not be worth the effort and expense. Step 5: Select Feasible and Effective Maintenance Tactics To successfully manage a failure, the preventive maintenance tactic must be: Technically feasible - dealing effectively with the technical characteristics of the failure.

Cost effective reducing or avoiding pitfalls in line with dollar and operating constraints. Tactical options are discussed more for the inch up to four in. Whether a particular approach is technically appropriate to solve the failure depends not only on the kind of help, but the nature of the problem. Technically feasible tactics for condition based and time based maintenance

Lamine Cherif

27

satisfy the following criteria. Step 5: Select Feasible and Effective Maintenance Tactics Condition based Its possible to in detect the physical resources degraded condition of performance. The failure is predictable as it progress from first instance to complete breakdown. It is practical to monitor the physical resource in less time than it takes for the problem to develop completely. The time between incipient and functional failure is long enough to be of some use that is, action can be taken to avoid the failure. Step 5: Select Feasible and Effective Maintenance Tactics Time based There is an identifiable point at which the physical asset shows a rapid increase in failure rate. Most assets survived to that age. For failures were significant safety or environmental risks, there should be no failures before this point. The task restores the assets condition. (This might mean partially restoration if the asset is overhauled, for example, or complete restoration if the item is discarded and replaced.) To be costs effective, preventive maintenance must reduce the likelihood and/or consequences of failure to acceptable levels, be readily implemented, and stay within budget. Step 5: Select Feasible and Effective Maintenance Tactics Within these limits, that maintenance tactic is considered cost- effective if : For legal problems, it cuts the chance of a multiple failure to an acceptable level. For failures with safety and environmental effects, the risks are kept to a comfortable minimum. For failures with production setbacks, the cost of the tactic is, over time, less than the production losses. Also, it must be cheaper than repairing the problem it is meant to prevent. For failures with maintenance consequences, the cost of prevention measures is, over time, less than repairing the failure that would otherwise results. Step 5: Select Feasible and Effective Maintenance Tactics If maintenance measures are neither technically feasible nor costeffective, then, depending on the risk of failure, one of the following default actions is selected: For hidden failures, the failure finding tactic to reduce the likelihood of multiple failures. An example is testing the readiness of standby equipment. Four failures with unacceptable safety or environmental risks, redesign or modification. For failures with production or maintenance consequences, run to failure or corrective maintenance. At logic tree diagram is used to integrate the consequences of failure with technically feasible and cost effective maintenance tactics.

Lamine Cherif

28

A simplified version of this diagram is illustrated in figure 7 4.

Step 5: Select Feasible and Effective Maintenance Tactics In general, tactics to prevent failures for this order: Condition based maintenance (CBM) tactics These generally have the least impact on production, help focus corrective actions, and get the most of the economic life of the equipment. Time based repair / restoration tactics These may work for failures that presents a significant safety, environmental, or economic risk to the organization. However, this approach is less preferable than the CBM for a number of reasons. It usually effects production or operations, the age limit can mean premature removals, and the additional shop work required increases the cost of maintenance. Time based discard tactics These are generally that least coast effective preventive maintenance measures. The tend to be used, though, where repair or restoration is impossible or ineffective, such as for components like filter elements, orings, and, in some cases, integrated circuit boards Step 5: Select Feasible and Effective Maintenance Combinations in some cases a combination of tactics may be necessary to reduce the safety and environmental risks to an acceptable level. In general, this involves a condition based maintenance method along with some form of time based maintenance. An example would be the in place inspection of an aircraft engine by borescope every 50 flying hours, combined with time based inspection and overhaul in a shop every 200 hours. Step 5: Select Feasible and Effective Maintenance Once the maintenance tactics have been chosen, next comes deciding how often they are performed initially.

Lamine Cherif

29

For condition -- based tactics, the frequency is linked to the technical characteristics of the failure and the specific monitoring technique. Depending on these factors, the time can vary from weeks to months. Generally, the more sophisticated (and expensive) the technique, the more Infrequent. Step 5: Select Feasible and Effective Maintenance Time base tactics are applied according to the expected useful life of the physical assets. That is determined by the age at which wear out begins, when the chance of failure greatly increases. How often the failure -- finding tactic is needed depends on how available it is and how likely a breakdown in the system. Figure 75 gives an example of how the first five steps might look on a worksheet

Step 6: Implement Selected Tactics It often requires as much effort and more coordination to put the results of the RCM in motion than the review itself. The recommendations are compared with the tasks already included in the maintenance program. The question is whether to add new tasks, change the existing ones (scope or frequency), and/or delete any. Step 6: Implement Selected Tactics Next on the agenda are the actions needed to put the maintenance tactics into effect. These may include: Tweaking maintenance schedules. Developing or revising task instructions.

Lamine Cherif

30

Specifying spare parts and adjusting inventory levels. Acquiring diagnostic or test equipment. Revising operation and maintenance procedures. Specifying the need for repair or restoration procedures. Most significantly, conducting training in the new procedures. To ensure all this is coordinated smoothly, an integrated plan is developed. This plan underscores the actions required and assigns the responsibilities and target dates for their completion Step 7: Optimize Tactics and Program Once the RCM review is complete and the maintenance work identified, periodic adjustments are made. The process is responsive to change in plant design, operating conditions, maintenance history, and discovered condition. In particular, the frequency of the tactics is adjusted to reflect the operating and maintenance history of the physical resource. The objectives of this ongoing activity are to reduce equipment failure improved preventive maintenance effectiveness and the use of the sources, identify the need to expand the review, and react to changing industry or economic conditions. Step 7: Optimize Tactics and Program To achieve these goals, two complimentary activities are integrated into a living program. The periodic re-assessment and revision of the RCM review results. The frequency of the re-assessment depends to some degree on the equipment age but is usually conducted in the tool to five years. A continuous process of monitoring, feedback, and adaptation. This process analyses and assesses the data produced by production and maintenance activities for failure rates, causes, and trends. It includes variances between actual and target performance. Corrective actions can then be taken. These may include changing the task type, scope, or frequency; revising procedures; providing additional training; or changing the design. Continually reviewing and improving the initial maintenance program is akin to a quality management process that continuously improves product quality

Lamine Cherif

31

RCM ELEMENTS: PHILOSOPHY TO PRACTICE


Some of the key success factors in previous RCM programs are listed in Figure 7-6.

To achieve such success and manage change effectively, the RCM program must be phased in and constantly improved. The continuous improvement strategy is long-term, involving people from production, materials, maintenance, and technical functions in the RCM review process. The program involves the use. Of a parttime review team, under the direction of a fulltime facilitator. As a result, it can take a few years to review the critical physical resources in a company. This approach complements other improvements initiatives, such as just in time (JIT) and total quality management (TQM). It provides : A high degree of support from people in production, materials, maintenance, and technical departments for RCM, ensuring acceptance of change. Many part-time review teams under the direction of fulltime facilitator to review important plant areas. Thus, it is easier to obtain the right people to conduct the review. Flexibility and cost-effectiveness, minimizing the need for full-time staff. The basic building block of this strategy is the cross functional RCM review team of company employees. The RCM review process addresses six questions about a physical asset. To answer these questions, input is required not only from maintenance but also the production, material, and technical departments.

Lamine Cherif

32

As a result, the RCM review is best conducted by small teams (five to seven members), with at least one member from each of the above functions who is knowledgeable about the physical resource under consideration. The other key member of the review team is that facilitator who provides expertise in the RCM methodology and guides the review process. The RCM review team meets on a parttime basis. Typically, this involves one to two meetings per week of about three hours duration each. Team members also spend about three to four hours per meeting on individual preparatory to follow up work. The RCM review process takes about ten to fifteen meetings to complete. The physical resource chosen may be studied in sections, by subgroups, so that that if you can be accomplished in this time. The RCM review team also coordinates how the commendations are carried out. Team meetings during this phase are of similar duration but less frequent. In addition, the phasedin approach is used to manage change successfully. This approach is employed to : Establish the need for RCM and build support for its implementation. Establish a vision of excellence. Customize RCM methods to meld with existing structures and systems. Promote technology transfer and commitment to RCM through an initial cadre of people trained and experienced in its methods. Achieve immediate results to build credibility. The major phases in this implementation approach and general tasks are Illustrated in figure 7-7.

Lamine Cherif

33

The following is an example of the use of RCM in manufacturing One mining company with a fleet of 240 ton trucks in continuous operation wanted to reduce unplanned downtime. They analyze the data in the truck dispatch system to determine the highest delay causes, and selected an assembly that was both significant and reasonably straightforward. Their choice was the hydraulic box dump assembly. With a team of in-pit and shop maintainers led by a facilitator with RCM expertise, they met for about two to three hours every week over thirteen weeks. The primary function was defined as: provide hydraulic power to smoothly and symmetrically raise and lower a loaded (240t) dump tray. The maximum overall cycle is 47 seconds for an empty tray at the regulated pressure of 2400psi 50psi with the prime mover at 1910 rpm. The function is a stated crisply, with several standards of performance that make the definition of a function failure clear: Fails to raise the dump tray at all with a regulated pressure of 2400psi 50psi. Tray is raised too slowly (overall cycle time >47s empty) at a pressure of less than 2350psi. Tray is raised too slowly (overall cycle time >47s empty) at a pressure of less than 2400psi but with the engine<1910rpm. Try is raised erratically. Try cannot be raised to full height. Try is lowered too slowly. About 150 modes of failure were determined using cause effect diagram and then transcribed to worksheets using terse phrases such as Hoist control valve spool jammed by foreign material or wear and tear . The failure effects were classed as to degree of severity using a frequency and severity matrix, with a bias toward frequency, on the assumption that if you care of the chronic problems, the acute ones will take care of themselves. The effect corresponding to the jammed spool above is Sufficient pilot pressure not available to move dump control valve spool and so tray cannot be lifted. The pilots valve is changed, which requires two labor hours and the truck is down for less than four hours. The cost effectiveness of this RCM example is clear. Downtime cost about 500 tons/hour and is worth $20,000 in lost production or $480,000 in a one day period. They were able to find the root causes of all critical failures, change both maintenance and operating procedures to reduce the incidence of some causes, and make some simple modifications in hydraulic system design to eliminate others. Today challenging maintenance environment demands continuous improvement. RCM provides a strategic framework to do just that. If properly applied, its benefits can be seen in better service and products. RCM is a logical and structural approach to balancing resources with equipment reliability requirements. Although it clearly involves the help of several functions in the organization, it is very much top down and engineering oriented.

Lamine Cherif

34

RCM is successful
The commercial airline industry in the 1960s: 60 crashes per million take-offs 40 of those (66%) were due to equipment failure 85% of aircraft maintenance was fixed interval Today: 2 crashes per million take-offs 0.3 of those (15%) are due to equipment failure (the rest are < 20% of aircraft maintenance is fixed interval Toronto Hydro Using RCM since 2002. Reducing maintenance costs by 22% Canadian Navy Used RCM for new ships in late 1980s Reduced crew size, reduced maintenance workload, extended ship refit intervals by over a year, Increased ship availability by 17% Saved $C200 million per year in operating costs and $C2 billion in capital costs GE Plastics (Holland, 2000) MTBF increased from 8 hours average to over 80 days! Costs dropped by 50% and staffing was reduced by 30%

human error)

Team composition & analysis meetings


Teams are made up of operators and maintainers trained in RCM analysis Teams are facilitated by a highly trained RCM facilitator

Analysis meetings are 3 to 4 hours each Each project requires 10 15 meetings

Lamine Cherif

35

Audit of results
Management makes certain that it is Happy with the results obtained by RCM analysis teams.

Confirm
Suitability of all maintenance tasks Suitability of all operator tasks `Suitability of any run-to-failure decisions Suitability of any re-designs suggestions

Recommended Reading
Moubray, John: Reliability-centred maintenance, 2nd edition, 1997, ButterworthHeinemann, UK

Campbell, John & Reyes-Picknell, James: Uptime, Strategies for Excellence in Maintenance Management, 2nd edition, 2006, Productivity Press, NY

Lamine Cherif

36

You might also like