You are on page 1of 77

EGERTON UNIVERSITY

FACULTY OF ENGINEERING AND TECHNOLOGY

DEPARTMENT OF INDUSTRIAL AND ENERGY ENGINEERING

TRAINING MANUAL

FOR

ITEC 236: TEROTECHNOLOGY

BACHELOR OF INDUSTRIAL TECHNOLOGY


FINAL YEAR

BY

Dr. Charles M.M. Ondieki


P.O. BOX 58274 – 00200 NAIROBI
Tel: 0722 705 609/020 2013978
E-mail: charlesondieki@yahoo.co.uk
MARCH 2007, NJORO KENYA

Preface

Various studies have indicated that for large manufacturing systems or pieces of equipment,
maintenance and its support account for as much as 60 to 75 percent or more of their life
cycle costs. The increasing demands on high quality products have brought the maintenance
problem into even sharp focus. This, therefore, has put more emphasis on maintainability
during product design.

Terotechnology is the process of optimising the life cycle costs of an asset or equipment. In
the process of optimising life cycle costs a thorough understanding of plant reliability and
maintainability is very crucial. An attempt, therefore, has been made to present reliability
and maintainability concepts in this book to meet the challenges of modern manufacturing
system design. In doing so every effort has been made to treat the topics discussed in such a
manner that the reader will need minimum previous knowledge to understand the contents.

This introductory book describes how to design for plant reliability and ease maintenance
from the first stages of product design. In addition the book stresses how to:
• Improve product performance
• Minimize downtime
• Reduce the frequency of maintenance
• Lower the cost of maintenance
• Use life cycle costing in the process of choosing and purchasing equipment
• Decide when and how to replace an equipment
• Acquire data for reliability growth

The book is intended for Bachelor of Industrial Technology students and other engineering
students, as well as professional engineers and design and maintenance managers. This book
will enable manufacturing firms to stay ahead of the rest in product quality, efficiency and
profitability.

Dr. Charles M.M. Ondieki


Msc. (Mech. Engg.), PhD (BAdmin.)
Lecturer, Egerton University, Njoro Kenya

2
Contents

ITEC 236: TEROTECHNOLOGY....................................................................................1


Dr. Charles M.M. Ondieki........................................................................1
MARCH 2007, NJORO KENYA..................................................................2
Preface........................................................................................................................................2
Dr. Charles M.M. Ondieki.........................................................................................................2
Msc. (Mech. Engg.), PhD (BAdmin.).......................................................................................2
...................................................................................................................................................2
Contents......................................................................................................................................3
1. Terotechnology.......................................................................................................................5
...................................................................................................................................................5
1.1 Introduction......................................................................................................................5
2. Plant Reliability......................................................................................................................5
2.1 Specifications for Reliability............................................................................................6
2.2 Reliability of Parts and Components................................................................................6
2.3 Parts in series....................................................................................................................6
2.4 Reliability and Quality.....................................................................................................6
2.5 The Role of Design in Reliability....................................................................................7
2.6 Improving Reliability.......................................................................................................7
2.6.1 Design Methods Used to Improve Reliability...........................................................7
2.6.2 Causes of Unreliability..............................................................................................9
2.7 Cost of Reliability ...........................................................................................................9
Reliability..........................................................................................................................10
2.8 Basic stages in the achievement of reliability................................................................10
2.9 Reliability and Failure Patterns......................................................................................11
(e) Calculating Reliability when Failure Rate is Constant...................................................12
2.10 Redundancy..................................................................................................................15
B..................................................................................................................16
3.0 Maintainability...................................................................................................................21
3.1 The importance, Purpose, and Results of maintainability efforts..................................21
3.2 Maintainability cost Considerations...............................................................................22
3.3 Maintainability Costs.....................................................................................................22
3.4 Maintainability Design Considerations..........................................................................22
3.5 Maintainability Tools.....................................................................................................24
3.6 Maintainability and Safety..........................................................................................26
3.7 Safety and Human Behaviour.....................................................................................26
3.8 General Maintainability Design Guidelines...................................................................27
3.9 Maintainability of new Equipment)...............................................................................27
3.10 Designs for Ease of Operation ....................................................................................27
3.11 Design for ease of maintenance....................................................................................29
3.12 Design for Serviceability..............................................................................................31
4.0 Plant Maintenance..............................................................................................................31
4.1 Preventive Maintenance.................................................................................................33
4.2 Breakdown Programs.....................................................................................................33
4.3 Replacement and Maintenance.......................................................................................34
4.4 Maintenance Models....................................................................................................34
4.5 Maintenance cost estimation models..............................................................................37
4.5 Availability.....................................................................................................................39

3
4.5.1 Availability and Scheduled maintenance................................................................39
4.5.2 Losses caused by non-availability of the system.....................................................39
4.6 Downtime and Maintenance Strategies..........................................................................40
4.6.1 Mean Downtime (MDT) and Mean time to repair (MTTR) [or Repair Time].......40
4.6.2 Active Repair Time.................................................................................................40
4.6.3 Factors Influencing Downtime................................................................................41
4.7 Comparisons of Maintainability and maintenance costs ...............................................42
4.8 Comparisons of Reliability and maintenance Costs.......................................................42
4.8.1 Factors affecting Reliability and Maintenance Costs..............................................43
4.9 Reliability – Centred Maintenance (RCM)....................................................................43
4.9.1 Basic steps in RCM Process....................................................................................44
4.9.2 Methods of Monitoring Equipment condition.........................................................46
4.9.3 The Benefits of RCM Application..........................................................................48
5. Life Cycle Cost and the Cost of an Equipment....................................................................49
5.1 The costs of quality........................................................................................................49
The Life Cycle Costs ...........................................................................................................49
..........................................................................................................................................52
5.2.1 Life Cycle Costing (LCC).......................................................................................52
5.2.2 Life cycle costing steps are:....................................................................................53
5.2.3 Advantages and Disadvantages of Life-cycle costing.............................................53
5.2.4 Why Use LCC?.......................................................................................................54
.............................................................................................................................................54
5.2.5 The Conversion or Decommission Phase of Life Cycle Costing ...........................54
5.2.6 Life Expectancies....................................................................................................55
5.2.7 Life Cycle Cost models...........................................................................................56
Example 1.................................................................................................................................58
Manufacturer B’s Electric generator....................................................................................59
5.3 Cost and Performance of Equipment.............................................................................60
5.3.1 Factors related to reliability.....................................................................................60
5.3.2 The Design Profile..................................................................................................61
6.0 Maintainability and Reliability terms and definitions........................................................61
7.0 Depreciation and Equipment Replacement........................................................................63
7.1 Economic Life and Obsolescence..................................................................................63
7.2 Depreciation...................................................................................................................64
7.3 Obsolescence..................................................................................................................64
7.4 Causes of Depreciation...................................................................................................64
Depreciation.....................................................................................................................64
7.4.1 Methods of calculating Depreciation......................................................................65
7.5 The Decision Whether to Purchase................................................................................67
7.6 Installation of new Equipment.......................................................................................68
7.7 Equipment Replacement................................................................................................68
7.7.1 Reasons for Replacement of Equipment.................................................................68
7.7.2 Equipment Replacement Policy..............................................................................69
7.7.3 Guidelines in Replacement Analysis.......................................................................69
7.7.4 Methods used for Replacement...............................................................................69
Solution................................................................................................................................70
Machine A................................................................................................................................71
Machine B................................................................................................................................71
(f) MAPI Method.............................................................................................................72
8.0 Acquisition of Failure Data ...............................................................................................72
8.1 Reasons for data collection..........................................................................................73
8.2 Information and Difficulties...........................................................................................73
8.3 Best Practice and Recommendations.............................................................................75

4
8.4 Analysis and Presentation of Results.............................................................................76
8.5 Sources of Reliability Information.................................................................................76

1. Terotechnology
1.1 Introduction

Terotechnogy is the process of optimising the life cycle cost of physical assets. Life cycle
cost is the sum of all costs incurred during the life time of an asset that is, the total of
procurement and ownership costs. Life cycle costs are categorised as: cost of acquisition, cost
of use, and cost of administration. Life Cycle Cost (LCC) of any physical asset is influenced
by the plant reliability and plant maintainability.

Maintainability is the action taken during the design and development of assets to include
features that will increase ease of maintenance and will ensure that when used in the field the
asset will have minimum downtime and Life-cycle support costs i.e. its serviceability,
reparability, and cost-effectiveness of maintenance are increased.

Reliability is the probability that an item will carry out its stated function adequately for the
specified time interval when operated according to the designed conditions, i.e. to define
reliability of any equipment:
• We must state the planned working life e.g. a new car might be very reliable if we
only expect it to last for 5 years; less reliability over a period of 10 years; and
completely unreliable if we are expecting a useful life of say 40 years.
• Similarly we shall need to know the intended conditions of use, and the routine
maintenance which is required, e.g. if a car engine seizes because there is no water in
the radiator this is a failure of maintenance rather than a failure of reliability; if a car
is driven carelessly and fails this is a misuse failure.

Mean Time between Failures (MTBF) and Mean Time to Repair (MTTR): The Mean
Time between Failures (MTBF) tells us how long on average, equipment operates before it
fails, and this we want to be as long as possible. MTBF, therefore, depends on reliability.
The Mean Time To Repair (MTTR) tells us how long on average, it takes to put the
equipment right after it has failed, and this we want to be as short as possible. MTTR,
therefore, depends on maintainability.

2. Plant Reliability
Reliability is the probability that an item will carry out its stated function adequately for the
specified time interval when operated according to the designed conditions. Since no two
equipments are identical due to manufacturing differences however the designer and control
engineers try to eliminate any defects, reliability is given in percentages (for mathematical
reasons, it is expressed in decimals of 1.00). Suppose that out of every 100 cars of a
particular type, 99 prove to be trouble free if used and maintained correctly, and one fails to
work as intended. Then we can say that the reliability of each car is 99 percent, meaning that
the chances are 99 in 100 that it will prove reliable.
The longer we expect anything to last the more likely it is to fail during that time i.e.
reliability falls as time increases.

5
Reliability at time t = R(t) = (No. surviving at instant t)/ (No. at start when t=o)

2.1 Specifications for Reliability


It is usually best to express a customer or market specifications in terms of the service to be
performed, or result to be achieved, rather than of the hardware envisaged. The specification
must contain full information about everything, which is required. The required reliability
must be expressed in figures. There are three main ways of expressing reliability in a
specification:

(i) Directly in terms of reliability for a specified useful life. Because


reliability is related to a particular life span, this is not always convenient, and MTBF
or failure rate is usually preferred.
(ii) MTBF or MTTF – This method is common, especially in the electronics industry,
where the failure rate is often approximately constant.
(iii) Failure Rate – Since the failure rate is directly related to the MTBF, it can be used
provided it is reasonably constant.

The Reliability of large installations is not necessarily quoted as a single figure. The central
unit, for example, may be assigned a higher reliability than some of the ancillary and support
equipment. Ideally, the minimum acceptable reliability or MTBF or the maximum failure rate
should be quoted, but when the system is of very new design it may be more realistic to get a
target reliability, MTBF or failure rate. In deciding what this should be, we must consider the
conditions of use, the duty cycle, what the maintenance requirements are likely to be and how
long the system can be out of use while maintenance is done.

2.2 Reliability of Parts and Components


A system will be made of parts and components, and since in some cases the failure of one of
these may cause the whole system to fail, it must be ensured that each is as reliable as
possible. Further the greater the number of parts, the greater the risk of including one which
is faulty. Hence there are two basic rules:
(i) Use as few parts as possible.
(ii) Ensure that each part is reliable.

2.3 Parts in series


Suppose we have a system consisting of a number of parts and:
• we know the reliability of each part;
• Every part is vital in the sense that, if one fails, the whole system will fail.

Example: Consider a transformer and rectifier set, used to convert mains electricity to a
suitable voltage and frequency. Suppose each part has a reliability of 0.9. If we require only a
transformer and nothing else, then the system will have the same reliability as the one part it
contains. Therefore, for 1 part Reliability = 0.9
If however we require a rectifier, then we have two things, which can go wrong. Therefore,
for 2 parts Reliability = 0.9 x 0.9 = 0.81
and for 3 parts reliability = (0.9)3 = 0.73
and for 10 parts reliability = (0.9)10 = 0.35

2.4 Reliability and Quality


Quality is sometimes defined as “fitness for purpose” and can be broken roughly into:
• Physical features, e.g. whether an item has a satisfactory appearance, all its
dimensions within limits etc.
• Performance, i.e. whether it works correctly.

6
Reliability is the probability that an item will perform as required, under stated conditions, for
a stated period of time. Hence since performance is an aspect of quality, we might say that
reliability is the probability an item will retain its quality, under stated conditions, for stated
period of time. Thus quality and reliability are very closely related. Hence the quality of a
product from the manufacturer will affect its reliability. The quality of the product is also
affected by:
(i) the method of manufacture
(ii) Production equipment
(iii) Inspection and test equipment
(iv) Supplies and/or selection of raw materials and parts etc.
(All these assume that the design and development of the product has been done
correctly).

2.5 The Role of Design in Reliability


According to the definition of reliability, design is keystone. The design strategy used to
ensure reliability can fall between two broad extremes.
• The fail-safe approach is to identify the weak spot in the system or component and
provide some way to monitor that weakness. When the weak link fails, it is replaced.
• At the other extreme is an approach where all the product components are designed to
have equal life so the system will fall apart at the end of its useful lifetime.
• The obsolete worst-case approach is frequently used where the worst combination
of parameters is identified and the design is based on the premise that all can go
wrong at the same time. This is a very conservative approach, and it often leads to
over design.

Two major areas of engineering activity determine the reliability of an engineering system.
First, provision for reliability must be established during the earliest design concept stage,
carried through the detailed design development, and maintained during the many steps in
manufacture. Once the system becomes operational, it is imperative that provision be made
for its continued maintenance during its service.

2.6 Improving Reliability


Because overall system reliability is a function of the reliability of individual components;
improvement in their reliability can increase system reliability. System reliability can be
increased by the use of backup components (i.e. redundancy). Failures in actual use can often
be reduced by upgrading user education and refining maintenance recommendations or
procedures. It may be possible to increase the overall reliability of the system by simplifying
the system (thereby reducing the number of components that could cause the system to fail)
or altering component relationships (e.g. increasing reliability of interfaces).

Generally the potential ways to improve reliability are:


• Improve component design
• Improve production and/or assembly techniques
• Improve testing
• Use redundancy
• Improve preventive maintenance procedures
• Improve user education
• Improve system design.

2.6.1 Design Methods Used to Improve Reliability


The following methods are used in engineering design practice to improve reliability (and
therefore minimize failure):

7
i) Margin of safety
Variability in the strength properties of materials and in loading conditions (stress) leads to a
situation in which the overlapping statistical distributions can result in failures. Therefore,
variability in strength has a major impact on the probability of failure, so that failure can be
reduced with no change in the mean value if the variability of the strength can be reduced.

ii) Derating
The analogy to using a factor of safety in structural design is derating electrical, electronic,
and mechanical equipment. The reliability of such equipment is increased if the maximum
operating conditions (power, temperature, etc.) are derated below their nameplate values. As
the load factor of equipment is reduced, so is the failure rate. Conversely, when equipment is
operated in excess of rated conditions, failure will ensue rapidly.

iii) Redundancy
Redundancy is the most effective way of increasing reliability. In parallel redundant designs
the same system functions are performed at the same time by two or more components even
though the combined outputs are not required. The existence of parallel paths may result in
load sharing so that each component is derated and has its life increased by a longer than
normal time.

Another method of increasing redundancy is to have inoperative or idling standby units that
cut in and take over when an operating unit fails. The standby unit wears out much more
slowly than the operating unit does. Therefore, the operating strategy often is to alternate
units between full-load and standby service. The standby unit must be provided with sensors
to detect the failure and switching gear to place it in service. The sensors and/or switching
units frequently are the weak link in a standby redundant system.

iv) Durability
The material selection and design details should be performed with the objective of
producing a system that is resistant to degradation from such factors as corrosion, erosion,
foreign object damage, fatigue, and wear. This usually requires the decision to spend more
money on high-performance materials so as to increase service life and reduce maintenance
costs. Life cycle costing is the techniques used to justify this type of decision.

v) Damage tolerance
Crack detection and propagation have taken on great importance since the development of the
fracture mechanics approach to design. A damage-tolerant material or structure is one in
which a crack, when it occurs, will be detected soon enough after its occurrence so that the
probability of encountering loads in excess of the residual strength is very remote.

vi) Ease of inspection


The product should be designed so that it is possible to employ visual methods of crack
detection. In critically stressed structures special features to permit reliable NDT by
ultrasonic or eddy current techniques may be required. If the structure is not capable of ready
inspection, then the stress level must be lowered until the initial crack cannot grow to a
critical size during the life of the structure. For that situation the inspection costs will be low
but the structure will carry a weight penalty because of the low stress level.

vii) Simplicity
Simplification of components and assemblies reduces the chance for error and increases the
reliability. The components that can be adjusted by operation or maintenance personnel

8
should be restricted to the absolute minimum. The simpler the equipment needed to meet the
performance requirements the better the design.

viii) Specificity
The greater the degree of specificity the greater the inherent reliability of design. Whenever
possible, be specific with regard to material characteristics, sources of supply, tolerances and
characteristics of the manufacturing process, tests required for qualification of materials and
components, procedures for installation, maintenance, and use. Specifying standard items
increase reliability. It usually means that the materials and components have a history of use
so that their reliability is known. Also, replacement items will be readily available. When it
is necessary to use a component with a high failure rate, the design should especially provide
for the easy replacement of that component.

2.6.2 Causes of Unreliability

The malfunctions that an engineering system can experience can be classified into five
general categories.

1. Design mistakes: Among others the common design errors are failure to include
all important operation factors, incomplete information on loads and
environmental conditions, erroneous calculations, and poor selection of materials.
2. Manufacturing defects: Although the design may be free from error, defects
introduced at some stage in manufacturing may degrade it. Some common
examples are (1) poor surface finish or sharp edges (burrs) that lead to fatigue
cracks and (2) decarburization or quench cracks in heat-treated steel. Elimination
of defects in manufacturing is a key responsibility of the manufacturing
engineering staff, but a strong relationship with the R&D function may be
required to achieve it. Manufacturing errors produced by the production work
force are due to such factors as lack of proper instructions or specifications,
insufficient supervision, poor working environment, unrealistic production quota,
inadequate training, and poor motivation.
3. Maintenance: Most engineering systems are designed on the assumption they
will receive adequate maintenance at specified periods. When maintenance is
neglected or is improperly performed, service life will suffer. Since many
consumer products do not receive proper maintenance by their owners, a good
design strategy is to make the products maintenance-free.
4. Exceeding design limits: If the operator exceeds the limits of temperature, speed,
etc., for which it was designed, the equipment is likely to fail.
5. Environmental factors: Subjecting equipment to environmental conditions for
which it was not designed, e.g., rain, high humidity, and ice, usually greatly
shortens its service life.

2.7 Cost of Reliability


Reliability costs money, but the cost nearly always is less than the cost of unreliability. The
cost of reliability comes from the extra costs associated with designing and producing more
reliable components, testing for reliability, and training and maintaining a reliability
organization. The figure below shows the cost to a manufacturer of increasing the reliability
of a product. The costs of design and manufacture increase with product reliability.
Moreover, the slope of the curve increases, and each incremental increase in reliability
becomes harder to achieve. The costs of the product after delivery to the customer, chiefly
warranty or replacement costs, reputation of the supplier, etc., decrease with increasing

9
reliability. The summation of these two curves produces the total cost curve, which has a
minimum at an optimum level of reliability. Other types of analyses establish the optimum
schedule for part replacement to minimize cost.

Total Cost
Cost

Cost of Design
And Manufacture
Costs after Delivery

Rm
Reliability
Figure: Influence of Reliability on Cost

Rm – Optimum Reliability

2.8 Basic stages in the achievement of reliability

Achievement of reliability can be divided into eight stages:

(i) The customer or market specification – we must ascertain as accurately as possible


precisely what our customer require.

(ii) Prepare the design and express it as a manufacturing specification – The designer
must specify what has been made in order to satisfy the requirement of the customers.

(iii) Prove the design – Wherever the design is a departure from previous practice, we
shall need tests on prototypes to show by demonstration that what is proposed will
achieve the reliability demanded.

(iv) Manufacture to specification – A high standard of quality control will be necessary to


ensure manufacture to specification at minimum cost, in the time required.
(v) Packaging and transport – we must ensure that the equipment is packaged and then
transported to the customer’s site without incurring any significant damage.

(vi) Purchase, storage, installation and commissioning of new equipment – we next look at
reliability form the customer’s point of view, and consider how he decides what to
purchase and how it is stored, installed and commissioned for use.
(vii) Operation and maintenance – The customer must use and maintain the equipment as
intended, employing operators with adequate skills and training. If there are any
difficulties, the manufacturer should be anxious to help, not merely in the interests of
good customer relations but also in order to learn for the future.

1
(viii) Reliability management and Prediction – Finally we consider the overall management
of reliability, and how reliability of a proposed design can be predicted.

2.9 Reliability and Failure Patterns

(a) Definition:
When an item no longer works as intended we say it has failed. Failure, therefore, is the
termination of the ability of an item to perform its required function.

(b) Classification of Failures

Failures are classified according to the:

i) Cause –
• A misuse of failure is a failure attributable to the application of stresses
beyond the stated capability of the item i.e. ill treated.
• An Inherent Weakness failure is a failure attributable to weakness inherent
in the item itself, when subjected to stresses within the stated capabilities of
the item i.e. failure is probably due to a design or manufacturing fault.
ii) Suddenness –
• A sudden failure is one which could not be anticipated by prior examination.
• A gradual failure is one, which could be anticipated by prior examination i.e.
it is possible to predict that it will occur since it takes place gradually.
iii) Degree –
• A partial failure is one resulting from deviations in characteristics beyond
specified limits, but not such as to cause complete lack of the required
function i.e. the item does not work as well as it should, but it has not
completely failed.
• A complete failure is one resulting from deviations in characteristics beyond
specified limits, such as to cause complete lack of the required function.
iv) Combination of the above terms –
• A catastrophic failure is one which is both sudden and complete.
• A degradation failure is one, which is both gradual and partial.
(c) Failure rate

Failure rate is the probability of failure in unit time of an item, which is still working
satisfactorily.
From a sample of 1000 parts, suppose 100 hours after the start of a test we notice that 221
parts have failed, leaving 779 still working. Then:
Observed Reliability over 100 hours = (No. Surviving at t= 100)/(No. at start of the test at
t=0) = 779/1000 = 0.779; After the test had run 200 hours the number of failures might have
risen to 400, and then: Observed Reliability over 200 hours = 600/1000 = 0.600

Suppose that at the instant when t= 200 hours, we are able to find out that parts are failing at
exactly 1.5 per hour. Since 600 are still working, we can express the rate at which they are
failing as: Failure rate 8 = (No. failing per hour at instant t)/ (No. still surviving at instant t)
8 = 1.5/600 = 0.0025 per hour =25 x 10-4 per hour.

(d) Failure Probability Density Function


The failure probability density function gives, for any instant of time t, the probability of
failure in unit time of an item which was working satisfactorily at time t= 0. In the example

1
above, the parts were failing at exactly 1.5 per hour at the instant when t = 200 hours, so we
can write: Observed value of failure probability density function for instant t =
(No. failing per hour at instant t)/(No. at start (t=0)
:. F(t) = 1.5/100 = 15x 10-4 per hour, and total f(t) = 1.00.

NB: Reliability is given by:


Reliability at time t= R (t) = (No. Surviving at instant t)/ (No. at start when t=0)
Failure rate = 8(t) = (No. failing per hour at instant, t)/ ( No. still surviving at instant t)
:. R (t) x 8 (t) = (Non failing per hour at instant t)/ (No. at start when t = 0) = f (t)
or f(t) = R (t) x 8(t) (=Reliability x failure rate); for the instant when t = 200 hours.
R(t) 600/1000 = 0.6; 8 (t) = 25 x 10-4 per hour
:. F(t) = 0.6 x 25 x 10-4 = 0.0015 failures per hour or f(t) = 15 x 10-4 failures per hour.

(e) Calculating Reliability when Failure Rate is Constant


Reliability at time t = (Number surviving at instant t)/(Number at start when t=0)
If we start with, say 1000 parts and suppose that exactly 100 hours after the start of test we
notice that 221 parts have failed, leaving 779 still working, then Observed Reliability over
100 hours = 779/1000 = 0.779.

If after the test had run 200 hours the number of failures might have risen to 400, and then we
should have; Observed Reliability over 200 hours = 600/1000 = 0.6.

Suppose that at the instant when t=200 hours, we are able to find out that parts are failing at a
exactly 1.5 per hour; since 600 are still working, we can express the rate at which they are
failing as follows: Observed instantaneous failure rate = (Number failure per hour at instant
t)/(Number still surviving at instant t)
Therefore, Instantaneous failure rate (or Hazard Rate) λ =1.5/600=25x10-4 per hour.

Having counted how many parts failed during each hour of the test, we could have said that
x1 per cent of the 1000 parts failed during the first hour, x2 per cent during the second hour,
and so on up to say, xn per cent during the hour when the 1000th part failed. Since all the parts
have now failed, if we add x1+x2+x3+……..+xn, the total must be 100% or 1.00.

At the instant t=200 hour, parts were failing at exactly 1.5 per hour:
Observed value of failure probability density function for instant t =
(Number failing per hour at instant t)/(Number at start when t=0).
Therefore, f(t)=1.5/1000=15x10-4 per hour.
If we add up the fraction, which fail for every instant of time, the total must be 1.00. For a
continuous mathematical function we can write this as

ƒt=∞f(t)=1.00, But f(t)=R(t)x λ, or f(t)=λR(t) e.g. at t=200; R(t)=600/1000=0.6, and


t=0
λ=1.5/600=25x10-4 ; Therefore, f(t)=λR(t)=25x10-4x0.6=15x10-4 (as above).

Reliability is cumulative. When we say that reliability of 100 parts over the period time t=0,
to time t=t1, was observed to be 0.70, we mean that if we add up all those which failed
between the start of the test and time t, there would be a total of 30 failures out of 100.

Suppose we continued to a new time t2, and during this extension a further 3 parts failed, so
that the reliability is now 0.67, therefore:

Observed change in reliability from time t1 to t2=New reliability – Previous reliability

1
= 0.67 - 0.70 = -0.03 (The negative sign means that the reliability is reduced). But this is the
observed value of failure probability density function for instant t, i.e. (Number failing per
hour at instant t)/(Number at start when t=0) = 3/100 = 0.03.
Therefore, the value of the failure probability density function f(t) = -(the rate of change of
reliability), i.e. f(t)= -dR(t)/dt=λR(t); Therefore, dR(t)/R(t)= -λdt, or ƒdR(t)/dt= t= 0ƒt –λdt,
Therefore, lnR(t)=-λt; or R(t)=exp(-λt), and f(t)= λR(t)=λexp(-λt)
Example: A part has a constant failure rate of 0.001 per hour. Calculate its reliability over
500 operating hours.
Solution: R(t)=exp(-λt)=exp(-0.001x500)=exp(-0.5)=0.61.
We can also calculate the fraction of parts originally put on test, which are failing per hour at
this instant when the time is 500 hours, since
f(t)= λR(t)=λexp(-λt)=0.001x0.61=0.00061=61x10-4 failures per hour.

(f) Relationship between Failure Rate λ, and Failure Probability Density Function
f(t)
Consider the probability of an item failing in the interval between t and t+dt. This can be
described in two ways:
(a) The probability of failure in the interval t to t+dt given that it has survived until time
t which is λ(t)dt; where λ(t) is the failure rate.
(b) The probability of failure in the interval t to t+dt unconditionally, which is f(t)dt,
where f(t) is the failure probability density function.
The probability of survival to time t is the reliability R (t). The rule of conditional probability
therefore dictates that: λ(t)dt=f(t)dt/R(t); therefore, λ(t)=f(t)/R(t).
However, if f(t) is the probability of failure in dt then:
t= 0 ƒtf(t)dt=probability of failure 0 to t=1-R(t)
t
t= 0ƒ f(t)dt =1-R(t), or f(t)= -dR(t)/dt, therefore, λ(t)= -(dR(t)/dt)/R(t)
t R(t)
t= 0ƒ λ(t)dt= - ƒ
1 dR(t)/R(t)= -lnR(t);
[NB when t=0, R(t)=1 and at t the reliability is R(t)]

If failure rate is assumed to be constant:


t
LnR(t)= - t= 0ƒ λ(t)dt= -λt; therefore, R(t)=exp(-λt)

Therefore, MTBF, θ=t=0ƒt=∞R(t)dt=t=0ƒt=∞exp(-λt)dt=1/λ

[NB: Failure Rate, λ, is the probability of failure in unit time of an item, which is still
working satisfactorily, i.e. 1.5/600=25x10-4 failures per unit time. Whilst, Failure Probability
Density Function, f(t), is the probability of failure in unit time of an item which was working
satisfactorily at time t=0, i.e. 1.5/1000=15x10-4 failures per unit time].

(g) The Bathtub Curve


If a large number of a particular item/product is put on life test and the test is run until every
part has failed, the graph of the observed failure rate against time since the test started is
called the bathtub curve, (its name comes from the bathtub resemblance to the shape of a
bathtub). For the purpose of performing various reliability studies, the bathtub curve is
divided into three region as follows:

Useful working life

Earlyconstant failure rate wear out


Failure period failure rate
Rate period

1
period
8 = Failure rate

O A 8 B C

i) Early failure period 0-A

At the start of the test the failure rate may be relatively high, but this usually falls
progressively, until at A where the failure rate is approximately constant and at its lowest
level. The most common causes of early failures are:

• Manufacturing faults – these are faults which are not detected before dispatch to the
consumer. In each case two faults are implied: the product was wrongly made; and its
fault was not detected before it left the factory. Manufacturing faults often account for
the majority of the early failures.
• Design faults – when the designer completes a new design, it must be thoroughly
tested as a proto type before full-scale manufacture begins. If this is not adequately
done, any shortcomings in the design may then reveal themselves as early failures.
• Misuse – A few failures may be due to accidental customer misuse, before he/she is
fully competent in operating the product.

Increasing the early failure period prior to dispatch, making improvements in the
manufacturing process, and improving quality control activities can all minimize the
occurrence of early failures. Some of the reasons for failures in this period include
substandard workmanship and parts, poor manufacturing methods, human error, inadequate
quality control, and unsatisfactory debugging.

NB: This is the period often covered by guarantee, during which the manufacturer agrees
to make good anything, which goes wrong.
ii) Constant failure period: A-B
Once the early failures have been removed, the parts usually settle down to what may be a
relatively long period, when the failure rate is approximately constant, from A to B, after
which the failure rate begins to rise again, often quite steeply as the parts begin to wear out.

Although the failure rate in the constant period is usually low, it can be very troublesome if
high reliability is required. We can avoid early failures by good design and manufacture, and
by running the parts on load for a time at least equal to OA. We can avoid wear out failures
by replacement before time B, but we are still left with the constant failure rate period, right
through the normal working life. Failures in this period are unlikely to be due to any single
cause, it is usual for failures from a wide variety of causes to occur at random, with no
obvious pattern, except that the failure rate is roughly constant.

Some of the reasons for failures in this period are undetectable defects, low safety factors,
high-unexpected random stress, abuse, and natural failures.

iii) The wear out failure period


Everything wears out sooner or later, and so after B the failure rate rises again - here the
failure rate increases with time. The failures occurring in this period are no longer random
and there causes include aging, friction, wrong overhaul practices, poor maintenance, and
corrosion.

1
(h) System
A system is used to denote any complete installation or equipment. The failure pattern of a
system, or indeed any assembly of parts, can be regarded as the sum of the failure patterns of
the individual parts, but there may be additional failures as follows:

(i) Failures may occur at what are termed the interfaces between two parts. For
example there may be failures in soldered jointed or connectors, as well as in the
parts themselves.
(ii) One part may affect the performance of another. Thus if one part fails, it may
overload other perfectly good parts and cause them to fail as well.

Whenever a system fails we repair it, probably by replacing the faulty part with a new one.
When repair is no longer economical, we buy a new system. Thus each system will be a
different age, and each of the large number of parts it contains will have its own failure
pattern. The addition of so many small failure patterns is likely to produce a roughly constant
failure rate for the whole system.

(i) Failure Rate and Mean Time between Failures (MTBF)


Consider a batch of n items and that, at any time t, a number k have failed. The cumulative, t,
will be nt if it is assumed that each failure is replaced when it occurs.
(i) Failure Rate: This is the ratio of the total number of failures to the total
cumulative observed time for a stated period in the life of an item. If 8 is the
failure rate of the n items then the observed 8 is given by 8 = k/t.
(ii) Mean time between failures (MTBF): This is the mean value of the length of
time between consecutive failures (computed as the ration of the total cumulative
observed time to the total number of failures) for a stated period in the life of an
item. If θ is the MTBF of the n items then the observed MTBF is given by θ = t/k
i.e. θ = 1/8 .
(iii) Mean Time to Fail (MTTF): This is the ratio of cumulative time to the total
number of failures for a stated period in the life of an item. Again this is t/k. The
only difference between MTBF and MTTF is in their usage. MTTF is applied to
items that are not repaired, such as bearings, and transistors, and MTBF to items,
which are repaired. It must be remembered that the time between failures excludes
the down time.
(iii) Mean Life: This is defined as the mean of the times to failure where each item is
allowed to fail. While MTBF and MTTF can be calculated over any period as, for
example, confined to the constant failure rate portion of the Bathtub Curve, mean
life, on the other hand, must include the failure of every item and therefore takes
into account the wear out end of the curve. Only for constant failure rate situations
are they the same.

2.10 Redundancy

(a) Types of redundancy:


However care in design of system and parts are taken, we may find that we have still not
achieved the overall reliability demanded. This may be primarily because some of the units,
which make up the system are insufficiently reliable. Hence we may decide to duplicate
them, so that if one unit fails there is another similar unit there to carry on working, and so
avoid failure of the whole system. This technique is called redundancy. Redundancy is the
provision of more than one means of accomplishing a given function. Example:
• aircrafts have three identical altimeters so that if one goes wrong readings can be
taken from the other two (which should give the same readings);

1
• in hospitals there is always an emergency supply (in case the mains electricity fails),
often from a standby motor generator;
• a bicycle wheel has several spokes, several can break without a serious drop in
performance or reliability. This is called partial redundancy;
• a spare wheel is provided for vehicles in case one is punctured etc.

There are two main types of redundancy


(i) Active redundancy: Here all the alternative means of achieving a given function
are energized whenever that section of the system is operating. Thus the altimeter
is an example of active redundancy.
(ii) Standby Redundancy: Here one of the alternatives is energized at a time, and
there is provision so that if one fails another can be switched on.

(a) Active Redundancy:


Suppose that one unit in our system is insufficiently reliable, so we decide to incorporate
three units in active redundancy. They each perform the same function, and are designed so
that the system will continue to operate so long as at least one of them is working. The three
units may be identical, but as this is not necessarily so, we will designate the reliability of
each by R1, R2 and R3.
A

Unit 1 Unit 2 Unit 3


R1 R2 R3

The electrical analogy is that current will flow from A to B so long as there is an unbroken
circuit through at least one unit. The probability that at least one of them is still working will
give the overall reliability of the units (in parallel).

We calculate the probability that each unit will fail:


Probability that unit 1 fails = F1=(1-R1)
Probability that unit 2 fails = F2=(1-R2)
Probability that unit 3 fails = F3=(1-R3)
Probability that all 3 units fail = Fb=F1 x F2 x F3

Probability the block of 3 units in parallel operates satisfactorily is


Rb=(1 - Fb)=1 - (F1 x F2 x F3)
Therefore in general when there are k units in parallel
Rb = 1 - (F1 x F2 x ---- x Fk)

Example 1: suppose R1=R2=R3=0.90


Then probability unit R fails=F1=(1-R1)=(1-0.90)=0.10=F2=F3
Probability all 3 units fail =Fb=F1xF2xF3=0.1x0.1=0.001
Probability block of 3 units operates as intended
=Rb=1-Fb=1-0.001=0.999

Thus although one unit alone would only be 0.90 or 90% reliable, three units in parallel are
0.999 or 99.9% reliable. For reasonable gain in reliability there is a limit to how units you
can put in active redundancy. Thus:

1
• Two units in parallel will in many cases give a useful improvement in reliability. If
however we insert a third unit, the additional improvement is much smaller, and in
general there is no advantage in putting large numbers of units in parallel.
• We get the biggest increase when the unit reliability is around 0.35 and 0.50. If the
unit reliability is already 0.85 or above, two units in parallel may be useful, but the
use of three upwards gives little further improvement.
• However, when the unit reliability is very low, it is difficult to get acceptable block
reliability from the use of redundancy alone because if our units are unreliable the use
of more of them only puts in more unreliability and this offsets the gain from
redundancy.

(c) Detection of a failed unit


One practical problem with active redundancy is that of detecting when a unit fails. Thus if
two units are in active redundancy and one fails, the other will carry on working and the
operator may not even know that a failure has occurred. This is dangerous, because the
protection provided by redundancy has now gone. Some time later the second will also fail,
and this time the system will fail with it. Therefore it is essential to have a detection device,
which will indicate when a unit fails, so that we can ensure that it is restored to working
order.

(d) The MTBF when active Redundancy is used.


If we have two units in active redundancy, the overall MTBF will not be doubled because
both are energized together from the start and so it depends upon how long before the second
fails. If there are k identical units in parallel, each with a constant failure rate v, then the
mean time between failures, (MTBF) from the whole block is given by:
MTBF = θb = 1/(8).+ 1/(28 )+1/(38 )+………+1/(k8 )
(e) Combination of series and parallel units
In practice a system may present itself a combination series and parallel units. In effect, units
will be in parallel whenever there is an alternative path which will enable the system to work
satisfactorily, and they will be in series whenever it is essential for all the units concerned to
work simultaneously; Take an example of the system shown below:
A

R4=
.90
R1=
.95

R5=
.95

R2= R3=.80
.80
R6= R8=
.85 .85

R7= R9=
.85 .85

1
• Look for units in series within a parallel configuration, and calculate the reliability of
a single equivalent unit.
• Reduce each parallel arrangement to a single equivalent unit.
• Repeat the above, always dealing with the smallest recognisable configurations first.

R6 and R7 are in series, within a parallel arrangement.


Equivalent reliability = R67 = R6 x R7 = 0.72; Similarly R89 = 0.72
• Next deal with the two parllel sections. Since R2 = o.80, F2 = (1.0 - 0.80) = 0.20, and
the probability that R2 and R3 both will fail = F23 = F2 x F3 = 0.20 x 0.20 = 0.04,
and therefore, reliability of R2 and R3 = R23 = (1.0 - -0.04) = 0.96
Similarly for R67 and R89, we have : F67 = (1.0 – 0.72) = 0.28, and F89 = (1.0 – 0.72) =
0.28, and therefore, the probability that both will fail = 0.28 x 0.28 = 0.08. Therefore, the
reliability for R67 and R89 = R6789 = (1.0 – 0.08) = 0.92

R1= R4=
.95 .90

R5=
.95
R23=
.96 R6789=
.92

B
• Next take each series limb in turn; R123 = R1 x R23 = ).95 x 0.96 = 0.91
R6789 = 0.90 x 0.95 x 0.92 = 0.79
We now have a straightforward parallel configuration; F123 = (1.0 – 0.91) = 0.09
F456789 = (1.0 – 0.79) = 0.21
Therefore, the probability the whole system will fail = Fs = 0.09 x 0.21 = 0.0189, and
the reliability of the whole system = Rs = (1.0 – 0.0189) = 0.9811

(f) Interconnections within series/parallel arrangements

A A

RA1 RA2 RA1 RA2

RB1 RB2

RB1 RB2
B

B
1st alternative: 2nd alternative:
A parallel arrangement A parallel arrangement
Without connections With connections

1
Suppose every unit has a reliability of 0.90:
With 1st alternative - the whole system will work if either:
RA1 and RB1 both work, or RA2 and RB2 both work. However there are two more possible
combinations, which are apparently unacceptable, namely:

RA1 and RB2


RA2 and RB1

We could make both these latter combinations acceptable merely by modifying the
configuration to the 2nd alternative in which there is a cross connection in the middle of the
network.

Reliability of the whole system in the 1st alternative = 0.964 and reliability of the whole
system in the 2nd alternative = 0.9801.
Thus the 2nd alternative is better than the 1st alternative for the same units, in the same
application.

(g) Standby Redundancy.


When standby redundancy is used, only one path is energized at a time and the remainder are
idle although ready to be brought into action should a failure of the first demand it. This is the
case with the standby motor generator in a hospital operating theatre. At first sight it is a
more attractive proposition than active redundancy, because the spare units remain new and
unused until required instead of wearing out all the time the equipment is operating. The
supposed advantage can be set out in terms of the MTBF.

For one path only, MTBF=1/8 =θ


For two paths in active redundancy, MTBF=1/8 +1/(28 )=3/(2 θ)
For two paths in standby redundancy, MTBF=1/8 +1/8 =2 θ
However, before any conclusions about the merits of standby redundancy are made, the
following facts should be considered:

(i) The standby unit is not carefully packed away in a specially designed storeroom.
Usually it is out on the job, alongside the unit, which is operating. Therefore it can
be affected by the environment due to, say vibration, grease, dust etc. Thus the
assumption that it will remain in perfect working order, ready for immediate use,
may be quite incorrect. It may well have an appreciable failure rate while waiting
to be used, and the longer it remains on standby duty the greater the risk that it
will not work should it be required.
(ii) For many systems, the moment of switch on is particularly hazardous. Stray
surges are liable to wander around the circuits and may well damage any part,
which is already a bit doubtful. With active redundancy, however, the second
circuit is switched on with the first and there is time to check that it works before
it is required.
(iii) Some parts deteriorate if left unloaded for long periods. Large electric furnaces
are best kept at least partially energized, since heater failures often occur if for any
reason they have to be completely switched off and allowed to cool.
(iv) It may be possible for two units in active redundancy to share the work, so that
each only has to work at half load. This effectively de-rates both of them, and so
may lead to a longer working life.

1
(v) In many cases the operator may not notice the failure and starts the standby unit.
Usually it will be necessary to install further devices to:
(a) detect that the main unit has failed
(b) Switch in the standby automatically.
If either a or b above do not work correctly, the standby unit will not come into
operation, even though it is in perfect working order.

The detection device will normally be energized whenever the main unit is working, and
there are two obvious ways in which it can fail:
(a) it can fail to detect that the main unit has ceased to function
(b) It can cause the standby unit to switch in when the main unit is still working
correctly.

The switchover device will not normally be energized, but it must function correctly when
required. It can also fail in two ways:
(a) it can fail to switch in the standby unit when required
(b) It can switch in the standby unit when it is unnecessary.

Clearly in choosing between active and standby redundancy, we shall have to consider each
individual case in its merits, but much often hinges on two factors:
(i) The reliability of the detection and switching device, of the standby unit.
We use redundancy because we are not satisfied with the reliability of one
working unit on its own. Unless therefore the detection and switching device
are appreciably more reliable than the main working unit, we may be
introducing as much unreliability into the system as we are removing by
having the standby unit.
(ii) It is a fundamental assumption in using a standby unit that its failure rate when
shut down, but probably exposed to normal operating conditions, is
appreciably lower than if it were energized under the same conditions. If this
is not so, then active redundancy may well be on better proposition.

(h) Partial Redundancy


In active or standby redundancy it is assumed that the system will continue to work provided
that at least one path is still in operation. There are however, many cases where, although
some failures are permitted, more than one path must continue to work if the system as a
whole is to work, and this is called partial redundancy. The following are examples.
(i) The suspension bridge is hung from the chains by a considerable number of
vertical members. It might be possible for a few of these vertical members to
break without endangering the bridge, but clearly this could not go on until only
one was left.
(ii) An aircraft with four engines could almost certainly land safely if three were still
working, but this might be very difficult if only one survived.
(iii) Usually several spokes of a bicycle wheel can break without a serious drop in
performance or reliability. However, a minimum number of spokes (roughly
evenly distributed around the hub) must remain.

(i) Redundancy and Cost


Although improving reliability and maintainability, redundant units require more space and
weight, capital cost is increased and the additional units need more spares and generate more
maintenance. Systems availability is thus improved but both preventive and corrective
maintenance costs with the number of units.

2
3.0 Maintainability
Maintainability is the action taken during the design and development, and installation of a
manufactured product to include features that will increase ease of maintenance, reduce
required man-hours, tools, logistic costs, skill levels and facilities and ensure that when used
in the field the product will have minimum downtime, and life-cycle support costs.

From this definition, the general principles maintainability, therefore, include lowering or
eliminating altogether the need for maintenance, reducing life cycle maintenance costs,
lowering the number, frequency, and complexity of required maintenance tasks; establishing
the extent of preventive maintenance to be performed; reducing the mean time to repair
(MTTR); and providing for maximum interchangeability.

On the other hand maintenance refers to the measures taken by the users of a product to keep
it in operable condition or repair it to operable condition.

3.1 The importance, Purpose, and Results of maintainability efforts


The objectives of applying maintainability engineering principles to engineering systems and
equipment include:
• Reducing projected maintenance time and costs through design modifications
directed at maintenance simplifications.
• Determining man-hours and other related resources required to carry out the
projected maintenance.
• Using maintainability data to estimate item availability or unavailability.

When maintainability engineering principles have been applied effectively to any product, the
following results can be expected.
• Reduced downtime for the product and consequently an increase in its operational
readiness or availability.
• Efficient restoration of the product’s operation condition when random failures are
the cause of downtime.
• Maximizing operation readiness by eliminating those failures that are caused by
age or wear-out.

Because engineering should consider maintenance requirements before designing a product,


maintainability design requirements can be determined by processes such as maintenance
engineering analysis, the analysis of maintenance tasks and requirements, the development of
maintenance concepts, and the determination of maintenance resource needs.

Because equipment downtime consists of many components and sub-components, there are
numerous engineering and analytical efforts required to reduce downtime. The three main
components of equipment downtime are logistic time, administrative time, and active repair
time.
1. The three main components of equipment downtime are logistic time, administrative
time, and active repair time.
(a) Logistic time is that portion of equipment downtime during which repair work
is delayed because a replacement part of other component of the equipment is
not immediately available. Logistic time, therefore, is largely a matter of
management. By developing effective procurement policies can minimize it.
(b) Active repair time is that portion of equipment downtime during which the
repair staff is actively working to effect a repair. Its six elements are fault
location time, preparation time, failure verification time, actual repair time,

2
part acquisition time, and final test time. Usually, the length of active repair
time reflects factors such as product complexity, diagnostic adequacy, nature
of product design and installation, and the skill and training of the
maintenance staff.
(c) Administrative time is that portion of equipment downtime not taken into
consideration in action repair time and in logistic time. This time (that
normally include wasted time) is a function of the structure of the operational
organization and is influenced by factors such as work schedules and the non-
technical duties of maintenance people.

3.2 Maintainability cost Considerations

In many cases, the cost of acquiring a product is less than the cost of ownership over the
product life cycle. Cost of ownership includes operation costs (such as the cost of personnel,
facilities, and utilities), maintenance costs, the cost of test and support equipment, retirement
and disposal costs, technical data costs, the cost of training operations and maintenance
personnel and the cost of spares, inventory, and other support materials.

Clearly, reducing the cost of ownership is critical if equipment is to be cost-effective. The


opportunity for creating savings in a products life cycle cost decreases dramatically in the
progress from the concept design and advance planning phase to the production and
construction phase. 60% to 70% of the projected life cycle cost can sometimes be locked in
by the completion of the preliminary design phase. This means the greatest impact on costs
comes from decisions made during the early design phases.

3.3 Maintainability Costs


Maintainability is an important factor in the total cost of equipment. An increase in
maintainability can lead to reduction in operation and support costs. For example, a more
maintainable product lowers maintenance time and operating costs. Furthermore, more
efficient maintenance means a faster return to operation or services, decreasing downtime.

Ways to improve equipment maintainability are:


• Design of built-in test points,
• Use of reduced maintenance parts,
• Increase in automatic test equipment use,
• Increase in self-checking features,
• Easier access for maintenance,
• Improvement and number of detailed troubleshooting manuals, and
• Discard-at-failure maintenance.

Elements to invest in (elements of investment cost) so as to increase maintainability are:


• Prime equipment,
• System engineering management,
• Repair parts,
• Support equipment;
• Data,
• Training system test and Evaluation; and
• New operational facilities.

3.4 Maintainability Design Considerations

2
A cost effective and supportable design must take into account the maintainability
considerations that arise at each phase in the life cycle of the system or product. Careful
planning and systematic effort are needed to bring attention to important maintainability
design factors such as maintainability allocation, maintainability evaluation, maintainability
and design characteristics; maintainability parameters, and maintainability demonstration.
Each of these factors involves various sub-factors – e.g. packaging, standardization, inter-
changeability, human factors, safety, and testing and check out all play a role in the final
products maintainability design characteristics in every aspect of maintainability design.

The maintainability design characteristics are the features and design characteristics that help
reduce downtime and enhance availability. The goals of maintainability design include
minimizing preventive and corrective maintenance tasks; increasing ease of maintenance.
Decreasing support costs; and reducing the logistical burden by decreasing the resources
required for maintenance and support, such as spare parts, repair staff, and support
equipment.

The most frequently addressed maintainability design factors, ranked in descending order,
are: accessibility, test points; controls; labelling and coding; displays; manuals; check lists,
chart and aids; test equipment; tools; connectors; cases; covers and doors; mounting and
fasteners; handles; and safety factors; Other factors are standardization, modular design,
inter-changeability ease or removal and replacement, indication and location of failures,
illumination, lubrication, test adapters and test hook ups, servicing equipment, adjustments
and celebrations installation, functional packaging, fuses and circuit breakers; cabling and
wiring, weight, training requirements, skill requirements, required number of personnel, and
work environments.

(a) Standardization
This important design feature restricts to a minimum the variety of parts and components that
a product system will need. Standardization should be a central goal of design, because the
use of non-standard parts may lead to lower reliability and increased maintenance.
Some of the primary goals of standardization include: maximizing the use of common parts
in different products, minimizing then number of different types of parts, components,
assemblies and other items; maximizing the use of inter-chargeable and standard or off-the-
self-parts and components; minimizing the number of different models and makes of
equipment in use; controlling and simplifying inventory and maintenance; reducing storage
problems, and the effort spent on part coding and numbering.

The benefits of standardization are:


(i) Reduce manufacturing costs, design time and maintenance time and cost.
(ii) Reduce the danger of incurrent use of parts.
(iii) Facilitate cannibalising maintenance approaches.
(iv) Reduce procurement, stocking, and training problems.
(v) Leads to greater reliability.
(vi) Reduce errors in wiring and installation caused by variations in characteristics of
similar items.
(vii) Reduces the chance of accidents that stem from wrong and unclear procedures.
(viii) Eliminates need for special or close tolerance parts.

(b) Inter-changeability
This is an important maintainability design factor that is made possible through
standardization. Inter-changeability means that, as an international aspect of design, any
component, part, or unit can be replaced within a given product or piece of equipment, by any
similar component, part, or unit. There are two types of inter-changeability; functional inter-

2
changeability and physical inter-changeability. In functional inter-changeability, two
specified items serve the same function. In physical inter-changeability, two items can be
mounted, connected and used effectively in the same locations and the same manner.

(c) Modularisation
Modularisation is the division of a system or product into physically and functionally distinct
units to allow removal and replacement. Each system or sub-system, from the highest to the
lowest level, can be designed as a removable entity. Questions of cost, practicality, and
function dictate the degree of modularisation. However, modular construction will reduce
training costs or provide other concrete benefit. Modularisation allows use of disposable
modules, which are designed to be discarded other than required once they fail (because
repair is either costly or impractical).

(d) Simplification
Probably the most difficult element of maintainability to achieve, but the most important, is
simplification. Simplification should be the constant goal of design. Even a complex product
or piece of equipment should appear simple and straightforward to the user. A good designer
incorporates important functions of a product into the design itself and uses as few
components as sound design practices will allow.

(e) Accessibility
Accessibility is the relative ease with which a part or piece of equipment can be reached for
service, replacement, or repair. Lack of accessibility is an important maintainability problem
and a frequent cause of ineffective maintenance.

(f) Identification
Adequate labelling or marketing of parts, controls, and test points facilitates maintenance
tasks such as replacement and repair. If a repair person is unable to readily part points, or
controls, maintenance tasks become more difficult, take longer to perform, and are more
likely to be performed incorrectly.

3.5 Maintainability Tools


Two methods developed to analyze both reliability and maintainability is Failure Mode and
Effects Analysis (FMEA) and Fault Tree Analysis (FTA).
FMEA is a structured qualitative analysis of a system, subsystem, component or function that
highlights potential failure modes, their causes, and the effects of a failure on system
operation. When FMEA also evaluates the criticality of the failure, that is, the severity of the
effect of the failure and the probability of its occurrence [Criticality assessment ranks
potential failures identified during the system analysis based on the severity of their effects
and the likelihood of their occurrence], the analysis is referred to as Failure Mode, Effects,
and Criticality Analysis FMECA) and the failure modes are assigned priorities.

There are three distinct types of FMECA: System level, design level, and process level
FMECA. Of these three levels, the highest level of analysis is the system level FMECA,
which usually consists of a collection of subsystem FMECAs.

Performed in the initial design concept phase, the system level FMECA highlights potential
system or subsystem failures so that they can be prevented. The design level FMECA helps
identify and prevent failures stemming from the product design. It analyzes the design that
has been developed and examines how failures of individual items would affect the system
functioning or operation. The purpose of the process level FMECA is to analyze the process
by which the product or system is to be built and assess how potential failures in the

2
manufacturing or service process would affect the product/system functioning or operation.
All the three types of FMECA consist of the following basic steps:
• Understanding system parts, operation, and mission.
• Identifying the hierarchical, or indenture, level at which the analysis is to be
performed.
• Defining each item expected to be analysed – for example, component, module, or
subsystem.
• Establishing associated ground rules and assumptions – for example, system mission
and operational phases.
• Identifying possible failure modes for each item.
• Determining the effect of each item’s failure for every possible failure mode.
• Determining the effect of group failures – failures of more than one item – on system
operation and mission.
• Identifying methods, procedures, or approaches for detecting potential failures.
• Determining any provisions or design changes that would prevent failures or mitigate
their effects.

1. Failure Mode and Effects Analysis (FMEA)


This is a reliability analysis technique that also applies to maintainability analysis. The
technique systematically determines the basic causes of failure and defines measure to reduce
their effects. Furthermore it can be applied to any system level. The failure mode is the
specific way in which the item fails to carry out its intended mission. The failure cause is the
reason the failure took place. The failure effect is the result of the failure for each failure
mode.
• Failure Mode: Examples are open or short circuits; reduced output, loss of function,
and loss of output.
• Failure Cause: Examples are wear, vibration, contamination, and voltage surge.
• Failure Effect: Examples are loss of communication, mission abort, reduced control,
and injury or damage to personnel or equipment.

FMEA is basically a qualitative approach to determining the reliability, maintainability, and


safety of a given design by taking into consideration potential failures and their resulting
effects. The seven major steps in performing FMEA are:
• Define the boundaries and detailed requirements for the system or piece of equipment
under consideration.
• List all of its components.
• List all possible failure modes, describe each, and identify the component that would
be involved.
• Assign a failure rate to each component failure mode.
• List the effects of each failure mode.
• Enter remarks for each failure mode.
• Review each critical failure mode and take appropriate action.

In using this analysis, the effect identified can be quite different depending on the objective
of the analysis. For example:
• In Reliability Analysis. The effect considered is the effect on the system’s or
equipment’s performance or ability to function.
• In Maintainability Analysis. The effects considered include the symptoms through
which failure to be pinpointed and the components that will require replacement as
the result of the failure.

2
• In Safety Analysis. The effects to be considered are damage to other systems and
equipment and possible danger to people.

Some of the advantages of the FMEA method are that it employs a systematic procedure to
categorize hardware failure and identifies all possible failure modes and their effects on
performance, personnel, and equipment. It is useful for comparing design, simple to
understand, and helps identify methods of detecting the various possible failures.

3.6 Maintainability and Safety


Safety means either freedom from hazards or protection against hazards. It is one of the most
important factors in designing for maintainability. As individuals perform maintenance tasks,
they are exposed to hazards or accidents. Many of these hazards and accidents are due to
careless design or design that does not give adequate attention to human factors and safety
features. Other factors include hazardous environmental conditions and the creation of
hazards by maintenance and operating personnel themselves when they perform their
assigned tasks carelessly. The key to overcoming many of these difficulties is to “design in”
safety features that will protect operators, maintenance personnel, and the equipment itself.
(It should be remembered that equipment that is dangerous to people is by definition not
maintainable).

During the equipment design phase, professionals have many methods at their disposal for
eradicating or minimizing hazards to people and equipment. The basic objective of all these
approaches is hazard identification and control. Some of the safety analysis techniques are
hazard analysis, failure mode and effects analysis, and fault tree analysis.

Hazard Analysis Method – This safety analysis tool determines the safety requirements for
people, procedures, and equipment used in testing, operations, maintenance, and logistic
support. This method also determines the compliance of system and equipment with specified
safety requirements and criteria. (For FMEA and FTA see elsewhere)

Comparison of FMEA and FTA


No. FMEA FTA
1. It is a hardware-oriented method. It is an event-oriented
approach.
2. It has a broader scope with It has a restricted scope with
restricted depth of analysis. in-depth analysis.
3. It is an optimum approach for It is an optimum approach for
multiple failures. single failures.
4. It does not require analysis of It provides documentation to
failures that have no effect on the ensure that each and every
operation under investigation. potential single failure has
been investigated.
5. It does not require investigation of It highlights all external
all external influences. influences contributing to loss,
for example, environment, test
procedures, and human errors.

3.7 Safety and Human Behaviour


The safety of the people who operate and maintain equipment is of utmost importance.
During the design phase, appropriate information on human behaviour can ultimately lead to
safer and more maintainable design. The following are important measures for reducing
accidents due to human error:
• Designing error-free mating parts

2
• Providing correct tools and making regular adjustments to safety equipment
• Developing effective support procedures
• Proper inspecting of all tasks
• Making each individual conscious of hazards involved in his/her assignments
• Making workers safety conscious
• Proper training for performing tasks.

3.8 General Maintainability Design Guidelines


Some of the important general maintainability design guidelines are:
(i) Design to minimise requirements for tools, maintenance skills, adjustments, and
other aspects of maintenance.
(ii) Group sub system for easy location and identification
(iii) Provide trouble shooting techniques, test points, etc
(iv) Used standard parts to extent possible
(v) Provide for visual inspection
(vi) Avoid the use of large cable connectors
(vii) Use plug-in modules
(viii) Design for safety.

3.9 Maintainability of new Equipment)


What to consider for the maintainability of a new system:
(a) Speed of repairs:
• Ensure that maintenance and repairs can be carried out easily.
• Ensure that spares are easily available. It may be costly to have the whole
system out of service merely because a vital spare cannot be obtained. On the
other hand it is expensive to hold a running stock of essential spares.
• Ensure that maintenance personnel with the necessary skills are easily
available.
(b) Performance after repairs: Some delicate or complex pieces of equipment do not
appreciate the disturbance caused by a major repair and never really work as well
afterwards – hence it is important to purchase a reliable system in the first place.

3.10 Designs for Ease of Operation

(a) The Competence of the Operator:


In designing equipment, we must consider how skilled or otherwise they operators are likely
to be. The system should be designed to be as easy as possible to operate correctly, and as
difficult as possible to operate incorrectly. If two controls should never be operated
simultaneously then it helps to interlock them, so that the operation of either puts the other
out of action. Alternatively the controls may be spaced so far apart that one operator cannot
reach both at the same time. However, it is essential to devise the equipment so that no injury
or damage is caused by any mistakes, which are made.

(b) The effect of fatigue and working conditions:


The more tiring equipment is to operate the greater the risk of a mistake and the shorter the
period for which an operator can work without a rest. Thus:

• Constant bending, either to pick up materials or to operate very low controls, is


unnecessarily tiring.
• Heavy weights should be moved mechanically rather than by the operator’s own
exertion.

2
• If the equipment generates excessive heat, it is in principle better to design so that
it is not dissipated in the direction of the operator, rather than to leave the
customer to provide protective clothing. The latter is usually uncomfortable, and
so may not always be worn.

(c) The layout of controls and tools:


Where tools must be used in a particular order it is a great help if they are presented to the
operator in that order. Controls tool should be arranged logically. It helps if all the controls
in a particular group are different so that the operator knows by feel, which he is holding.
Controls should be labelled as simply as possible, but with enough information so that the
operator understands at once what they do, without consulting the instruction manual to find
out.

(d) Indicators:
Include dials, gauges, lamps bells, etc. which will show the working the working state of the
equipment. The majority are either visual or aural. Thus a light may come on or a bell may
ring to warn the operator that a particular cycle of manufacture is complete. A light to show
a particular part of system is on or off is cheaper and easier to observe. However, a meter
conveys more information and therefore it must be specified whenever an operator needs it.

(e) The arrangement of meter dials on the display panel:


Meter dials are easiest to read if they are mightily at eye level. When they are very high or
low they are not only tiring, but there will inevitably be errors due to parallax. Avoid highly
reflective surfaces, such as charmed meter rims or shiny instrument panels, since these may
dazzle the operator.

Ensure that meters are calibrated directly in the units to be observed. A dial should be
calibrated so that it can be easily read to the required accuracy. Digital type maters, which
display actual numbers, are preferable as they are quicker to read and less prone to operator
error.

(f) Electrical Controls and Responses:


Frequently a control is related to a particular meter, so that if, say, the voltage shown is low,
the operator turns up the knob, which controls it. In such cases:
• The knob should turn the same way as the needle, so that turning it clockwise
causes the needle to go clockwise.
• The knob and the dial should be placed as close together as possible, and there
should be no doubt which knob control which dial e.g. the control should not be
round the back of the equipments, when the dial is in front.
• There should not be too much delay between the alternation of the control and the
response of the meter.

(g) Mechanical controls and Responses:


As with the electrical systems mechanical equipment apply same principles as:
• Wheels and levers should more logically with respect to the function they control
• Controls should be placed where the operator can see at once the effect of
operating them. He should not have walk round to see whether soothing which he
has started up is in fact running.
• Response to the operation of controls should be quick enough to avoid hunting
(i.e. delaying them overshooting or undershooting).
• Try to avoid controls, which require a considerable physical effort to operate
them.

2
NB: Must controls tend to use a mixture of mechanical and electrical/electronic
principles.

(h) Alarm signals:


In both electrical and mechanical equipment we should include alarms to warn the operation
when all is not well. In some cases (where delay in shutdown can be serious) an alarm as
well as an automatic shutdown will be required, because once the plant has stopped
adjustments and repairs must be immediately put in hand to get it working again.

(i) Ergonomics or human engineering:


Fit the job to the operator’s natural abilities rather than assuming that the operator will
somehow cope with badly designed equipment.

(j) Imaginary Malfunctions:


Sometimes when a system does not appear to work properly, it is merely because it has not
been correctly operated. The operator may not realise this. This is because the operator’s
skill is not quite equal to the demands of the system. Designing the system so that it is as
easy as possible to operate can help this.

3.11 Design for ease of maintenance

(a) The Importance of Availability


When a breakdown occurs, the customers chief interest will be to get the system back into
service with a minimum of delay and expenses; he is more likely to be concerned with its
availability than with its reliability. Hence we must consider how to make the system quick
and easy to services.

(b) Fault Location Routine


A set routine for fault location is essential otherwise a lot of time can be wasted. Therefore a
fault finding routine should be incorporated in the production design. It should be as simple
as possible, and set out on a step by step basis. The approach will probably be first to
determine which block in the system contains the fault, then which unit within that block is
faulty and so on so that in as few steps as possible one is able to locate the actual part which
has failed.

(c) Provision for Fault Location


Having decided roughly what the fault location routine is to be, the designer must consider
how he can make it easy to carry out. Possibly the best solution is to arrange for lights and
meters on the control panel, to indicate precisely where a fault is. The next possibility is to
provide check points, so that the required test equipment can be plugged in. It may also be
desirable to provide alternative units which can be plugged in and so allow the maintenance
engineer to eliminate sections of the actual system in turn.

(d) Test Equipment


The test equipment should be designed to be as reliable as possible, using techniques like
redundancy where necessary and it should be as easy as possible to check and maintain.

(e) Fault Correction


Once a fault has been located in a system, then repairs and adjustments must be carried out
and again the designer can help to make this as easy as possible. Thus:
i) Parts that have high failure rates must be easy to identify and get at.

2
ii) Fixing devices must be easy to release ad reassemble.
iii) Thus nuts, bolts and screws in awkward places should be avoided. It is difficult to
insert or tighten a screw into a position, which on e cannot see or perhaps cannot even
feel directly with the fingers.
iv) Plugs and sockets are attractive, because they make dismantling and reassembly
easier, although their own reliability may present a problem
v) Make sure that visual inspection is as ease as possible, and that parts such as nuts and
screws do not fall down inside when released immediately fall down inside.
vi) Where appropriate, make it possible to replace a complete unit, so that the faulty
one can be taken away for repair.

(f) Working conditions of the Maintenance Staff


The designer must try to anticipate the actual conditions under which maintenance may have
to be done. Thus:
• The sill of the maintenance staff may be very limited indeed, and in remote areas
there may be no one whom they can turn for help especially with mobile systems e.g.
vehicles.
• The maintenance equipment available may be similarly limited.
• The repair workshop at base probably has reasonable working conditions, but out on
the job they may be most unreasonable.
• Consider what spares are likely to be available, remembering that it is much more
difficult to obtain special parts than standard ones.

(g) Instruction Manuals


A system must have a clear concise instruction manual. However, information in the
instruction manual should be minimised but clear. Thus:
• The method of operation should as far as possible, be clear without reference to the
manual. Controls should be arranged and labelled so that their use is self-explanatory.
• The same applies to maintenance. It is much better for adjustment points to be readily
visible and accessible, than to have to consult the instruction manual to find out where
they are hidden.

Thus the objective should be to make manuals as much as possible to be unnecessary or with
very little text by:
• using clear diagrams wherever they would be useful. A simple sketch often saves a
lot of writing, which the operator may not fully understand.
• Trying to put a diagram on the same page as the text to which it relates
• Keeping the wording concise and simple. Avoid technical terms which may mean
little to the operating and maintenance staff.
• Making sure that it is easy to locate any piece of information which may be required.
• Provide a good cross-referenced index, so that the users can find what they want even
though their terminology may be somewhat different from that used by the
manufacturer.

(h) After-Sales Service


The complexity of modern equipment makes it increasingly impracticable for customers to do
more than routine maintenance, and some customers anyway prefer to rely upon the services
of the manufacturing company. Hence after-sales services must be provided.

(i) Stock control of Spares

3
Spares parts and replacement parts should be stocked and supplied for models in current
production as well as for those which have gone out of production but are still used by
customers.

3.12 Design for Serviceability

Serviceability is concerned with the ease with which maintenance can be performed on a
product. Many products require some form of maintenance or service to keep them
functioning properly. Products often have parts that are subject to wear and that are expected
to be replaced at periodic intervals.

There are two general classes of maintenance:


• Preventive maintenance is routine service required to prevent operating failures, such
as changing the oil in your car.
• Breakdown maintenance is the service that must take place after some failure or
decline in function has occurred.

It is important to anticipate the required service operations during the design of the product.
Provision must be made for disassembly and assembly. For example a design that requires
the removal of a body panel of an automobile to access the oil filter is inappropriate. It should
also be remembered that service usually will be carried out in "the field" where special tools
and fixtures used in factory assembly will not be available.

A concept closely related to serviceability is testability. This is concerned with the ease with
which faults can be isolated in defective components and subassemblies. In complicated
electronic and electromechanical products, testability must be designed into the product.

The best way to improve serviceability is to reduce the need for service by improving
reliability. Reliability is the probability that a system or component will perform without
failure for a specified period of time. Failing this, the product must be designed so that
components that are prone to wear or failure, or which require periodic maintenance, are
easily visible and accessible. It means making covers, panels, and housings easy to remove
and replace. It means locating components that must be serviced in accessible locations. If
possible press fits, adhesive bonding, riveting, welding, or soldering for parts that must be
removed for service should be avoided. Modular design is a great boon to serviceability.

4.0 Plant Maintenance


Maintaining the production capability of an organization is an important function in any
production system. Maintenance encompasses all those activities that relate to keeping
facilities and equipment in good working order and making necessary repairs when break-
downs occur, so that the system can perform as intended.

Maintenance activities are often organized into two categories:


(1) Buildings and grounds, and
(2) Equipment maintenance.

Buildings and grounds is responsible for the appearance and functioning of buildings, parking
lots, lawns, fences, and the like. Equipment maintenance is responsible for maintaining
machinery and equipment in good working condition and making all necessary repairs. The
goal of maintenance is to keep the production system in good working order at minimal cost.
Decision makers have two basic options with respect to maintenance:

3
• One option is reactive: It is to deal with breakdowns or other problems when they
occur. This is referred to as breakdown maintenance.
• The other option is proactive: It is to reduce breakdowns through a program of
lubrication, adjustment, cleaning, inspection, and replacement of worn parts. This is
referred to as preventive maintenance.

Decision makers try to make a trade-off between these two basic options that will minimize
their combined cost. With no preventive maintenance, breakdown and repair costs would be
tremendous. Furthermore, hidden costs, such as lost production and the cost of wages while
equipment is not in service must be factored in. So must the cost of injuries or damage to
other equipment and facilities or to other units in production. However, beyond a certain
point, the cost of preventive maintenance activities exceeds the benefit.

As an example, if a person never had the oil changed in his or her car, never had it lubricated,
and never had the brakes or tires inspected, but simply had repairs done when absolutely
necessary, preventive costs would be negligible but repair costs could be quite high,
considering the wide range of parts (engine, steering, transmission, tires, brakes, etc.) that
could fail. In addition, property damage and injury costs might be incurred, plus there would
be the uncertainty of when failure might occur (e.g., on the expressway during rush hour, or
late at night). On the other hand, having the oil changed and the car lubricated every morning
would obviously be excessive because automobiles are designed to perform for much longer
periods without oil changes and lubrications.
The best approach is to seek a balance between preventive maintenance and breakdown
maintenance. The same concept applies to maintaining production systems: Strike a balance
between prevention costs and breakdown costs. This concept is illustrated in Figure below.
The age and condition of facilities and equipment, the degree of technology involved, the
type of production process, and similar factors enter into the decision of how much
preventive maintenance is desirable.

Thus, in the example of a new automobile, little preventive maintenance may be needed since
there is slight risk of breakdowns. As the car ages and becomes worn through use, the
desirability of preventive maintenance increases because the risk of breakdowns increases.
Thus, when tires and brakes begin to show signs of wear, they should be replaced before they
fail; dents and scratches should be periodically taken care of before they begin to rust; and the
car should be lubricated and have its oil changed after exposure to high levels of dust and
dirt. Also, inspection and replacement of critical parts that tend to fail suddenly should be
performed before a road trip to avoid disruption of the trip and costly emergency repair bills.

Optimum
Amount of preventive maintenance

3
4.1 Preventive Maintenance
The goal of preventive maintenance is to reduce the incidence of breakdowns or failures in
the plant or equipment to avoid the associated costs. Those costs can include loss of output;
idle workers; schedule disruptions; injuries; damage to other equipment, products, or
facilities; and repairs, which may involve maintaining inventories of spare parts, repair tools
and equipment, and repair specialists.

Preventive maintenance is periodic. It can be scheduled according to the availability of


maintenance personnel and to avoid interference with operating schedules. Preventive
maintenance is generally scheduled using some combination of the following:
ι ) The result of planned inspections that reveal a need for maintenance,
ιι) According to the calendar (passage of time), and
ιι ι) After a predetermined number of operating hours.

Ideally, preventive maintenance will be performed just prior to a breakdown or failure


because this will result in the longest possible use of facilities or equipment without a
breakdown. Predictive maintenance is an attempt to determine when to perform preventive
maintenance activities. It is based on historical records and analysis of technical data to
predict when a piece of equipment or part is about to fail. The better the predictions of
failures are, the more effective preventive maintenance will be. A good preventive main-
tenance effort relies on complete records for each piece of equipment. Records must include
information such as date of installation, operating hours, dates and types of maintenance, and
dates and types of repairs.

Predictive maintenance: An attempt to determine when best to perform preventive


maintenance activities.

Some Japanese companies have workers perform preventive maintenance on the machines
they operate, rather than use separate maintenance personnel for that task. Called total
preventive maintenance, this approach is consistent with JIT systems and lean production,
where employees are given greater responsibility for quality, productivity, and the general
functioning of the system.

Total preventive maintenance: JIT approach where workers perform preventive


maintenance on the machines they operate.

4.2 Breakdown Programs


The risk of a breakdown can be greatly reduced by an effective preventive maintenance
program. Nonetheless, occasional breakdowns still occur. Even firms with good preventive
practices have some need for breakdown programs. Of course, organizations that rely less on
preventive maintenance have an even greater need for effective ways of dealing with
breakdowns.

Unlike preventive maintenance, breakdowns cannot be scheduled but must be dealt with on
an irregular basis (i.e., as they occur). Among the major approaches used to deal with
breakdowns are the following:
i) Standby or backup equipment that can be quickly pressed into service,
ii) Inventories of spare parts that can be installed as needed, thereby avoiding lead times
involved in ordering parts, and buffer inventories, so that other equipment will be less
likely to be affected by short-term downtime of a particular piece of equipment,
iii) Operators who are able to perform at least minor repairs on their equipment, and

3
iv) Repair people who are well trained and readily available to diagnose and correct
problems with equipment.

The degree to which an organization pursues any or all of these approaches depends on how
important a particular piece of equipment is to the overall production system. At one extreme
is equipment that is the focal point of a system (e.g., printing presses for a newspaper, or vital
operating parts of a car, such as brakes, steering, transmission, ignition, and engine). At the
other extreme is equipment that is seldom used because it does not perform an important
function in the system, and equipment for which substitutes are readily available.

The implication is clear: Breakdown programs are most effective when they take into account
the degree of importance a piece of equipment has in the production system, and the ability of
the system to do without it for a period of time.

4.3 Replacement and Maintenance


When breakdowns become frequent and/or costly, the manager is faced with a trade-off
decision in which costs are an important consideration: What is the cost of replacement
compared with the cost of continued maintenance? This question is sometimes difficult to
resolve, especially if future breakdowns cannot be readily predicted. Historical records may
help to project future experience. Another factor is technological change; newer equipment
may have features that favour replacement over either preventive or breakdown maintenance.
On the other hand, the removal of old equipment and the installation of new equipment may
cause disruptions to the system, perhaps greater than the disruptions caused by breakdowns.
Also, employees may have to be trained to operate the new equipment.

Finally, forecasts of future demand for the use of the present or new equipment must be taken
into account. The demand for the replacement equipment might differ because of the different
features it has. For instance, demand for output of the current equipment might be two years,
while demand for output of the replacement equipment might be much longer.

These decisions can be fairly complex, involving a number of different factors. On the other
hand, most of us are faced with a similar decision with our personal automobiles:
When is it time for a replacement? (The answer is given in section ………….)

4.4 Maintenance Models

Many mathematical models have been developed to better define and predict aspects of
maintenance. Some of the models available to assist in making decisions concerning product
maintenance are:

(a) Maintenance Model I – This model determines the optimum number of inspections
per facility per unit of time. An inspection is often disruptive, but it usually decreases
downtime because it means fewer breakdowns. The total downtime is expressed by:
TD = xTpf + (kTBf)/x; where,
TD is the total downtime per unit of time for a facility.
x is the number of inspections per facility per unit of time.
k is a constant for a specific facility.
Tpf is the downtime per inspection for a facility.
TBf is the downtime per breakdown for a facility.

By differentiating the equation with respect to x and put it equal to zero, we get:
dTD/dx = Tpf – kTBfx-2 = 0; therefore, x* = [(kTBf)/ Tpf]1/2

3
Where x* is the optimum number of inspections per facility per unit of time.
Therefore the optimum total downtime per unit of time for a facility is given by: T*D
= 2 kTpfTBf

Example:
Assume that the following data are associated with a piece of engineering equipment:
k = 2; Tpf = 0.009 month; TBf = 0.2 month

Determine the optimum number of inspections per month.

X* = [2(0.2)/0.009]1/2 = 6.67
The optimum number of inspections per month for the piece of engineering
equipment is 6.67.

(b) Maintenance Model II – This model determines the optimum time interval between
replacements. The goal is to minimize average annual total cost with respect to the
time between replacements or the life of the equipment in years. The average cost
consists of three elements: mean investment cost, mean maintenance cost, and mean
operating cost. The total average cost is
CT = OC1 + MC1 + IC/x + [(x – 1)/2](αoc +αmc); (i) where :
CT is the average total cost.
OC1 is the equipment’s operational cost for the first year.
MC1 is the equipment’s maintenance cost for the first year.
IC is the cost of investment.
X is the equipment life expressed in years.
α oc is the amount by which operational cost increases per year.
α mc is the amount by which maintenance cost increases per year.

Differentiating equation (i) with respect to x and putting it equal to zero, leads to
dCT/dx = (αoc +αmc)/2 – IC/x2 = 0; Therefore, the optimum replacement interval x*
= [2IC/(αoc +αmc)]1/2; substituting in equation (i) gives the minimum average annual
total cost
C*T = OC1 + MC1 - (αoc +αmc)/2 + [2IC(αoc +αmc)]1/2;

Example: The following table apply to an engineering system:


α = $1,000, αmc = $500, IC = $50,000
oc

Determine the optimum replacement interval.


x* = [2(50,000/(1,000+500)]1/2 = 8.16 years

(c) Maintenance Model III – This model is concerned with a parallel system composed
of k identical machines or pieces of equipment, with output fed into the next stage of
the production process. For the system success at least one machine must function
normally. Furthermore, the total cost involved in system operation and downtime
losses with respect to k is minimized.
Using queuing theory knowledge, we write the following relationship to obtain the
average proportion of unit of time that the parallel system is unavailable: UA = [λ/
(λ+μ)]k ; where
UA is the system unavailability.
Λ is the constant machine failure.
Μ is the constant machine repair time.

3
Thus the total cost, TC, is given by:
TC = (UA)(DC) + k(MOC) = [λ/(λ+μ)]k(DC) + k(MOC); where
DC is the downtime cost per unit of time.
MOC is the single machine’s operational cost per unit of time.
Differentiating with respect to k and putting it equal to zero, leads to
D(TC)/dk = ([λ/(λ+μ )]kln[λ/(λ+μ)])(DC) + (MOC) = 0;
Thus the optimum number of machines to be used in the parallel configuration for
minimum total cost k* = (ln[-MOC/(DC lnUA1)]/lnUA1);
Where UA1 = λ/(λ+μ)
Example: The following data apply to an engineering system used to produce certain
mechanical parts:
λ = 4 failures per month, μ = 10 repairs per month, MOC = $150, and
DC = $1,500

Determine the optimum number of machines to be used in the parallel configuration


to minimize total cost.

UA1 = 4/(4+10) = 0.2857; and k* = (ln[-150/(1,500ln0.2857)]/ln0.2857) = 2.018;


say two machines. Thus using two machines in parallel configuration minimizes the
total cost of the system.

(d) Maintenance Model IV – This model determines the optimum replacement time for
an item under ordinary periodic replacement policy. Under this policy an item is
replaced with a new one every xp accumulated hours of operation. If the item
malfunctions prior to xp hours, it is repaired only minimally so that its instantaneous
failure rate, λ(x), corresponding to its probability density function, f(x), remains the
same as it was before failure. It is assumed that each failure is detected
instantaneously and the minimum repair time is negligible.

The cost function for this model is expressed by


K = (Cpr + CmrE[α(xp)])/xp; where
K is the cost per unit per operating hour.
Cmr is the cost of minimal repair.
Cpr is the cost associated with planned preventive replacement.
E[α(xp)]is the expected number of failures followed by minimal repair activity during
an interval xp.
Thus we have E[α(xp)] = 0ƒxpλ(x)dx,
Where λ(x) = f(x)/R(x); and
f(x) is the failure probability density function of a unit.
R(x) is the reliability function of a unit.
λ(x) is the time dependent failure rate of unit.

Therefore, K = (Cpr + Cmr 0ƒxpλ(x)dx)/xp

Example: A mechanical unit receives preventive maintenance per the policy


described, and the Rayleigh probability density function expresses its times to failures
as follows:
. f(x) = 2x/[(20)(20)]exp[-(x/20)]2
Assume that the planned preventive replacement cost is $10 and the minimal repair
cost is $50. Calculate the optimum preventive replacement time.

Integrating the above equation over the time interval [0, x], we get

3
F(x) = 0ƒx2x/(20)2 exp[-(x/20)]2 dx = 1- exp[-(x/20)]2
Where F(x) is the cumulative distribution function.
Subtracting from unit:
R(x) = 1-F(x) = 1-(1-exp[-(x/20)]2) = exp[-(x/20)]2
Where R(x) is the reliability at time x.

Therefore, λ(x) = f(x)/R(x) = 2x/(20)2;


and E[α(xp)] = 0ƒxpλ(x)dx = (xp/20)2
Therefore, K = (Cpr + Cmr 0ƒxpλ(x)dx)/xp = [Cpr + Cmr(xp/20)2]/xp
Differentiating with respect to xp, and then setting the resulting equation equal to zero,
leads to

dK/dxp = -Cpr/(xp)2 + Cmrxp/(20)2 = 0


From which the optimal replacement time is given by:

x*p = (20)(Cpr/Cmr)1/2 = (20)(10/50)1/2 = 8.94 hours


The optimum preventive replacement time for the mechanical unit is 8.94 hours.

(e) Maintenance Model V – This model is similar to Model IV except that in this case
the objective is to minimize the total downtime per unit of time – in other words, to
minimize equipment unavailability. The model represents the constant interval
replacement policy. Two important factors associated with this policy are:
• Replacements are carried out at predetermined times irrespective of the age of
the equipment or unit being replaced.
• Replacements are performed when equipment fails.

Total equipment downtime per unit of time, DT(xp), is


DT(xp) = TDT/CL, where
TDT is the total downtime of the equipment under consideration.
CL is the cycle length or the length of the preventive replacement cycle.
In turn, TDT is expressed by
TDT = DTF + DTPR, where
DTF is the equipment downtime due to failure.
DTPR is the equipment downtime due to preventive replacement.
Alternatively, TDT can also be expressed by
TDT = (ENF)(TPR) + Xpr, where
ENF is the expected number of failures in time interval [0, xp].
TPR is the time to perform a failure replacement.
Xpr is the time to perform a preventive replacement.

The cycle length, CL, is given by


CL = Xpr + xp
Therefore, DT(xp) = TDT/CL = [(ENF)(TPR) + Xpr]/( Xpr + xp)

The optimal value of xp may be obtained in similar fashion to that for Maintenance
Model IV.

4.5 Maintenance cost estimation models


(a) Corrective maintenance cost estimation model - This model estimates the
corrective maintenance labour cost for a piece of equipment. The annual cost is
expressed by:

3
Ccm=(SOH)(LC)MTTR)/MTBF, where
SOH represents the scheduled operating hours of the equipment
LC is the maintenance labour cost per hour.
MTBF is the mean time between failures, for the equipment.
MTTR is the mean time to repair for the equipment.

Example 2: A heavy-duty motor is scheduled to operate for 3,000 hours annually. The
expected MTBF and MTTR of the motor are 1,000 hours and 10 hours, respectively.
Determine the annual labour cost of corrective maintenance for the motor if the maintenance
labour rate is $25 per hour.

Solution: Ccm=(3000)(25)(10)/1000=$750
The yearly labour cost is $750

(b) Equipment maintenance cost estimation model - This model calculates the cost of
equipment maintenance with the formula: MC=PMC+CMC+SPIC
Where MC is the equipment maintenance cost.
PMC is the cost of preventive maintenance
CMC is the cost of corrective maintenance
SPIC is the cost of spare parts inventory.

The cost of preventive maintenance, PMC, is defined by


PMC=(STpm+TTpm)(UH)R/SIpm.

Where STpm is the scheduled time preventive maintenance work will take.
TTpm is the expected travel for preventive maintenance
SI pm is the scheduled interval at which preventive maintenance takes place
UH is the number of usage hours, or in use time, per time period considered.
R is the servicing engineer’s hourly rate, including the prorated parts cost.

Similarly, the corrective maintenance cost, CMC, is expressed by


CMC=(TTcm+MTTR)(UH)R/MTBF
Where TTcm=is the expected travel time for corrective maintenance

MTTR is the mean time to repair for the equipment


MTBF is the mean time between failures, for the equipment

The cost of spare parts inventory, SPIC, is given by:


SPIC=(OMC)(ICR), where
OMC is the original manufacturing cost of spare parts
ICR is the inventory rate, expressed as a percentage, including such factors as interest,
handling cost, and depreciation, etc.

Example 3: Assume that for maintenance of a personal computer, the following values are
given:
• MTTR = 2 hours, MTBF = 7,500 hours
• UH = 4,500hours per annum
• R = $400 per hour; STpm = 0.35 hour
• ICR = 8% per year; SIpm = 2,500 hours
• OMC = $1,000, TTpm = 0.25 hour; TTcm = 0.25 hour

Determine the annual maintenance cost for the personal computer


Solution: SPIC=(1,000)(0.08)=$80

3
PMC=(0.35+0.25)(4,500)(400)/2,500=$432
CMC=(0.25+2)(4,500)(400)/7,500=$540
MC=432+540+80=$1,052

The annual maintenance cost $1,052

4.5 Availability

Every designer would like to achieve the highest reliability at minimum cost; however it must
be recognised that equipments will break down sooner or later. When this happens, the speed
and ease with which it can be repaired become vitally important, and hence we must also
consider maintainability.

For any equipment, the probability that it will be available for use is important. Hence,
availability is defined as

Average Availability = MTBF/(MTBF + MTTR).

Example: An average availability of 0.80 means that the equipment is working satisfactorily
for 80% of the time, and under repair, including waiting for spare, etc for the remaining 20%.

4.5.1 Availability and Scheduled maintenance

Availability is often more important to the user than reliability. At first sight it might be
thought that a major objective of the maintenance engineer must be to ensure that every item
of equipment is available for use as continuously as possible. This is not necessarily so, for if
it were the case it will make scheduled maintenance useless. The basis of scheduled
maintenance is that item equipment, which is operating acceptably, is deliberately made non-
available while it is checked and any necessary corrections made. The scheduled maintenance
nearly always reduces the average availability. Its attraction, however, is that we can choose a
time when the equipment will not be required for use, whereas breakdowns often occur at
most inconvenient times.

(Breakdowns – whenever a system fails to work as intended i.e. breaks down, the repairs and
adjustments much be made to put it right. Since we can never foresee exactly when such
failure will occur we cannot plan for it, and so this is called unscheduled maintenance as
opposed to scheduled maintenance also referred to as preventive maintenance or planned
maintenance or routine maintenance).

4.5.2 Losses caused by non-availability of the system


Reliability and maintainability combine together to give availability. We would like to
achieve the highest reliability at minimum cost, but we have to recognize that equipments
will break down sooner or later. When this happens, the speed and ease with which it can be
repaired become vitally important, and hence we must also consider maintainability of the
system. The following factors therefore, will be considered when purchasing an equipment
and/or system.

(a) Profit making organizations


Where the objective is to make a profit, then the cost of not having the system available
can be worked out. The losses may include:
(i) Cost of scrap or rework made at the time the system fails. To this the cost of
materials scrapped should be added.
(ii) Payment to operators and other workers who remain idle during the breakdown.

3
(iii) Cost and nuisance value of split, extra setting time, etc.
(iv) Loss of profits due to the system being out of operation, plus the cost of
unrecovered overheads while it is idle.
(v) Damage to good customer relations, which is caused by late deliveries.

(b) Non-profit making organizations


In some cases it is easy to work out the losses due to unreliability of the system, but in most
cases the cost of unreliability can be very difficult to calculate.

4.6 Downtime and Maintenance Strategies

4.6.1 Mean Downtime (MDT) and Mean time to repair (MTTR) [or Repair Time]

Downtime (MDT)

(a) (b) (c) (d) (e) (f) (g)


Realization Access Diagnosis Spares Replace Checking Alignment

Repair Time (MTTR)

Downtime, or outage, is the period during which equipment is in the failed state. Downtime
may commence before repair (e.g. a system not in continuous use may develop a fault while
it is idle; the fault condition may not become evident until the system is required for
operation). Repair often involves an element of checkout or alignment, which may extend
beyond the outage.

4.6.2 Active Repair Time


This is that segment of downtime during which repair staff work to effect a
repair. Repair always involves an element of check out or alignment which
may extend beyond the downtime. The elements of active repair time are:
i) Access time – This involves the time from realization that a fault exists, to make
contact with displays and test points and so commence fault finding. This does not
include travel but the removal of covers and shields and the connection of test
equipment. This determined largely by mechanical design.
ii) Diagnosis time – This is referred to as fault finding and includes adjustment of test
equipment (e.g. setting up a lap top or generator), carrying out checks (e.g. examining
waveforms for comparison with a handbook), interpretation of information gained,
verifying the conclusions drawn and deciding upon the corrective action.
iii)Spare part procurement – Part procurement can be from the “tool box”, by
cannibalisation or by taking a redundant identical assembly from some other part of
the system. The time take to move parts from a depot or store to the system is not
included, being part of the logistic time.
i) Replacement time – This involves removal of the faulty followed by connection
and wiring, as appropriate, of a replacement. Replacement time is largely
dependent on the choice of LRA (Least Replacement Assembly) and on
mechanical design features such as the choice of connectors.
ii) Checkout time – This involves verifying that the fault condition no longer exists
and that the system is operational. It may be possible to restore the system to
operation before completing the checkout in which case, although a repair
activity, it does not all constitute downtime.

4
iii) Alignment time – As a result of inserting a new module into the system
adjustments may be required. As in the case of checkout, some or all of the
alignment may fall outside the downtime.

Activities (i) – (vi) are called Active Repair Elements as opposed to Passive Repair
Activities which consist of:
• Realization time – this is the time, which elapses before the fault condition becomes
apparent. It does not constitute part of repair time;
• Logistic time – this is the time consumed waiting for spares, test gear, additional tools
and manpower to be transported to the system;
• Administrative time – this involves failure reporting, allocation of repair tasks,
manpower changeover due to demarcation arrangement, official breaks, disputes, etc.

Design, maintenance arrangements, environment, and manpower, instructions, tools and test
equipment determine Active Repair Elements. The maintenance environment, that is, the
location of spares, equipment and manpower and the procedure for allocating tasks, mainly
determines logistic and administrative time.
Another parameter related to outage is Repair rate (μ). It is simply the downtime expressed as
a rate, therefore: μ = 1/MTTR

4.6.3 Factors Influencing Downtime

The two factors governing downtime are:


(a) Equipment design, and
(b) Maintenance philosophy

In general, it is the active repair elements that are determined by the design and the passive
elements, which are governed by the maintenance philosophy. Designers must be aware of
the maintenance strategy and of the possible equipment failure modes. Achieving acceptable
repair times involves simplifying diagnosis and repair.

(a) Equipment design

The key design areas are access, adjustment, built-in test equipment, circuit layout and
hardware partitioning, connections, displays and indicators, handling, human and ergonomic
factors, identification, inter-changeability, Least Replaceable Assembly (LRA), mounting,
component selection, redundancy, software, standardization, test points, and, safety (Apart
from legal and ethical considerations, safety-related hazards increase active repair time by
requiring greater care and attention; An unsafe design will encourage short cuts or the
omission of essential activities; Accidents add, very substantially, to the repair time).

(b) Maintenance Strategies (Maintenance Philosophy)

Both active and passive repair times are influenced by factors other than equipment design.
Consideration of maintenance procedures, personnel, and spares provisioning is known as
Maintenance Philosophy and plays an important part in determining overall availability. The
costs involved in these activities are considerable and it is therefore important to strike a
balance between over- and under-emphasizing each factor. They can be grouped under seven
headings:
(i) Organization of maintenance of resources
(ii) Maintenance procedures

4
(iii) Tools and test equipment
(iv) Personnel selection, training and motivation
(v) Maintenance instructions and manuals
(vi) Spares provisioning
(vii) Logistics

4.7 Comparisons of Maintainability and maintenance costs

The level of maintainability of a product determines the kinds of maintenance work that can
and will need to be performed at each point in the product’s life cycle, and the difficulty and
expense of performing them. Maintainability features, such as mean time to repair (MTTR),
therefore influence maintenance costs such as required manpower. For example if the design
calls for the inclusion of built-in test equipment, the time to fault detection and isolation
should be lower. Usually, higher maintainability means less required maintenance, and
therefore lower maintenance costs. In early equipment design, several alternative levels of
built-in test equipment and other factors that can reduce maintenance costs should
considered.

The objective of performing an economic trade-off analysis is to determine all costs for each
alternative under consideration and then to compare them. Usually, the alternative with the
lowest cost should be selected. This approach is also useful in determining whether items
should be designed to be thrown away or to be repaired (cf. integrated circuits).

The factors include the cost of hardware, manpower, training, test equipment and tools, and
repairs facilities, replacement parts, packaging and shipping, repair parts, and supply,
administration, and cataloguing.

4.8 Comparisons of Reliability and maintenance Costs

The cost of achieving any desired reliability and the subsequent cost of maintenance are
related to each other roughly as shown in the figure below:

minimum
cost
E total F B cost of achieving
Cost C cost reliability
per cost of maintenance
item and repairs
Produced
A
D
Low Reliability Rm High Reliability
Reliability
Fig: The relation between reliability and maintenance costs

At A the reliability is very low and the amount spent on reliability is also low. If we go on
improving reliability we gradually reach the situation where all the obvious things have been
done, and from now on we shall have to spend increasingly more to achieve very little
reliability improvement. If we were unwise enough to demand an impossible reliability of
1.00, costs will sweep away to infinity beyond B.

4
However, when reliability is low, maintenance costs from all the breakdowns are inevitably
high, as shown at C. As the reliability improves so the cost of maintenance falls, until at D, as
reliability approaches 1.00, maintenance costs approach zero.

By adding reliability and maintenance costs we get curve EF, and find that there is a
particular reliability Rm for which the overall cost is a minimum.

Although the above concept is useful, the actual case is nearly always complicated than
suggests since:
i) It does not follow that an improvement in reliability must inevitably cost more.
Better designs, different materials, better quality control during productions, etc.
may achieve improved reliability at little or no extra cost. If scrap is reduced at the
same time, the overall cost may acutely come down (i.e. reliability cost is difficult
to estimate).
ii) Maintenance costs are also difficult to estimate. When an equipment fails, we are
unlikely to be able to foresee associated costs such as:
o The value of production lost through breakdowns, including the cost of the
late deliveries, split batches, idle operators, etc. on the production line.
o The cost of having a piece of equipment out of action. This probably depends
very much on whether it happened to be required for use during the time it
was under repair.

iii) Costs are inextricably mixed up with all the factors related to reliability as
discussed above. We must define precisely what we mean by reliability costs, or
the figures we assign to them will have no meaning.

4.8.1 Factors affecting Reliability and Maintenance Costs

Any project may embrace some or all of the following costs:


i) Research. Design and Development Costs:
• Research into reliability problems for which solutions are yet not known.
• Design costs.
• Cost of building and testing prototypes.
• Cost of modification to design and of further tests until the required reliability is
achieved.
(ii) Manufacturing and Installation Costs;
• Development and purchase of new manufacturing plant, equipment, tooling, etc.
• Installation and commissioning costs, when the equipment we have made is
installed in our customer’s premises.
(iii) Utilization costs
• Day to day running costs, including the costs of routine servicing, repairs etc.
• Costs of support equipment, which is necessary to operate the equipment
efficiently.
• Spares, test equipment, etc.
• Training of operators.

4.9 Reliability – Centred Maintenance (RCM)

Reliability centred maintenance (RCM) systematically identifies the preventive maintenance


tasks required to sustain, in the most cost-effective manner possible, the maximum level of
reliability and safety that can be expected from a product when it receives effective
maintenance.

4
RCM determines the maintenance needs of any facility, system, or equipment in its operating
context. The process, therefore, entails asking questions on the following subjects:

• The functions and related performance standards of the asset in its current
operating context
• Possible ways in which the asset may fail to perform its required functions.
• Causes of each functional failure
• Events that follow each failure
• Significance of each failure
• Measures to prevent failure
• Corrective measures that may be taken if there is no appropriate preventive step

(a) The RCM Process – takes place first during the equipment design and
development phase, when it is used to develop maintenance plans.

During product operation and deployment, these plans are then modified on field
experience. The following two criteria are key to the maintenance plans:

• Parts that are not critical to safety – In this case, preventive maintenance tasks
should be chosen that will decrease the ownership life cycle cost.
• Parts that are critical to safety – In this case, preventive maintenance tasks should
be chosen that will help prevent reliability or safety from dropping to an
unacceptable level, or will help reduce the ownership life cycle costs.

It is through the preventive maintenance program that incipient failures are detected and
corrected, the probability of failure is reduced, hidden failures are detected, and the cost
effectiveness of maintenance program is improved.

4.9.1 Basic steps in RCM Process


The basic steps in RCM process are:
(a) Determine parts with highest maintenance priority,
(b) Obtain appropriate failure data,
(c) Perform fault three analysis,
(d) Apply decision logic to critical failure modes,
(e) Classify maintenance requirements,
(f) Implement RCM decisions,
(g) Base sustaining engineering on real-life experience data.

(a) Determine Parts with the highest maintenance Priority.


Use failure mode and effect analysis to identify these parts whose failure would have the
most significant effects. This will provide the basis for defining the most important
preventive and corrective maintenance requirement. Fault Tree Analysis (FTA) can be used
to identify parts that are critical to safety and provide quantitative failure-mode data about
them. It is also more rigorous and accurate in determining the root course of failures and their
consequences with respect to product safety.

(b) Obtain Appropriate Failure Date.


Each step in fault tree requires data. The most important data are failure probabilities and
assessments of the criticality of the failures, part failures rates, probability of operator error,
and inspection efficiency data. Part failure rate data can come from experience, banks of
generic failure data, and other sources.

4
(c) Perform Fault Tree Analysis
Fault tree Analysis (FTA) defines an undesirable state of the system or product and then
analyses the system or product, in terms of its operation and environment, to determine all
possible ways in which the undesirable event can occur. An FTA is a useful tool to identify
all possible failure causes at all possible levels associated with a system and to identify the
relationship between causes. It can thus improve the design of any specified system, product
or process. An FTA normally takes place during the early design phase and then is
progressively refined and updated as the design develops.

Objectives of FTA: FTA of a system can be used to:


• Identify critical areas and cost-effective improvements;
• Provide input to testing, maintenance, and operational procedures and policies;
• Confirm the ability of the system to fulfil its imposed safety requirements;
• Meet jurisdictional requirements;
• Provide input for cost-benefit analysis of trade-offs;
• Evaluate performance of systems or products for bid-evaluation purposes; and
• Highlight requirements or targets for systems.

The prerequisites for a fault-tree analysis include clearly defined analysis scope and
objectives, clear identification of assumptions, well defined analysis resolution, thorough
understanding of the system’s design and its operation and maintenance aspects, well-defined
physical bounds and interfaces for the system, a comprehensive review of system operational
experience, and a clear definition of what constitutes system failure or undesirable events.

(d) Apply Decision logic to critical Failure Mode


This involves asking standard assessment questions and using the results to determine what
the most effective preventive maintenance tasks would be.
After the fault tree analysis identifies the critical failure modes, decision logic is used to
assess the relationship between each failure mode and each part with a high maintenance
priority. The next step is to establish what maintenance tasks are necessary to prevent or
reduce the incidence of each failure mode. The tasks considered necessary, and the
appropriate intervals at which they should be performed, make up the overall scheduled
preventive maintenance program.

(e) Classify Maintenance Requirements


This step uses the decision logic of the proceeding step to sort the preventive maintenance
requirements into three classifications and define a maintenance task profile. The three
classifications are:
• Hard-time maintenance requirements: These are scheduled removals or
replacements of equipment or parts at predetermined intervals of age or usage.
• Condition monitoring maintenance requirements: These are unscheduled tests or
inspections conducted on parts when failure of the parts can be tolerated during
equipment operation or where impending failure can be discovered through
routine monitoring during usual operations.
• On-condition maintenance requirements: These are scheduled inspections or tests
that measure part deterioration. The level of part deterioration determines whether
corrective maintenance should be performed or whether the part should remain in
service.

(g) Implement RCM Decisions

4
Set and enact the maintenance tasks and their frequencies.

(h) Base Sustaining Engineering on Real-life Experience data


During the rest of the asset’s life cycle, the RCM process focuses on reducing the burden of
scheduled maintenance and cost of support while keeping the equipment in a desirable state
of readiness. Once the system is operating and real-life data begin to accumulate, the goal is
to review previous decisions in order to eliminate excessive maintenance costs while
maintaining established and desirable reliability and safety levels.

4.9.2 Methods of Monitoring Equipment condition

Special instruments are employed for this purpose. Condition monitoring techniques fall into
one of six categories, according to the symptoms or potential failure effects they monitor –
dynamic effects, electrical effects, physical effects, temperature effects, particle effects, and
chemical effects.

(a) Dynamic effects - the methods detect failures, particularly of rotating equipment,
that result in abnormal energy emissions in the form of waves e.g. vibrations, pulses, or noise.
Dynamic monitoring techniques include:

i) Broad band vibration analysis – this technique monitors changes in vibration


characteristics caused by problems such as wear, fatigue, mechanical looseness, or
misalignment of engines, shafts, electric motors, gearboxes, and pumps.
ii) Shock pulse monitoring – this technique monitors surface deterioration and lack
of lubrication in devices such as pneumatic impact tools, internal combustion and
rolling element bearings.
iii) Proximity analysis – This technique is used to track misalignment, rubs and oil
which in fans, shafts and motor assemblies.
iv) Real time analysis – this approach monitors shock, transient, acoustic and
vibration signals in gearboxes, rotating machines, and shafts. It is capable of
simultaneous analysing bands of frequencies over the entire analysis range, and
short-duration signal such as transient vibration and shocks.
v) Ultrasonic leak detection – this method detects leaks and other sources of very
high frequency noise in heat exchangers, air-operated contractors on electric
traction control, underground tanks, and steam condensers.
iv) Kurtosis – this technique monitors shock pulses in gears and rolling
element bearings.

b) Electrical effects: Three techniques are used here


i) Electrical resistance (corrometer) – This technique detects integrated metal
loss, including total corrosion, in facilities, such as process plants, paper mills,
petroleum refineries, and gas transmission.
ii) Linear polarization resistance (corrator) - This technique monitors the rate of
corrosion in electrically conductive corrosive fluids in such as nuclear power
water heat exchangers, cooling water systems, and geothermal power
generating systems.
iii) Potential monitoring – This method detects problems such as stress corrosion
cracking, pitting corrosion, and selective phase corrosion in materials of
stainless steel, titanium, and nickel – based alloys.
c) Physical effects: Six techniques are used here:-
i) Magnetic particle inspection – This technique monitors surface and near surface
cracks and discontinuities caused by wear fatigue, heat treatment, and other

4
problems in Ferro–magnet materials including welds, shafts, boilers, and
machined surfaces.
ii) X-ray radiography – This technique monitors surface and subsurface
discontinuities caused by gas porosity, stress, or fatigue, and also monitors
discontinuities such loose wires in compressors, welds, gearboxes pumps, and
steel structures.
iii) Strain gauges - This technique monitors strain in civil engineering structures such
as tunnels and bridges.
iv) Eddy current testing – This method monitors factors such as material hardness,
and surface and subsurface discontinuities caused by wear, stress, and fatigue in
ferrous materials used in items like heat exchangers tubes, railways lines, boiler
tubes, and hoist ropes.
v) Electron fractography – This technique tracks the growth of fatigue cracks in
motor vehicles, metallic components in air craft, industrial equipment, and similar
items.
vi) Ultrasonic (a pulse echo technique) - This technique is used on welds, boilers,
tubes, compressors, receivers, steel structures, and other items either of ferrous or
non-ferrous materials to monitor the thickness of materials, subject to wear and
corrosion, as well as surface and below-surface discontinuities caused by factors
such as inclusions, fatigue, heat treatment, and lamination.

d) Temperature effects: Three methods of monitoring temperature effects are:-


i) Temperature – indicating paint-the paint, which is applied to hot spots or potential
points of insulation failures, reacts to surface temperature and provides a
permanent record of the highest temperature reached at that point.
ii) Fibre hop thermometry – This technique monitors temperature variations caused
by insulation deterioration, block cooling systems, leaks, or other problems in
items such as engines, power cables, transformer windings, and pipelines. The
technology is operable in hazardous environments reachable in otherwise
inaccessible locations, and unaffected by the presence of electromagnetic
interference.
iii) Thermograph - This methods identifies changes in heat transfer characteristics due
to determination laminated materials or to variation in temperature caused by
fatigue, leaks, wear, or other problems in transformers, hydraulics, building
insulation, and electrical switch gears.

e) Particle effects: Five techniques are used here:


i) Magnetic clip detection – This technique monitors wear and fatigue in equipment,
such as aircraft engine, gearboxes, compressors, and turbines, with enclosed
lubricating systems.
ii) Blot testing -This method detects fatigue, wear and corrosion particles
in circulating out systems such as compressors , gearboxes, and engine bumps.
iii) Ferrography - This monitors fatigue, wear, and corrosion in enclosed lubricating and
hydraulic oil systems, such as engine sumps, gearboxes, and hydraulics.
vi) X-ray fluorescence – It monitors wear and damage to filters in enclosed
lubricating and hydraulic oil systems, such as engine bumps, gearboxes, and
hydraulics.
vii) Graded filtration - Also used for enclosed lubricating and hydraulic oil systems to
detect particles in lubricating oil that stem from fatigue, corrosion, and wear.

f) Chemical Effects: some of the techniques used to monitor chemical effects are:
(i) Gas chromatography - This method detects gases emitted as the result of faults
in nuclear power systems and turbine generators.

4
(ii) Thin-larger activation - This technique monitors wear in devices such as
turbine blades, electrical contacts, bearings rails, and cooling systems.
(iii) Infrared spectroscopy - This method measures fluid degradation and the
presence of gases such as hydrogen, carbon monoxide, and methane in
enclosed oil systems such as compressor sumps, Transformers, and engine
sumps.
(iv) Spectrometric oil analysis procedure - This technique is used in circulating oil
systems to track wear, leaks, and corrosion.

4.9.3 The Benefits of RCM Application


The benefits offered by RCM application are:

(a) Improvement in operating Performance – Usually equipment performance is composed


of three elements: availability, efficiency and yield. The RCM process helps improve plant
performance in the following ways:
(i) The emphasis on on-condition tasks ensures that potential failures are
highlighted before they become functional failures.
(ii) This emphasis can also reduce the frequency of major overhauls, thus
improving the long-term availability of equipment.
(iii) Eliminating superfluous facilities, equipment or components results in a
corresponding increase in reliability.
(iv) Because it relates each failure made to the corresponding functional failure,
the information sheet becomes a frol for quick failure diagnosis. This
ultimately leads to shorter repair times.
(v) Systematically reviewing the operational consequence of every failure that
was not dealt with as a safety hazard, and employing stringent criteria to
determine task effective tasks to address each failure mode.
(vi) Using the people most knowledgeable about equipment to analyse failure
modes help ensure that chronic failure are identified and eliminated and that
necessary preventive measures are taken.

(b) Improvement in Maintenance Cost-effectiveness- In many industries, maintenance


forms the largest portion of operating cost after raw materials and direct production labour or
energy. This means that controlling maintenance costs is central to cost-effectiveness. The
RCM process helps lower or at least central the maintenance cost growth rate, because it
reduces routine maintenance and the need for costly experts, improves purchasing of
maintenance services, and provides clearer directions for acquiring new maintenance
technology.

(c) Improvement in Teamwork - The practice of RCM not only fosters teamwork within the
RCM review groups but also helps to improve communication and co-operation among
various people and units: design engineers; equipment users, and maintainers; production
and operation units and maintenance a personnel; and management, supervisors; technical
personnel and operators.

(d) Improvement is safety and environmental Rotation - The ways in which the practice
of RCM leads to improved safety and environmental protection include:
(i) The reduction in the total number of frequency of routine tasks automatically
lowers the chance of critical failures occurring either during maintenance or
shortly after equipment start-up.
(ii) The RCM decision process demands that all potential failures that would or could
affect safety or the environment be eliminated or addressed.

4
(iii) Examination of a failure’s safety and environmental implications prior to
considering it, operational effects makes safety and environmental integrity a top
priority.
(iv) The attention to hidden failures and the systematic approach to failure – funding
results in considerable improvement to preventive maintenance. The probability
that multiple failures having serious consequences will occur is thereby
substantially reduced.
(v) Improvement in Individual Motivation – RCM helps improve the motivation of
individuals involved in the review process by providing:

• A clearer understanding of an asset’s functions, and of the expectations placed


on each individual who works with it, helps enhance his or her competence
and confidence.
• A Clearer understanding of the issues beyond the control of each individual
enables him/her to work more comfortably within the framework of those
limitations.
• Knowledge about each group member’s part in formulating goals, and in
deciding what actions are required to achieve them and who should perform
these actions, leads to a strong sense of ownership.

5. Life Cycle Cost and the Cost of an Equipment


The cost of an equipment or asset can be broadly divided into two:
(a) The costs of quality, and
(b) The Lifecycle Costs

5.1 The costs of quality


These are the manufacturer’s quality costs. Quality cost analysis entail extracting various
items from accounts and grouping these under three headings:
i) Prevention costs i.e. cost of preventing failure – include, design review; quality
and reliability training; under quality planning; audits of systems, products and
processes; installation prevention activities; product qualification, quality
engineering.
ii) Appraisal costs i.e. costs related to measurement includes test and inspection;
maintenance and calibration; test equipment depreciation; live quality
engineering; installation testing.
iii) Failure costs i.e. costs incurred as a result of scrap, rework, failure, etc. includes
Design changes; vendor rejects; rework; scrap and material renovation; warranty;
commissioning failures; fault finding in test.

The Life Cycle Costs


Life cycle costs (or Cost of Ownership) are all costs expected during the life of an item over
some finite study period. This means costs associated with acquisition and ownership of a
system over its full life must be estimated and timed for the year of the expenditure. The costs
associated with ownership of a system are normally referred to as sustaining costs; these costs
can further be subdivided into costs of use and costs of administration.

a) The purchase price of the asset


[The cost of ownership. The total cost of ownership of an asset consists of:
concerned, plus any ancillary equipment, which may be required to operate it; b) The cost of
transporting the asset to the site and installing it there. Included here may be the cost of
foundations and of a building to house it; c) Commissioning costs; d) Maintenance and
repair costs. These may well be the really expensive items, and they go on for year after
year; e) The cost of training operators and maintenance staff].

4
The basic tree for LCC starts with a very simple tree based on the costs for acquisition and
the costs for sustaining the acquisition during its life as shown in Figure 1.

Life Cycle Cost Tree

Acquisition
Sustaining Costs
Costs

Cost of Use: Maintenance Cost of


and operating costs Administration

Figure 1: Top Levels of Life Cycle Costs Tree

These can be divided into three:


iv) Cost of Acquisition: Procurement cost (or Capital cost) plus cost of transport,
installation and commissioning.
v) Cost of Use: i) Maintenance Cost - cost of preventive and Corrective maintenance
and have modifications, and cost of training maintenance staff; ii) Operating Cost
- cost of materials and energy, cost of training operators.
vi) Cost of Administration: Cost of data acquisition and recording and of
documentation.

These three costs are influenced by:


(i) Reliability which determines frequency of repair, fixes spares requirements, and
determines loss of revenue (Together with maintainability)
(ii) Maintainability, which affects training, test equipment, and downtime manpower.
(iii) Safety Factors which affects operating efficiency and maintainability.

Life Cycle Costs will clearly be reduced by enhanced reliability, maintainability and safety
but will be increased by the activities required to achieve them. In general, cost details for the
acquisition tree shown in Figure 2, are usually identified and collected correctly. The
collection of costs for the sustaining tree shown in Figure 3 is the major problem.

5
Figure 2: Acquisition Cost Tree

5
Figure 3: Sustaining Cost Tree

5.2.1 Life Cycle Costing (LCC)


Life cycle costing (LCC) is a methodology that attempts to capture all of the costs associated
with a product throughout its life cycle. The typical problem normally encountered is
whether it is more economical to spend more money in the initial purchase to obtain a
product with lower operating and maintenance costs, or whether it is less costly to purchase a
product with lower first costs but higher operating costs. However, life cycle costing goes
into the analysis in much greater detail in an attempt to evaluate all relevant costs, both
present and future.

The operation and maintenance costs of some equipment can be 60 to 80 percent of the life
cycle cost. Life cycle costing is combined with life cycle assessment to consider the costs of
energy consumption and pollution during manufacture and service, and the costs of retiring
the product when it reaches its useful life.

The elements in the life cycle of a product include the overlooked impact on society costs
(OISC) that are rarely quantified and incorporated into a product life cycle analysis. We start

5
with design. The actual costs incurred at the design stage are a small part of the LCC but the
costs committed here comprise about 75 percent of the avoidable costs within the life cycle of
the product. Moreover, it is about 10 times less costly to make a change or correct an error in
design than it is in manufacturing. The costs of acquiring and processing the raw materials
can incur large environmental costs.

The costs of ownership of a product are the traditional aspect of LCC. Useful life is
commonly measured by cycles of operation, length of operation, or shelf life. In design using
durable and reliable materials and components extends the life for use and service of a
product. Maintenance costs, especially maintenance labor costs, usually dominate other use
costs. Most analyses divide maintenance costs into scheduled or preventive maintenance and
unscheduled or corrective maintenance. The mean time between failure (MTBF) and the
mean time to repair (MTTR) influence the LCC. (LCC is reduced by high values of MTBF
and low values of MTTR). Other costs that must be projected for the operations and support
phase are: maintenance of support equipment; maintenance facility costs; pay and fringe
benefits for support personnel; warranty costs and service contracts.

Once the product has reached its useful life it enters the retirement stage of the life cycle.
High-value-added products may be candidates for remanufacturing. By value added we
mean the cost of materials, labor, energy, and manufacturing operations that have gone into
creating the product. Products that lend themselves to recycling are those with an attractive
reclamation value, which is determined by market forces and the ease with which different
materials can be separated from the product. Reuse components are subsystems from a
product that have not spent their useful life and can be reused in another product. Materials
that cannot be reused, remanufactured, or recycled are discarded in an environmentally safe
way. This may require labor and tooling for disassembly or treatment before disposal.

5.2.2 Life cycle costing steps are:


• Estimate the useful life of the asset.
• Estimate all associated costs, including the costs of operation and maintenance.
• Estimate the terminal value from the ownership cost of the asset.
• Take the result of this calculation and find its present value.
• Obtain the life cycle cost of the asset by adding the procurement cost to this
present value amount.
• Repeat these steps for each product being considered for acquisition.
• Compare the life cycle costs of these products.
• Choose the product with the lowest life cycle cost, in balance with other
considerations.

5.2.3 Advantages and Disadvantages of Life-cycle costing


Life cycle costing is now often used in the procurement of expensive systems or equipment.
Life cycle cost analysis examines the effect on cost of alternative equipment designs. Life
cycle costing plays an important role in maintainability analysis, particularly with respect to
operation and maintenance costs.

The advantages of life cycle costing are that is an excellent tool for comparing the cost of
competing projects, controlling program costs, selecting among competing contractors,
making decisions associated with equipment replacement, reducing total cost, and conducting
planning and budgeting.

Some of the disadvantages of life cycle costing are that it is time consuming and expensive,
that collecting the data needed for analysis can be a trying task, and that the data available is
sometimes of doubtful accuracy.

5
5.2.4 Why Use LCC?
LCC helps change provincial perspectives for business issues with emphasis on enhancing
economic competitiveness by working for the lowest long term cost of ownership. Too often
parochial views result in ineffective actions best characterized by short term cost advantages
(but long term costly decisions). Consider these typical events observed in most companies:
• Engineering avoids specifying cost effective, redundant equipment needed to
accommodate expected costly failures so as to meet capital budgets,
• Purchasing buys lower grade equipment to get favorable purchase price variances,
• Project engineering builds plants with a 6 month view of successfully running the
plant only during start-ups rather than the long term view of low cost operation,
• Process engineering requires operating equipment in race car driver fashion using a
philosophy that all equipment is capable of operating at 150% of its rated condition
without failure and they have other departments to clean-up equipment abuse,
• Maintenance defers required corrective/preventive actions to reduce budgets, and
thus long term costs increase because of neglect for meeting short term management
gains,
• Reliability engineering is assigned improvement tasks with no budgets for
accomplishing the goals.
Management is responsible for harmonizing these potential conflicts under the banner of
operating for the lowest long term cost of ownership. The glue binding these conflicts
together is a teamwork approach for minimizing LCC. When properly used with good
engineering judgment, LCC provides a rich set of information for making cost effective, long
term decisions. LCC can be used as a management decision tool for:
• Costing discipline - it is concerned with operating and support cost estimates.
• Procurement technique - it is used as a tool to determine cost per usage.
• Acquisition tool - it is concerned with balancing acquisition and ownership costs.
• Design trade-off - it integrates effects of availability, reliability, maintainability,
capability, and system effectiveness into x-y charts that are understandable for cost
effective screening methods.

5.2.5 The Conversion or Decommission Phase of Life Cycle Costing


This final phase of the life cycle costing is planning for end of life. No manufacturing facility
lasts forever. Consideration must be given for the conversion or decommission phase of life.
As remediation and decontamination efforts are looming for many facilities, this is now a
major cost impact driven by the actions taken during design and operation of existing plants.
It is unlikely that these costs will decline in the future and consideration should be given for
future costs. The cost decisions for this phase are difficult to quantify with any high degree of
accuracy. However, if you plan for conversion or decommission, it leads to more responsible
decisions during life of the project for reducing contamination and waste products which are
expensive to neutralize prior to disposal. How would this system work for forward looking
organizations?
1. Start with an objective such as: “We will build an economical and failure-free process
which will operate for 5 years between planned outages with an availability of 98%
(including lost production time during turnarounds), and 80% of all component
failures must be capable of being repaired in less than 24 hours”. This requires pricing
out alternatives for achieving the lowest long term cost of ownership using LCC
techniques to achieve a highly available process which is free from failures and thus
lacks instability of process changes so as to produce consistently large outputs as
advocated by process reliability techniques.
2. The purpose of planning for a failure-free process is to increase manufacturing
productivity and manufacturing throughput recognizing the process is the king while
individual equipment are the pawns for the strategy. This requires planning for,

5
calculating, and understanding reliability principles as improved reliability tends
toward lower life cycle cost.
3. The purpose of maintainability improvement is to design machinery and equipment
that can be quickly and safely repaired to reduce downtime of individual pieces of
equipment so the risk of failing the process is low in both probability of failure and
low in the money risk of exposure. This requires identifying planned maintenance
actions and improving the reliability and longevity of equipment which is driven by
economics. Planning for both quick and safe maintenance activities decreases the
exposure for safety issues and exposure for financial damage from the required
repairs.
Of course, pursing this path for cost reductions has a prerequisite of knowing, understanding,
and using technology from the field of reliability and maintainability. Many papers
concerning reliability and the high cost of unreliability from failures are available for
download at http://www.barringer1.com/Papers.htm .

5.2.6 Life Expectancies


Possibly the most significant, yet least readily appreciated variable is the need to determine
appropriate LCC periods for each facility. Life expectancies and their relative perception may
be particularly difficult to forecast due to premature obsolescence related to one or more of
the following issues and/or their interpretations under various circumstances:
• Physical: The physical life of a facility is the period from construction to the time
when it is physically derelict. In reality, most buildings never reach this point as they
are demolished or refurbished for other reasons.
• Economic: The economic life of a facility is the point of time at which continued
occupation of a facility is considered to be the least cost effective option.
• Functional: The functional life of a facility is the period from occupation to when it
ceases to be functionally efficient or ‘fit for purpose’. Functional and economic
obsolescence are often closely related.
• Technological: This occurs when a facility or its components are no longer
technologically superior to alternatives and replacement is undertaken because of
expected lower operating costs or great efficiency.

• Social: Community values and fashion can lead to the need for facility renovation or
replacement, such as environmental and social concerns which give rise to the
obsolescence of processes and products.
• Legal: Revised safety regulations, facility standards, compliance issues or emerging
case law may lead to legal obsolescence.

5
Notwithstanding these six defined life expectancies, and there could be others, the life of an
asset is generally thought to be equal to its economic life. This is the period of time during
which the asset is able to make a positive contribution to the financial position of its owners,
both present and future. The concept of using ‘financial interest’ as the basis for assessing the
time horizon of a LCC study is not unique, but is increasingly giving way to ‘sustainability
considerations’.

Techniques
Given that LCC analysis is most commonly a technique for examining the economic
consequences in the future of choices made now. It may be used to assess the relative
financial merit of a particular proposal or to choose between options. Hence, LCC analysis
will, regardless of the selected model, is typically characterised by:
• a defined time span or investment horizon,
• analysis of economic consequences of current decisions,
• the concept of discount rates to express future and present costs, and
• The use of sensitivity factors.
The central principle of LCC analysis is the value of money over time – a shilling today is
worth more than a shilling in the future, due to its earning potential if invested in the interim
versus the ‘opportunity cost’ of its alternative use.

There are three principle methods of evaluating the LCC of a facility or its components;
1. Current Cost Aggregation: This method simply adds the total capital costs to the
total of all expenditure on operating, repairs, maintenance and replacement over the
facility’s life. This calculation is usually expected to show up in good light proposals
to spend more on the capital cost in order to reduce annual expenditure. However the
‘Simple Aggregation’ has no standing in management accounting terms; this is
because it ignores the highly significant effect of Discounted Cash Flow (DCF) on the
real value of deferred expenditure.
2. Net Present Value: This method, the aggregation of initial and annual expenditure is
modified by deduction of the interest (at the chosen rate) theoretically earned on the
money invested during the period from inception of the project to the actual date of
payment for the facility.
3. Annual Equivalent: This expresses the aggregated amounts in terms of the
‘mortgage payable’ on the initial cost added to the typical annual costs of operating,
repair and replacement. This same quick method which provides a ‘snapshot’ at a
point in time can also be used for comparison of rental options. If considering the
provision for future liabilities this forms the basis of a ‘sinking fund’ formulation.

Each of these methods can be calculated with or without variables such as inflation, tax, etc.
Other important techniques and tools available to support LCC decisions include Internal
Rate of Return (IRR), Break-even Points, Pay-back Periods and Cost Benefit Analysis.

5.2.7 Life Cycle Cost models


Life cycle cost estimation models fall into two groups: General models and specific models.
The data to be input into a life cycle cost model include the purchase price of the product,
mean time between failures (MTBF), mean time to repair (MTTR), average material cost of a
failure, labour cost per preventive maintenance action, labour cost per corrective maintenance
action, installation costs, training costs, the warranty coverage period cost of carrying spares
in inventory, and shipment forecasts over the course of the product’s useful life.

5
(a) General Life Cycle cost estimation models
All these models determine the total cost of an item over its life span, but they vary in the
methods they use to estimate many of the major costs used in the calculation. The following
two cycle cost models demonstrate this point:
(i) Life Cycle cost models I - This model calculates two major kinds of costs:
recurring and nonrecurring. The life cycle cost is expressed by LCC = RC +
NRC
Where LCC is the product’s life cycle cost.
RC is the recurring cost; and NRC is the nonrecurring cost.

The major elements of the recurring cost are operating cost, maintenance cost, support
cost, manpower cost, and inventory cost. For the non recurring cost, they are
procurement cost, reliability and maintainability improvement cost, research and
development cost, installation cost, training cost, support cost, qualification approval
cost, life cycle management cost, test equipment cost, and transportation cost.

The present value of the recurring cost must be obtained, using discounting formulas,
before it is added to the non-recurring cost.

(ii) Life cycle cost model II – Three major costs form the life cycle cost in this
model: Research and development cost, investment cost, and operations and
maintenance cost. Thus the life cycle cost is given by: LCC = RDC + IC +
OMC; where
RDC is the research and Development cost,
IC is the investment cost
OMC is the operations and maintenance cost.

The components of the RDC are engineering design cost, which includes covering system
engineering, reliability, maintainability, human factors, predictability, electrical design,
mechanical design, and logistic support analysis; advanced research and development cost;
engineering development and test cost, for example, the cost of engineering models and of
testing and evaluation, Engineering data cost; and program management cost.

The investment cost consists of construction cost, that is the cost of manufacturing facilities,
test facilities, operational facilities, including the cost of manufacturing engineering, quality
control, fabrication, assembly, tools and test equipment, test and inspection, material, packing
and shipping, initial logistic support cost, or the cost of program management, test and
support equipment, initial spare and repair parts, provisioning initial inventory, first
destination transportation, technical data preparation and initial training and training
equipment.

The elements of the operations and maintenance cost are modification cost, disposal cost,
operations cost, that is, the cost of operations personnel, operational facilities, support and
handling equipment, and operator training, and maintenance cost, including the cost of
maintenance personnel, spare and repair parts, maintenance facilities, maintenance training,
maintenance of test and support equipment, transportation and handling, and technical data.

b) Specific Life Cycle Cost Estimation Models.


These models are tailored to meet specific need. Two of them are as follows:
(i) Specific Life Cycle Cost Estimation Model I – This model estimates the life
cycle cost of switching power supplies. The cost is expressed by LCCsp=1C+FC
where LCCsp is the life cycle cost of switching power supplies; FC is the failure
cost; IC is the initial cost

5
The failure cost FC, is expressed by FC=8T(RC+SC)
Where RC is the cost of repairs
SC is the cost of spares
8 is the constant failure rate of switching power supplies
T is the expected life of the switching power supply.

The cost of spares SC, is given by: SC =(USC) (θ) where θ is the fractional
quantity of spares for each active unit. USC is the unit spare cost.

(ii) Specific life cycle cost Estimation Model II – This model was developed to
estimate the life cycle cost of an early warning radar system. The radar systems
life cycle cost is expressed by: LCCer=Ca+Co+C1
Where LCCer is the life cycle cost of the early warning radar system
Ca is the acquisition cost of the system
Co is the operational cost of the system
C1 is the logistic support cost of the system.
Past experience indicates that Ca accounts for 28% of LCCer, Co for 12% of LCCer, and C1
for 60% of LCCer.

Example 1
Assume that a company is considering buying an electric generator. Manufacturers A, B, and
C are bidding to sell the system. The table below presents data for generators produced by all
three manufacturers. Determine which of the three electric generators has the lowest life
cycle cost.

ITEM Manufacturer Manufacturer Manufacturer


A’s Generator B’s generator C’s generator
-Procurement cost $1.5m $2.0m $1.8.
-Expected useful life 15 years 15 years 15 years
-Expected yearly operating cost $80,000 $30,000 $40,000
-Expected cost of a failure $10,000 $12,000 $11,000
-Failure rate per year 0.07 failures 0.08 failures 0.075 failures
-Dispatch cost $30,000 $40,000 $35,000
-Annual rate of increase in cost 5% 5% 5%

Solution. Manufacturers A’s Electric Generator.


The annual expected failure cost is
EFCA= 8 x Expected cost of a failure=(0.07)(10,000)=$700
The present value of the failure cost is given by:
FCA=PVA=P[1-(1+i)-n]/i
Where FCA is Annual failure cost
PVA is present value of sum of uniform payments
P is the payment made each year.
i is annual rate of increase in cost

FCA=700 [1-(1+0.05)-15]/0.05= $7,265.7


The present value of the disposal cost is given by
DCA=AMn/(1+i)n = 30,000/(1+0.05)15=$14,430.5 = PV
Where AMn is the value of item at the end of n years.

The present value of the operating cost is given by


OCA=80,000[1-(1+0.05)-15]/0.05=$830,372.6

5
Adding these three costs to the procurement cost, the life cycle cost of the electric generator
from manufacturer A is LCCA=1,500,000+14,430.5+830,372.6+7,265.7 = $2,352,068.8

Manufacturer B’s Electric generator


The annual expected failure cost is, EFCb = (12,000)0.08=$960
The present value of the failure cost
FCB=(960)[1-(1+0.05)-15]/0.05=$9,964.4

The present value of the disposal cost:


DCB=PV=(40,000)/(1+0.05)15=$19,240.6

The present value of the operating cost:


OCB=PVA=(30,000)[1-(1+0.05)-15]/0.05=$311,389.7

Adding these costs to the procurement cost, the life cycle cost of the electric generator from
manufacturer B is
LCCB=2,000,000+19,240.6+311.389.7+9,964.4 = $2,340,594.70

Manufacturer C’s Electric Generator:


The annual expected failure cost is, EFCC=(11,000) (0.075)=$825
The present value of the failure cost:
FCC=(825)[1-(1+0.05)-15]/0.05=$8,563.2

The present value of the disposal cost:


DCC=PVa=(35,000)/(1+0.05)15=$16,835.5

The present value of the operating cost:


OCC=PVa=(40,000) [1-(1+0.05)-15]/0.05=$415,186.3

The life cycle cost of the electric generator from manufacturer C is:
LCCC=1,800,000+16,835.5+415,186.3+8,563.2 = $2,240,585

The lowest life cycle cost comes from manufacturer C

(c) Maintenance cost estimation models.


(i) Corrective maintenance cost estimation model - This model estimates the
corrective maintenance labour cost for a piece of equipment. The annual cost is
expressed by:

Ccm=(SOH)(LC)MTTR)/MTBF, where
SOH represents the scheduled operating hours of the equipment
LC is the maintenance labour cost per hour.
MTBF is the mean time between failures, for the equipment.
MTTR is the mean time to repair for the equipment.

Example 2: A heavy-duty motor is scheduled to operate for 3,000 hours annually. The
expected MTBF and MTTR of the motor are 1,000 hours and 10 hours, respectively.
Determine the annual labour cost of corrective maintenance for the motor if the maintenance
labour rate is $25 per hour.

Solution: Ccm=(3000)(25)(10)/1000=$750
The yearly labour cost is $750

5
(ii) Equipment maintenance cost estimation model - This model calculates the cost
of equipment maintenance with the formula: MC=PMC+CMC+SPIC
Where MC is the equipment maintenance cost.
PMC is the cost of preventive maintenance
CMC is the cost of corrective maintenance
SPIC is the cost of spare parts inventory.

The cost of preventive maintenance, PMC, is defined by


PMC=(STpm+TTpm)(UH)R/SIpm.

Where STpm is the scheduled time preventive maintenance work will take.
TTpm is the expected travel for preventive maintenance
SI pm is the scheduled interval at which preventive maintenance takes place
UH is the number of usage hours, or in use time, per time period considered.
R is the servicing engineer’s hourly rate, including the prorated parts cost.

Similarly, the corrective maintenance cost, CMC, is expressed by


CMC=(TTcm+MTTR)(UH)R/MTBF
Where TTcm=is the expected travel time for corrective maintenance

MTTR is the mean time to repair for the equipment


MTBF is the mean time between failures, for the equipment

The cost of spare parts inventory, SPIC, is given by:


SPIC=(OMC)(ICR), where
OMC is the original manufacturing cost of spare parts
ICR is the inventory rate, expressed as a percentage, including such factors as interest,
handling cost, and depreciation, etc.

Example 3: Assume that for maintenance of a personal computer, the following values are
given:

• MTTR = 2 hours, MTBF = 7,500 hours


• UH = 4,500hours per annum
• R = $400 per hour; STpm = 0.35 hour
• ICR = 8% per year; SIpm = 2,500 hours
• OMC = $1,000, TTpm = 0.25 hour; TTcm = 0.25 hour

Determine the annual maintenance cost for the personal computer

Solution: SPIC=(1,000)(0.08)=$80
PMC=(0.35+0.25)(4,500)(400)/2,500=$432
CMC=(0.25+2)(4,500)(400)/7,500=$540
MC=432+540+80=$1,052

The annual maintenance cost $1,052

5.3 Cost and Performance of Equipment

5.3.1 Factors related to reliability


ι) Cost - There is always the risk that in our efforts to achieve a more reliable
product, we may price ourselves out of market.

6
ι ι ) Delivery time – If our deliveries are too long, our customers may buy from a
competitor. However, the time allowed for delivery determines the time available
for design, development, manufacture and test, so if the delivery time is
impossibly short both the quality and reliability of our products may suffer.
ι ι ι ) Quality and performance – we must ensure that we supply to satisfy our
customers. If our quality and performance are too low, our customers may not
buy; if it is unnecessary high, our prices may also be too high.
ι ϖ ) Safety – safety is important and is part of reliability since an equipment which
injures a person is clearly not working as intended.
ϖ) Weight and Volume – Both weight and volume should be reduced to a minimum.
ϖ ι ) Extensibility – the ease with which the equipment can be extended or enlarged
(as the future market requirements increase).
ϖ ι ι ) Adaptability – Adaptability is concerned with the addition of more facilities.
However many facilities, even though they can be added later, are best and
cheapest if included in the new requirement.

5.3.2 The Design Profile


When we want to buy new equipment we want it to have the following attributes:
i) Absolute reliability
ii) Extreme cheapness
iii) Immediate delivery
iv) Very cheap and easy to maintain
v) Excellent performance, quality, etc.
vi) Absolute safety
vii) Negligible weight and volume, etc.

Unfortunately the above is quite impossible, because these factors are interrelated. The
greater our demand on quality, reliability, performance, safety, weight and volume, the
greater the cost is likely to be and the longer the delivery time. Therefore, for each individual
case we have to balance one against the other i.e. optimised or trade off one factor against
another.

6.0 Maintainability and Reliability terms and definitions


1. Maintainability is the action taken during the design and development of assets to
include features that will increase ease of maintenance and will ensure that when used
in the field the asset will have minimum downtime and life-cycle support costs i.e. its
serviceability, reparability, and cost-effectiveness of maintenance are increased.

2. Maintenance is the action taken by the user of an asset to keep it in operable condition
or repair it to operable condition.

[Maintenance is the act of repairing or servicing equipment while


Maintainability is a design parameter intended to minimize repair time].

3. Maintenance frequency: This is the frequency with which each maintenance


action must be conducted and is central to the preventive, schedule, or corrective
maintenance requirements of a system.

6
4. Downtime: This is the total time during which the product is not in an adequate
operating state.

5. Reparability: This is the probability that a failed product will be repaired to its
operational state within a given active repair time.

6. Serviceability: This is the degree of difficulty or ease with which a product can be
restored to its operable state.

7. Accessibility: This is the ease and rapidity with which an item can be reached and the
required maintenance performed. Poor accessibility leads to increased downtime and,
in turn, lower revenue. Thus, one design goal should be to provide access to failed
parts that does not require removing other parts.

8. Simplicity: This is the simplification of maintenance tasks associated with the


system. System simplification helps to reduce the costs of spares and improves the
effectiveness of maintenance troubleshooting.

9. Visibility: This measures how readily the system part requiring maintenance can
be seen. A blocked view can significantly increase downtime.

10. Testability: This is the measure of fault detection and fault isolation ability. Fault
diagnosis speed can significantly influence downtime and maintenance costs.

11. Active repair time: This is that segment of downtime during which repair staff
work to affect a repair.

12. Logistic time: This is that segment of downtime occupied by the wait for a needed
part or tool.
13. Interchangeability: This is the extent to which one item can be readily replaced
with an identical item without a need for recalibration. Such flexibility in design
reduces maintenance work and in turn maintenance costs.

14. State-of-the-art: Technological advances can help to improve maintainability and


decrease maintenance costs.

15. Reliability is the probability that an item will carry out its stated function adequately
for the specified time interval when operated according to the designed conditions.
Reliability at time t = R(t) = (No. Surviving at instant t)/( No. at start when t=o)

16. Failure is the termination of the ability of an item to perform its required function
within the specified guidelines.

17. Hazard rate: This is the rate of change in the number of failed items divided by the
number of items that have not failed at time t.

18. The Mean Time Between Failures (MTBF) tells us how long on average, an
equipment operates before it fails, and this we want to be as long as possible. The
Mean Time To Repair (MTTR) tells us how long on average, it takes to put the
equipment right after it has failed, and this we want to be as short as possible.

19. Availability: This is the probability that a product is available for use when needed.
Average Availability = MTBF/(MTBF + MTTR)

6
20. Utilization factor = MTBF/(MTBF+ MTTR+ MTIBR)
(MTIBR = Mean time the equipment is idle between repairs).

21. In order to achieve reliability, we must have reliable parts/components and use as few
as possible.

22. Design adequacy: This is the probability that the product will complete its
intended mission successfully when it is used according to its design specifications.

23. Reliability and Maintenance Costs


minimum
cost
E total F B cost of achieving
Cost C cost reliability
per cost of maintenance
item and repairs
Produced
A
D
Low Reliability Rm High Reliability
Reliability

At A the reliability is very low and the amount spent on reliability is lso low. At B
reliability is very high and the amount spent on reliability is also high. However,
when reliability is low, maintenance costs from all the breakdowns are inevitably
high, as shown at C. As the reliability improves so the cost of maintenance falls, until
at D, as reliability approaches 1.00, maintenance costs approach zero.

By adding reliability and maintenance costs together we get curve EF, and find that
there is a particular reliability Rm for which the overall cost is a minimum.

24. Redundancy is the provision of more than one means of accomplishing a given
function. There are two types of redundancy – active redundancy and standby
redundancy.

7.0 Depreciation and Equipment Replacement


7.1 Economic Life and Obsolescence
When a company invests in an income-producing asset, the productive life of the asset is
estimated for accounting purposes, the asset is depreciated over this period. It is assumed
that the asset will perform its function during this time and then be considered obsolete or
won out, and replacement will be required.

Assume that a machine expected to have a productive life of 10 years is purchased. If at any
time during the ensuring 10 years a new machine is developed that can perform the same task
more efficiently or economically, the old machine has become obsolete. Whether it is “worn
out” or not is irrelevant.

The economic life of a machine is the period over which it provides the best method for
performing its task. When a superior method is developed, the machine has become obsolete.
Thus, the stated book value of a machine can be a meaningless figure.

6
7.2 Depreciation
Whenever any machine or equipment performs useful work its wear and tear is bound to
occur. This can be minimised up to some extent by proper care and maintenance but cannot
be totally eliminated. Its efficiency also reduces with the lapse of time and at one time it
becomes uneconomical to be used further and needs replacements by another new unit.

This reduction of equipment’s efficiency and value with the lapse of time during use is called
Depreciation. Some money, therefore, must be set aside yearly from the profits, so that
when that equipment becomes uneconomical, it can be replaced by the new one. Therefore,
the initial cost of equipment plus installation charges plus repairs charges minus scrap value
is charged against overheads and is spread over equipment’s useful life.

7.3 Obsolescence
Suppose a factory owner purchases a machine for his production shop but after some duration
a better machine comes in the market, whose production rate is very high and much
economical. Although the old machine is efficient but becomes out of fashion and
uneconomical due to the new better machine which has come in the market. This is known as
Obsolescence, and some money should also be set aside from the profits for this cause. Since
it is very difficult to predict whom a new and better machine comes to market, the general
practice is to reduce the life of a machine so as to account the effect of obsolescence. Then
the depreciation and obsolescence charges are calculated in the reduce life.

Example: Suppose the life of a machine is expected to be 20 years then the depreciations
rate will be 100/20 = 5%. By considering obsolescence also, its may be taken as 15 years.
Then the combined depreciation and obsolescence charges will be 100/15 = 6.66% instead of
5%. Therefore, the difference (6.66-5.00) = 1.66% will be obsolescence charges.

7.4 Causes of Depreciation

Apart from wear and tear the other causes of depreciation are physical decay, accidents,
deferred maintenance and neglect, inadequacy and obsolescence. Depreciation can be broadly
classified as Depreciation due to physical condition and Depreciation due to Functional
condition.

Depreciation

Depreciation due to physical Depreciation due to


condition functional
condition

W/T P/D ACC D/M/N INAD OBSL


W/T – Wear and tear; P/D – Physical decay; ACC – Accidental; INAD – Inadequacy;
D/M/N – Deferred maintenance and neglect; OBSL – Obsolescence.

(i) Depreciation due to War and Tear – Whenever any machinery perform work, wear
and tear of certain components take place, although sufficient precautions are taken,
i.e. proper lubricating and cooling is done, which minimises wear and tear but it

6
cannot be totally prevented. Hence, the cost of replacement because of these
precautions is the value of depreciation due to wear and tear.
(ii) Depreciation due to physical decay – There are certain items in a factory, such as
insulation of material, furniture, electric cables, poles, building, chemicals and vessels
etc. which get decay, because of climatic and atmospheric effect, with the result the
value of these articles goes on reducing with the lapse of time. Although every effort
is made by the owner to keep them in serviceable condition, even then because of
climatic and atmospheric effect, there will be reduction in their value. This reduction
in valued is depreciation due to physical decay.
(iii) Accidental Depreciation - Although the machine might have been installed even few
days back and sufficient care is taken to prevent accident, even then, accident may
occur due to some wrong operation, or some loose component or some other cause,
which may result in a heavy damage. The depreciation in machine caused due to this
reason is called accidental depreciation. This knot of depreciation is taken care by
insuring the equipment the amount of the insurance premium depends upon the
estimated cost and life of the equipment.
(iv) Depreciation due to “Deferred Maintenance and Neglect” – If manufacturer’s
instructions for any equipment are neglected and if proper maintenance is not done as
recommended by the manufacturer, then the value of the equipment may be reduced,
and depreciation value because of this, is called depreciation due to “deferred
maintenance and neglect.”
(v) Inadequacy – Inadequacy means reduction in efficiency of an asset. This may result,
even if any equipment is serviced under proper precautions and sufficient
maintenance provided, there is a fall in efficiency with the lapse of time. Secondly,
suppose after say 2-3 years of running, the demand of products manufactured by
certain plant is increase, But the plant cannot cope-up with the increased demand.
This means additional money either to replace with the bigger sized machinery, or
installation of more plants of similar size. This is what is called depreciation due to
inadequacy.
(vi) Depreciation by Obsolescence - Because of technological and scientific
advancement a new machinery comes in the market, which more efficient because of
new invention and better design then, the manufacture of same type of products by the
new machine are much cheaper and better than the existing one, hence the existing
machinery has to be replaced to withstand market competition. This is called
“depreciation by Obsolescence” and is of functional type.

7.4.1 Methods of calculating Depreciation


There are several methods used to calculate fund for depreciation some of them are:
• straight line method
• Diminishing balance method
• Sinking Fund Method
• Insurance Policy Method
• Machine hour Basis Method etc.

(i) Straight Line Method - This method assumes that the loss of value of machine is
directly proportional to its age. It means one should deduct the scrap value from
the original value and divide the remaining value by the number of years of useful
life.
Let, C be the initial cost plus installation of a machine
S be the scrap value
N be number of years of life of machine, and
D be the depreciation amount per year.
Then D= (C-S)/N

6
This method is also known as “Fixed Instalment” method since same fixed amount is
deducted and no consideration is made about the maintenance and repair charges, which
gradually increases as the machine is getting old.

Example 1: A boiler was purchased for Ksh.900, 000/= on 1st January 1946, the erection and
installation cost was Kshs.140, 000/=. The boiler was to be replaced by a new one on 31st
Dec. 1965. If the scrap value was estimated to be Kshs.300, 000/= what should be the rate of
deprecation and depreciation fund on 15th June, 1955.

Solution:
(a) Total cost = Boiler cost + Erection and Installation charges

:. C = Kshs.900, 000/= + Kshs.140,000/= = Kshs.1,040,000/=


Scrap value, S= 300,000/=.
Life of boiler = from 1st Jan. 1946 to 31st Dec. 1965 = 20 years
:.Rate of depreciation, D = (C – S)/N = (1,040,000 – 300,000)/20 =
Ksh.37,000/=
:.Depreciation in value of boiler per year = Kshs.37,000/= only. Now, depreciation fund on
15th June 1955, i.e. 9 instalments could be accumulated.
:.Depreciation fund collected up to 15th June, 1955 = Kshs.9 x37,000/= = Kshs.333,000/=

(b) As after 12 years tubes have been replaced and cost of replacement is Kshs.30,000/=.
Now book value in 12 years will be Kshs.1,040,000 – 12 x 37,000/= =
Kshs.596,000/= and replacement cost is Kshs.30,000/=. Hence, new book value =
Kshs.596,000 + Kshs.30,000/= = Kshs. 626,000/=.

As, scrap value is same Ksh300,000/= hence the depreciation for the rest 8 years will be
(Kshs.626,000 – Kshs.300,000/=) = Kshs.326,000/=.
:.New rate of depreciation p.a. = Kshs.(326,000/8) = Kshs.40,750/=.

(ii) Diminishing Balance Method (also known as “Reducing Balance “Method) – The
diminishing value of machine is much greater in the early years. It depreciates
rapidly in the early years and later on slowly. Therefore, it is better to depreciate
more during the early years, when the repairs and renewal are not costly.

So, under this method, the book value of the machine goes on decreasing as its
existence continues. A certain percentage of the current book value is taken as the
depreciation.
Let X be the fixed percentage taken to calculate the yearly depreciation on the book
value. Then, X = 1 – (S/C) 1/N

Example 2: A lathe is purchased for Kshs.160,000/= and the assumed life is 10 years, and
scrap value is Kshs.40,000/=. If the depreciation is charged by Diminishing Balance
Method, calculate the percentage by which value of the lathe is reducing every year, and
depreciation fund after 2 years.
Solution: C= 160,000/= ; S = 40,000/= , N = 10
Therefore, X = 1 – (S/N) 1/N = 1- (40,000/160,000)1/10 = 0.1294
Therefore, required % = 12.94
:. Value of the lathe after 1 year = 160,000 (1-0.1294) = Kshs.139,296/=.
:.Depreciation fund after 1 year = Kshs.160,000/= = Kshs.139,296/= = Kshs.20,705/=
Now, value of lathe after 2 years, = Kshs.139,296 (1-0.1294) = Kshs.121,271/=.
:.Depreciation of 2nd year = (139,296 – 121,271) = Kshs.18,025/=

6
:.Depreciation fund after 2 years =20,705 +18,025/= Kshs.38,730/=

(iii) The Insurance Policy Method – This method covers the risk, if the machine becomes
unserviceable before its estimated life.
(iv) Machine Hour Basis Method – In this, rate of depreciation is calculated, considering
the total number of hours a machine runs in a year.

Example 3: The estimated life of a lathe is 10 years and it works 16 hours a day. The initial
cost of lathe is Kshs.160,000/= and scrap value after 10 years is Kshs.50,000/=. If the
machine works for 5,840 hours in a year, calculate the rate of depreciation charged annually
as per machine hour basis method.

Solution: N= 10, through C= Kshs.160,000/=, S= Kshs.50,000/=


C-S = 160,000 – 50,000 = Kshs.110,000/=.
:. Loss in the cost of lathe in 10 years = Kshs.110,000/=
Life of machine in hours = 10 x 365 x 16 = 58,400 hrs.
:.Depreciation per hour 110,000/58,400 = 1.884
:.Depreciation in one year = 1.884 x 5840 = 11,000
:.Rate of depreciation per year = Kshs.11,000/=.

7.5 The Decision Whether to Purchase


In principle unless the benefits exceed the cost of ownership of a system by an acceptable
margin then the purchase should not be made. Once the cost of ownership has been
determined and the profits, which will accrue have been estimated, the next step is to decide
whether or not to purchase a particular system. This will depend on the return on the capital
and the pay back period. Three main methods are used to make this decision.

(a) The Return on Capital


In this method the total cost of owning the system, and the total cash inflow, which will be
received from it is calculated over a certain period.

Thus suppose the cost of ownership (capital investment) = Kshs.40, 000/= over 10 years.
Total cash inflow = Kshs.60, 000/= over 10 years.
Total profit = Kshs.20, 000/=
:. Average profit per year = Kshs.2, 000/=
:. Percentage return outlay = 2,000/40,000 x 100% =5%

The objection to this method is that it takes no account of when money has to be paid out or
of when it is received.

(b) Payback Period


In this method we estimate how long it would take to recover the outlay, the assumption
being that after that the project starts to make a profit. The risk is that although a project may
recover its outlay very quickly it may then come to an end and never make any profit.

(c) Discounted Cash Flow


This is the only one of the three methods that takes full account of the financial situation. It
takes into account how much money is paid out or received and when this happens. The
rules are:
• The later we have to pay out money the better
• The earlier we receive money the better.
(This method is illustrated in section………….)

6
7.6 Installation of new Equipment
Once we have decided to purchase a piece of equipment, we must plan for its installation.
This involves:
(a) The erection of a suitable building to house it.
(b) Preparation of a base or foundations to receive it and the provision of services such as
gas, electricity, water, drainage and compressed air.
(c) Receipt of the equipment and if necessary its storage elsewhere until its permanent
site is ready.
(d) Installation.
(e) Commissioning.
Items (a), (b) and (c) are usually the customer’s responsibility although they may be done to
instructions and drawings supplied by the manufacturer. The latter usually supervises items
(d) and (e).

7.7 Equipment Replacement

Equipment replacement decision plays an important role in the economic running of any
organization. There are three types of replacement decisions:
(a) The replacement of capital equipment, as it wears out
(b) The capital equipment required for expansion.
(c) The replacement of equipment due to obsolescence
i.e. the introduction of an improved technology/equipment in the market, which
may produce cheaper products.

Equipments are used to produce at profitable rate, so that the production can stand
competition. Replacement decision is not an easy job, it requires several considerations.
Since it involves large capital investment, a wrong decision may adversely affect the
profitability of the whole concern. Therefore, a scientific approach to solve this problem is
essential. For this purpose the Break-Even Analysis is very useful.

Almost all the equipments are subjected to deterioration and obsolescence in varying degree
with the passage of time. Thus with the passage of time operating inferiority increases. Hence
the old machine has this operating inferiority high and book value as low, while a new
machine to be purchased will have operating inferiority minimum and costs at a maximum.
Hence the problem before the decision maker is to choose between more capital cost and less
imperfection on one hand, and less capital cost and more imperfection on the other.

Although replacement reduces maintenance cost, it involves high capital cost. Therefore an
equipment is replaced when the maintenance and capital cost of the existing equipment is
more than the average capital and operating cost of the new equipment.

7.7.1 Reasons for Replacement of Equipment


The main reasons for replacement of the equipment are:

(a) Deterioration
It becomes necessary to replace the machine when it wears out and does not function
properly. Such machine starts lowering the quality of products, decreasing the production and
increase in labour and maintenance cost.

(b) Obsolescence
Whenever new equipment (due to new technology) comes in the market, which is capable of
producing more good quality products with less labour and has more efficiency, the existing

6
machine is to be replaced with this machine even though it is functioning well. Generally this
is necessitated because the products manufactured by the new machine will be cheaper.

(c) Inadequacy:
With the change of product design to meet the customers demand or quantity to be
manufactured, old machinery become inadequate and therefore call for different
manufacturing equipment.

7.7.2 Equipment Replacement Policy


Sometimes a manufacturer has to face a very serious problem of replacing an equipment.
This is a continuous process and therefore a set system or policy for this process must be
evolved. The main consideration is when to replace, but this when requires many
considerations to reach at suitable conclusion.

When replacing equipment before the expiry of the estimated life, the following reasons or
factors should be considered:
(i) To reduce production cost
(ii) To reduce fatigue
(iii) To raise quality
(iv) To increase output
(v) To secure greater convenience, safety and reliability.

7.7.3 Guidelines in Replacement Analysis

(a) Financial Factors


There are certain rules, which may be used as guidelines for replacement analysis:
(i) For equipment in use:
• Do consider operating cost, repairs and maintenance cost, downtime cost,
salvage value, and rebuilding cost
• Do not consider original cost, money already spent on repairs and
maintenance, and unrealistic book value.

(ii) For new equipment:


• Do consider initial cost, interest on capital investment, salvage value at
the end of useful life, cost advantage of improved product, and labour
savings.
• Do not consider any savings not clearly assessable, and overhead
charges.

(b) Technical Factors


The status of present equipment i.e. whether it is worn out completely:
(i) Whether the present equipment has become obsolete.
(ii) Whether the present equipment is inadequate in meeting the production rate.
(iii) Whether the present equipment can hold tight tolerances
(iv) Whether the present equipment can provide the required surface finish
(v) Whether the new equipment is better from working condition point of view
including noise and vibrations and it has lesser pollution.
(vi) Whether the new equipment requires less maintenance.

7.7.4 Methods used for Replacement

The following methods are used for equipment replacement:


(a) Pay back period method

6
(b) Total life average method
(c) Present value method
(d) Rate of return method
(e) MAPI method

(a) Pay Back Period Method


This method determines as to how long it will take to pay back the invested capital. The
period can be determined using the following formula:
P=C/R, where P=pay back period
C=Original capital investment
R=Annual return expected (i.e. total annual earning after deducting taxes)

This method is not much reliable as it does not take into account its insurance, interest and
maintenance. Further, in the beginning the return is generally less, which increases gradually,
but here we consider it as constant. This method also does not consider depreciation and
obsolescence.

(b) Total Life Average Method


In this method, all the costs involved in buying, operating and maintaining an equipment or
asset are added together into one total life figure and this sum is divided by the total
estimated life to get an average annual cots.

(c) Present worth Method


This method is more accurate and reasonable and is used to evaluate the present value of new
equipment. For the purpose of comparison future costs are translated into to-day’s money.

A “present worth money” is today’s value of money invested (at certain interest rate) after
given number of years from today.

Example – If we have Ksh5,000/- to invest and want to know what will be its worth in ten
years at 10% interest, using the formula: F=P(1+i)n, where
F=worth of money in future; P=present amount of money
i=interest rate, n=number of years

F=5,000(1+0.010)10-=5000(2.594) = Ksh.12, 970/=

Thus Ksh.5,000/- today is worth Ksh12,970/-after ten years from now or in other words
Ksh.12,970/- after ten years from now has a present worth of Ksh5,000/-.

While comparing two alternatives all the costs must be translated into present worth and they
must be compared for equal length of services, e.g. if machine A has its life of 3 years and
machine B has 6 years, then to compare the two machines we must compare the present
worth value of 6 years service of two machines of type A and one machine of type B.

Example: Machine A, operated manually costs, Kshs.2000/= has a life of 2 years. While an
automatic machine B costs Kshs.5000/= but has a life of 4 years. Operating costs for machine
A is Kshs.4,000/= per year while that of machine B is Kshs.3,000/=. Which should be
purchased? Consider 10% interest.

Solution
For comparing the two machines for equal period we must consider 2 machines A against one
machine B. each of the expense is to be converted into present worth.

7
Machine A
Expenses converted in terms of present worth.
(i) Present worth of cost of first piece = Ksh2,000/=
(ii) Present worth of operating cost of first piece.
• in the first year
P = F[1/(1 +i)n] = 4,000/(1 +0.10) = Kshs.3,636/=
• In the second year, P = 4000/(1+0.10)2 = Kshs.3,306/=

(iii) Present worth of cost of second piece of machine A, purchased after the expiry
of life of first piece; i.e. 2 years,
= 2000/ (1 + 0.10)2 = Kshs.1,652/=
(iv) Present worth of operating cost of second piece.
• in the third year = 4000/1+0.10)3 = Kshs.3005/=
• in the fourth year = 4000/(1+0.10)4 = Kshs.2,732/=

Thus the total expenditure in terms of present worth required when machine A is used =
Kshs.16,331/=.

Machine B
Expenses converted in terms of present worth.
(i) Present worth of costs of machine B = Kshs.5000/=
(ii) Present worth of operating cost.
• in the first year = 3000/(1+0.10) = Kshs.2,727/=
• in the second year= 3000/(1+0.10)2 = Kshs.2479/=
• in the third year = 3000/(1+0.10)3 = Kshs.2254/=
• in the third year = 3000/)1+0.10)4 = Kshs.2049/=

Total expenditure on machine B in terms of present worth = Kshs.14,509/=


Thus the machine B is economical as its expenses are less and hence machine B must be
purchased.

(d) Rate of Return Method


In this method average annual net income (after tax and depreciation deductions) is expressed
as percentage of capital investment. The formula used is:

Percentage rate of return = (Earnings per year)/( Net Investment) x 100%

But this method has a drawback that earnings of all the years cannot have the value equal to
that of today (present worth). Hence the method will be more useful and practical, if the
earnings, of all years are first converted to present worth and the calculations are made for
rate of return. The following method is an improvement over this method.

(e) Discounted Rate of Return


This is an improvement of the Rate of Return Method. In this case the following is used: C=
R/(1+r)n; where

C = Investment cost
R = Expected earning in the nth year
R = Rate of return

If we have to find out the rate of return when a machine has worked for n years and earnings
in each year is R1 (in the first year), R2 (in second year), R3 (in the third year) etc, then the
following formula is used (it is calculated for r, by trial and error method).

7
C = R1/(1+r)1 + R2/(1+r)2 + R3/(1+r)3 +…….+Rn/(1+r)n +S/(1+r)n
S = salvage value of the machine after n years.

Example: Find out whether a machine having the following particular must be purchased or
not if rate of interests is 10%.

Cost of Machine Kshs.4,000/=


Expected return in first year Kshs.2,400/=
Expected return in second year Kshs.1,600/=
Expected return in third year Kshs.1,400/=
Salvage value at the end of third year Kshs. 400/=

Solution: C = R1/(1+r) + R2/(1+r)2 + R3/(1+r)3 + S/(1+r)3


:. 4000 = 2400/(1+r) + 1600/(1+r)2 + 1400/(1+r)3 + 400/(1+r)3
By trial and error, r = 0.21, i.e. 21%

Since this return of 21% on the investment is higher than the rate of interest of 10%, it is
worthwhile to purchase the machine.

(f) MAPI Method


The term MAPI stands for machinery and Allied Products Institute of Washington, which
developed this method. Almost all the equipment are subjected to deterioration and
obsolescence in varying degree with the passage of time. This with the passage of time
operating inferiority increases. Hence the old machine has this operating inferiority high and
book value as low. While a new machine to be purchase will have operating inferiority
minimum and cost at a maximum. Hence the problem faced with the decision maker is to
choose between more capital cost and more imperfection on the other. The MAPI has
developed a new approach, which helps in deciding this problem. The existing equipment
which is to be replaced is known as defender and the new one which will replace the old one
is known as the challenger.

For estimating as to whether the proposed replacement is profitable, the adverse “minimum”
of the defender and the challenger are found and compared. Adverse minimum of the
defender or the challenger is the lowest sum of the time adjusted average of capital cost and
operating inferiority (expressed in terms of money) obtainable from a machine. The
calculations can easily be done with the help of MAPI Charts.

8.0 Acquisition of Failure Data


Failure and repair data are invaluable in reliability and maintainability studies. The uses of
such data include determining product maintenance needs, predicting product maintainability,
performing life cycle cost studies, and determining product replacement policies. There are
many sources of such data. For example, during the product life cycle, data sources include
reports generated by the repair facility, past experience with similar or identical items, failure
reporting systems developed and used by customers, warranty claims.

The most important data are failure probabilities and assessments of criticality of the failures,
part failure rates, probability of operator error, and inspection efficiency data. Part failure rate
data can come from experience, banks of generic failure data, and other sources.

The data banks for operator error data fall into three categories: experimentally based data
banks, field based data banks, and subjectively based data banks. The experimentally based

7
data banks contain data gathered in the laboratory. The field based data banks are based upon
data gathered during operation and so often provide more realistic information. The
subjectively based data banks contain data generated by various techniques.

8.1 Reasons for data collection


Failure data can be collected fro prototypes and production models or from the field. In either
case a formal failure-reporting document is necessary in order to ensure that the feedback is
both consistent and adequate. Field information is far more valuable since it concerns failures
and repair actions, which have taken place under real conditions. Since recording field
incidents relies on people, it is subject to errors, omissions and misinterpretation. It is
therefore, important to collect all field data using a formal document. Information of this type
has a number of uses, the main two being feedback resulting in modifications to prevent
further defects, and the acquisition of statistical reliability and repair data. In detail, then,
they:

(a) Indicate design and manufacture deficiencies and can be used to support reliability
growth programs; these concern the improvement in reliability, during use, which
comes from field data feedback resulting in modifications. Improvements depend on
ensuring that field data actually lead to design modifications. Reliability growth, then,
is the process of eliminating design-related failures.
(b) Provide quality and reliability trends.
(c) Identify wear-out and decreasing failure rates.
(d) Provide subcontractor ratings.
(e) Contribute statistical data for future reliability and repair time predictions.
(f) Assist second –line maintenance – this for the purpose of:
• Scheduled overhaul and refurbishing of units returned from preventive
maintenance;
• Unscheduled repair and/or overhaul of modules which have failed or become
degraded.

Deeper diagnostic capability is needed and therefore the larger, more complex,
test equipment will be found at the workshop together with full system
information.
(g) Enable spares provisioning to be refined.
(h) Allow routine maintenance intervals to be revised.
(i) Enable the field element of quality costs to be identified.

A failure-reporting system should be established for every project and product. Customer
cooperation with a reporting system is essential if feedback from the field is required (and
this could well be sought, at the contract stage, in return for some concession).

8.2 Information and Difficulties


A failure report form must collect information covering the following:
• Report time – active and passive
• Type of fault – primary or secondary, random or induced, etc.
• Nature of fault – open or short circuit, drift condition, wear out, design deficiency.
• Fault location – exact position and detail of component or assembly.
• Environmental conditions – where these are variable, record conditions at time of
fault if possible.
• Action taken – exact nature of replacement or repair.
• Personnel involved.
• Spares used.

7
• Unit running time.
The main problems associated with failure recording are:
i) Inventories – whilst failure reports identify the numbers and types of failure they
rarely provide a source of information as to the total numbers of the items in question
and their installation dates and running times.
ii) Motivation – if the field personnel can see no purpose in recording information it is
likely that items will be either omitted or incorrectly recorded. If the person is
frustrated by unrealistic time standards, poor working conditions and inadequate
instructions, then the failure report is the first task which will be skimped or omitted.
iii) Verification – Once the failure report has left the person who completes it the
possibility of subsequent checking is remote. If repair times and diagnoses are suspect
then it is likely that they will go undetected or be unverified. Where failure data are
obtained from customer’s staff, the possibility of challenging information becomes
even more remote.
iv) Cost – Failure reporting is costly in terms of both the time to complete failure-report
forms and the hours of interpretation of the information. For this reason, both supplier
and customer are often reluctant to agree to a comprehensive reporting system. (If the
information is correctly interpreted and design or manufacturing action taken to
remove failure sources, then the cost of the activity is likely to be offset by the
savings and the idea must be sold on this basis).
v) Recording non-failures – The situation arises where a failure is recorded although
none exists. This can occur in two ways. First, there is the habit of locating faults by
replacing suspect but not necessarily failed components. When the fault disappears
the first (wrongly removed) component is not replaced and is hence recorded as a
failure. Failure rate data are therefore artificially inflated and spares depleted.
Second, there is the interpretation of secondary failures as primary failures. A failed
component may cause stress conditions upon another, which may, as a result, fail.
Diagnosis may reveal both failures but not always which one occurred first. Again,
failure rates become wrongly inflated. More complex maintenance instructions and
the use of higher-grade personnel will help reduce these problems at a cost.

(vi) Times to Failure: These are necessary in order to establish wear out. In most cases
fault data schemes yield the numbers of failures/defects of equipment. Establishing
the inventories, and the installation dates of items, is also necessary if the cumulative
times are also to be determined. This is not always easy as plant records are often
incomplete (or out of data) and the exact installation date of items has sometimes to
be guessed.

Although this failure rate information provides a valuable input to reliability


prediction to optimum spares provisioning, it does not enable the wear out and burn-in
characteristics of an item to be described.

For this to happen it is essential that each item is separately identified (usually by a
tag number) and that each failure is attributed to a specific item.

Further more, if an item is removed, replaced or refurbished as new then this needs to
be identified (by tag number) in order for the correct start times to be identified for
each subsequent failure time.

Another complication is in the use of operating time rather than calendar time. In
some ways the latter is more convenient if the data is to be used for generic use. In
some cases however, especially where the mode is related to wear and the operating
time is short compared with calendar time, then operating hours will be more

7
meaningful. In any case consistency is the rule. If information is available then it is
possible to list:
• Individual times to failure (calendar or operating)
• Times for items which did not fail
• Times for items which were removed without failing

In summary the following are needed:


• Installed (or replaced/refurbished) dates and tag numbers
• Failure dates and tag numbers
• Failure modes (by physical failure mechanism)
• Running times/profiles unless calendar time is to be used.
vii) Causes of failures: Failures might be divided into those due to:
• Design
• Manufacture/production
• Raw materials or purchased items
• Damage in the store or in transit
• Faulty installation
• Faulty customer operation
Out of the above six causes of failure, it is sometimes difficulty to decide with any certainty
who is responsible for each failure; Guessing gives analysis of very little value.

8.3 Best Practice and Recommendations


The following list summarises the best practice together with recommended enhancements
for both manual and computer based field failure recording. Recorded field information is
frequently inadequate and it is necessary to emphasize that failure data must contain
sufficient information to enable precise failures to be identified and failure distributions to be
identified. They must, therefore, include:
(a) Adequate information about symptoms and causes of failure. This is important
because predictions are only meaningful when a system level failure is precisely
defined. Thus component failures, which contribute to a defined system failure
can only be identified if the failure mode is accurately recorded. There needs to
be a distinction between failures (which cause loss of system function) and defects
(which may only cause degradation of function).
(b) Detailed and accurate equipment inventories enabling each component item to be
separately identified. This is essential in providing cumulative operating times for
the calculation of assumed constant failure rates and also for obtaining individual
calendar times (or operating times or cycles) to each mode of failure and for each
component item. These individual times to failure are necessary if failure
distributions are to be analysed.
(c) Identification of common cause failures by requiring the inspection of redundant
units to ascertain if failures have occurred in both (or all) units. In order to
achieve this it is necessary to be able to identify that two or more failures are
related to specific field items in a redundant configuration. It is therefore
important that each recorded failure also identifies which specific item (i.e. tag
number) it refers to.
(d) Intervals between common cause failures. Because common cause failures do not
necessarily occur at precisely the same instant it is desirable to be able to identify
the time elapsed between them.
(e) The effect that a “component part” level failure has a failure at the system level.
This will vary according to the type of system, the level of redundancy (which
may postpone system level failure) etc.

7
(f) Costs of failure such as the penalty cost of system outage (e.g. loss of production)
and the cost of corrective repair effort and associated spares and other
maintenance costs.
(g) The consequences in the case of safety related failures (e.g. death, injury, and
environmental damage) not so easily quantified.
(h) Consideration of whether a failure is intrinsic to the item in question or was
caused by an external factor. External factors might include:
• Process operator error induced failure
• Maintenance error induced failure
• Failure caused by a diagnostic replacement attempt
• Modification induced failure.
(i) Effective data screening to identify and correct errors and to ensure consistency.
There is a cost issue here in that effective data screening requires significant man-
hours to study the field failures returns.
(j) Adequate information about the environment (e.g. weather in the case of
unprotected equipment) and operating conditions (e.g. unusual production
throughout loadings).

8.4 Analysis and Presentation of Results


Once collected, data must be analysed and put to use or the system of collection will lose
credibility and, in any case, the cost will have been wasted. If the frequency of each defect
type is totalled and the types then ranked in descending order of frequency it will usually be
seen that a high percentage of the defects are spread across only a few types. A still more
useful approach, if cost information is available is to multiply each defect type frequency by
its cost and then to re-rank the categories in descending order of cost. Thus the most
expensive group of defects, rather than the most frequent, heads the list.

It is also useful to know whether the failure rate of a particular failure type is increasing,
decreasing or constant. This will influence the engineering response. A decreasing failure
rate indicates the need for further action in tests to eliminate the failures. Increasing failure
rate shows wear out requiring either a design solution or preventive replacement. Constant
failure rate suggests a reliability level, which is inherent to that design configuration.

8.5 Sources of Reliability Information


(a) Sources:
(i) Tests:
• Demonstration and acceptance tests;
• Measurements of failure rate, MTBF, MTTR etc.
• Data from suppliers tests
(ii) Customer experience:
• Market research;
• Customer’s maintenance records;
• Returns for repair or replacement;
• Customer complaints.
(iii) Other sources of information:
• Quality control department
• Published data and internet
• Consultants.

(b) Advantages and Disadvantages of Reliability Tests


Direct tests on parts, assemblies of parts or even occasionally on complete systems have the
great advantage that we can control the conditions of test and record exactly what they are. In

7
this sense they yield reliable data. Unfortunately tests tend to be both expensive and time
consuming. Further, although we may know the precise conditions under which a test was
done, we can never be sure that they exactly reproduce the treatment the products will receive
in service. Tests are roughly divided into two:
(i) Demonstration and acceptance tests – These are tests to decide whether a
particular item or items can be accepted.
(ii) Reliability measurements – Tests to estimate values of reliability are likely to be
more expensive than demonstration tests. To measure reliability, a test to estimate
one of the following parameters is carried out, then results converted to reliability
if required:
• The rate of degradation or deterioration
• The MTBF
• The mean time between wear out failures, and the variability about this mean
time.
• The parameters of the Weibull distribution.

Since tests tend to be both expensive and difficult, we should not start one without full
consideration of its probable cost, balanced against the value of the information we hope to
obtain from it.

(c) Customer Experience:


Information based on customer’s experience has the advantage that it relates to actual use
conditions, but unfortunately we seldom know exactly what these are. A typical situation
arises when an item of equipment fails and is returned to the manufacturer who examines it,
anxious to discover what went wrong. He then finds that, while there is no doubt that item
concerned has failed, there is no evidence whatever to establish how or why it happened. He
is unable to reconstruct the circumstances just before failure. Probably the customer will not
be able to help either, because he was not paying attention to that item of equipment at that
moment. As far as he knows it was being operated normally.

You might also like