Maintenance is an age – old function, which developed and progressed, knowingly or unknowingly, along with the
operation of equipments. In early ages, maintenance was, probably, not a separate identity but the job of maintenance
was considered as part and parcel of operator’s job. This was possible because of simplicity and openness of machines
and equipments. In these days, nailing of a cracked wooden frame of a chariot or tying a broken or cracking piece with
rope was done by the same person who was operating the same. Even the ancient carvings on the Egyptian tomb of
Ra-En-Ka (about 2600 to 1700 BC) shows that a sledge is being used to transport a stone monument and a man is
pouring some liquid, probably to lubricate the runners of the sledge, for easy movement. Thus, even lubrication job,
which is basically a maintenance job, as done by the same personnel without giving the job a separate identity.
However, with the growth of industrialization, the complexity of the machines increased and the
machines became less simple and less open. This started creating problems for the operating personnel and the
concept of maintenance as a separate discipline and separate identity started. Maintenance Prevention (MP) was
probably first known in 1960s from a factory Magazine in United States and that Maintenance Prevention (MP) meant
design, manufacture or purchase of equipment, which is free from maintenance.
The growth and speed of inflation in 1970s gave rise to new awareness of the rising costs for
downtimes or loss of use. Also the asset and equipment / component replacement costs became so inflated and
increased that well managed maintenance programs, to enhance the life of existing equipments and components,
became the essential aspects of all management strategies.
In an industry of today, Plant Engineering is probably the biggest force to increase productivity (other
than the motivation work force) and Maintenance Engineering / Maintenance Management is the most important
component of Plant Engineering. Maintenance is an investment that buys / gives more production time. With the
increase of complexity, sophistication and automation of equipments, a very serious burden now falls on maintenance
engineers regarding the quality and quantity of maintenance, maintenance aids and their documentation etc. Problems
do not increase in linear proportion to the increase in production but increase in astronomical proportions. We can take
a rough example of problems in a family of husband and wife when another women or man comes in.
Also we cannot cope up today’s jobs and problems with yesterday’s tools and techniques. We have to
use tools and technique of today, compatible with today’s problems and also with the anticipated problems of
tomorrow (near future). Educational and research institutions should cooperate with industries to ease out the
problems. If all these are not taken care of, a maintenance man would be so busy in committing suicides that he would
hardly have any time to live i.e. he would be busy in fire fighting breakdowns rather than preventing / eliminating the
same.
A maintenance system of today includes many aspects – some of which are given below:
1. Protecting the buildings, structures and plants.
2. Increasing equipment availability and reducing downtimes; also helping in increasing utilization of
equipments.
3. Controlling and directing labour forces.
4. Economy in maintenance department.
5. Maximizing utilization of available resources.
6. Ensuring safety of installations and reducing environmental pollution.
7. Recording expenditure and costing of individual jobs and of department / section.
8. Preventing waste of tools, spares and other materials.
9. Emphasis on waste recovery.
10. Preparing maintenance budgets.
11. Improving technical communications.
12. Measuring plant performance as a guide for future actions.
13. Training of maintenance personnel on related topics.
MAINTENANCE OBJECTIVES
From various aspects of maintenance, mentioned in preceding paragraph, the maintenance objectives of a big
industrial undertaking like steel plants, can be enumerated as below:
1
• To maintain plant and equipments at its maximum operating efficiency, ensuring operational safety and
reducing downtime.
• To safeguard investments by minimizing rate of deterioration and achieving this at optimum cost through
budgeting and control.
• To help management in taking decisions on replacements or new investments and activity participate in
specification preparation, equipments selection, its erection and commissioning.
• Development pf resources for equipments and spares and providing technical help for Vendor’s / equipment
supplier’s selection / rating and import substitution.
• Help in implementation of suitable procedures for procurement, storage and consumption of spares, tools and
consumables etc.
• Standardization of spares and consumables in conformity with plant, national and international standards and
help in adoption of these standards by all uses in the plant. Also help in variety reduction and inventory
control.
• Running of centralized services like Steam Generation, water supply, air supply and fuel supply etc.
• Running of captive workshops for repair and reconditioning and also making some new spares.
• Help in training and development of skilled workmen and executives.
In order to streamline the understanding of different types/systems of maintenance functions, the classification can be
done on the basis of planning and critically/essentially of jobs. Some jobs may be planned in advance but some jobs
may have to be taken up immediately and un-planned. Planned and un-planned jobs can be classified further
depending on the nature of the job and its essentiality. The detailed classification is shown below.
Maintenance system has two types:
1. Un-planned
2. Planned
Un-planned maintenance has three types:
1. Corrective Maintenance
2. Opportunistic Maintenance
3. Emergency or Break down maintenance
Planned maintenance has four types:
1. Routine Maintenance
2. Preventive Maintenance
3. Predictive Maintenance
4. Design-out Maintenance
Breakdown Maintenance
In a break down maintenance system, repair is undertaken only after the failure of the equipment. The equipment is
allowed to run un-disturbed till it fails. Off course, lubrication and minor adjustment (pressure, flow etc.) are done
during this period. Only when equipment fails to perform designed functions or comes to grinding halt, any other
maintenance/repair job is taken.
2
On the face value, this system appears to be simple and less expensive but it is not really so. It may work well
in a small factory/plant where:
• Numbers of equipment are few.
• Equipments are very simple and repair does not call for specialist tools/tackles.
• Where sudden stoppage/failure of equipments will not cause serve financial loss in terms of delivery
commitment or further damage to other equipments/components.
• Where sudden failure will not cause serve safety or environmental hazards.
In such small factories, generally there is no specialized maintenance crew. Maintenance is normally done by
persons operating the machine and other connected persons. Maintenance is generally done to put back the breakdown
m/c into operation but not much job is done to prevent recurrences of such breakdowns. Spears are generally kept with
persons operating the machine or their superiors.
CORRECTIVE MAINTENANCE
Corrective maintenance as it implies, means maintenance actions for correcting or restoring a failed unit (or the units
going to fail). Its scope is very vast and may include different types of actions from small actions like typical
adjustments and minor repair to re-design of equipment. It includes both planned and un-planned actions and is
governed by failure of the items as wells as condition of the items.
Actions in corrective maintenance can be sub-divided, according to priority, as follow:
1. Emergency work, high priority, generally off-line so that, after stopping the equipment, normally less than 24
hours notice is given for taking the job.
2. Deferred work-jobs of lower order priority; generally off-line.
3. To eliminate/reduce repetitive breakdown.
4. Reconditioning or Re-design jobs.
Corrective maintenance is generally one tie task so that once taken up competed fully. Each corrective
maintenance job may differ from the other. Some of the corrective maintenance jobs may differ from the
other. Some of the corrective maintenance jobs may call for collection of extensive data/information about
breakdowns and their causes etc. and proper analysis of those data before coming to conclusion about actual
jobs to be done. Techniques like Cause and Effect Analysis (Fish-bone diagram/Parameter diagram) etc. help
these cases. Some jobs may have the following stages:
• Collection of data/information’s and analysis.
• Identify likely causes.
• Find out the best possible solutions to eliminate likely causes.
• Implement those solutions, etc.
OPPORTUNISTIC MAINTENANCE
In multi-component system, with several failing components often it is advantageous to follow opportunistic
maintenance also. When an equipment or system is taken down for maintenance/changing of one or few worn-out
components, the opportunity can be utilized for maintaining/changing other wearing out components even though they
have not failed. This would probably be economical in the long run than taking shut down when other components
fail. Normally cost of replacing several parts jointly is much less than the some of the costs of several separate
replacements. However, cost of left over life of the components, which are going to be changed, has to be taken into
considerations in such calculations.
Opportunistic maintenance is very beneficial for non-monitored components, which are in
assessable for inspection without replacement, replacement policy can be considered. For non-monitored components,
which can be inspected before replacement, inspection policy can be considered. As a commonly used example in
automobile engine, if one valve gives problem and needs grinding, all other valves are also ground in the some shut
down.
ROUTINE MAINTENANCE
Routine maintenance is the simplest form of planned maintenance but very essential. As the name implies, routine
maintenance means carrying out minor maintenance jobs regular intervals. It involves minor jobs such as cleaning,
lubrication inspection and minor adjustment of pressure flow, tightness etc. and tightening of loose parts etc. it also
3
includes inspection of bearings, V belts, couplings, joining, foundations. Bolts etc. The small and critical defects
during inspection are rectified immediately and bigger jobs are planned for rectification during next available shut
down. Such maintenance is essentially for effective, scheduled and preventive maintenance.
Routine maintenance is not need-based. In an equipment some motors may be running four
hours a day and some motors may be running twenty hours in a day, but, in routine maintenance, all are
inspected at the same frequency. This may lead to some amount of over-maintenance on some equipments
or components but this system pays handsomely in the long run. “Regularity” i.e. carrying out planned jobs
regularly in simple cyclic schedules is very essential in routine maintenance. Such schedules are simple
{like check, clean. Lubricate, tighten. Adjust etc} and repetitive. Routine maintenance may also consider a
small portion of preventive maintenance.
PREVENTIVE MAINTENANCE
This is one of the oldest maintenance systems being practiced in industries. It is easy to understand and is still being
used extensively. Today corrective maintenance and Condition-Based Maintenance [diagnostic maintenance] etc. are
also added to this concept to some extent. Preventive maintenance [PM] is the planned maintenance of plants and
equipments [including and resulting from periodic inspections] in order to prevent or minimize breakdowns and
depreciation rates. As it covers vast areas, occasionally some people get misled about its coverage.
In general, the various component of PM are as follows:
1. Check drawings design and installation of equipments including subsequent redesign and minor modifications
depending on specific nature of problems
2. Proper identification of all items, proper documentations and conditions:
• History cards/records
• Spares catalogues, equipment catalogues and inventory list
Job manuals etc.
• Maintenance Work order etc.
3. Periodic inspection of plants and equipments:
• Use of checklists by inspectors and its frequency, daily, shift-wise, weekly, monthly etc.
• Well-qualified and experienced inspectors.
• Use of necessary aids—Test equipments, Vibration meters, Ultrasonic and X-ray equipments etc.
• Preparing total defect lists and their categorization.
4. Repetitive Servicing, repairs, upkeep and overhauls:
• Minor repairs
• Medium Repairs-roughly around 50% of job of major overhauls.
• Major overhauls or capital repairs.
• Emergency repairs or corrective repairs.
• Recovery or Salvaging- when equipment has undergone several major repairs.
4
TEROTECHNOLOGY
The life & reliability of a equipment can be improved by proper application of Tribology & terotechnology. Tribology
the science & tech. of friction, wear, & fear aspects. It deals with study of friction, wear, tear, & their control.
Industrial study has believed that cost & replacement of born-out components may be as high 70% of total component.
In any industry it is highly desirable that equipment remain available for operational utilization. Therefore cause for
design out maintenance, which is the main theme of terotechnology.
The application of this concept is following design features included in equipment: -
1. The design of equipment should be such that it requires no or minimum maintenance work.
This ensures better performance & productivity.
2. Provision should be made to send periodic feed back on the performance of system to the
designer to enable him to make modification to achieve the desire performance.
3. All consult person should be proper training on the working & maintenance aspects of
equipment.
4. The important area of terotechnology is condition monitoring, protective device & application
of advance maintenance engg.
APPLICATION
Solution Manufacturin
Implementatio g Operation
n
Feed back R&ds
The nature of the maintenance activity was determined by the manner in which plant and equipment was
designed, selected, installed, commissioned, operated, removed and replaced. Major benefits could come to British
industry from the adoption of a broadly based technology which embraces all these areas, and because no suitable
word existed to describe such a multidisciplinary concept, the name ‘terotechnology’ (based on the Greek work
‘tero’---to guard or look after) was adopted’.
In 1975 the committee for terotechnology defined terotechnology as follows: -
‘A combination of management, financial, engineering and other practices applied to physical assets in pursuit of
economic life cycle costs’
the following was then added---
‘Its practice is concerned with the specification and design for reliability and maintainability of plant, machinery,
equipment, buildings and structures, with their installation and replacement, and with the feedback of information on
design, performance and costs’.
It can be seen that the concept of terotechnology had drifted from a point where maintenance and unavailability costs
were of central importance to a very general and less tangible subject area, the relevance and applicability of which is,
as yet, far from gaining full acceptance by industry.
The author is inclined to the view that the definition attempts to encompass too diverse a range of ‘physical assets’
(e.g. from school buildings to steel processing plant) across which the main cost factors may differ by many orders of
magnitude. This book will therefore be mainly concerned with industrial plant where the main maintenance-related
costs are those of resources, unavailability and, in life cycle terms, useful life .the author has arrived at a clear
5
preference for understanding this important subject area in terms of the ‘optimization of total maintenance costs over
the equipment life cycle’. It is implicit in this definition that certain categories of unavailability can be classified as
indirect maintenance costs, and the costs of maintenance resources as direct maintenance costs.
ECONOMY OF MAINTENANCE
Maintenance economy is approved allocation of funds with in which maintenance engineers and in charges can
operate with reasonable amount of freedom for their activities / jobs.
Economy of maintenance is essential and also very advantageous to control maintenance is a service organization and
probably no better way is available to control its costs. Other benefits of such economy are that: -
• It improves the system effectiveness and efficiency of maintenance organization, increase of expenditure over
economy is generally because of some inadequate / wrong planning or untimely planning.
• Maintenance personnel know their economy in advance and, so, they plan their expenditure judiciously and timely
so that no job is held up and never there is shortage of funds.
It is very effective technique for projecting future and additional requirements of funds.
Generally there are no disadvantages of economy maintenance expenditure if the economy has been done properly and
with foresight. However disadvantages and problems are faced if the economy is not done properly with foresight and
if economy is not flexible and, in that case, a maintenance in charge has to run from pillar to post to carry out some
urgent job and delay is also caused.
TRAINING IN MAINTENANCE
The application factor is the product of morale, skill and management factors and is always less than one. The job of
training and human resource development is to increase his application factor to as near to one as possible so that the
effectiveness of maintenance manpower improves towards the potential and they are able to give more and better
output.
OBJECTIVES OF MAINTENANCE
6
Training and development of maintenance personnel is, primarily, planned and designed to achieve following
objectives;
1. Knowledge Objective: - These objectives refer to knowledge acquired during training regarding maintenance
policies and systems, preventive, predictive and corrective maintenance, condition monitoring and diagnosing,
fault and defect analysis, planning, scheduling and recording keeping etc.
2. Attitudinal Objectives: - It is commonly believed that attitudes influence the behaviour, which affects the output
and effectiveness of maintenance personnel.
3. Skill Objectives: - Skill generally refers to ability and expertness of a person to perform his job. One of the
important objectives of training is to improve the skill of maintenance personnel in his own trade and to impart
skill of other necessary trades, especially in case of multi-trade concepts.
4. Job Behaviour Objectives: - Here the attention is focused on the extent to which knowledge, skills and abilities
acquired during the training can be generalized or transferred to maintenance problems solving efforts.
5. Advanced technological Objectives: - These objectives refer to short term training programs aimed to acquaint
maintenance engineers and supervisors about the modern developments taking place in the field of repair and
maintenance technologies.
Before designing any training and management development program, following principles should be considered
which would increase the productivity in maintenance personnel, which, in turn, will enhance the organizational
productivity.
Categories of Training
In an industry, the training may be of different types and categories and on different subjects. Various training and
development program for maintenance personnel can be roughly grouped into the following categories: -
Modes of Training
Mode of training is selected mainly on type and critically of training and number and profile of participants. Generally
a combination of modes is used. Few are mentioned below: -
8
Maintenance Types/ Systems
In earlier days very few terms were used in maintenance management like repair, overhauling, P.M.
etc. but with the involvement of management experts in maintenance and also attempting to differentiate between
various maintenance jobs, several new terms were invented and used such as Planned Maintenance, scheduled
maintenance, routine maintenance, periodic maintenance, breakdown maintenance, corrective maintenance, predictive
maintenance, opportunity maintenance, need based maintenance, optimum maintenance, fixed time maintenance,
condition based maintenance and reliability centered maintenance etc. however, with so many terms available now,
there are more chances of confusion in the minds of maintenance personnel.
In order to streamline the understanding of different types/systems of maintenance functions, the
classification can be done on the basis of planning and critically/essentially of jobs. Some jobs may be planned in
advance but some jobs may have to be taken up immediately and un-planned. Planned and un-planned jobs can also be
classified further depending on nature of the job and it’s essentially. The detailed classification is shown below:-
Emergency or breakdown maintenance
Opportunsitic Maintenance
Correective maintenance
Corrective Maintenance
Maintenance
System Routine Maintenance
Corrective Maintenance:
Corrective Maintenance, as the name implies, means maintenance actions for
correcting or restoring a failed unit (or the units going to fail). Its scope is very vast and may
include different types of actions from small actions like typical adjustments and minor
repairs to re-design of equipment. It includes both planned and unplanned (or scheduled and
unscheduled) actions and is governed by failure of the items as well as condition of the
items.
Actions in corrective maintenance can be sub-divided, according to priority, as follows:
(i) Emergency work, high priority, generally off-line i.e. after stooping the equipment,
normally less than 24 hours notice is given for taking the job.
(ii) Deferred work-jobs of lower order priority, generally offline.
(iii) To eliminate/reduce respective breakdowns.
(iv) Reconditioning or re-design jobs (both major and minor).
Corrective maintenance is generally one time task i.e. once taken up,
completed fully. Each corrective maintenance job may differ from the other. Some of the
corrective maintenance jobs may call for collection of extensive data/information about
the breakdowns and their causes etc. and proper analysis of those data before coming to
conclusion about actual jobs to be done. Techniques like cause and affect analysis
(Fish-bone diagram/Pareto diagram) etc. help these cases. Some jobs may call for
research & development (R&D) activities. Thus, such corrective maintenance jobs may
have the following stages:
• Collection of data/information’s and analysis.
• Identify likely causes.
• Find out the best possible solutions to eliminate likely causes.
• Implement those solutions etc.
Some of the differences between preventive maintenance (P.M.) and corrective
maintenance may be as follows:-
(i) P.M. jobs are generally taken before the equipment has stopped working whereas
corrective maintenance may be done before or after the equipment has stopped
working.
10
(ii) Level and type of P.M. jobs are generally decided within the maintenance
department whereas in corrective maintenance help of other departments may be taken.
(iii) PM jobs are planned well in advance corrective maintenance jobs may be taken
at shorter notice.
Corrective maintenance jobs may also include some of the “Design-out
maintenance” jobs.
Mr. V.Z. Priel of U.K., in his book Systematic Maintenance Organization,
explains that “the emphasis in corrective maintenance is an obtaining full information of all
the breakdowns and their causes. Efforts are made to identify and eliminate the cause by
the activities such ass improving maintenance practices, changing frequency of
maintenance services, improving process control practices, modifying equipments or
components of equipments etc.”
11
Often, in an equipment complex, which are taken down every year for statutory annual overhaul and
inspection (like boilers etc.) if any components fails a month or two earlier than the scheduled date of start of next
shutdown and if that repair is going to take some time, the next annual overhaul and inspection is prepared to start
immediately and total job in taken together. This can be a case of opportunistic maintenance.
Opportunistic maintenance is actually not a specific maintenance system but is a system of utilizing an
opportunity which may come up anytime. To carry out the actual jobs, we use different telephone systems.
Routine Maintenance:
Routine maintenance is the simplest form of the planned maintenance but very essential. As the name
implies, routine maintenance means carrying out minor maintenance jobs at the regular intervals. It involves minor
jobs such as cleaning, lubrication, inspection and minor adjustment of pressure, flow tightness etc. and tightening of
loose parts etc. It also includes inspection of bearings, V-belts, couplings, jointings, foundation bolts, earthings and
protective covers etc. The small and critical defects, observed during such inspection, are rectified and bigger jobs are
planned for rectification during next available shutdown. Such maintenance is essential for effective scheduled and
preventive maintenance.
Routine maintenance is not need-based. In anequipment, some motors may be running four hours a day and
some motors may be running twenty hours in a day, but, in routine maintenance, all are inspected at the same factory.
This may lead to some amount of over- maintenance, on some equipments or components but this system pays
handsomely in the long run. “Regularity i.e. carrying out planned jobs regularly in simple cyclic schedules is very
essential in routine maintenance. Such schedules are simple (like check, clean, lubrications, tighten, adjust etc.) and
respective. Routine maintenance may also consider a small portion of preventive maintenance.
Frequency of routine maintenance is generally once every shift (at the start of shift) or once every day. Of
course, in sophisticated and automatic \working equipments or in equipments having enough condition monitoring
gadgets to indicate failures, the period of routine maintenance may change. Again, depending on the extent of jobs and
time available either the same jobs may be planned for every day or one group of jobs may be planned for Monday,
another group of jobs for Tuesday and so on.
Routine maintenance needs very little investment in time and money. The duration of routine maintenance in
a day is generally so small that it is does not affect the output from the m/c appreciably. As the jobs are not big and
don’t need much spares and materials, costs of doing routine maintenance is also very small. However, cost of not
doing routine maintenance may be very high as small defect, which could have been rectified during routine
maintenance with little time and effort, may lead to a major problem and crisis causing severe production disruption
and needing lot of money and resources for rectification.
One example of routine maintenance is one Railway suburban electric train system is that whenever a train
stops at few bigger stations, a group of maintenance people immediately start checking brakes etc. The whole job is
over in 10 to 12 minutes by the time the train is due to start for onward journey. In industries, during shift change
periods a small group of maintenance personnel carry out necessary inspection, lubrication, adjustment and tightening
etc. for about 15 minutes by the time operating personnel are ready to start the equipment. Rough flow diagram of
routine maintenance is shown in figure. Similar flow diagram can be made for other maintenance systems.
Preventive Maintenance:
This is one of the oldest maintenance systems being practiced in industries. It is easy to understand
and is still being used extensively. Today corrective maintenance and Condition based maintenance (diagnostic
maintenance) etc. are also added to this concept to some extent. Preventive maintenance (PM) is the planned
maintenance of plants and equipments (including and resulting from periodic inspections) in order to prevent or
minimise breakdowns and depreciation rates. As it covers vast areas occasionally some people get misled about its
coverage. Some people think PM is just a routine inspection, cleaning, lubrication, adjustment and doing minor
repairs/jobs on equipments. Some other think that PM means internal cleaning of equipments and components,
lubrication and oil changing and replacement of consumables like gaskets, belts, seals, bearings etc. Yet some other
link that PM includes only major jobs like overhauling and reconditioning etc. Actually PM includes all the three
types of activities mentioned here. After PM repairs, the equipment’s health is restored back nearly to the equipments
original condition. However, it does not include much improvement and upgradation jobs.
12
Flow Diagram for Routine Maintenance
In general, the various components of PM are as follows:-
(i) Check drawings, design and installation of equipments including subsequent re-design and minor
modifications depending on specific nature of problems.
(ii) Proper identification of all items, proper documentations and conditions:
• History Cards/Records
• Spares catalogues, equipment catalogue and inventory list.
• Job manuals etc.
• Maint work orders etc.
(iii) Periodic inspection of plants and equipments:
• Use of checklists by inspectors and its frequency, daily, shift-wise, weekly monthly etc.
• Well qualified and experienced inspectors.
• Use of necessary aids – Test equipments, vibration meters, ultrasonic and X-ray equipments etc.
• Preparing total defect lists and their categorizations.
(IV) Repetitive Servicing, repairs, upkeep and overhauls:
• Minor repairs
• Medium Repairs – roughly around 50% of jobs of major overhauls.
• Major overhauls or capital repairs.
• Emergency repairs or corrective repairs.
• Recovery or Salvaging-when equipment ahs undergone several major repairs.
(V) Adequate lubrication, cleaning and painting of equipments. Changing of oils and lubricants of systems as
per inspection report.
(vi) Typical failures analysis and planning for their elimination.
(vii) Organization for P.M.
(viii) Budgetary Control of Repairs and P.M.
Crew carrying out P.M. jobs should normally be separate from crew attending breakdowns. If the two types of
jobs are given to same crew. P.M. jobs often get neglected or get less importance and less supervision as not during
P.M. jobs does not immediately reflect in downtime, But, in the long run, this practice would prove disastrous as
breakdowns would occur more frequently.
Essence of P.M. is proper planning of all activities beforehand so that the following common delays are
avoided.
• Waiting for job orders at the start of the shift, and also finishing one job.
• Visiting the site to find out what to do and how to do.
• Unnecessary trips to stores as complete lists of tools/spares not available at start of the shift and tools are
brought from stores as and when needed.
• Operating personnel not clearly aware of time of sparing the m/c and doing their preparatory jobs (opening of
dies/jaws etc.) before the handling over the m/c to maintenance for P.M.
• Losing time because of lack of safety permit etc.
13
Frequency of P.M.
The frequency of P.M. jobs are generally cyclic in nature, but the in the interval between two P.M.
schedules for same jobs i.e. frequency of P.M. is not same throughout the life cycle of the equipment. As discussed
earlier, failure rate also follows the bathtub curve. As indicated in figure. Failure rates are high during initial stage i.e.
just after commissioning (also called de-bugging phase or wear-in-phase), failure rates are less during normal working
period (also known as chance failure phase) and failure rates are again high towards the end of life cycle or working
cycle i.e. just before the discard of equipment or equipment is taken down for major revamping (also known as wear-
out phase). As such the P.M. frequency between these three phases are decided accordingly.
Again, during the chance failure-phase, the inspection, cleaning, lubrication and minor repair components etc.
may have pre-determined fixed frequency and interval but the major overhauls and capital repair components etc. may
sometimes follow slightly differing frequency and intervals. In figure, the periods x1, x2 and x3 indicate the actual
operating periods during which the equipment condition deteriorates and periods y1, y2 & y3 are the periods of capital
repairs/overhauls during which a deteriorated equipments is restored back to near its original condition. In actual
practice, because of some difference in local conditions at the time, the periods x1, x2 and x3 may or may not be equal
and similarly periods y1, y2 andy3 may or may not be equal.
Frequency of P.M.
CONDITION MONITORING
Condition monitoring is the process of monitoring a parameter of condition in machinery, such that a significant
change is indicative of a developing failure. It is a major component of predictive maintenance. The use of conditional
monitoring allows maintenance to be scheduled, or other actions to be taken to avoid the consequences of failure,
before the failure occurs. It is typically much more cost effective than allowing the machinery to fail. Serviceable
machinery includes rotating machines and stationary plant such as boilers and heat exchangers.
14
OBJECTIVES OF CONDITION BASED MONITORING
In broad sense maintenance is “to keep fit any System for use”. It may be defined as an overall combination of all
those activities which are required to keep an item as in built condition so that it continues to have its original
procedure capacity.
The following are the main objectives of condition based monitoring
WHAT TO MONITOR
A general impression is generally taken that once the planning, design manufacture, erection and commissioning is
done successfully and the plant/equipment goes for regular operation, the rest is the matter of planned periodic
maintenance schedules for upkeep of the plant/equipment. It is essential and by measuring emanated signals at the
correct locations and in correct manner it is possible to judge the physical condition of plants and equipments and
timely action can be initiated. All m/c emanate or indicate both “Primary signals” and “Secondary signals.” Primary
signals are generally those signals or parameter which are required to asses the performance of the equipments and
which are design to emanated such as oscillation in vibratory screens etc. all other signals, which are appear as loss
o/p like vibration, sound, chemical and physical changes etc. as secondary signals are generally result or form of loss
output, monitoring of these signals becomes inevitable for equipment health monitoring and technical diagnostics.
Hence we need tom monitor all the signals i.e. Primary as well as Secondary signals.
WHEN TO MONITOR
Condition monitoring is almost essential and by measuring emanated signals at the correct location and in correct
manner it is possible to judge the physical condition
Of plants and equipments and timely action can be intiated.now the question arises that when to monitor the
system,there are several possibilities when monitoring is essential or it is required which are discussed below:
Undue noise from machines or its parts.
Unpleasant smell from any machine.
Leakage from any part of the machine.
A part or many of a machine undergoing regular wear and tear.
Failure of parts.
PRINCIPLES OF CONDITION BASED MAINTENANCE SYSTEMS (CBMS)
1. Listing and Codification of all machines/equipments: for proper identification and location.
2. Selecting Critical Machines and Systems: Machine and system may be classified as very critical, critical, less
critical and least critical. Criteria for very critical and critical machines and systems may be:
• If the machine/system is very important for the production.
15
• If the machine/system is very costly to repair.
• If the failure of the machine/system could result in injury or loss of life (health hazard) or serious
damage to environment
• Major machines not having any stand-by units.
CBMS is very beneficial for critical and very critical machines/systems and generally continuous condition
monitoring program are used for these machines. Sometimes expensive on-line continuous monitoring is
preferred for very critical machines and systems. On less critical machines/system, periodic monitoring is
enough. For least critical machines fixed time maintenance (P.M.) is preferred or those may be allowed to run
to failure. If the machine are classified as Vital, Essential and Desirable (VED analysis), continuous
monitoring, periodic monitoring and fixed time maintenance may be preferred respectively.
3. Identifying Components/Items: Very critical, critical and less critical machines/systems are sub-divided into
process components, mechanical components and control components etc. From these, individual components
(items) are identified for monitoring such as roller bearing, seals, oils/lubricants etc.
4. Fixing Condition Parameters: These identified components (items) have to be transformed into condition
parameters which are actually measured or monitored such as temperature, pressure, flow, vibration, noise, strain,
level, magnetic flux, electrical insulation etc. For specific component, most appropriate condition parameters are
fixed.
5. Monitoring Techniques: For each condition parameters, relevant measuring and monitoring techniques are
selected and for those, suitable equipments/instruments/implements are identified and obtained. Different
techniques have their own merits and demerits. All techniques cannot be used and often not required for all
machines. Selection of measuring techniques and instruments are primarily based on operating conditions, past
experience, fluid handled and likely defect that may occur while in operation.
After deciding the techniques and instruments, the mode of monitoring is ascertained such as displacement mode,
velocity mode or acceleration mode for vibration monitoring. Then comes fixing up of monitoring points or
sampling points. These points should be so located that they give truly representative signals/symptoms and
conditions. Also, generally same points are used for successive monitoring and trend monitoring. For example, in
vibration monitoring points should be so selected that we get radial vibration (in two planes) and axial vibration
for each bearing. In lubricant monitoring, for wear-debris analysis, the sampling points should be so located that it
gives representative value of debris generated-it should not be at the extreme bottom of tank as debris is normally
accumulated there and it should generally not be at the top most level where debris normally do not reach on
surface.
6. Monitoring Schedule and Frequency: Having finalized the monitoring techniques, instruments and points, the
next is to decide about the frequency of monitoring (daily, weekly, monthly, continuous etc.). Some monitoring or
inspection can be done only “online” i.e., when the machines/systems are running and some can be done only
“offline”. Condition parameters such as vibration, temperature, bearing conditions, strain etc. can be monitored
only when machines/ systems are running (on-line). Again such monitoring can be periodic or continuous
depending on criticality. Some other condition parameter such as checking the internal clearances, gaps,
backlashes etc. can be inspected/monitored when the machines/systems are not working or dismantled (off-line)
Considering the above aspects and also considering the severity/criticality of the machines and the defects likely
to be generated, a master inspection/monitoring schedule is prepared, monitoring/inspections are carried out as per
the schedule and recorded. In case a defect or deviation is observed, often rechecking is done (through not
scheduled) or next inspection date is proponed to ascertain the defect/deviation. If possible, severity limits of
condition parameters are also ascertained.
7. Trend Monitoring: The inspection records, thus obtained, are decoded, analyzed and compared with the
maximum allowable limits or earlier data if available. Today the equipment manufacturer often supplies vibration
signature/records and records of the other condition parameters of the new machines along with their test-reports.
16
These can be taken as reference value for monitoring. However for older machines/systems, where such reference
value are not available, successive inspection/monitoring records of same machines are analyzed and trend is
established which helps in finding out the extent of deviation/defect. Comparison with inspection results of other
similar machines also help in such trend monitoring. In case of increasing trend, more frequent
inspections/monitoring are made. Failure statistics (MTBF, MTTR etc.), if available, can also be considered
This assessment will indicate the deterioration developing in the machine and will also indicate the time when
corrective repairs are to be done to avid failure.
8. Repair Schedule and Execution: Based on the assessments indicated by trend monitoring, necessary repair
actions are planned, scheduled and executed for correcting the deterioration reaches higher limits.
9. Follow-up: After the repairs, the condition parameters are again inspected and analyzed to ensure the defects,
identified earlier, were repaired correctly and this goes on.
The success of this program depends on the effective use of monitoring instruments and proper analysis of
inspection data by skilled persons and also by taking timely repair action.
CONDITION BASED MONITORING TECHNIQUES
• Visual Monitoring.
• Leakage Monitoring.
• Temperature Monitoring.
• Lubricant Monitoring {Wear Debris Analysis}
• Vibration Monitoring.
• Sound/ Acoustic Monitoring.
• Cracks monitoring
• Corrosion monitoring
• Noise/sound monitoring
• Smell/odour monitoring
17
VISUAL
Visual monitoring is the most commonly used method. Such monitoring can be done using;
1) Human eyes
2) Optical probes
3) Optical probes with television
The selection is done based on sophistication/complexity involved. Optical probes are use when man cannot easily
approach to see or it as hazardous to see with human eye. With the use of set of televisions, man sitting at one place
can see and monitor the conditions of different places. These can again be computerized.
TEMPERATURE MONITORING
The techniques/instruments used in temperature monitoring are
• Temperature crayons and tapes
• Thermometers and optical pyrometers
• Thermocouples
• Fusible plugs
• Infrared meters
• Thermography (infrared radiation scanner)
a. Infrared Thermography. A non contact technique employing either a video system or a scanning-type temperature
probe that measures infrared radiation emitted and reflected from surfaces. The technique is also effective in detecting
thermal cavities and roof leaks.
18
b. Contact Devices. Devices such as thermometers, resistance temperature detectors, thermocouples, decals, and
crayons that detect temperatures within 0.25oC.
c. Deep-Probe Temperature Analysis. Using temperature probes inserted into the soil in the vicinity of buried pipes
carrying steam or hot fluid to determine the degree of leakage and energy loss.
Thermography
Thermography, or thermal imaging, is a type of infrared imaging. Thermographic cameras detect radiation in the
infrared range of the electromagnetic spectrum (roughly 900–14,000 nanometers or 0.9–14 µm) and produce images
of that radiation. Since infrared radiation is emitted by all objects based on their temperature, according to the black
body radiation law, thermography makes it possible to "see" one's environment with or without visible illumination.
The amount of radiation emitted by an object increases with temperature therefore thermography allows one to see
variations in temperature, hence the name. With a thermographic camera warm objects stand out well against cooler
backgrounds.
Thermal imaging photography finds many other uses. For example, firefighters use it to see through smoke, find
persons, and localize hotspots of fires. With thermal imaging, power lines maintenance technicians locate overheating
joints and parts, a telltale sign of their failure, to eliminate potential hazards. Where thermal insulation becomes faulty,
building construction technicians can see heat leaks to improve the efficiencies of cooling or heating air-conditioning.
The appearance and operation of a modern thermo graphic camera is often similar to a camcorder. Enabling the user to
see in the infrared spectrum is a function so useful that ability to record their output is often optional. A recording
module is therefore not always built-in.
19
The basis for IR imaging technology is that any object whose temperature is above 0 °K radiates infrared energy. The
amount of radiated energy is a function of the object's temperature and its relative efficiency of thermal radiation,
known as emissivity. Fig. shows an image of an aluminum housing.
Radiated energy (power) is proportional to the body's temperature, raised to the 4th power. For example, a black body
(emissivity of 100%) at 30° C would have a radiation density of 5.4mW/cm². That same blackbody at a temperature of
150° C would have a radiation density of 139.2mW/cm². This energy can be measured and an instrument calibrated to
indicate the corresponding temperature of the surface it's "looking at." Instruments which "scan" an object and create
an image or spatial map of surface temperatures are referred to as thermal imagers
LUBRICANT MONITORING
Oil analysis is used to determine the condition of a given oil, fuel, or grease sample by testing for viscosity; particle,
fuel, and water contaminants; acidity/alkalinity (pH); breakdown of additives; and oxidation.
Coupled with other technologies such as vibration and temperature measurements, oil analysis identifies the
equipment condition and aids in identifying the root cause of failures.
The methods of lubricant monitoring or wear debris monitoring can be classified in to following three categories
1. Direct Detection Method: The wear debris in the machine is detected by arranging the oil to flow through a device
which is sensitive to the presence of the debris.
2. Debris collection Method: Wear debris is collected in a device fitted to the machine so that the debris can be
extracted periodically for the examination.
3. Lubricant sample Analysis: In such method, a representative sample of lubricant is taken out periodically from the
machine and is analyzed for wear debris types, concentration and pattern etc.
Based on these methods, the following techniques/equipments (which are component of wear debris monitoring) are
used for monitoring the health and condition of the plant/equipment with which the oils/lubricants come into contact.
20
• Magnetic plugs
• Oil monitoring filters
• Ferrography
• Spectroscopy
• Particle counter
Visguage: This is actually a viscosity comparator which quickly tells the rough viscosity of the oil. However correct
viscosity of the oil can be measured in a laboratory through viscosity meters.
It is a small portable kit which can be taken near the engines of any D.G. set, Rail or any other engine. It is a very
convenient, quick and reasonably dependable and efficient oil monitoring system which helps in identifying the
defective operating conditions in the engine which are caused by fuel dilution, dirty air filters, faulty combustion,
excessive wear etc. It also helps in indicating the correct engine oil change period and this conserves precious oil.
Magnetic Plugs
These are simple magnetic Probes positioned in the lubricating system for maximum catch efficiency of the wear
debris. These are generally fitted at the bottom portion of the sump, reservoir so that these can be periodically
removed along with the collected metallic debris.
The magnetic plugs are of debris collection method type. This plug, in addition to collecting metallic debris for
monitoring also helps in cleaning of oils.
These works on the debris collection method. These filters can be used with an “Integrated oil analyzer” or otherwise.
In this a unique filter is used which passes the very small particles of little interest but captures and retains small wear
particles. Fig. shows a filter unit which consists of a syringe pump to force the oil at constant flow, a filter to capture
the particles, a means to measure the pressure drop across the filter and a magnet and flux sensor to measure the
magnetic particles. As the syringe pump starts forcing the oil through filter at constant flow, the pressure drop
continues to increase. The pressure drop across the filter rise quickly to equilibrium value proportional to the viscosity
and remains nearly constant in case of clean oil. Thus the rate of rise in pressure drop can be correlated to particle
concentration and also to wear generated. Further analysis of wear/debris particles, deposited on filter, can reveal the
components which are wearing out.
21
Ferrography
Ferrography, or wear-particle analysis, is the identification of all particles suspended in the lubricating fluids of any
oil-wetted machinery. This technology was developed by the U.S. Navy in the 1970s. Today, it is available worldwide
through commercial laboratories. Ferrography provides a non-invasive look at historic, current and future conditions
of a machine's lubricated components. This is all accomplished without the time and expense of physical examination.
Analytical methods identifying the size, shape, composition and concentration of particles is the core of Ferrography.
Once a trained analyst determines these factors, an association between the wear particles and the specific component
of origin can be determined. This is done through direct examination of the particles. Glass substrate, or Ferrogram
analysis, is one common method of particle identification. Predict/DLI of Cleveland developed a method of particle
distribution that uses a magnetic gradient field. A combination of incline, sample preparation and a magnetic field
ensure all particles present in the lubricant sample are deposited on the substrate for examination.
This technique uses a powerful magnetic field to separate wear particle from a sample of lubricating oil of the
machine. This technique works in the following two stages:
• Detection of the onset of a machine failure using a ‘direct reading Ferro graph’.
• ‘Diagnosis’ of the failure type by identification of the wear particles using ‘Analytical Ferro graph’.
In ‘direct reading Ferro graph’ a magnetic field aligns the particles according to size within a glass capillary tube.
Large particles (5 microns and above) are clustered near the entry point of the tube and the small particles(less than 5
microns) are concentrated down stream. The instrument measurers the total number of large particles(DL)and total
number of small particles (DS).however when the condition of the machine deteriorates, the total number of wear
particles(DL+DS) increased and concentration of large to small particles(DL/DS) increases.
For a bad sample, diagnosis of failure mode is done with the help of ‘analytical Ferro graph’. With this instrument,
lube oil is pumped at a very slow and controlled rate across a glide slide held at as light inclination above the poles of
a very powerful magnet. The wear debris deposits on the slide in a fashion ideally suited for a microscopic
examination of individual wear particles, magnifying the particles from 100 to 1000times.the shape, configuration
and surface characteristic of particle indicate the type of wear and type/mode of failure, such as particles in the form
of loops, spirals and bent wires indicate abrasive wear.
22
Spectroscopy
Spectroscopy is the study of matter by investigating light, sound, or particles that are emitted, absorbed or scattered by
the matter under investigation.
Spectroscopy may also be defined as the study of the interaction between light and matter. Historically, spectroscopy
referred to a branch of science in which visible light was used for theoretical studies on the structure of matter and for
qualitative and quantitative analyses. Recently, however, the definition has broadened as new techniques have been
developed that utilize not only visible light, but many other forms of electromagnetic and non-electromagnetic
radiation: microwaves, radiowaves, x-rays, electrons, phonons (sound waves) and others.
Spectroscopy is often used in physical and analytical chemistry for the identification of substances through the
spectrum emitted from them or absorbed in them. A device for recording a spectrum is a spectrometer. Spectroscopy
can be classified according to the physical quantity which is measured or calculated or the measurement process.
Spectroscopy is also heavily used in astronomy and remote sensing. Most large telescopes have spectrographs, which
are used either to measure the chemical composition and physical properties of astronomical objects or to measure
their velocities from the Doppler shift of spectral lines.
TYPES OF SPECTROSCOPY
Emission spectroscopy uses the range of electromagnetic spectra in which a substance radiates. The
substance first absorbs energy and then radiates this energy as light. This energy can be from a variety of
sources, including collision (either due to high temperatures or otherwise), and chemical reactions.
Absorption spectroscopy uses the range of electromagnetic spectra in which a substance absorbs. In atomic
absorption spectroscopy, the sample is atomized and then light of a particular frequency is passed through the
vapour. After calibration, the amount of absorption can be related to the concentrations of various metal
ions
Scattering spectroscopy measures certain physical properties by measuring the amount of light that a
substance scatters at certain wavelengths, incident angles, and polarization angles. Scattering spectroscopy
differs from emission spectroscopy due to the fact that the scattering process is much faster than the
absorption/emission process.
PERFORMANCE MONITORING
The PM for organizations is for whole, not individuals. We have identified two main ways of instituting such PM:
An in-depth evaluation of an organization’s processes and outcomes, typically involving a site-visit and large
amounts of documentation. Examples are OFSTED visits, police inspections, QAA in universities, HMI
Prison reports.
23
The collection and publication of summary performance indicators. These can be broad or narrow in focus.
For example, schools essentially face just three: truancy rates and two measures of GCSE pass rates. Local
Authorities face a long list.
The more detailed measures are more expensive to collect, and if it can be shown that the summary measures provide
as good a measure as more detailed ones, there is then a case for moving to such measures.
CURRENT MONITORING
Current monitoring enables the current consumed around the home as well as in industry to be analysed. Monitoring
supply currents can highlight areas of excessive current consumption and once the problem has been identified ideas
can be implemented to reduce this excess. This will help reduce bills, increasing profits and in turn help the
environment.
As machinery gets older the current consumed will often increase as parts become worn and the machine becomes
inefficient. Data collected from separate machines over long periods of time can highlight problems before they
become critical. Servicing or replacement can then be scheduled before the
machine fails, which saves any loss in production.
VIBRATION MONITORING
2
4
The four characteristics of vibration mentioned below some specific significance about the vibration.
1. Displacement indicates “how much” vibration is present which also indicate how good or bad is the condition
of the machine.
24
2. Peak velocity also indicates “how much” vibrations are present which again indicates how good or bad is the
condition of the machine.
3. Frequency of the vibration indicates “what” is causing the vibration. This vibration reading are said to be on
frequency mode. This is the most important vibration characteristics
4. Phase of the vibration indicates “where” the vibration or problem originates .Phase is the position of vibrating
part at a given instant with reference to a fixed point .phase measurement is mainly used in dynamic balancing
of machine and helpful in identifying certain causes of vibration.
For vibration study and monitoring the following three basic principles are taken
a) All machine vibrate because there is no perfect machine with zero vibration.
b) With increase in mechanical trouble vibration increases.
c) Different troubles cause vibration s in different ways.
Vibrations are measured by various types of vibration meters. Some of the vibration analysis methods are:
a) Spectral analysis
b) Statistical analysis and Kurtosis method
c) Envelope analysis
For a rotating machine the spectral changes wit the speed of rotation arranging the various spectra vertically in
ascending value of this parameter result in a very useful diagram for fault identification. Such a plot is called a
spectral map or cascade diagram
In a Cambell diagram the machine speed in rpm is plotted along the horizontal axis and frequency as vertical axis.
Orders of harmonics are shown as broken lines originating at the lower left corner.
25
Spike energy, are measuring unit for judging varying condition, is based on high frequency peak acceleration. It is
used for rolling element bearings, where damage is generally in the form of local spalling. Each time a rolling element
passes or the local damage, there is a short impact which results in the release of a certain amount of energy. This is
known as short shock pulse. Spike energy measurements differ from ordinary acceleration measurement as it detects
only high frequency vibration and hold their peak amplitudes. The spike energy measuring circuit uses a high
frequency band pass filter to reject low frequency signals caused by unbalanced, misalignment etc.
Based on these methods and characteristics, the following types of vibration monitoring are used for maintenance in
industries:
CORROSION MONITORING
The principle of corrosion monitoring equipment is based on the corrosion or chemical wear of the test
material.The use of such technique sfor condition monitoring of machines/components (for helping in condition
base dmaintenance or other maintenance jobs) is very limited an dselective.Again some of these mey not give
accurate machine deterioration rate.
NOISE/SOUND MONITORING
Noise and sound are basically the same except that the noise may be considered harsh, unpleasant and undesirable
sound.
26
Noise is the result of transient vibrations of the structures and components of machines and such vibrations are
induced by rapid energy release, rapid pressure and temperature rise, cavitations and air ingress, leakages and other
malfunctions etc.
For identifying the noise source following techniques may be used:
• Subjective assessment
• Acoustic duct(such as horn)
• Surface intensity approach(using accelerometer on vibrating surface and a microphone)
• Acoustic intensity approach
• Lead wrapping technique(using sound absorbing porous material blanket, pasted on lead sheet and placing on
machine vibrating surface so that the porous material is sandwithched between the lead sheet and machine
surface)
However for condition based maintenance vibration monitoring has overshadowed noise monitoring.
SMELL/ODOUR MONITORING
Efficient smell/odour monitoring systems have probably not yet been developed though smelling through nose as used
since ages to determine the leakages and presence of few gases (coke oven gas, ammonia etc.)However many gas
detection system (on-line and off-line) and instruments are available in the market which mostly work on chemical,
electro-chemical and infrared actions/reactions. e.g.
• SO2 Analyzer (based on heated UV instrument) to measure SO2 in flue gases in sulphuric acid plants and
paper/pulp mills etc.
• CO, CO, NH3, CH4 detectors analyzers are based on infrared techniques using gas filled detectors.
• Hydrocarbon Analyzers for monitoring stack gases of boilers/furnaces.
• NO2 analyzer based in chemiluminescent technique etc.
Technical benefits:-
Organizational Benefits:-
RELIABILITY-CENTERED MAINTENANCE
27
Over the past twenty years, maintenance has changed, perhaps more so than any other management discipline. The
changes are due to a huge increase in the number and variety of physical assets (plant, equipment and buildings)
which must be maintained throughout the world, much more complex designs, new maintenance techniques and
changing views on maintenance organization and responsibilities.
Maintenance is also responding to changing expectations. These include a rapidly growing awareness of the extent to
which equipment failure affects safety and the environment, a growing awareness of the connection between
maintenance and product quality, and increasing pressure to achieve high plant availability and to contain costs.
The changes are testing attitudes and skills in all branches of industry to the limit. Maintenance people have to adopt
completely new ways of thinking and acting, as engineers and as managers. At the same time the limitations of
maintenance systems are becoming increasingly apparent, no matter how much they are computerized.
In the face of this avalanche of change, managers everywhere are looking for a new approach to maintenance. They
want to avoid the false starts and dead ends which always accompany major upheavals. Instead they seek a strategic
framework which synthesizes the new developments into a coherent pattern, so that they can evaluate them sensibly
and apply those likely to be of most value to them and their companies.
This chapter provides a brief introduction to RCM, starting with a look at how maintenance has evolved over the past
fifty years.
Reliability-centered Maintenance
Before looking at where we going, lets see where we have come from. In the period up to and shortly after the First
World War equipment was generally simple and robust. The ways in which it could fail were easily treated since the
simplicity aided diagnostics and in some cases equipment failure was an acceptable reason for loss of production. In
this environment, maintenance was largely reactive; simply to fix things when they tailed, supplemented with simple
tasks such as lubrication. As an example, consider a steam train. The operating principle is well understood, the
systems are simple with a low level of automation and low configurability (essentially doing one job), the construction
is robust & contains redundant elements and operational tolerances are broad. There had to be a lot wrong before the
train actually stopped running.
However, during the Second World War, things began to change and the availability of manpower declined in
industrialized economies of the time. Equipment became more complex, thus replacing the need for manual
intervention and reducing manpower requirements. Loss of production through equipment failure also became
unacceptable leading to work on prevention of failures before they occurred. Conventional wisdom suggests that as
equipment gets older it "wears out" and becomes more likely to fail. Using this model it was believed that failures
could be avoided if equipment was maintained before items "wore out" and the failure occurred, i.e. planned
intervention at the right time would prevent failures, all that had to be determined was the right time.
Interestingly, this line of thought yields an insight into the use of the principal maintenance performance indicators to
this day i.e. the ratio of planned to breakdown maintenance. If the likelihood of item failure increases with age, then
planned intervention before the failure should reduce the number of failures that occur. Using this model suggests that
if we continue to see failures then we have not intervened early enough i.e. we do not yet know the right age.
Therefore it would seem appropriate to measure the effectiveness of our strategy by measuring the amount of planned
to unplanned maintenance. This is widely reported in industry; improvement targets are even established for this ratio
(in most cases, the target is parity). However, as will shortly be shown, this takes no account of the technical
characteristics of the failure and assumes that we want to prevent all failures. This is not the case and the measurement
is essentially meaningless e.g. one German car plant has determined its most effective ratio of planned to unplanned
maintenance as 1:64!
28
The growth of civil aviation in late 1940's and 50’s triggered the next step. At about the same time the Federal
Aviation Administration (FAA), the body responsible for regulating airlines in the USA was worried about aircraft
reliability. In an effort to reduce the number of failures, the industry concluded that the maintenance was being done
too late based on the accepted "wear out" model of failure. So the frequency of scheduled maintenance was increased.
This lead to higher maintenance costs which by the late 1950's prompted the industry to look at the concept of
preventive maintenance. In addition the FAA was concerned that the reliability of some engines had not been
improved by changing either the type or frequency of overhaul. The data available at the time indicated that although
the frequency of occurrence of some failures had been reduced, many more had remained unchanged or actually
increased! There was no way this finding could be explained using the model of failure accepted at that time.
A task force, consisting of representatives from both the FAA and the airlines, was established to investigate planned
maintenance policies. What evolved was a statement from the committee that the reliability and the overhaul
frequency of equipment was not necessarily directly related and the common belief that reliability declined with
increasing age was not generally true. In fact:
1. Scheduled overhaul has little effect on the overall reliability of a complex item unless there is a dominant failure
mode.
2. There are many items for which there is no effective form of scheduled maintenance.
It became obvious that too much emphasis had been placed on the 'right age’ model.
The task force went on to develop a propulsion system reliability program, each airline involved developed reliability
programs for their own particular areas of interest. These became the Handbook for the Maintenance Evaluation and
Program Development for the Boeing 747, more commonly known as MSG-1 (Maintenance Steering Group 1). MSG-
1 was subsequently improved and became MSG-2. In 1979 the Air Transport Association (ATA) reviewed MSG-2 to
incorporate further developments in preventive maintenance; this resulted in MSG-3, the Airline/Manufacturers
Maintenance Program Planning Document.
United Airlines was sponsored by the US Department of Defense to write a comprehensive document on the
relationships between Maintenance, Reliability and Safety. The report was prepared by Stanley Nowlan and Howard
Heap, it was called ‘Reliability Centred Maintenance'. Outside the aerospace industries, the application of MSG-3 is
generally known as RCM. The work of the airlines predated similar problems that spread throughout industry during
the 1980’s, consequently industry has been fortunate in being able to use the airlines prior experience.
Thus after the evolution of maintenance in1930's , its evolution can be traced through three generations. RCM is
rapidly becoming a cornerstone of the Third Generation, but this generation can only be viewed in perspective in the
light of the First and Second Generations.
29
The First Generation
The First Generation covers the period up to World War II. In those days industry was not very highly mechanized, so
downtime did not matter much. This meant that the prevention of equipment failure was not a very high priority in the
minds of most managers. At the same time, most equipment was simple and much of it was over-designed. This made
it reliable and easy to repair. As a result, there was no need for systematic maintenance of any sort beyond simple
cleaning, servicing and lubrication routines. The need for skills was also lower than it is today.
Things changed dramatically during World War II. Wartime pressures increased the demand for goods of all kinds
while the supply of industrial manpower dropped sharply. This led to increased mechanization. By the 1950's
machines of all types were more numerous and more complex. Industry was beginning to depend on them.
As this dependence grew, downtime came into sharper focus. This led to the idea that equipment failures could and
should be prevented, which led in turn to the concept of preventive maintenance. In the 1960's, this consisted mainly
of equipment overhauls done at fixed intervals.
The cost of maintenance also started to rise sharply relative to other operating costs. This led to the growth of
maintenance planning and control systems. These have helped greatly to bring maintenance under control, and are
now an established part of the practice of maintenance
Finally, the amount of capital tied up in fixed assets together with a sharp increase in the cost of that capital led people
to start seeking ways in which they could maximize the life of the assets.
Since the mid-seventies, the process of change in industry has gathered even greater momentum. The changes can be
classified under the headings of new expectations, new research and new techniques.
30
Downtime has always affected the productive capability of physical assets by reducing output, increasing operating
costs and interfering with customer service. By the 1960's and 1970's, this was already a major concern in the mining,
manufacturing and transport sectors. In manufacturing, the effects of downtime are being aggravated by the
worldwide move towards just-in-time systems, where reduced stocks of work-in-progress mean that quite small
breakdowns are now much more likely to stop a whole plant. In recent times, the growth of mechanization and
automation has meant that reliability and availability have now also become key issues in sectors as diverse as health
care, data processing, telecommunications and building management.
Greater automation also means that more and more failures affect our ability to sustain satisfactory quality standards.
This applies as much to standards of service as it does to product quality. For instance, equipment failures can affect
climate control in buildings and the punctuality of transport networks as much as they can interfere with the consistent
achievement of specified tolerances in manufacturing.
More and more failures have serious safety or environmental consequences, at a time when standards in these areas
are rising rapidly. In some parts of the world, the point is approaching where organizations either conform to society's
safety and environmental expectations, or they cease to operate. This adds an order of magnitude to our dependence
on the integrity of our physical assets - one which goes beyond cost and which becomes a simple matter of
organizational survival.
At the same time as our dependence on physical assets is growing, so too is their cost - to operate and to own. To
secure the maximum return on the investment which they represent, they must be kept working efficiently for as long
as we want them to.
Finally, the cost of maintenance itself is still rising, in absolute terms and as a proportion of total expenditure. In some
industries, it is now the second highest or even the highest element of operating costs. As a result, in only thirty years
it has moved from almost nowhere to the top of the league as a cost control priority.
New research
Quite apart from greater expectations, new research is changing many of our most basic beliefs about age and failure.
In particular, it is apparent that there is less and less connection between the operating age of most assets and how
likely they are to fail.
Figure 1.2 shows how the earliest view of failure was simply that as things got older, they were more likely to fail. A
growing awareness of 'infant mortality' led to widespread Second Generation belief in the "bathtub" curve.
However, third Generation research has revealed that not one or two but six failure patterns actually occur in
practice. This is discussed in detail later, but it too is having a profound effect on maintenance.
31
New techniques
There has been explosive growth in new maintenance concepts and techniques. Hundreds have been developed over
the past fifteen years, and more are emerging every week.
Figure 1.3 shows how the classical emphasis on overhauls and administrative systems has grown to include many new
developments in a number of different fields.
• Decision support tools, such as hazard studies, failure modes and effects analyses and expert systems
• New maintenance techniques, such as condition monitoring
• Designing equipment with a much greater emphasis on reliability and maintainability
• A major shift in organizational thinking towards participation, team-working and flexibility.
A major challenge facing maintenance people nowadays is not only to learn what these techniques are, but to decide
which are worthwhile and which are not in their own organizations. If we make the right choices, it is possible to
improve asset performance and at the same time contain and even reduce the cost of maintenance. If we make the
wrong choices, new problems are created while existing problems only get worse.
• In a nutshell, the key challenges facing modem maintenance managers can be summarized as follows:
• To select the most appropriate techniques to deal with each type of failure process in order to fulfill all the
expectations of the owners of the assets, the users of the assets and of society as a whole
• In the most cost-effective and enduring fashion
• With the active support and co-operation of all the people involved.
Reliability Centered Maintenance (RCM) has its place, but many times plants jump into training programs and
attempt to implement Reliability Centered Maintenance long before they are ready for it. The academia of
maintenance management still argues about the definition of RCM. Some even say that if it is not done exactly the
way they prescribe, then it is not RCM. So what? The whole idea is that you want to achieve more cost-effective
reliability through the implementation of better operations and maintenance practices.
Reliability Centered Maintenance (RCM) has its definite place in the specification and design phase of new equipment
and systems, and for existing critical and complicated systems. The thought process used, for example, to analyze
existing preventive programs, is good, but can easily be made overcomplicated to serve the purpose. I have analyzed
the results of many RCM implementations, and the fact is that after a very lengthy criticality and failure mode
32
analysis, the end results have not changed the fact that a V-belt drive needs to be inspected for an obviously critical
belt conveyor! What is often missing is a document describing how to inspect it while the equipment is operating. In
the worst cases, belts, couplings, heat exchangers, control valves, and other common components are, even after the
RCM analyses, inspected during shutdowns. Perhaps some inspections have been deleted because equipment was not
critical. So, there you might have saved an inspection that only takes two minutes for an operator who will inspect the
process in that area every shift anyway!
The first two of the above activities are low cost and easy to implement because of high acceptance by people in your
organization. You can use standard training material to train people when and how to do inspections. What you do
with, for example, a coupling, can be decided without a complicated analysis. The failure developing period for
misalignment might be two to eight weeks, so you need to inspect it every week on the run using an infrared
thermometer. How to do this is described in a Condition Monitoring Standard for each common component
. The time to implement is short; a production area can have all inspections documented, people trained, and
inspections executed in less than four weeks. An RCM approach and implementation could take six months with no
different result. An RCM analysis might lead you to spend days deciding that the primary screen is critical, and that if
the bearings fail the screen goes down; therefore, you need to inspect the bearings—all of which is obvious.
RCM does not consider planning and scheduling and people efficiency at all, nor does it include vital support systems
such as a technical database and its interface with stores. RCM is therefore a tool that should be used selectively
for critical and very complicated systems and equipment. It is not a complete reliability and maintenance system.
Do not fall into the trap of believing it is something completely new and different, or that it is a complete program for
reliability and maintenance. There are certain mills that have spent over three years on RCM implementation and they
still do not have the basics in place and/or executed well. It cannot be reinforced often enough to do the basics well
before you start complicating things.
Reliability-Centered Maintenance (RCM) is the optimum mix of reactive, time- or interval-based, condition-based,
and proactive maintenance practices. The basic application of each strategy is shown in Fig. 1. These principal
33
maintenance strategies, rather than being applied independently, are integrated to take advantage of their respective
strengths in order to maximize facility and equipment reliability while minimizing life-cycle costs
RCM includes reactive, time-based, condition-based, and proactive tasks. In addition, a user should understand
system boundaries and facility envelopes, system/equipment functions, functional failures, and failure modes, all of
which are critical components of the RCM program.
This modern concept of RCM has been adopted across several government and industry operations as a strategy for
performing maintenance. RCM applies maintenance strategies based on consequence and cost of failure. In addition,
RCM seeks to minimize maintenance and improve reliability throughout the life-cycle by using proactive techniques
such as improved design specifications, integration of condition monitoring in the commissioning process, and the
Age Exploration (AE) process.
RCM a Process
• What are the functions and associated performance standards of the asset in its present operating context?
• In what ways does it fail to fulfill its function?
• What causes each functional failure?
• What happens when each failure occurs?
• In what way does each failure matter?
• What can be done to predict or prevent each failure?
• What should be done if a suitable proactive task cannot be found?
By answering each of the seven questions above, one can identify the failure modes of the equipments, the causes of
the failure, the criticality of each failure modes, and the corresponding action to prevent the failure modes.
The seven key stages of the RCM process are shown in Figure 1 below:
34
Areas covered within the RCM Scorecard
The data collected through establishing an effective maintenance program allows a company to generate a range of
leading indicators. Measures that lead performance, or tell you that something is likely to begin to perform badly
before it actually does.
The diagram depicts the relative impact of these areas of leading indicators, and the smaller impact of performance
measures established in the traditional lagging approaches. These are the key areas of the RCM Scorecard.
35
However, the basic thrust of the RCM scorecard is to allow companies to measure the effectiveness of their
maintenance policy initiatives through applying measures to the data captured in the course of doing the day-to-day
work.
RCM Principles
• RCM is Function Oriented—RCM seeks to preserve system or equipment function, not just operability for
operability's sake. Redundancy of function, through multiple pieces of equipment, improves functional reliability but
increases life-cycle cost in terms of procurement and operating costs.
• RCM is System Focused—RCM is more concerned with maintaining system function than with individual
component function.
• RCM is Reliability Centered—RCM treats failure statistics in an actuarial manner. The relationship between
operating age and the failures experienced is important. RCM is not overly concerned with simple failure rate; it
seeks to know the conditional probability of failure at specific ages (the probability that failure will occur in each
given operating age bracket).
• RCM Acknowledges Design Limitations—RCM objective is to maintain the inherent reliability of the
equipment design, recognizing that changes in inherent reliability are the province of design rather than of
maintenance. Maintenance can, at best, only achieve and maintain the level of reliability for equipment that was
provided for by design. However, RCM recognizes that maintenance feedback can improve on the original design.
In addition, RCM recognizes that a difference often exists between the perceived design life and the intrinsic or
actual design life and addresses this through the Age Exploration (AE) process.
• RCM is Driven by Safety, Security, and Economics—Safety and security must be ensured at any cost;
thereafter, cost-effectiveness becomes the criterion.
• RCM Defines Failure as "Any Unsatisfactory Condition"—Therefore, failure may be either a loss of
function (operation ceases) or a loss of acceptable quality (operation continues but impacts quality).
• RCM Uses a Logic Tree to Screen Maintenance Tasks—this provides a consistent approach to the
maintenance of all kinds of equipment.
• RCM Tasks Must Be Applicable—the tasks must address the failure mode and consider the failure mode
characteristics.
• RCM Tasks Must Be Effective—the tasks must reduce the probability of failure and be cost-effective.
• RCM Acknowledges Three Types of Maintenance Tasks—these tasks are time-directed (PM), condition-
directed (CM), and failure finding (one of several aspects of Proactive Maintenance). Time-directed tasks are
scheduled when appropriate. Condition-directed tasks are performed when conditions indicate they are needed.
Failure-finding tasks detect hidden functions that have failed without giving evidence of pending failure.
Additionally, performing no maintenance, Run-to-Failure, is a conscious decision and is acceptable for some
equipment.
• RCM is a Living System—RCM gathers data from the results achieved and feeds this data back to improve
design and future maintenance. This feedback is an important part of the Proactive Maintenance element of the RCM
program.
Types of RCM
There are several ways to conduct and implement an RCM program. The program can be based on Rigorous Failure
Modes and Effects Analysis (FMEA), complete with mathematically-calculated probabilities of failure based on
design or historical data, intuition or common-sense, and/or experimental data and modeling. These approaches may
be called Classical, Rigorous, Intuitive, Streamlined, or Abbreviated. Other terms sometimes used for these same
approaches include Concise, Preventive Maintenance (PM) Optimization, Reliability Based, and Reliability Enhanced.
All are applicable. The decision of what technique to use should be left to the end user and be based on:
• Consequences of failure
• Probability of failure
• Historical data available
36
• Risk tolerance
• Resource availability
Classical/Rigorous RCM
a. Benefits: Classical or rigorous RCM provides the most knowledge and data concerning system functions,
failure modes, and maintenance actions addressing functional failures of any of the RCM approaches. Rigorous
RCM analysis is the method first proposed and documented by Nowlan and Heap and later modified by John
Moubray, Anthony M. Smith, and others. In addition, this method should produce the most complete documentation
of all the methods addressed here.
b. Concerns: Classical or rigorous RCM historically has been based primarily on the FMEA with little, if any,
analysis of historical performance data. In addition, rigorous RCM analysis is extremely labor intensive and often
postpones the implementation of obvious condition monitoring tasks.
c. Applications: This approach should be limited to the following three situations:
• The consequences of failure result in catastrophic risk in terms of environment, health, or safety,
and/or complete economic failure of the business unit.
• The resultant reliability and associated maintenance cost is still unacceptable after performing and
implementing a streamlined type FMEA.
• The system/equipment is new to the organization and insufficient corporate maintenance and
operational knowledge exists on function and functional failures.
Abbreviated/Intuitive/Streamlined RCM
a. Benefits: The intuitive approach identifies and implements the obvious, usually condition-based, tasks with
minimal analysis. In addition, it culls or eliminates low value maintenance tasks based on historical data and
Maintenance and Operations (M&O) personnel input. The intent is to minimize the initial analysis time in order to
realize early-wins that help offset the cost of the FMEA and condition monitoring capabilities development.
b. Concerns: Reliance on historical records and personnel knowledge can introduce errors into the process that
may lead to missing hidden failures where a low probability of occurrence exists. In addition, the intuitive process
requires that at least one individual has a thorough understanding of the various condition monitoring technologies.
c. Applications: This approach should be utilized when:
• The function of the system/equipment is well understood.
• Functional failure of the system/equipment will not result in loss of life or catastrophic impact on
the environment or business unit.
• For these reasons, the streamlined or intuitive approach has been recommended for DOS, NASA,
and NAVFAC facilities. In addition, a streamlined or intuitive approach has been successfully used in both discrete
and continuous manufacturing facilities.
RCM Logic
The RCM analysis should carefully consider and answer the following questions:
• What does the system or equipment do; what are the functions?
• What functional failures are likely to occur?
• What are the likely consequences of these functional failures?
• What can be done to reduce the probability of the failure(s), identify the onset of failure(s), or reduce the
consequences of the failure(s)?
Answers to these four questions can be used with the decision logic tree depicted in Fig. 3,
37
Reliability-Centered Maintenance (RCM) Decision Logic Tree, to determine the maintenance approach for the
equipment item or system.
Note that the analysis process as depicted in Fig. 3 has only four possible outcomes:
Failure
Failure is the cessation of proper function or performance. RCM examines failure at several levels: the system level,
sub-system level, component level, and sometimes even the parts level. The goal of an effective maintenance
organization is to provide the required system performance at the lowest cost. This means that the maintenance
approach must be based on a clear understanding of failure at each of the system levels. System components can be
degraded or even failed and still not cause a system failure. A simple example is the failed headlamp on an
automobile. That failed component has little effect on the overall system performance. Conversely, several degraded
components may combine to cause the system to have failed, even though no individual component has itself failed.
Reliability
Reliability is the probability that an item will survive a given operating period, under specified operating conditions,
without failure usually expressed as B10 (L10) Life and/or Mean Time to Failure (MTTF) or Mean Time Between
Failure (MTBF). The conditional probability of failure measures the probability that an item entering a given age
interval will fail during that interval. If the conditional probability of failure increases with age, the item shows wear-
38
out characteristics. The conditional probability of failure reflects the overall adverse effect of age on reliability. It is
not a measure of the change in an individual equipment item.
Failure rate or frequency plays a relatively minor role in maintenance programs because it is too simple a measure.
Failure frequency is useful in making cost decisions and determining maintenance intervals, but it tells nothing about
which maintenance tasks are appropriate or about the consequences of failure. A maintenance solution should be
evaluated in terms of the safety, security, or economic consequences it is intended to prevent. A maintenance task
must be applicable (i.e., prevent failures or ameliorate failure consequences) in order to be effective.
FMEA is applied to each system, sub-system, and component identified in the boundary definition. For every function
identified, there can be multiple failure modes. The FMEA addresses each system function (and, since failure is the
loss of function, all possible failures) and the dominant failure modes associated with each failure, and then examines
the consequences of the failure. What effect did the failure have on the mission or operation, the system, and on the
machine?
Even though there are multiple failure modes, often the effects of the failure are the same or very similar in nature.
That is, from a system function perspective, the outcome of any component failure may result in the system function
being degraded.
Likewise, similar systems and machines will often have the same failure modes. However, the system use will
determine the failure consequences. For example, the failure modes of a ball bearing will be the same regardless of the
machine. However, the dominate failure mode will often change from machine to machine, the cause of the failure
may change, and the effects of the failure will differ.
The failure characteristics shown in Figs. 4 and 5,. Follow-on studies in Sweden in 1973, and by the U.S. Navy in
1983, produced similar results. In these studies, random failures accounted for 77-92% of the total failures and age
related failure characteristics for the remaining 8-23%.
39
Fig. 4. Random conditional probability of failure curves
The basic difference between the failure patterns of complex and simple items has important implications for
maintenance. Single-piece and simple items frequently demonstrate a direct relationship between reliability and age.
This is particularly true where factors such as metal fatigue or mechanical wear are present or where the items are
designed as consumables (short or predictable life spans). In these cases an age limit based on operating time or stress
cycles may be effective in improving the overall reliability of the complex item of which they are a part.
Complex items frequently demonstrate some infant mortality, after which their failure probability increases gradually
or remains constant. A marked wear-out age is not common. In many cases scheduled overhaul increases the overall
failure rate by introducing a high infant mortality rate into an otherwise stable system.
Criticality assessment provides the means for quantifying how important a system function is relative to the identified
Mission. Table 1, Criticality/Severity Categories, provides a method for ranking system criticality. This system,
adapted from the automotive industry, provides 10 categories of Criticality/Severity. It is not the only method
available. The categories can be expanded or contracted to produce a site-specific listing.
40
accomplished during trouble call.
Minor disruption to facility function. Repair to failure may be longer
3 Low
than trouble call but does not delay mission.
Low to Moderate disruption to facility function. Some portion of mission
4
Moderate may need to be reworked or process delayed.
Moderate disruption to facility function. 100% of mission may need
5 Moderate
to be reworked or process delayed.
Moderate to Moderate disruption to facility function. Some portion of mission is
6
High lost. Moderate delay in restoring function.
High disruption to facility function. Some portion of mission is lost.
7 High
Significant delay in restoring function.
High disruption to facility function. All of mission is lost. Significant
8 Very High
delay in restoring function.
Potential safety, health, or environmental issue. Failure will occur
9 Hazard
with warning.
Reliability, Maintainability, and Supportability Guidebook, Third Edition, Society of Automotive Engineers, Inc.,
Warrendale, PA, 1995.
The Probability of Occurrence (of Failure) is also based on work in the automotive industry. Table 2, Probability of
Occurrence Categories, provides one possible method of quantifying the probability of failure. If there is historical
data available, it will provide a powerful tool in establishing the ranking. If the historical data is not available, a
ranking may be estimated based on experience with similar systems in the facilities area. The statistical ("Effect")
column in Table 2 can be based on operating hours, day, cycles, or other unit that provides a consistent measurement
approach. The statistical bases ("Comment") may be adjusted to account for local conditions. For example, one
organization changed the statistical approach for ranking 1 through 5 to better reflect the number of cycles of the
system being analyzed.
RCM ANALYSIS
The RCM analysis is a systematic approach for identifying preventative maintenance tasks or scheduled maintenance
tasks for an equipment end item and establishing necessary preventative (or scheduled) maintenance task intervals.
One of the key objectives of the RCM analysis is to develop a maintenance schedule that would ensure that reliability
of a system (or end) is enhanced. In essence a maintenance task would be implemented prior to the failure occurrence
of the component in question.
The consolidated results from the RCM analysis process forms the basis of a Preventive Maintenance (PM) program
for the system. A RCM analysis is conducted to determine which PM tasks would provide increased equipment
reliability for the life cycle. The RCM analysis would use the information generated by the FMEA to identify
which hardware components have the greatest effect on the equipment reliability and availability, by
identifying probable failure modes.
Using the decision tree process of RCM analysis, a complete analysis of each Functional Significant Item and their
assigned failure modes can be conducted. The results of the analysis provide a clear decision as to which preventive
maintenance tasks should be developed to support the system. The RCM analysis when used in conjunction with the
FMECA can be used to identify potential hidden safety related failures for electronic systems.
Again if the RCM analyses in conjunction with the FMEA are implemented early in the design process, safety related
failure modes could be more easily removed from the system during the design phase. As the maturity of the design
progresses this option becomes increasingly more difficult and expensive to address.
RCM IMPLEMENTATION
There is no one set path for successfully implementing RCM because RCM is more than just performing a Failure
Modes and Effects Analysis (FMEA), adopting condition monitoring techniques, and/or optimizing a maintenance and
overhaul program through the application of an Age Exploration (AE) process. A successful RCM implementation
process first must recognize what and where the source of return on investment (ROI) resides. The source(s) of ROI
may be tangible and/or intangible. For the former, a quantifiable business case may be developed based on financial
benefit (savings, cost avoidance, reduced Work in Progress (WIP) and/or reduced liability) to the organization while
for the latter, the benefit may be unquantifiable (employee skills, morale, customer relations, etc.) In either case, a
baseline and goal must be established through some mechanism such as internal or external benchmarking, which
results in a defined gap between the "As-Is" and the "To-Be" state and the ROI identified for closing all or a portion of
the gap.
Remember, caveat emptor. That is, RCM is not for everyone and very few organizations will benefit from
implementing all elements of a classical RCM program. RCM like all tools/processes has an element of diminishing
return. Not all the elements of RCM which are applicable to a nuclear power plant, the aircraft industry, and/or a 24/7
continuous process plant in a sold out condition, will be applicable to a batch process operation or a non-production
facility. However, there are a few truths everyone should follow and there is no need to pilot or perform an FMEA
analysis. They are:
42
1. Key performance indicators (aka metrics/performance indicators) are essential for establishing the baseline,
goal, and the gap. Progress cannot be measured or sustained without KPIs. (See Section G-Key Performance
Indicator (KPIs) Selection)
2. Thermography works for electrical distribution, boilers, couplings, roofing systems and building façades.
3. If your specifications for alignment, imbalance, motor circuit phase impedance, oil condition and
cleanliness, and vibration are not quantified, the product you receive will have latent defects 80% of the time.
4. If you do not commission and check the sequence of operation of your equipment and buildings to a
predetermined quantifiable specification, you will not get what you expect.
5. Pareto analysis is the best tool for determining where to start your RCM process. Look for the bottlenecks,
the recurring failures, and follow the money.
6. RCM implementation in a team environment works better.
7. Failure modes for identical equipment are the same. It is only the consequence and probability of failure that
changes.
8. The impact of poor water chemistry is underestimated in terms of energy consumption and life-cycle cost.
9. The majority of failures are random. Very few machines understand how a calendar works. Age Exploration
can reveal hidden assets.
10. Celebrate and advertise your successes and address your failures. Credibility is a key to building support for
long-term success.
Significant thought must go into the process of selecting KPIs to support the maintenance program. The value of
meaningful KPIs cannot be overstated; however the significance of KPIs that are inaccurate or inapplicable cannot be
understated. First identify the goals and objectives of the organization because they will have an impact on the
selection of KPIs at all levels of maintenance activity. KPIs that cannot possibly be obtained should not be chosen,
and only those that may be controlled should be selected. Issues of concern should also be identified so that they will
be considered in the selection of KPIs. All processes owners who are key to the implementation of the overall effort
should have a self-selected metric to indicate goals and progress in meeting those goals. This will foster the
acceptance of collecting data to support the KPIs and will also promote the use of the KPIs for continuous
improvement. Also one must consider the capabilities of the organization to collect the data for KPIs, i.e. the process
used for collecting and storing the data and the ease of extracting and reporting the KPIs. In doing this, the cost of
obtaining data for the KPIs and the relative value they add to the overall program must be calculated. While
advocating doing the right things within the maintenance program with life-cycle cost as a driver, the cost of the
capturing supporting KPIs must also be watched closely.
INTRODUCTION
Total productive maintenance (TPM) is the systematic execution of maintenance by all employees through small
group activities.
The dual goals of TPM are zero breakdowns and zero defects; this obviously improves equipment efficiency rates and
reduces costs. It also minimises inventory costs associated with spare parts.
43
It is claimed that most companies can realise a 15-25 percent increase in equipment operation rates within three years
of adopting TPM. Labour productivity also generally increases by a significant margin, sometimes as high as 40-50
percent.
Now a days the meaning of maintenance is that it is all about preserving the functions of physical assets. In other
words, carrying out tasks that serve the central purpose of ensuring that our machines are capable of doing what the
users want them to do, when they want them to do it. The possible maintenance policies can be grouped under four
headings viz.
1. Corrective - wait until a failure occurs and then remedy the situation (restoring the asset to productive capability) as
quickly as possible.
2. Preventive - believe that a regular maintenance attention will keep an otherwise troublesome failure mode at bay.
3. Predictive - rather than looking at a calendar and assessing what attention the equipment needs, we should examine
the 'vital signs' and infer what the equipment is trying to tell us. The term 'Condition Monitoring' has come to mean
using a piece of technology (most often a vibration analyzer) to assess the health of our plant and equipment.
4. Detective - applies to the types of devices that only need to work when required and do not tell us when they are in
the failed state e.g. a fire alarm or smoke detector. They generally require a periodic functional check to ascertain that
they are still working.
Apart etective maintenance, the central problem that companies have struggled with is how to make the choice
between the other three. This has led to the increasing interest within industry in two strategies, which offer a path to
long term continuous improvement rather than the promise of a quick fix. These are Reliability Centered Maintenance
(RCM) and Total Productive Maintenance (TPM). The two strategies, although having similar names, actually have
very different strengths.
TPM is a manufacturing led initiative that emphasises the importance of people, a 'can do' and 'continuous
improvement' philosophy and the importance of production and maintenance staff working together. It is presented as
a key part of an overall manufacturing philosophy. In essence, TPM seeks to reshape the organization to liberate its
own potential.
The modern business world is a rapidly changing environment, so the last thing a company needs if it is to compete in
the global marketplace is to get in its own way because of the way in which it approaches the business of looking after
its income generating physical assets. So, TPM is concerned with the fundamental rethink of business processes to
achieve improvements in cost, quality, speed etc. It encourages radical changes, such as;
• Total
o all employees are involved
o it aims to eliminate all accidents, defects and breakdowns
• Productive
o actions are performed while production goes on
o troubles for production are minimized
44
• Maintenance
o keep in good condition
o repair, clean, lubricate
TPM - History:
TPM is a innovative Japanese concept. The origin of TPM can be traced back to 1951 when preventive maintenance
was introduced in Japan. However the concept of preventive maintenance was taken from USA. Nippondenso was the
first company to introduce plant wide preventive maintenance in 1960. Preventive maintenance is the concept
wherein, operators produced goods using machines and the maintenance group was dedicated with work of
maintaining those machines, however with the automation of Nippondenso, maintenance became a problem as more
maintenance personnel were required. So the management decided that the routine maintenance of equipment would
be carried out by the operators. ( This is Autonomous maintenance, one of the features of TPM ). Maintenance group
took up only essential maintenance works.
Thus Nippondenso which already followed preventive maintenance also added Autonomous maintenance done by
production operators. The maintenance crew went in the equipment modification for improving reliability. The
modifications were made or incorporated in new equipment. This lead to maintenance prevention. Thus preventive
maintenance along with Maintenance prevention and Maintainability Improvement gave birth to Productive
maintenance. The aim of productive maintenance was to maximize plant and equipment effectiveness to achieve
optimum life cycle cost of production equipment.
By then Nippon Denso had made quality circles, involving the employees participation. Thus all employees took part
in implementing Productive maintenance. Based on these developments Nippondenso was awarded the distinguished
plant prize for developing and implementing TPM, by the Japanese Institute of Plant Engineers ( JIPE ). Thus
Nippondenso of the Toyota group became the first company to obtain the TPM certification.
Traditionally high buffer stocks were allowed to develop between major pieces of the plant & equipment to ensure
that if there was a problem with one piece of the plant or equipment then it would not affect production from the rest
of the plant. Hence the role of maintenance was to cost effectively ensure major pieces of plant & equipment were
available for an agreed period of scheduled time, for example 90%.
Because of the accepted practice of retaining high buffer stocks, most items of equipment could be considered
independent. If the equipment in a process was maintained such that it achieved 90% availability, the availability of
the process was 90%. If the equipment started to cause quality problems, these would probably be noticed in final
quality inspection and the cause traced back to the offending piece of equipment and corrected by maintenance.
45
At Nippon Denso in 1970 with the introduction of the Toyota Production System, the buffer stocks were substantially
reduced in their quest for shorter leadtimes and improved quality. Statistical Process Control (SPC) supported by
"Quality at Source" was introduced to ensure quality right first time so to provide maximum customer value through
the highest quality at the lowest cost supported by quick responsiveness and superior customer service. Hence in this
quest for maximum customer value, buffer stocks were reduced to both reduce leadtimes and force the identification
of cost consuming problems. This resulted in individual equipment problems affecting the whole process.
If one piece of equipment stopped then shortly afterwards the whole process stopped. This made the equipment
interdependent. Under these circumstances, the availability of the process became the product of the individual
availabilities of each piece of equipment. Thus, a process involving four pieces of equipment maintained at 90% no
longer had an overall process availability of 90%, but an availability of 90% X 90% X 90% X 90%, or 66%!
Furthermore, as the quality approach changed to "Prevention at Source" by controlling process variables, equipment
performance problems were identified much earlier. Conformance and reliability became much more important.
As buffer stocks reduced substantial pressure was placed on the maintenance department to improve process
performance. From a maintenance perspective, the maintenance department's performance had not deteriorated, yet
demand for the substantial improvement in equipment availability was overwhelming.
This caused friction between the production and maintenance departments. Production departments demanded former
levels of process availability and quicker response times from maintenance, who were often unable to comply due to
traditional organisation structures which keep maintenance as a separate function. After much conflict between
46
maintenance and production, engineering were called in to find a solution. They soon realised that mathematically for
the four pieces of equipment to achieve their original goal of 90% availability, their individual availabilities needed to
increase from 90% to 97.5%.
The traditional view of maintenance was to balance maintenance cost with an acceptable level of availability and
reliability often influenced by the level of buffer stocks which hid the immediate impact of equipment problems. In
traditional companies, maintenance is seen as an expense that can easily be reduced in relation to the overall business,
particularly in the short term. Conversely, maintenance managers have always argued that to increase the level of
availability and reliability of the equipment, more expenditure needs to be committed to the maintenance budget. With
the on set of substantial availability problems caused by the new way of running the plant, management soon realised
that just giving more resources to the maintenance department was not going to produce a cost effective solution.
This conflict between maintenance cost and availability is similar to the old quality mind-set before the advent of
Total Quality Control (TQC): that higher quality required more resources, and hence cost, for final inspection and
rework. TQC emphasised "prevention at source" of the problem rather than by inspection at the end of the process.
Instead of enlarging the inspection department, all employees were trained and motivated to be responsible for
identifying problems at the earliest possible point in the process so as to minimise rectification costs. This did not
mean disbanding the quality control department but having it now concentrate on more specialist quality activities
such as variation reduction through process improvement. This new approach to quality demonstrated that getting
quality right first time does not cost money but actually reduces the total cost of operating the business.
This new Quality approach of "prevention at source" was translated to the maintenance environment through the
concept of TPM resulting in not only superior availability, reliability and maintainability of equipment but also
significant improvements in capacity with a substantial reduction in both maintenance costs and total operational
costs. TPM is based on "prevention at source" and is focused on identifying and eliminating the source of equipment
deterioration rather than the more traditional approach of either letting equipment fail before repairing it, or applying
preventive / predictive strategies to identify and repair equipment after the deterioration has taken hold and caused the
need for expensive repairs.
TPM has developed over the years since its first introduction in 1970. Originally there were 5 Activities of TPM that
is now referred to as 1st Generation TPM (Total Productive Maintenance). It focused on improving equipment
performance or effectiveness only. Late in the 80's it was realized that even if the shop floor were committed fully to
TPM and the elimination or minimization of the "six big losses" there were still opportunities being lost because of
poor production scheduling practices resulting in line imbalances or schedule interruptions. Hence the development of
2nd Generation TPM (Total Process Management) which focused on the whole production process.
Finally, in more recent times it has been recognized that the whole company must be involved if the full potential of
the capacity gains and cost reductions are to be realized. Hence 3rd Generation TPM (Total Productive Manufacturing
/ Mining) has evolved which now encompasses the 8 Pillars of TPM with the focus on the 16 Major Losses
incorporating the 4Ms - Man, Machine, Methods, Materials. At the CTPM we have expanded the Japanese 8 Pillars to
10 Pillars of Australasian 3rd Generation TPM to better suit our needs in Australia and New Zealand based on our
extensive research of the past two and a half years.
An important outcome of this new approach to equipment management which is now supported by many success
stories throughout the world in a variety of operational industries, has been that senior management have realised that
TPM is both strategically important for a world competitive business, and that TPM cannot be implemented by the
maintenance department alone. TPM is a company wide improvement initiative involving all employees.
Although each enterprise may approach TPM in its own unique way, most approaches recognise the importance of
measuring and improving overall equipment effectiveness along with the need to reduce both operational and
maintenance costs in an environment that promotes continuous improvement.
• To improve the equipment reliability and maintainability which will improve quality and productivity.
• To ensure maximum economy in equipment and management for the entire life of the equipment.
• To cultivate the equipment-related expertise among operators and skills among operators.
• To create an enthusiastic work environment.
Motives of TPM
1. Adoption of life cycle approach for improving the overall performance of production equipment.
2. Improving productivity by highly motivated workers which is achieved by job enlargement.
3. The use of voluntary small group activities for identifying the cause of failure, possible plant and equipment
modifications.
48
Uniqueness of TPM
The major difference between TPM and other concepts is that the operators are also made to involve in the
maintenance process. The concept of "I ( Production operators ) Operate, You ( Maintenance
department ) fix" is not followed.
(1) Improve equipment effectiveness: examine the effectiveness of facilities by identifying and examining all losses
which occur - downtime losses, speed losses and defect losses.
(2) Achieve autonomous maintenance: allow the people who operate equipment to take responsibility for, at least
some, of the maintenance tasks. This can be at :
• the repair level (where staff carry out instructions as a response to a problem);
• the prevention level (where staff take pro-active action to prevent foreseen problems); and the
• improvement level (where staff not only take corrective action but also propose improvements to prevent
recurrence).
(3) Plan maintenance: have a systematic approach to all maintenance activities. This involves the identification of the
nature and level of preventive maintenance required for each piece of equipment, the creation of standards for
condition-based maintenance, and the setting of respective responsibilities for operating and maintenance staff. The
respective roles of "operating" and "maintenance" staff are seen as being distinct. Maintenance staff are seen as
developing preventive actions and general breakdown services, whereas operating staff take on the "ownership" of the
facilities and their general care. Maintenance staff typically move to a more facilitating and supporting role where they
are responsible for the training of operators, problem diagnosis, and devising and assessing maintenance practice.
(4) Train all staff in relevant maintenance skills: the defined responsibilities of operating and maintenance staff require
that each has all the necessary skills to carry out these roles. TPM places a heavy emphasis on appropriate and
continuous training.
(5) Achieve early equipment management: the aim is to move towards zero maintenance through "maintenance
prevention" (MP). MP involves considering failure causes and the maintainability of equipment during its design
stage, its manufacture, its installation, and its commissioning. As part of the overall process, TPM attempts to track all
potential maintenance problems back to their root cause so that they can be eliminated at the earliest point in the
overall design, manufacture and deployment process.
49
PILLARS OF TPM
Pillars of TPM
PILLAR 1 - 5S :
TPM starts with 5S. Problems cannot be clearly seen when the work place is unorganized. Cleaning and organizing
the workplace helps the team to uncover problems. Making problems visible is the first step of improvement.
50
SEITON - Organise :
The concept here is that "Each items has a place, and only one place". The items should be placed back after usage at
the same place. To identify items easily, name plates and colored tags has to be used. Vertical racks can be used for
this purpose, and heavy items occupy the bottom position in the racks.
This involves cleaning the work place free of burrs, grease, oil, waste, scrap etc. No loosely hanging wires or oil
leakage from machines.
SEIKETSU - Standardization :
Employees has to discuss together and decide on standards for keeping the work place / Machines / pathways neat and
clean. This standards are implemented for whole organization and are tested / Inspected randomly.
Considering 5S as a way of life and bring about self-discipline among the employees of the organization. This
includes wearing badges, following work procedures, punctuality, dedication to the organization etc.
This pillar is geared towards developing operators to be able to take care of small maintenance tasks, thus freeing up
the skilled maintenance people to spend time on more value added activity and technical repairs. The operators are
responsible for upkeep of their equipment to prevent it from deteriorating.
Policy :
1. Preparation of employees.
2. Initial cleanup of machines.
3. Take counter measures
4. Fix tentative JH standards
51
5. General inspection
6. Autonomous inspection
7. Standardization and
8. Autonomous management.
1. Train the Employees : Educate the employees about TPM, Its advantages, JH advantages and Steps in JH.
Educate the employees about abnormalities in equipments.
2. Initial cleanup of machines :
o Supervisor and technician should discuss and set a date for implementing step1
o Arrange all items needed for cleaning
o On the arranged date, employees should clean the equipment completely with the help of maintenance
department.
o Dust, stains, oils and grease has to be removed.
o Following are the things that has to be taken care while cleaning. They are Oil leakage, loose wires,
unfastened nits and bolts and worn out parts.
o After clean up problems are categorized and suitably tagged. White tags is place where problems can
be solved by operators. Pink tag is placed where the aid of maintenance department is needed.
o Contents of tag is transferred to a register.
o Make note of area which were inaccessible.
o Finally close the open parts of the machine and run the machine.
3. Counter Measures :
o Inaccessible regions had to be reached easily. E.g. If there are many screw to open a fly wheel door,
hinge door can be used. Instead of opening a door for inspecting the machine, acrylic sheets can be
used.
o To prevent work out of machine parts necessary action must be taken.
o Machine parts should be modified to prevent accumulation of dirt and dust.
4. Tentative Standard :
o JH schedule has to be made and followed strictly.
o Schedule should be made regarding cleaning, inspection and lubrication and it also should include
details like when, what and how.
5. General Inspection :
o The employees are trained in disciplines like Pneumatics, electrical, hydraulics, lubricant and coolant,
drives, bolts, nuts and Safety.
o This is necessary to improve the technical skills of employees and to use inspection manuals
correctly.
o After acquiring this new knowledge the employees should share this with others.
o By acquiring this new technical knowledge, the operators are now well aware of machine parts.
6. Autonomous Inspection :
PILLAR 3 - KAIZEN :
"Kai" means change, and "Zen" means good ( for the better ). Basically kaizen is for small improvements, but carried
out on a continual basis and involve all people in the organization. Kaizen is opposite to big spectacular innovations.
Kaizen requires no or little investment. The principle behind is that "a very large number of small improvements are
move effective in an organizational environment than a few improvements of large value. This pillar is aimed at
reducing losses in the workplace that affect our efficiencies. By using a detailed and thorough procedure we eliminate
losses in a systematic method using various Kaizen tools. These activities are not limited to production areas and can
be implemented in administrative areas as well.
Kaizen Policy :
Kaizen Target :
Achieve and sustain zero loses with respect to minor stops, measurement and adjustments, defects and unavoidable
downtimes. It also aims to achieve 30% manufacturing cost reduction.
1. PM analysis
2. Why - Why analysis
3. Summary of losses
4. Kaizen register
5. Kaizen summary sheet.
53
The objective of TPM is maximization of equipment effectiveness. TPM aims at maximization of machine utilization
and not merely machine availability maximization. As one of the pillars of TPM activities, Kaizen pursues efficient
equipment, operator and material and energy utilization, that is extremes of productivity and aims at achieving
substantial effects. Kaizen activities try to thoroughly eliminate 16 major losses.
Loss Category
9. Management loss
10. Operating motion loss
11. Line organization loss
Loses that impede human work efficiency
12. Logistic loss
Classification of losses :
Causes for this failure can be easily This loss cannot be easily identified and
Causation traced. Cause-effect relationship is solved. Even if various counter measures
simple to trace. are applied
This type of losses are caused because of
Remedy Easy to establish a remedial measure hidden defects in machine, equipment
and methods.
54
Specialists in process engineering,
Usually the line personnel in the
Corrective action quality assurance and maintenance
production can attend to this problem.
people are required.
PILLAR 4 - PLANNED MAINTENANCE :
It is aimed to have trouble free machines and equipments producing defect free products for total customer
satisfaction. This breaks maintenance down into 4 "families" or groups which was defined earlier.
1. Preventive Maintenance
2. Breakdown Maintenance
3. Corrective Maintenance
4. Maintenance Prevention
With Planned Maintenance we evolve our efforts from a reactive to a proactive method and use trained maintenance
staff to help train the operators to better maintain their equipment.
Policy :
Target :
It is aimed towards customer delight through highest quality through defect free manufacturing. Focus is on
eliminating non-conformances in a systematic manner, much like Focused Improvement. We gain understanding of
what parts of the equipment affect product quality and begin to eliminate current quality concerns, then move to
potential quality concerns. Transition is from reactive to proactive (Quality Control to Quality Assurance).
QM activities is to set equipment conditions that preclude quality defects, based on the basic concept of maintaining
perfect equipment to maintain perfect quality of products. The condition are checked and measure in time series to
55
very that measure values are within standard values to prevent defects. The transition of measured values is watched
to predict possibilities of defects occurring and to take counter measures before hand.
Policy :
Target :
Data requirements :
Quality defects are classified as customer end defects and in house defects. For customer-end data, we have to get data
on
In-house, data include data related to products and data related to process
1. The operating condition for individual sub-process related to men, method, material and machine.
2. The standard settings/conditions of the sub-process
3. The actual record of the settings/conditions during the defect occurrence.
PILLAR 6 - TRAINING :
It is aimed to have multi-skilled revitalized employees whose morale is high and who has eager to come to work and
perform all required functions effectively and independently. Education is given to operators to upgrade their skill. It
is not sufficient know only "Know-How" by they should also learn "Know-why". By experience they gain, "Know-
56
How" to overcome a problem what to be done. This they do without knowing the root cause of the problem and why
they are doing so. Hence it become necessary to train them on knowing "Know-why". The employees should be
trained to achieve the four phases of skill. The goal is to create a factory full of experts. The different phase of skills
are
Policy :
Target :
1. Achieve and sustain downtime due to want men at zero on critical machines.
2. Achieve and sustain zero losses due to lack of knowledge / skills / techniques
3. Aim for 100 % participation in suggestion scheme.
1. Setting policies and priorities and checking present status of education and training.
2. Establish of training system for operation and maintenance skill up gradation.
3. Training the employees for upgrading the operation and maintenance skills.
4. Preparation of training calendar.
5. Kick-off of the system for training.
6. Evaluation of activities and study of future approach.
Office TPM should be started after activating four other pillars of TPM (JH, KK, QM, PM). Office TPM must be
followed to improve productivity, efficiency in the administrative functions and identify and eliminate losses. This
includes analyzing processes and procedures towards increased office automation. Office TPM addresses twelve
major losses. They are
1. Processing loss
2. Cost loss including in areas such as procurement, accounts, marketing, sales leading to high inventories
3. Communication loss
4. Idle loss
5. Set-up loss
6. Accuracy loss
7. Office equipment breakdown
8. Communication channel breakdown, telephone and fax lines
9. Time spent on retrieval of information
10. Non availability of correct on line stock status
57
11. Customer complaints due to logistics
12. Expenses on emergency dispatches/purchases of equipment
The TPM program closely resembles the popular Total Quality Management (TQM) program. Many of the tools such
as employee empowerment, benchmarking, documentation, etc. used in TQM are used to implement and optimize
TPM.Following are the similarities between the two.
1. Total commitment to the program by upper level management is required in both programmes
2. Employees must be empowered to initiate corrective action, and
3. A long range outlook must be accepted as TPM may take a year or more to implement and is an on-going
process. Changes in employee mind-set toward their job responsibilities must take place as well.
For this reason, the application of TPM as a company wide improvement strategy is highly advisable to ensure:
Before attempting a full blown RCM analysis or a partial RCM approach following the basic RCM process. Failure to
do this in an environment where basic equipment conditions and operator error are causing significant variation in the
life of your equipment parts will block your ability to cost effectively optimise your maintenance tactics and spares
holding strategies.
The other key difference between RCM and TPM is that RCM is promoted as a maintenance improvement strategy
whereas TPM recognises that the maintenance function alone cannot improve reliability. Factors such as operator 'lack
of care' and poor operational practices, poor 'basic equipment conditions', and adverse equipment loading due to
changes in processing requirements (introduction of different products, raw materials, process variables etc) all impact
on equipment reliability. Unless all employees become actively involved in recognising the need to eliminate or
reduce all "losses" and to focus on 'defect avoidance' or 'early defect identification environment.
58
CONCLUSION:
It should be acknowledged that a TPM implementation is not a short-term fix. It is a continuous journey based on
changing the work-area then the equipment so as to achieve a clean, neat, safe workplace through a "PULL" as
opposed to a "PUSH" culture change process. Significant improvement should be evident within six months, however
full implementation can take many years to allow for the full benefits of the new culture created by TPM to be
sustaining. This time frame obviously depends upon where a company is in relation to its quality and maintenance
activities and the resources being allocated to introduce this new mind-setmanagement.
1. INTRODUCTION
In broad sense maintenance is “to keep fit any System for use”. It may be defined as an
overall combination of all those activities which are required to keep an item as in built
condition so that it continues to have its original procedure capacity.
In order to streamline the understanding of different types/streams of maintenance functions,
the classification can be done on basis of planning and criticality/essentiality of jobs. Some
jobs may be planned in advance but some jobs may have to taken up immediately and up-
planned.
Fig.1. The graph here shows a relation between amount of wear and time for which the
component is being used.
60
Autonomous maintenance by operators, therefore, is most important in TPM.
Moreover, offering workshops and trainings to stakeholders improve the interaction between
people operations and maintenance. The objective of which is the implementation of
improved OEE (Overall Equipment Efficiency) metrics. The implementation of TPM is will
be easier if “5S” is already working in the plant.
Undoubtedly, TPM is one of the most effective ways to create a lean organization with
reduced cycle time and improved operational efficiency
Meaning of total productive maintenance
Now a days the meaning of maintenance is that it is all about preserving the functions of
physical assets. In other words, carrying out tasks that serve the central purpose of ensuring
that our machines are capable of doing what the users want them to do, when they want them
to do it. The possible maintenance policies can be grouped under four headings viz.
1. Corrective - wait until a failure occurs and then remedy the situation (restoring the asset to
productive capability) as quickly as possible.
2. Preventive - believe that a regular maintenance attention will keep an otherwise
troublesome failure mode at bay.
3. Predictive - rather than looking at a calendar and assessing what attention the equipment
needs, we should examine the 'vital signs' and infer what the equipment is trying to tell us.
The term 'Condition Monitoring' has come to mean using a piece of technology (most often a
vibration analyzer) to assess the health of our plant and equipment.
4. Detective - applies to the types of devices that only need to work when required and do
not tell us when they are in the failed state e.g. a fire alarm or smoke detector. They
generally require a periodic functional check to ascertain that they are still working.
Apart etective maintenance, the central problem that companies have struggled with is how
to make the choice between the other three. This has led to the increasing interest within
industry in two strategies, which offer a path to long term continuous improvement rather
than the promise of a quick fix. These are Reliability Centered Maintenance (RCM) and
Total Productive Maintenance (TPM). The two strategies, although having similar names,
actually have very different strengths.
TPM is a manufacturing led initiative that emphasises the importance of people, a 'can do'
and 'continuous improvement' philosophy and the importance of production and maintenance
staff working together. It is presented as a key part of an overall manufacturing philosophy.
In essence, TPM seeks to reshape the organization to liberate its own potential.
The modern business world is a rapidly changing environment, so the last thing a company
needs if it is to compete in the global marketplace is to get in its own way because of the way
in which it approaches the business of looking after its income generating physical assets.
So, TPM is concerned with the fundamental rethink of business processes to achieve
improvements in cost, quality, speed etc. It encourages radical changes, such as;
flatter organisational structures - fewer managers, empowered teams,
multi-skilled workforce,
rigorous reappraisal of the way things are done - often with the goal of simplification.
EVOLUTION:
The concept of TPM originated in Japan’s manufacturing industries, initially with the aim of
eliminating production losses due to limitations in the JIT process for production operations
[8]. Seichi Nakajima is credited with defining the fundamental concepts of TPM and seeing
61
the procedure implemented in hundreds of plants in Japan; the key concept being
autonomous maintenance [9].
TPM is a major departure from the “you operate, I maintain” philosophy [10]. It is the
implementation of productive maintenance by all associated personnel (whether machine
operators or members of the management team), based on the involvement of all in the
continual improvement of performance. TPM endeavour’s to eliminate the root causes of
problems, through team-based decisions and their implementation. Achieving low-cost
improvements and zero-deficit product quality are striven for, while designing for minimum
LCC maintenance and using the JIT procedure. All employees through small-group
activities, which include aiming for zero breakdowns and zero defects, should implement it.
In essence, TPM seeks to integratethe organisation to recognise, liberate and utilise its own
potential and skills [11].TPM combines the best features of productive and PM procedures
with innovative management strategies and encourages total employee involvement. TPM
focuses attention upon the reasons for energy losses from, and failures of equipment due to
design weaknesses that the associated personnel previously thought they had to tolerate.
Autonomous maintenance looks into the means for achieving a high degree of cleanliness,
excellent lubrication and proper fastening (e.g. tightening of nuts on bolts in the system) in
order to inhibit deterioration and prevent machine breakdown. The Japanese Institute of
Plant Maintenance in 1996 introduced autonomous maintenance for operations as a role for
all employees’ in order to achieve greater financial profits.
AIM OF TPM:
62
The aim of TPM is to bring together management, supervisors and trade union members to
take rapid remedial actions as and when required.
MAIN OBJECTIVES.
Its main objectives areis to achieve zero breakdowns, zero defects and improved throughputs
by:
• Increasing operator involvement and ownership of the process.
• Improving problem-solving by the team.
• Refining preventive and predictive maintenance activities.
• Focussing on reliability and maintainability engineering.
• Upgrading each operator’s skills.
TPM strategies:
Human-oriented Strategy
Human-oriented strategy is, generally, strategies that actively involve human administrative
application of management methods in achieving high extent of TPM. Three important
aspects that are often discussed as the core of Human-oriented strategy are:
(1) Top management commitment and leadership,
(2) Total Employee Involvement, and
(3) Training and Education
Top Management Commitment and Leadership
The role of top management’s commitment and leadership has been frequently emphasized
in many literatures to have the decisive influence over successful TPM implementation
(Tsang & Chan, 2000). TPM requires a drastic change in the traditional mindset of work
culture and maintenance approaches. However at the present moment, high resistance is
often encountered from the shop floor operators and as well as the maintenance personnel.
To this extent, active top management support is crucial to overcome such resistance,
especially during the transition period (Fredendall, 1997). Bamber et al. (1999) wrote that
the major obstacle in implementing TPM in UK was the lack of top management
commitment to follow through which resulted in many organizations to struggle when
attempting to implement TPM. Patterson (1996) explained that to successfully implement
TPM, an organization must be led by top management that is supportive understanding and
committed to the various kinds of TPM activities. Top management has the primary
responsibility of preparing a suitable and supportive environment before the official kick-off
of TPM within their organization. This may include resources allocation and training and
education provided to the middle management level as well as the production floor
operators. Nakajima (1989) stated that the top management’s primary responsibility is to
establish a favorable environment where the work environment can support autonomous
activities.
While top management commitment and leadership is essential for TPM success, it is not
sufficient on its own. TPM embraces empowerment to production operators establishing a
sense of ownership in their daily operating equipment (Tsang & Chan, 2000). This sense of
ownership is an important factor that underpins TPM to its continual success with every
63
operator being responsible to ensure her own machine is clean and maintained. It involves
the employees to have a common understanding of the basic principles of TPM. The
importance of total employee involvement is based on the beliefs that shop floor operators
have the most hands-on experience with the machines they operate daily. Thus, TPM
demands active participation from the shop floor operators in the continuous improvements
activities, cross-functional teamwork, work suggestion schemes (Nakajima, 1989). High
level of maintenance awareness and simple routine maintenance tasks are integrated into
their daily duties and the final mission ahead is to achieve profitable Autonomous
Maintenance by operators. TPM accomplished the maximization of equipment effectiveness
through total employee participation and incorporated the use of Autonomous Maintenance
in the small group activities to improve on the equipment reliability, maintainability and
productivity (Chen, 1997).
Blanchard (1997) pointed out that training and educational issues had become one of the
critical factors to establish successful TPM implementation, where proper education begin as
early as during the TPM introduction and initial preparation stages. The entire workforce in
the organization need to acquire new knowledge, skill and abilities related to TPM.
Thiagarajan and Zairi (1997) further addressed that education and training is the single most
important factor once the necessary commitment has been assured and had become a long-
term strategy in the planning schedule to obtain aspirations and skills. Further
implementation of TPM sees the training to be essential to the implementation and work
performance.
The findings show that there is a positive relationship between human-oriented strategy and
the extent of TPM implementation. This can be related to the emphasis that the extent of
TPM implementation mainly requires new system development and adoption of a new
strategy to the organisation itself and its success depends exclusively on work culture,
organisational practices and so on, which are human related issues. The findings that training
is a significant determinant of TPM implementation supports the works of Blanchard (1997)
who pointed out that training and educational issues become the critical factors to establish
successful TPM implementation. It is also supported by the study of Thiagarajan and Zairi
(1997) singled out education and training as the important drivers after the initial
commitment to carry out the TPM implementation has been mad. Process-oriented Strategy
64
diminished by inefficient equipment that generates losses in terms of failure-loss,
performance-loss or defect-loss. In detail, such losses can be refined as Equipment
breakdown; Setup and adjustment time loss; Idling and minor stoppages; reduced
performance rate (slower speed); Process failure (defects) and reworks and Startup time
losses. Therefore, one of the major features of Process-oriented Strategy emphasizes on
hands-on and practical approaches to identify and quantify all the above losses in the
production floor (Suzuki, 1994). The sequential step-wise procedure of Process-oriented
Strategy begins with:
The immediate subsequent action after identifying and quantifying the equipment losses is to
stratify and analyze the relevant root causes. Some of the analytical methods that had been
developed and widely deployed to promote the thorough and systematic elimination of
defects in Process-oriented Strategy are PM analysis, Fault-tree analysis (FTA), Failure
Mode and Effect analysis (FMEA) and so on (Suzuki, 1994). With all failure phenomena
being clearly described, improvements become crucial in the second stage to eliminate these
pertinent causes. Improvement plans had to be, in balance, carried out on both equipment
and process. Shimbun (1995) stated that, for improvement of equipment, the very basic
activities are to restore all forms of deterioration in the equipment and establishing the basic
condition.
65
The impact and effectiveness of human and process oriented strategies towards the extent of
TPM is simultaneously measured in hypothesis 3 and Multiple Regression analysis showed
that Human-oriented Strategy is having a greater impact compared to Process-oriented
Strategy. For the reason that introduction and implementation of TPM are considered as one
of the form of change management in the organisation where changes in work culture,
process and management systems, organisational environment, and the individual
perspective within the organisation are crucial to the enforcement of commitment,
involvement and matured attitudes. When all these human related issues has been restored in
the environment then only the technical skills and knowledge of maintenance have the
foundation to maximize their effectiveness.
Small Autonomous Group Concept: this is recent concept and gained widespread
acceptance in Japanese firms. Here the small group activities are company led and inter-
woven in company’s over-all activities. The company’s organization is so build up that over
laps at several levels-from small groups of senior executives down to small groups of shop-
floor workmen and this company led overlapping organization of small group involves
effective participation of almost all employees from top management to shop-floor
workmen. In this, leader of each group level works as link and binder by acting as member
of small group of the level above. This improves the vertical and horizontal communication.
Each autonomous group is totally responsible for maintenance and all other jobs of their
area. Fig. Shows a TOM promotion system indicating roughly the overlapping of small
groups at different levels. These group also called as productive maintenance circles at
different levels.
66
Normal Productive Maintenance Concept: in this, operational personnel, maintenance
personnel and connected planning groups are fully involved in the maintenance and upkeep
the plant work together for common goal. Other centralized group are also involved to some
extent by taking one of their task as” helping in good maintenance and upkeep the plant by
accessing the materials needed and supplying right quality at right time.” This concept can
of course be adopted in big Indian industries.
Goals of TPM:
(1) Achieving sustainability,
(2) Standardisation,
(3) Pertinent education and training in TPM,
(4) Measuring TPM effectiveness,
(5) Developing an autonomous maintenance programme and
(6) Implementing Kaizen-teian programmes
The main goal of TPM is to create a production environment free from mechanical
breakdowns and technical disturbances by involving everybody in maintenance duties
without heavily relying on mechanics or engineers.
Workshop management is responsible for implementing TPM goals via group PM, small-
group activities, maximising equipment effectiveness, zero-accident and zero-pollution aims,
improving operating reliability, reducing the LCC, and problem solving.
Benefits
67
• Cultivate a sense of ownership in the operator by introducing autonomous maintenance –
the operator takes responsibility for the primary care of his/her plant. The tasks include
cleaning, routine inspection, lubrication, adjustments, minor repairs as well as the cleanliness
of the local workspace.
• Establish an optimal schedule of clean-up and PM to extend the plant’s life- span and
maximise its uptime.
Many TPM operators have achieved excellent progress [11], in instances such as:-
• Better understanding of the equipment’s criticality and where and when is it financially
worth improving.
68
achieving a high productivity depends on keeping the equipment functioning at peak levels,
for as long as is feasible. Today, with competition increasing, successful TPM may be one of
the essential factors that determine whether some organisations, survive.
• Overall plant’s productivity (i.e. more effective operation and resource utilisation as well as
the elimination of excessive inventory stocks).
• End-product quality (e.g. by insisting on purchasing better designs) and services (e.g.
through better-maintained plant and machines).
• Education and training of employees, so empowering them and raising morale, to keep
pace with the complexity of evolving technologies.
The process identifies the non-value-added activities within an organisation and then
systematically creates solutions to eliminate successively the most wasteful ones.
Maintenance affects all aspects of business effectiveness - risk, safety, environmental
sustainability achieved, energy efficiency, product quality and customer service, i.e. not just
plant availability and costs. Downtime has always affected adversely the capability of
physical assets by reducing output, increasing operating costs and lowering customer service
[5].
In a culture that stresses participation and autonomy, the function of the management should
not be solely to control but also to provide support and encouragement. Decisions on
broadly-based issues, such as the implementation of TPM and RCM or the introduction of a
new reward convention for employees, are made only after the management has entered into
a dialogue with those affected. The managers will provide overall direction for the work that
is clearly targeted and engaging. Their tasks will be those of consultants, mentors and
coaches to help the employees avoid unnecessary waste of effort so that they can
69
(ii) formulate creative, unique and appropriate performance strategies
that generate synergistic process gains.
They should also be responsible for answering requests from employees to ensure that the
resources required for increasing performance are available when needed.
Management processes, including training should be designed from the point-of-view of the
recipient and with a built-in
mechanism for feedback. Employees must be encouraged to set measurable but attainable
goals. Employee training should focus on appropriate multi-skills and knowledge.
Empowerment of employees by devolved authority to make decisions
autonomously (i.e. subsidiarity) regarding TQM, so that each individual “ owns” the
particular process phase, is necessary. The objective throughout is continual improvement.
PROBLEM STATEMENT
Introducing TPM in a developing country, such as Malaysia, is still considered a major
challenge due to several non-conducive environments in the adoption and implementation
process. Lack of commitment and leadership from top management has always been
discussed as one of the main factors that inhibit the implementation of TPM. On the other
hand, resistance from the employee involved in the TPM program is also regarded as another
major reason that explains why TPM fails in many local organizations. Employees refused to
endure extra maintenance responsibilities without any rewards, recognition or compensation.
Lack of proper and adequate training and education about TPM also contributed another
significant percentage to the pitfalls of TPM implementation in a developing country.
RESEARCH METHODOLOGY
A structured survey approach has been used as the research strategy in this study where
questionnaires regarding the implementation of TPM were distributed to industrial
manufacturers. The questionnaires was divided into several sections to capture all the
relevant data and information such as
(1) Organization profile,
(2) TPM background,
(3) Extent of current TPM activities,
(4) Success of TPM and echnological complexity in the organization.
The target respondents of this study are the industrial manufacturers (operation/maintenance
managers) in Malaysia, especially from the north (Penang, Prai and Kulim industrial areas)
and from the central region (industrial estates of Shah Alam, Puchong and Petaling Jaya).
Sample organizations selected from the FMM Directory of Malaysian Industry and involved
in various manufacturing products from consumer electrical & electronics products, food &
beverage, medical & health care products, precision tooling & machining, rubber and plastic
based products, semiconductors and so on.
THEORETICAL FRAMEWORK
Based upon the literature review on TPM implementation and its practices, a research
framework of Implementing TPM in a Manufacturing Organization is developed. The main
purpose of this study is focused on the TPM Operational Strategy and its inter-relationship
70
with other Extent of TPM Implementation. The research framework is illustrated in Figure 1
below.
HYPOTHESIS DEVELOPMENT
Three hypotheses have been generated in this study to test the relationship of the research
framework that has been elaborated earlier. It will test the relationship, especially, between
the Operational Strategy and Extent of TPM Implementation.
In the literature review it has been postulated that organizations, which extensively carried
out Human-oriented strategy while implementing TPM, would have a higher Extent of TPM
within the organization. Basically, when more efforts are devoted to Human-oriented
strategy, such as prominent commitment and leadership from the top management, total
involvement from all employees will, definitely, result in a higher achievement of TPM. The
more an organization implements the Process-oriented strategy (example, process &
equipment improvement, equipment standardization, re-layout, etc.); the maintenance
knowledge becomes higher and the Extent of TPM implementation will be enhanced.
However, the Human–oriented strategy has a greater impact compared to the Process-
oriented strategy towards the Extent of TPM implementation. In brief, the three hypotheses
of this study are summarized as follows:
H3: Human-oriented strategy has greater impact on Extent of TPM level then Process-
oriented strategy.
CONCLUSION
It can be concluded that the extent of both the human and process oriented strategies would
lead to higher TPM implementation in the organisation. However, the impact of Human-
oriented Strategy is found to be greater then Process-oriented Strategy in fostering higher
extent of TPM implementation as the changes and adoption in the organisation are much
more related to human issues. Thus the management has to balance both these strategies in
order to achieve the maximal effect of implementation.
71
TOPICS
1. BASIC INGREDIENT
4. DOCUMENTATION
TIMING
FIXED TIME MAINTANANCE
Maintenance actions that are carried out at regular intervals, or after a fixed cumulative output, fixed number of cycle
of operation etc. This includes item replacement, repair and major strip down for inspection (the author regards
condition-based maintenance as inspection carried out without major strip-down). In most cases, the periodicities,
actions and resources for such work can be anticipated and scheduled well beforehand with ample time tolerance.
However , this procedure is only effective where the failure mechanism of the item is clearly time dependent , the item
being expected to deteriorate over a period much less than the life of the unit to which it belongs.
Obviously, the more predictable the time to failure, the more effective is fixed time maintenance. Whether this is
necessarily the best procedure, even then, will depend upon the cost of the alternative effective procedure. If approach
of the failure is detectable, condition based maintenance is effective and in the majority of cases more economic.
Comparing fixed-time maintenance to-operate- to failure, a rough guide is that fixed-time-maintenance is only
effective where the total cost per maintenance action is substantially less than that of operating –to-failure. The
implication is that ‘fixed time maintenance and adjust or replace’ is an effective procedure for simple replaceable
items, but is inappropriate for complex replaceable items because of their far less predictable time-to-failure and their
high cost. With such item, it is often better to seek a suitable condition monitoring technique.
72
An attractive concept is that the proper time for performing maintenance ought to be determinable by monitoring
condition and performance , provided of course, that a readily monitor able parameter of deterioration can be found .
The probabilistic element in failure prediction is therefore reduced or, indeed, almost eliminated, the life of the item
maximized and the effect of failure minimized. One of the major benefits of this policy is that the resulting corrective
maintenance can, in most cases, be scheduled in the short term without production loss.
The monitoring parameter can provide information about a single component, or provide
information that can indicate a change in any number of different components (e. g. vibration from a turbo
generator).the more specific is the information provided, the better from the point of view of maintenance decision
making.
Condition-monitoring can be applied in three ways:
Simple inspection - qualitative checks based on look, listen and feel (e.g. rope worn/rope note worn).
Condition checking - done routinely and measuring some parameter which is not recorded but is only used
for comparison with a control limit. Such checking only has value where there is
extensive experience of identical systems.
Trend monitoring - measurement made and plotted in order to detect gradual departure from a norm.
The desirability of the monitoring, the technique used, and its periodicity will depend upon the deterioration
characteristics of the item and the costs involved.
Simple inspection procedure is sufficiently effective to account for 70% of a typical condition- based-programmed.
Such procedure is usually cheap and carried out as part of a routine. The important points are that the cost should
insignificant, relative to the cost of repair and that the periodicity should be insignificant, relative to the cost of repair
and the periodicity should be sufficiently short to detect minor and often unexpected problems before they develop.
One form is ‘short period blanket inspection ’of on-line plant carried out in order to identify obvious minor defects
before serious damage can arise.
Condition checking can be used in such items as brake pads which have well documented linear deterioration
characteristics of the type shown in fig. Because of the predictability of the time failure the actual inspection need not
begin until well item’s life. The subsequent periodicity can be adjusted to give the desired level of warning using a
control limit. Obviously, the deterioration characteristics shown figure 3.6 are preferable to the more usual
characteristics show in fig 3.7 in as much as failure developing period , ‘lead time ’ ,is longer .the greater the
inclination towards shorter inspection period and, in special cases, continuous monitoring (e.g. vibration monitoring
of turbo- generators).
Trend monitoring is most effective where little is known about the deterioration characteristics. Since this is mostly
case, it will be appreciated that trend monitoring is of widest application. Experience is accumulated as monitoring
progresses and, when enough knowledge of deterioration characteristics has been acquired, condition checking can be
substituted for trend monitoring.
In general, monitoring techniques can be used for both condition checking and trend monitoring. it will be appreciated
that condition monitoring routines are predetermined and, in most cases ,can be carried out with little or no unit/plant
unavailability (I.e. they can be categorized as no line maintenance ). if the resulting corrective maintenance requires
73
the plant to stopped this can short –term scheduled to minimize inconvenience (see fig 3.8)
Operate-to-failure
No action is taken to detect onset of, or to prevent failure. The corrective work that results occurs with random
incidence and with little or no warning. Where such maintenance results in plant failure the failure can cause a
complete plant outage, the determination of the ‘best’actionis both difficult and expensive. This will be discussed
below under ‘corrective maintenance ’
Opportunity maintenance
Timing is determined by the procedure for some other item in the same unit or plant. Consideration of this possibility
is a major in formulating the maintenance plan for the whole plant.
1. Determine critical plant units and production windows: This step determines the nature of plant process
continuous or batch type
And classify the plant into units and construct a flow diagram. This step Carry out a simple consequences of failure
analysis and estimate the Cost of lost production. It determines the production plan, the pattern of plant operation and
the expected plant and unit availabilities. (See diagram 1.1).
2. Classify the plant into constituent items: This will be a complete classification in the case of critical units, and a
partial classification in the case of non critical units.(see diagram 1.2).
3. Determine and rank the effective procedures: Determine the effective procedures for each item and best of these
form a cost and safety viewpoint. In general procedures for simple items will be reasonably certain and will be mostly
on-line maintenance. However this will not necessarily be so in the case of complex items and the best approach is
often to try to identify a simple method of condition checking.
74
4. Establish a plan for the identified work: This method will depend upon whether the plant is series-continuous, product
flow, batch product flow, batch, or vehicle fleet. The large series-continuous plant presents the most difficult problem
because it has to be considered as a whole for scheduling; the item in a batch plant or fleet can be considered
individually.
6. Establish corrective maintenance guidelines: In spite of preventive maintenance there will be some unexpected
failures, e.g. those due to items which fail randomly and without monitor able warning. Such failures have to be
planned for in items of spares and manpower. In the case of critical units, careful consideration must also be given
to repair methods, documentation and decision guidelines.
75
Charge vessel Batch reactor Filter Wash vessel Dryer
76
MAINTENANCE PLANNING AND CONTROL SYSTEM
Maintenance planning and control system is an important task in the industries during these days. So it become very
important to keeping in mind the concept of Maintenance planning and control system so that we can maintain the
industry efficiently.
There are some important components of maintenance planning and control system:-
Job planning: - Planning of maintenance job basically deals with two questions WHAT and HOW of the job. While
answering these two questions many other supplementing questions need to the answered i.e. “where the job to be done”,
or “why the job to be done” etc. As such, it is a very important component and here the engineering knowledge must be
applied extensively to the maintenance jobs for development of appropriate job plan using most suited techniques, tools
materials etc.
The efficiency and cost of further action, like job scheduling, execution and control etc. depends:
Steps of job planning: - The main steps to be followed for proper job planning are as below:
(1) Knowledge about equipment, job, available techniques and facilities etc. Knowledge about equipment and
jobs can be obtained from-
* Drawings.
* Instruction manuals.
* Job manuals.
* Experience on similar machines/jobs.
It is not absolute essential that persons doing job planning must have practical experience of that machines/job but it is
advisable to associate such experienced persons also.
(2) Job investigation at site: - After having the some knowledge about the machines and the jobs through step (i) looking
at the actual job at site gives a more clear perception of the total job. This also helps in ascertaining the following:
* Physical access and space limitations-this may call for jobs like removing covers guards and stoppers or cutting a
portion of machine housing etc for actual approach .
* Assessing if the available lifting and handling facilities are enough to be brought in and in that case, space for
bringing those facilities.
• Facilities for disposal of water, oil, gases which may leak or come out during dismantling.
• Space for keeping the dismantled parts and safety enclosure for machine under repair.
(3) Development of repair plan: - Preparation of the step by step procedure which would accomplish with the most
economical use of time, manpower and material. It includes making of sketches line diagrams and net works etc. Weight
77
of each items, to be lifted, should be determine beforehand and planning should be done to avoid double and repeated
handling of same items. The total job should be broken into smaller measurable activities at this stage.
(4) Preparation of list if materials required: - Depending on defects list and condition monitoring results a list of spare
components needed to be changed should be made. A list should also be made of all rubber items like seals, packing
and rings and other items which are likely to damage during dismantling. Another list should be made of all consumable
which would be needed during and after the jobs like cotton waste, nuts, bolts, grinding wheels and lubricating oils etc. A
separate list should be made for all tools and tackles needed for the job.
Job Manuals:- Job manual are almost permanent record about methodology, spares, tools etc. for all maintenance job
which may have to be done in future. Following steps are generally involved in preparing job manuals:
(i) Make a list for all major and medium maintenance job of the plant and codify them for proper identification.
(ii) For each code job, a separate job manual is to be made which should include the following:
* Sequence wise break up of the job into activities with instructions, as may be needed to carry out those. This also
takes care of fits and tolerance for dismantling and assembling etc.
• List of tools tackles, spares and consumable needed to carry out each activity.
• List of special tools and facilities, jigs and fixtures and handling facility for each activity.
• Necessary preparatory job which must be done before starting the main job i.e., depressurizing the hydraulic and
pneumatic system taking level of machine foundations and fixing other bench marks and removing operational
items like dies, jaws etc.
78
• Necessary safety and environmental protection instructions.
• Agencies required doing the job activities with special reference to agencies outside the shop or outside the plant.
(3) Each job manuals thus prepared, should be cross checked and okayed by maintenance in charge.
(4) Different job manuals should be bunched and sent to respected/potential users.
(5) Necessary updating of job manuals, as and when needed in consultation with users of same.
Uses: - * While actual planning of particular maintenance job, the job manuals provides ready information for further
micro planning.
Materials department may also use job manuals for better materials procurement strategy.
* Short-term plans.
* Long-term plans.
Short-term plans: - Short–term planning of maintenance job generally includes small preventives maintenance, small
defect rectification, adjustments and small corrective maintenance which are planned and scheduled on day to day basis.
The persons/group engaged for short term planning, besides planning and scheduling, ensures timely resource availability
and maintenance engineers as well as proper documentation. Some of the short term listed below:-
The planning and scheduling of such jobs are done, taking into consideration of plans of different departments, without
necessitating the stoppage of running equipments or on schedule off days or on day’s equipment are not used.
Occasionally small shut downs for some urgent repair, may also be included in short term planning. It should be ensured
that during one isolation of the equipment, jobs of all associated departments are completed. These are knows as short-
term plans, not only because of small duration but also because these are not planned much in advance.
Long-term plans:-Long term planning includes the planning functions/jobs which are helpful in maintenance original
level of performance in term of output, efficiency and reliability. It also includes the maintenance jobs which lead to
improved availability and capability of equipment, improved maintenance facilities and safety and environmental
protection measured. Planning of such job often include better of design, material, techniques and infrastructure etc.
Following are some of the long term maintenance plans commonly used in industries and plants:
• Major repair and capital repairs.
• Annual overhauls and statutory overhauls.
79
• Renovation and revamping.
• Modernisation.
• 5-year rolling plans.
• Strategic maintenance planning.
• Corporate planning.
• Turn around planning.
(1) Major repair and capital repairs: - The term “Major repair” is often confused with the term “Capital repair”.
Major repairs includes job required to restore seriously deteriorated or broken down equipment to a condition of
usability for its desired functions. In edition to maintaining the essential quality of the equipment, these repairs may
include minor modifications. “Major overhauls” may be another term to mean the same.
“Capital repairs” normally means major dismantling and nearly rebuilding of equipment which amount to big
jobs involving lot of resources. The value of an asset normally builds up to original value. In capital repairs, job of
major repairs nature, need to be done at that time, are also included.
(2) Annual overhaul and statutory overhauls :- Annual overhaul, though may include some of major repair and capital
repair job, differ slightly to the extent that annual overhaul are planned yearly “Action Plan” and are taken at nearly
predetermined time after the expiry of about one year from last overhauls. Such overhauls take three types of jobs:-
(a) Overhauling and defect rectification jobs exiting at that time or postponed for that overhaul.
(b) Annual maintenance cleaning, painting and corrosion prevention jobs pf equipment, machines, tanks, reservoirs
and filters etc.
(c) Annual operational cleaning and painting job like removal of mill-scales, dusts, debris, guards and dies etc.
(3) Renovation and Revamping: - Renovation and Revamping are nearly the same type of jobs. These differ vastly
from “Capital Repair” as renovation/revamping jobs are rarely after the equipment condition has deteriorate to such
an extent that any amount of major repairs, capital repairs do not yield desired result and maintenance become
uneconomical. These jobs include almost total rebuilding of the equipment complex using either the similar type of
technology and equipment or improved type of technology and equipment.
(4) Modernisation: - Modernisation job for whole plant or a section of the plant are planned when the operation of that
plant become uneconomical. Modernisation job may be necessitated when the exiting plants, equipments and
technologies become outdate and better alternatives are available. However plant engg. Department is more
responsible for planning such jobs and maintenance engineers play supported role to ensure good maintainability and
reliability.
(5) Five-year Rolling Plans: - In industries generally a five year rolling plan is made for major overhauls and
replacement of components and assemblies. Such plans are essential as procurement of some of the major spares,
components and assemblies may required long lead time and some of the component may become obsolete. Every
year this 5-years plan is reviewed and update for next 5-years. The period of such plan may vary from three years to
ten years depending upon type of the plant/industry.
(6) Strategic Maintenance Planning: - Such planning jobs mainly include long term planning of service and facilities
of maintenance function like steam supply and fuel supply etc. to take care of anticipating need in future or possibility
of known-availability of same service in future.
These also include long-term planning for replacement of some other equipment which help in case of maintenance
and better availability of plant and equipment.
(7) Corporate Maintenance Planning:- This is also a reference type of plan like 5-years or 10-years rolling plans, but is
made to suit the corporate planning goals of production and total operation of production and total operation of major
plants. Such plans include jobs like major repairs, replacement which may have to be taken at different times. Many of
80
these jobs require lot of money and hence need approval from corporate office. Detailed planning and scheduling of these
jobs starts when the time frame of is actually.
Let us now consider this in 3 dimensions and then draw some analogies with previous examples in other articles on the
site:
The above cube breaks down into 8 segments, four for Garages and four for Re-manufacturing. In addition there are also a
few other important aspects of the environment, which are important. Volume and mix have a profound effect on the
control systems needs.
GARAGES
1. Materials Planning
This is generally a low volume or repair to order environment, where materials are ordered when required and some
common long lead-time items may be forecast and scheduled. Some difficulties may be encountered in forecasting due to
the low volume and intermittent demands. Control systems techniques that are appropriate include:
• "Re-order Point" with blanket orders and call-offs for the common long lead time items
• "Replacement" for the expensive long lead time items (when you use one, replace it)
• "2 Bin Systems") for the low value items
Safety stocks may be dictated by the need to support AOG (Aircraft On Ground) or VOR (Vehicle Off Road)
requirements or support for other critical plant or computer systems.
81
Why not to use sophistication and MRP1
Some sophisticated modeling systems have been developed to try to forecast when units are likely to arrive for repair or
need repair, coupled with a second model to determine which parts are likely to be replaced in each unit. This is used to
create a Bill of Material to drive an MRP1 provisioning system.
We would not use a provisioning model in this way except to establish initial stock holding requirements at new
equipment acquisition or product launch. This technique requires two forecasts:
1. Product returns
2. Component replacement rates
It is a lot easier to forecast demand based on historical usage at component level by recording issues from stores, (and
probably just as accurate). The last time we took out an MRP1 system which required product forecasts and replaced it
with a simple reorder-point system at component level it reduced stock by 20%, and stock-outs by a factor of 10!
Stock Strategy
However there is a trade off between service level verses the cost of holding stock. This balance can be tipped in your
favour in several ways:
• Assess the likely risk of AOG/VOR/Critical failure situations arising out of a potential stock-out.
• Conduct a Pareto analysis of historical demand by ranking demand in volume time’s value sequence. The
resulting list can be split into the following categories:
• "A" items (the highest volume / value, typically 20% of the items or less)
• "B" items (the next highest category, typically 30% of the items or less)
• "C" items (the lowest volume / value, typically 50% of the items)
• No worries. Keep plenty to reduce the reorder frequency and risk of stock-out.
• Why are you stocking this at all? Or if you insist, keep one aircraft/vehicle/equipment set just in case.
82
Beware of seasonal demand. For example more aircraft fly in the summer, so you need to watch out for that. Build stock
in the spring, and reduce stock before the autumn.
Reorder Quantities
We call the period just before reorder "the point of vulnerability". It is particularly relevant in assembly type work where
the number of shortages of raw materials can be dramatically reduced by increasing batch sizes of the "C" items without
any real penalties in inventory stock holding. The mathematics is also very simple. If an assembly contains 100 parts and
each is replenished weekly, the risk of shortage is on average 100 per week. If the "C" items are replenished annually
(typically 50% of the part numbers) the risk reduces to 51 per week, i.e. 50 part numbers are still vulnerable every week
but 50 part numbers are now only vulnerable once every 50 weeks (one per week on average). If you then increase the
batch size of the "B" items (typically 30% of the items) to say 2 weeks, you reduce the risk of shortage further to 36 per
week (20 "A" risks + 15 "B" risks + 1"C" risk). If you then halve the batch size of the "A" items to twice per week
replenishment, you have taken out about 50% of the inventory carrying costs while simultaneously reducing the risk of
shortages by about 45%. In practice there is little extra risk simply because these "A" items are usually produced on a
flow type basis. Because the cost of material shortage is significantly higher than any other costs in terms of lost output
and dissatisfied customers (who may go elsewhere), you get the benefits of Pareto from both ends (reduced inventory
costs and reduced shortages).
In practice we have used this technique many times now and have produced benefits at one end of the spectrum of a 40%
reduction in stock holding without reducing a very high service level. At the other end of the spectrum we have reduced
shortages by a factor of 10 whilst simultaneously reducing stockholding by 10%. The last time we did this in an
aerospace company AOG disruption to production virtually disappeared, and service levels doubled (They were bad to
start with). Stock levels did not increase.
1. Watch out for short shelf life products. You will end up throwing them away.
2. Watch out for bulky items. They may be cheap but they will still fill your stores.
3. In most repair and overhaul situations it would be very difficult to approximate to a normal distribution pattern
from the very uncertain demand. Demands are statistically "sparse", so you cannot use statistical approaches
reliably.
A refinement to this technique is to forecast service intervals and quantities required at those intervals separately. In
principle this is attractive since it seems more likely to schedule the arrival of the parts when required and to allow for a
high probability that the demand at that time will be covered by stock. However it is still forecasting based on historical
demand and therefore suffers from that fundamental inaccuracy. But it is a sensible simplification of the MRP1 approach
above.
4. Do not ask your plant supplier what spares they recommend you should hold for your critical plant, or if you do
halve what they say because the last time we did this we had slow moving stock around for the life of the plant.
2. Materials Control
The call-offs above may be via Kanban systems. Otherwise stores stock recording and control is essential, particularly for
attractive items with a black market value. Strenuous efforts must be made to keep the shop clean and unused materials
returned to stores routinely. A technique very valuable in this environment is "5S's", which aims to improve
housekeeping. The last time we did this we filled many skips with rubbish and racking for the rubbish! We reduced the
83
floor space utilized by about one third, we started to be able to find things in stores and we stopped ordering things
we already had. This is particularly important if lot traceability is required.
Shop floor control and tracking of jobs, features significantly in all but the smallest shops.
Version control is important in all safety critical situations, to plan and implement changes and to isolate problems. In
particular suspect batches may require to be recalled Squirreling
It was once the proud boast (and it made the works magazine) that our repair shop could service a 30 year old vehicle
from floor stock. I was horrified that we could do this! And for three reasons:
• The stock should have been used or thrown away twenty five years ago
• Even if we had kept the stock, it should have been in the stores
• We did not charge an economic cost for the service if you take into account the cash tied up for 30 years.
3. Capacity planning
There is a classic dilemma in maintenance work. If the maintenance people are busy the place is not earning money. If
they are not busy they are usually first on the redundancy list. Scheduling of maintenance work exists against a
background of unusual breakdowns, which have to be accommodated in a hurry. The only 100% reliable way of
managing this situation is to have spare capacity either through sub-contracting or through re-deploying maintenance
personnel to other duties when not busy. This is very difficult unless routine scheduled maintenance predominates.
Another problem is the lack of routine scheduling information (standard methods and times) for non-routine operations. A
typical problem of this type of work measurement is the establishment of "loose standards", which if used to drive
incentive schemes gives rise to serious problems. As an aside: incentive schemes are no substitute for good supervision.
However "rule of thumb" time estimates and Rough Cut Capacity Planning is possible. Skills are the usual resources
that need to be scheduled, not plant. If Total Productive Maintenance (see below) is being utilized scheduling becomes
simpler because a higher proportion of the work is scheduled rather than breakdown dominated.
4. Capacity Control
This again is a classic dilemma. Do we do the urgent first or the very urgent? Frequently a job will be shelved to
accommodate a more urgent one. This process can degenerate into very cluttered workshops and high work-in-process
stock holding. Running a strict good housekeeping regime of operations control can alleviate this. The only satisfactory
way to avoid building unwanted work-in-process is to use a simple form of input/output control. I.e. do not issue another
job until the last is out of the way: One way which we have used to control work in process is to restrict the number of
work or kitting trolleys to one per individual so that they can only be working on one job at a time. The trolley is used as a
Kanban to request the next job from stores, when the previous job has been started. Also it is common to hold some sub-
assemblies in work-in-process. Unless these require significant lead-time to assemble it is hard to justify holding sub-
assemblies and this situation often leads to cannibalizing one job to make a more urgent job. Our advice is do not do it
unless you really have to, and draw a "Commonality Tree" to assess the need
The use of loading boards is common in this environment. More recently electronic loading boards with pick and place
facilities are being used.
84
Tools management is essential with "Shadow Boardsused to ensure tools can be located when needed and in safety
critical situations such as aircraft assembly it ensures that they are not lost.
The use of housekeeping techniques such as "5S's" is appropriate. Diagnostic skills and possibly tools are required. These
may be required to support remote diagnostics. Considerable effort may be required to establish this infrastructure.
Re-manufacturing
1. Materials Planning
Sufficient stock must be maintained to support underlying demand for reconditioned units.
b) Managing the stripped component stock to keep balanced sets of parts for rebuild.
Using new items instead of salvaged items is costly. So in order to maintain components it may be necessary to strip
further units. Yields must be used as an input to calculate material requirements. Ultimately imbalances are bound to
occur. In this case an occasional purge may be required to restore the balance, by either throwing away surpluses or
buying new components depending on the economics of doing so.
Because there is a greater volume, medium to large batch rules apply with many similarities to original production /
assembly operations. Forecasting is easier and there is more repetition. MRP1 systems may be appropriate where demand
can be forecast with some certainty. We have encountered situations where a negative bill of material was constructed to
accommodate yields expected from salvaged units which were then offset against the requirements for remanufacture.
This method was later abandoned in favour of change of manufacturing strategy where salvaged units were stripped as
soon as possible to determine availability of good components.
Generally Re-order Point techniques are most appropriate for forecasting demand, with blanket orders/schedules for
repetitious component requirements.
2. Materials Control
Call offs are more likely to be via Kanban control because of increased repetition. It is vital to monitor yields in this
situation to ensure that the correct numbers of unsalvaged stocks are sufficient to satisfy demand. Re-manufacturing
creates a special problem for lot traceability. If a part has been recycled and it fails, what is the cause of the failure? Is it
the original manufacture or the recycling process?
3. Capacity Planning
Because more time standards on work content are available (however informally) estimating jobs is easier. Because
processes are more predictable Routes (Routings) can be established to use in shop loading. Because there is repetition,
demand is also smoother. The combined effect of these factors makes capacity planning easier. The use of "Level
Scheduling" is recommended.
85
4. Capacity Control
Because demand is smoother and more repetition is present, skills management is less important and in fact more
deskilling or automation may be possible. Switching effort to stripping rather than rebuilding can accommodate troughs
and conversely reducing stripping to satisfy immediate demands can accommodate peaks.
Sometimes the organization may be slightly schizophrenic, flipping from job shop to volume producer. At this point it is
worth considering some method of segmentation along resource utilization lines.
Volume increases are much easier to manage than mix increases. When volume increases, segmentation of the product is
possible and automation of the ring-fenced product implemented. Kanban systems are most appropriate in this situation.
Forecasting of demand is also easier.
As mix increases the overall business complexity increases. Considerable thought needs to be given to the proliferation of
variety
Measures of Performance
Quality is a given these days, however faults measured in Parts per million is less applicable to this situation because
volumes are generally lower. It is more common to measure the utilization of the equipment that is the result of the
maintenance process rather than the process itself. However a useful technique is "FRACAS" Corrective Action Systems
or Operations Management for Continuous Improvement) which aims to follow up all faults to prevent a recurrence.
Often maintenance has a primary goal of meeting demand, which can often be measured directly in up-time, (of
computers), down time (of critical plant), on the ground time (of aircraft), or off the road time of vehicles). Contributory
measures to this include response time, and mean time between failures. This may need categorization by critical plant or
critical components.
The cost of maintenance can be significant so the productivity/costs are important measures. However productivity
presupposes a standard output such as vehicles, repairs, etc. Which as we have said before may be difficult to establish
because of the variable work content involved. This often causes up-time etc. (above) to be used as a substitute. One key
feature of productivity is cash utilisation, which can be significantly influenced by good materials and capacity
management above.
Replacement Theory
It is a fact that the cost of maintenance can become prohibitive as plant, vehicles, aircraft or software ages. Replacement
theory dictates that this is monitored and that there is a planned event to replace the item before it actually dies but also
before the cost of maintenance becomes uneconomic. It is not the intention to explore the mathematics of this here but
simply to point out the fact that it can be cheaper to replace than maintain or continue to maintain, so you should not
automatically choose the repair option, and you should keep records of repair costs.
86
“DOCUMENTATION”
87
PERVENTIVE MAINTENANCE DOCUMENTATION:
Most preventive maintenance documentation systems are based on similar principles but differ because of the nature of
the process and size of the plant. The principles involved are the best discussed with reference to the traditional
manual .This is based on the plant unit and is suitable for small machine shops or transport fleets where the unit is clearly
identifiable, can be scheduled separately and the plan is mainly inspection based. The system is made up of three main
parts, a plan for each unit, specification for each job (or routine) and a schedule for the plant. The maintenance plan is a
list of the identified maintenance procedures, for a unit of plant, classified by trade and frequency. Job specification can
be written for:
An individual procedure- if the procedure contains a large work contents or some particular technical difficulty.
A job -a set of off-line procedures on one unit and to
be carried out by section.
A routine - a set of on-line procedures (usually simple inspection or lubrication) in one area and involving
one trade and carried out on one occasion.
The Maintenance schedule is arranged with the aid of a bar chart and with the aim of achieving a balanced
workload. Job of short term period city(less than a month) is not included and is scheduled separately usually by first line
supervision. The great advantage of the bar chart is that the preventive load for a plant can be seen and smoothed to
required profile.
In most of the traditional preventive documentation systems the bar chart is used mainly for scheduling job into
some form of card index, some time called the ‘job tickler file’. The index made up of 52 slots, can then be used directly
for scheduling and control of preventive work i.e. each week it feed a weekly load of job specifications, plus a summary,
in to the work planning and system. Each specification is accompanied by a Work order on its way to the shop floor. The
index can be updated and rescheduled as necessary on the return of the job specification cards. Resulting corrective work
is noted on the completed work order and stays in the work planning system.
In the case of large process plants, e.g. the batch chemical plant or paper machine, scheduling of preventive work is much
more complex and not always inspection-based. There is a need to integrate the scheduling of maintenance work on
many units of plant and, if a manual system is being used, a bar chart must be used for the initial schedule and to control
the completion of work. The work load is read off the bar chart and the specifications are then selected from their file and
sent, as before, to the shop floor.
A simplification of the traditional system to use a work manual containing coded job specifications which is
made available to the maintenance trade force. The work order from planning office will contain a brief job description
and a job specification code.
The main variation on the foregoing traditional ideas is the extension of the job tickler file into a
maintenance job catalogue. The latter includes all preventive job (frequency job) and as wide a range of corrective jobs
(non frequency jobs) as possible. Each job, and routine, in the catalogue is listed under the relevant plant inventory
number and the job specification includes job method standard time, frequency, trade(s), spares, tools and indication of
whether it is on-line or off-line. If it is offline that has to be shut down in order to complete the job as indicated. Such
information, stored in a computer file, can form basis of a preventive maintenance scheduling program which can also
take in to account resource constraints, the possibility of multi-unit scheduling, opportunity maintenance and deferred
work. This is more detailed and flexible approach than that of the traditional systems and is appropriate for large complex
process where a computer is already being used for data processing.
The main documentation and planning aids used are as follows are as follows:
88
WORK REQUEST: ‘A document requesting work to be carried out’ It usually carried such information as person
requesting, plant number, plant description, work description, defect, priority, date requested.
JOB CATALOGUE: A file of job specifications (preventive and corrective) as previously described.
ALLOCATION BOARD: A short term planning board showing men available on each day of one week which allows
jobs to be allocated to man. This can be supplemented by the allocation board.
89
WORK PLANNING: Requests for emergency work made verbally to area supervision that raises work orders.
Requests for deferred corrective work and modifications are made to the planning office on work request
forms. A work order is raised directly or by reference (if held) to the job catalogue. The priority of such work is
described at weekly plant meeting and the loaded in to the corrective maintenance planning board.
Preventive work is planned and scheduled as explained in the previous section. The preventive maintenance system
feeds a weekly load of work In to the work planning system to be considered for the weekly program- me alongside the
corrective and modification work. Work order is raised (by reference to the job catalogue) and work not to be carried out
is re- scheduled using the planning board.
Work orders are raised in triplicate, one copy remaining in the planning office, one with the supervisor and sent as the
order to the tradesman. As the work order is returned through the system the copies can be filled or destroyed. An
important point is that for effective control the execution of all work should be covered by a work order and copy of all
completed orders should return to the planning office.
CONTROL:
The main information necessary for control comes from the completed work order and stores requisition
forms. The main information necessary to complete the work order (job completed, hours taken, action taken etc) comes
from the tradesman and is checked and augmented by the supervisor.
Work control is completed by the daily updating of the allocation
board and weekly up dating of the planning boards are classified by trades and analyzed to establish overtime hours and
proportion of the time spent on planned and unplanned work. If a work measurement scheme is in operation the work
analysis can extend in to performance calculation for the consequent report.
The first level of plant condition control operates via. Information passed from the tradesman to the supervisor either
verbally or through the work order. The supervisor is responsible for seeing that the cause and consequences section of
the work order is completed by the tradesman and, where this is not the case, for establishing the reason. The second level
of condition control operates through the planning office and depends partly on an effective history record. This in turn
depends upon the information on causes, cost and work conveyed on the work order on the transformation of this
information at the right level to the history record.
The information should also include item affected, components replaced, possible causes, downtime and total
hours worked.
If the history record is to perform both its functions it must be easily accessed and interrogated. In addition, it
should be designed to provide automatic indication of the main problem area.
COMMENTS:
90
The documentation system that has been outlined does not have to be used in its totality. Any part, or parts,
can be used as needed. For example, many companies use only the preventive maintenance system plus a limited plant
inventory and history record. Others use only the work order system and that mainly as a mean of conveying work control
information for incentive schemes.
A COMPLETE SYSTEM
A system in use in many plants in the UK is that devised by the Production Engineer Research Association
(PERA) and described by Carder
CONTROL
Work control is achieved through the return of the work order and inspection reports and updating of the
allocation board and planning office schedules the maintenance supervisor checks and signs the work orders and
inspection reports (entering the maintenance cost codes) before returning them to the planning office. The job
specifications are returned to their file for future issue.
A unique electro-mechanical method of card storage and selection has been developed by Kalamazoo. This coupled
with photocopying, allows a maintenance documentation system to be developed around a catalogue
of these jobs, the majority of which can be covered by individual job specifications. The following description will
concentrate on the features since the other part of the system are similar to those already described.
Each unit of plant is numbered, using a method similar to that of the PERA system. Plant Inventory Record Cards
and history record cards are
held in visible-edge files.
This also divides the preventive work into on-line routines – carried out without special planning – and off-line jobs that
do need such planning.
91
The specification for each off-line job (there might be a number of These for each unit) is entered on a blank
punched edge card. Included on The card is a list of spares and references to manual and drawing required.
On the reverse side of the card is the unit’s description, identification number, location details, etc. and job information
such as frequency, priority and trades and time required. Each job is scheduled into an appropriate week (with the help of
a twelve-month bar chart) and the card is punched at that week number. Other selection information is then punched on
the card edges and might include identification number, trade, job type, priority, etc.
The on-line work is classified into area, trade and periodicity. For each area, a card is made out for every routine,
with periodicity, scheduled week or month, and other details punched on the appropriate edge index.
Computer technology has developed extremely rapidly over the last ten years and a wide variety of computerized
documentation systems have become available. Early applications in the field of maintenance where based on expensive
and powerful mainframe machines. The development of on-line maintenance systems has been rendered possible by the
introduction of improved data-handling techniques for storing and receiving data. Such techniques allowed time sharing,
i.e. on line use of computer by a number of users. In addition the advent of less expensive but relatively powerful
machines has meant that computer can be dedicated solely to the maintenance function.
Maintenance system currently in operation falls into the following categories.
(1) Those which use a main-frame, joint with other departments or sites, on a time sharing basis. Such systems are
restricted to large organizations that can justify the procession of a main frame computer. The maintenance system is
multi-user i.e. several terminals (VDUs) are connected to the computer, but often has low priority of usage and although
the computer is on-line the response time might be slow.
(2) Those which use a dedicated mini-computer backed up by a time shared main-frame computer (distributed
processing). The maintenance system is multi-user and, to facilitate immediate access, the main functions are on the mini.
Such systems are expensive but permit adoption of the full software package required by a large plant.
(3)Those which use a dedicated mini-computer only. Those systems are
mostly multi-user and can provide on-line processing of the totally of documentation required by medium size and small
plant. Such systems are less expensive than those of (2) above and the majority of computerized maintenance packages in
current use fall into this category.
(4)Those which use a micro-computer. These are relatively cheaper. However, they are only single user, small core
storage, and their main storage is on floppy disks. The processing capacity of a computer is indicated by the size of its
core storage. Typically a micro has 32 Kbytes core-storage, a main-frame 8000 Kbytes (1000 Kbytes = 1mbytes =200
pages A4). The main storage of a computer is held on a disk, usually on hard disk Connected permanently to the
computer. A floppy disk can hold only 500 Kbytes of storage, which is much less than can be held on a hard disk. Floppy
disk are stored and loaded into a disk drive as required. Micro-computer is therefore slow in operation and only allows
full documentation in the case of a small plant. If however there is limited to one aspect of documentation, e.g.
compilation of history record, this can be tackled on a considerable scale.
Micro-computers of increased power are beginning to appear on the market. These allow hard-disk storage and may
be connected to a small number of terminals. Such machines are cheaper than those of (3) and will compete in the same
market. Another development is the network connection of a number of micro-computers (distributed processing) to form
a multi-user system.
Workshop: The workshops are the essential parts of the maintenance organization; their structures as well as the
equipment located there in largely depend upon the type of machines to be maintained. However, the general layout of the
workshops should be more or less of similar type. The basic characteristics of a good workshop are the following:
It must have all the basic facilities to enable effective utilization of equipment used in plants.
92
It must have all the facilities for carrying out minor modifications to parts of the equipment.
It must have a research and the development wing for devising new methods of test and repair.
It must have provisions to ensure quality of raw materials to be used for maintenance work.
It must have the requisite facilities for training the maintenance personnel working in the field.
It must maintain a proper record system for the utilization of manpower and equipment at its disposal.
It must have sufficient place for future expansion to accommodate modern facilities.
Store: Stores form an important part of the maintenance function. For completion of maintenance work on time, the
supply of the spare parts must be maintained regularly through the stores organization. Before setting up a store,
classification of the equipment must be done to identify each machine and its particulars and the particulars of its spare
parts. Equipment history cards also help in planning the stores. The working conditions of equipment may also affect the
requirements of the spare parts. For efficient working, it is advisable to store the items category wise, e.g. mechanical,
electrical, hydraulics, and so forth the duties of stores in charge are very crucial since success of maintenance
organization largely depends on ready availability of spare parts.
Lubrication: Lubricants are used in almost all-mechanical systems and equipment and they play a very important role in
maintenance engineering. The proper use of Lubricants contributes towards increased life of equipment and plant. To
achieve a sound quality of maintenance, therefore, the maintenance management needs to take care of the following:
To ensure that all the components or parts are provided with proper type and quality of
lubricants.
To select the correct type of lubricants as well as the lubrication systems.
Periodic monitoring of the quality of lubricants in use and their replacement with in the
specified time period.
Proper storage and handling of lubricants.
Spares control: For the success of the maintenance function, the following important
factors need to be looked into
Provision of accurate quality of spare parts.
Reduction in the non-moving lot of spare parts.
Maintaining the optimum level of spare parts inventory.
The following are a few steps to minimize the range and scale of spares.
Efficient indenting of spares.
Control over consumption.
Reconditioning/overhauling of used spares.
Use of management techniques.
Use of management techniques.
Effective involvement of maintenance/operational personnel.
93
COMPUTER INTEGRATED MAINTENANCE SYSTEM
An intelligent computer integrated maintenance system and method includes an electronically stored parts manual which
contains a hierarchical listing of all parts in production machines, and a maintenance operations computer controller
which includes a maintenance schedule management subsystem, an engineering change control subsystem, a parts manual
management subsystem and a spares inventory management subsystem. The maintenance schedule management
subsystem obtains a schedule of actual and planned production, and groups maintenance activities in order to minimize
lost production time. The engineering change control subsystem integrates engineering change activities with maintenance
activities to maximize production time. The automated parts manual is also updated to account for engineering changes.
The spare parts inventory management subsystem orders spare parts based on predicted maintenance rather than on
prescribed inventory levels. Production efficiency is thereby maximized, as is the use of available maintenance
manpower. Engineering changes are easily accommodated and spare parts inventory is kept to a minimum.
SOME CLAIMS
1. A computer integrated maintenance system for use with a computer integrated manufacturing system, the computer
integrated manufacturing system including a computer controller for controlling a plurality of production complexes each
of which includes a plurality of production machines, the manufacturing system computer controller including an
electronically stored master schedule file having therein a schedule of actual production and planned production for the
plurality of complexes, the manufacturing system computer controller controlling the plurality of production machines
based upon the planned production in the master schedule file; said computer integrated maintenance system comprising:
an electronically stored parts manual, containing a hierarchical listing of parts in the plurality of production machines in
the plurality of production complexes and maintenance operations computer controlling means, communicatively
connected to said electronically stored parts manual and adapted to be communicatively connected to the master schedule
file, comprising: first means for obtaining a schedule of actual production and planned production for the plurality of
complexes from the master schedule file; second means for identifying parts in the hierarchical listing to be maintained
during a predetermined time period, and a corresponding maintenance time during the predetermined time period for each
identified part, based upon the obtained schedule third means for reassigning the corresponding maintenance times for the
identified parts, based upon the hierarchical listing of parts in the electronically stored parts manual, to reduce lost
production time for each of the plurality of complexes;
fourth means for generating a revised schedule of planned production based upon the reassigned maintenance times for
the identified parts; and fifth means for communicating the revised schedule of planned production to the master schedule
file; whereby the plurality of complexes are controlled based upon the revised schedule of planned production to allow for
maintenance activities while maximizing production.
2. The computer integrated maintenance system of claim 1 wherein said third means comprises; means for determining
when a complex is inactive, based upon the obtained schedule; and, means for reassigning the corresponding maintenance
times for at least some of the identified parts to the time when the complex including least some of the identified parts is
inactive.
3. The computer integrated maintenance system of claim 1 wherein said third means comprises means for grouping at
least some of the corresponding maintenance times for identified parts in a complex, to reduce lost production time for
that complex.
4. The computer integrated maintenance system of claim 3 wherein said third means further comprises means for
identifying a critical part to be maintained in a complex and a corresponding critical maintenance time, and means for
reassigning at least some of the corresponding maintenance times for other parts in the complex to the critical
maintenance time.
5. The computer integrated maintenance system of claim 1 wherein said third means further comprises means for
determining manpower needed to perform maintenance according to the reassigned maintenance times, and means for
further reassigning the reassigned maintenance times to permit maintenance to be performed with available manpower.
6. The computer integrated maintenance system of claim 1 wherein said electronically stored parts manual further
includes an end of life indicator for selected ones of the production machines, the end of life indicator indicating that the
associated production machine is scheduled to be replaced or modified; and wherein said third means comprises means for
94
eliminating the corresponding maintenance for parts in machines having an associated end of life indicator, to thereby
reduce lost production time.
7. The computer integrated maintenance system of claim 1 wherein said electronically stored parts manual further
contains means for identifying the type of maintenance for a part to be one of time dependent maintenance or usage
dependent maintenance; and wherein said second means comprises means for identifying parts to be maintained and a
corresponding maintenance time for identified parts having usage dependent maintenance based upon the obtained
schedule.
8. The computer integrated maintenance system of claim 1 wherein said second means further comprises means for
accepting a user selection of said predetermined time period.
9. The computer integrated maintenance system of claim 1 wherein said electronically stored parts manual further
contains an image file, including a corresponding image for parts in the hierarchical listing.
10. The computer integrated maintenance system of claim 1 wherein said hierarchical listing comprises a complete bill of
materials for each complex.
In order to increase the production efficiency and manufacturing flexibility of large manufacturing operations, computer
integrated manufacturing systems are now being widely installed and used. Representative computer integrated
manufacturing systems are described in U.S. Pat. Nos. 4,346,446 to Erbstein et al. entitled "Management and Analysis
System for Web Machines and the Like"; 4,472,783 to John stone et al. entitled "Flexible Manufacturing System";
4,457,772 to Haynes et al. entitled "Management Control System for Forming Glassware"; and 4,803,634 to Ohno et al.
entitled "Production Process Control System in Newspaper Printing".
Computer integrated manufacturing system which includes multiple levels of computer control to organize and
disseminate the information for controlling shop floor level systems is described in U.S. Pat. No. 4,827,423 to Beasley et
al. entitled "Computer Integrated Manufacturing System", assigned to the assignee of the present invention, the disclosure
of which is hereby expressly incorporated herein by reference. In Beasley et al., manufacturing scheduling data and data
relating to process, product and material specifications as well as bills of material are generated in an upper level
computer system and refined and downloaded as needed to lower level computers controlling the shop floor process. The
upper level computers are capable of communication with the computers on the lower levels, and computers on the same
level are capable of communication with each other as needed to pass information back and forth.
The art has heretofore suggested adding a maintenance module to a computer integrated manufacturing system in order to
integrate maintenance of the production machines into the computer integrated manufacturing system. For example, the
Haynes et al. '772 patent noted above discloses a glassware production control system which also provides maintenance
information. The Ohno et al. '634 patent noted above also describes a production process control computer which includes
a materials and maintenance control subsystem. The materials and maintenance control subsystem controls the timing of
95
parts replacement. The timing of parts replacement is calculated in advance from the cumulative total of the predicted life
of consumable parts and operation time and displayed or printed so as to enable order placement for parts. The
maintenance system includes a parts list file containing a list of all consumable parts in the system. The parts list file is
updated by collecting information on the operation of the machine so that residual service lives of consumable parts may
be calculated. When parts replacement is needed, the quantity of parts used for replacement is deducted from the stock
volume in the parts inventory file. When the stock volume of parts in the parts inventory file becomes smaller than at the
time of parts ordering, an order form slip is printed. In other words, a "point of ordering" system is provided. A running
total of elapsed time is computed and compared with the durable life of parts so that the time and date of actual
replacement can be calculated and a schedule of maintenance may thereby be derived.
The art has recognized the potential advantage of providing a computer integrated maintenance system for a computer
integrated manufacturing system. Indeed, for a sophisticated computer integrated manufacturing system, which controls
many production machines in many production lines in one or more plants, it is almost essential that maintenance be
controlled and scheduled by computer. Unfortunately, heretofore known computer integrated maintenance systems did not
intelligently integrate maintenance into manufacturing. For example, in presently available computer integrated
maintenance systems, the computer may schedule a low priority maintenance operation such as an oil change for one
machine in a production line even though a major maintenance operation for the production line may be taking place a
week later. Similarly, a "point of ordering" system for spare parts may order new parts when the number in inventory falls
below the number stored in the system, even though in reality the machine is scheduled to be replaced in the near future.
Similarly, a computer integrated maintenance system may prescribe a number of maintenance operations to be performed
at one time even though insufficient manpower exists for performing all of that maintenance at that time.
Accordingly, there is a need for an "intelligent" computer integrated maintenance system which does more than merely
schedule maintenance by adding total accumulated hours and scheduling maintenance when the hours reach a
predetermined number. An intelligent maintenance system must also do more than merely function as a point of order
system to order maintenance parts when inventory falls below a predetermined number.
The need for an intelligent computer integrated maintenance system has become more pressing as the complexity of
computer integrated manufacturing systems has increased. As the number of machines being controlled and the number of
simultaneous manufacturing lines being controlled increases, it becomes difficult for a human to understand the overall
work flow in sufficient detail to intelligently modify maintenance instructions generated by a computer integrated
maintenance system. Similarly, it is difficult for humans to assimilate all of the maintenance data and intelligently modify
spare parts ordering instructions generated by a point of ordering system.
An intelligent computer integrated maintenance system is provided for use with a computer integrated manufacturing
system, where the computer integrated manufacturing system includes a computer controller for controlling many
production lines, each of which includes many production machines for producing a particular product. The
manufacturing system computer controller contains an electronically stored master schedule file which includes a
schedule of actual production and planned production for the production lines so that the manufacturing system computer
controller controls the production machines based upon the planned production in the master schedule file.
According to the invention, the computer integrated maintenance system includes an electronically stored parts manual
which contains a hierarchical listing of parts in the plurality of production machines in the plurality of production lines.
The electronically stored parts manual does not merely contain a listing of consumable or maintenance parts. Preferably it
contains a complete bill of materials for each machine in each line. The bill of materials is contained in a hierarchical
listing, which breaks each machine into assemblies and breaks each assembly into its subassemblies, down to the level of
individual parts. Preferably, the electronically stored parts manual includes corresponding image files which illustrate the
hierarchical listing of parts at each level.
96
The intelligent computer integrated maintenance system also includes a maintenance operations computer controller
which is connected to the electronically stored parts manual and is adapted to be connected to the master schedule file.
According to the invention, the maintenance operations computer controller includes four subsystems: (1) a maintenance
schedule management subsystem; (2) an engineering change control subsystem; (3) a parts manual management
subsystem; and (4) a spares inventory management subsystem.
The maintenance schedule management subsystem generates a master maintenance schedule. The maintenance schedule
management subsystem obtains a schedule of actual production and planned production for all of the production lines
from the master schedule file. It also interfaces with the parts manual management subsystem to identify parts in the
hierarchical listing to be maintained during a predetermined time period, and also identifies a corresponding maintenance
time during the predetermined time period for each identified part based upon the obtained schedule of actual production
and planned production.
However, rather than generating maintenance orders based solely upon the predetermined time period calculated for each
identified part, the maintenance schedule management subsystem of the present invention reassigns the corresponding
maintenance times for the identified parts based upon the hierarchical listing of parts in the electronically stored parts
manual, so that lost production time for each production line is reduced. A revised schedule of planned production, based
on the reassigned maintenance times, is then generated and communicated back to the master schedule file in the
computer integrated manufacturing system. Accordingly, the plurality of production lines is controlled based upon the
revised schedule of planned production to allow for maintenance activities while maximizing production.
According to the present invention, the maintenance operations computer does not merely schedule maintenance time
based upon the schedule of actual production and planned production. Rather, the maintenance times identified during a
predetermined time period are rearranged based upon the hierarchical listing of parts in the electronically controlled stored
parts manual to reduce lost production time for each production line. For example, the production schedule for each of the
production line is analyzed to determine whether the line is scheduled to be offline during a time interval which is
sufficiently close to the calculated maintenance time to allow maintenance to be postponed or moved forward to the
machine offline time.
The present invention realizes that production and maintenance both compete for the use of machines, and accordingly
schedules non-critical maintenance tasks for machine down times so that production time is maximized. Similarly, when a
number of maintenance tasks at a production line are scheduled for a short time interval, the maintenance tasks are
grouped together so that they may all be performed simultaneously. For example, a most critical maintenance task may be
identified and all other maintenance tasks for the line may be scheduled to be performed at the same time as the critical
maintenance tasks. Down time is thereby minimized.
According to another aspect of the invention, after a revised maintenance schedule is calculated, the manpower
requirement for performing the maintenance is calculated. If the manpower requirement exceeds the available manpower,
the maintenance tasks are rescheduled in a hierarchy of importance/criticality, so that a group of tasks may be performed
with the available manpower.
The intelligent computer integrated maintenance system also intelligently schedules maintenance at the end of the
machine life. In particular, an indication is provided to the computer integrated maintenance system when a machine is
reaching the end of its useful life, either because the machine is worn out or because the machine is scheduled to be
replaced or modified in an upgrade. The intelligent computer integrated maintenance system postpones selected
maintenance activity on machines which are scheduled to be taken out of service in the near future.
The intelligent computer integrated maintenance system of the present invention also allows iterative maintenance
operations planning to be performed. For example, strategic planning of maintenance operations for a multi-year period
97
may be performed in order to determine manpower requirements, spare parts requirements, and actual production
capabilities which include maintenance time. Maintenance operations planning may also be performed for intermediate
range periods such as a yearly period in order to determine parts ordering requirements, manpower availability and the
like. Then, maintenance operations planning may be performed for a short range period such as daily, in order to generate
daily maintenance schedules. Accordingly, maintenance operations planning may be performed in long-range,
intermediate range and short-term iterations.
As described above, the intelligent computer integrated maintenance system of the present invention also includes a parts
manual management subsystem which controls a parts manual file. The parts manual file contains a complete bill of
materials for each production machine. The electronically stored parts manual file does not merely include consumables
or maintenance parts. Rather, it includes all parts in the machine in a hierarchical listing, commonly using 5-6 levels of
parts, so that a complete subsystem description of the machine is available. Preferably, an electronically stored image of
each level is also stored with the listing of parts so that maintenance parts can be identified and repairs are simplified.
According to the invention, all parts in the hierarchical listing are categorized as either "consumable", "replaceable",
"generic" or "non-stocked". Consumable parts are those for which spare parts planning is based on the number of hours
used. For replaceable parts, the mean-time to failure rate versus the actual run time determines the maintenance schedule.
For generic parts such as screws, bulk inventory is maintained and a point of ordering system is used. Finally non-stocked
parts, which are typically not maintenance parts, are typically not stocked and are not ordered until actually needed.
The electronically stored parts manual file may include more than one part number for each part in the system. In
particular, each part may include a "generic parts identifier" or "international part code" to indicate that a generic, often
less expensive industry standard part may be used instead of the manufacturer's specified part number. Also, a "substitute
part number" may be used to indicate that more than one part may be used in the particular maintenance operation. Also a
"changed part number" may be used to indicate that as of a certain date, or other change criteria a revised part number
should be used as part of an "engineering change control" procedure described below. The electronically stored parts
manual may be downloaded to local computers associated with each production machine so that a hierarchical description
of each associated production machine may be found in its associated computer. The electronically stored parts manual
may also be included in a personal computer, on CD-ROM or other storage means. The electronically stored parts manual
may be included in the same computer as the intelligent computer integrated maintenance system or in a separate
computer therefore.
The spare parts inventory management subsystem of the intelligent computer integrated maintenance system allows
ordering of spare parts based on predicted maintenance, rather than on the prescribed inventory levels. Spare parts
budgeting are also accommodated. According to the invention, generic parts are ordered using a conventional order point
system when the inventory quantities fall below a predetermined order point. For replaceable parts, however, the parts
requirements are calculated based on time phased manufacturing requirements and mean-times to failure. The automated
parts manual file is used to extend the production plan to parts replacement. A requirement is generated to replace a part
in the week that it will exceed its mean-time to failure, and order forms for the parts are generated, or the parts may be
ordered electronically.
The engineering change control management subsystem interfaces with an engineering change control file in the computer
integrated manufacturing system in order to intelligently accommodate engineering changes. The engineering change
control file indicates engineering changes to be made in the production machines in order to upgrade the machines or
reconfigure the production machines to produce new products. This schedule of engineering changes is integrated into the
maintenance schedule management subsystem, the parts manual management subsystem and the spares inventory
management subsystem. For example, at the end of a machine's useful life, scheduled maintenance is postponed or
eliminated. Similarly, maintenance parts are not ordered for these machines even though inventory falls below a
predetermined level, to allow for depletion of inventory when the machine is taken off line. According to the invention,
engineering changes may be phased into maintenance operation by controlling the phase-in by a specified date, by a
specified spare parts inventory level or by assigning engineering changes to be made by a specific maintenance request.
98
The computer integrated maintenance system and method of the present invention allows maintenance operations to be
integrated into production in an intelligent manner. When used, production efficiency is maximized as is the use of
available maintenance manpower. Engineering changes and machine upgrades are easily accommodated and spare parts
inventory is kept at a minimum with minimum waste of spare parts.
The computer integrated maintenance system and method of the present invention need not be used in a production line
environment as described above. Indeed, the computer integrated maintenance system and method of the present invention
need not be used in connection with a computer integrated manufacturing system, or in connection with manufacturing at
all. The computer integrated manufacturing system and method of the present invention may be used in connection with
any collection of machines or apparatus which are used to perform a primary or main function and also require
maintenance. Such a collection of machines will be referred to herein as a "complex".
A "complex", according to the present invention, may include a production line as described above. A complex may also
include a plurality of independent machines which are not structurally or functionally interconnected in a production line.
For example, the present invention may be used to control maintenance in a machine shop having many independent
machine tools.
A "complex", according to the present invention, may also include machines which are not related to production or
manufacturing at all. For example, an airplane or automobile fleet operated by an airline, car rental agency, and
corporation or government agency is a complex, according to the present invention, because the airplanes or automobiles
have a primary function but also have maintenance requirements. Similarly, a building may include a bank of elevators
which also have maintenance requirements. The present invention may be used to intelligently control airplane,
automobile or elevator maintenance, consistent with the primary function.
During the past two decades advances in CMMS technology have changed forever the face of maintenance management
and how we, as an industry, conduct our business. We now have the ability to automate many of our standard maintenance
processes, analyses in detail various parts of our businesses, and the performance of our equipment. We are able to plan
shutdowns, technical change projects and operational maintenance procedures down to a very fine level of detail. As
maintenance management generally makes up around 40 - 50 % of operational budgets, the savings made possible from
increased efficiency and reduction of waste are staggering.
However one of the sad realities of this extraordinary rate of change is that the business processes have not adequately
kept pace with the advances in technology. Thus we unfortunately have the situation whereby the capabilities of many
CMMS systems far exceed the capabilities of maintenance organizations to fully utilize them. Most companies with
CMMS, either as MRO stand alone or sub modules of EAM and ERP systems, are not realizing the full benefits of their
investments.
So with this in mind, what is the future of CMMS systems? How far down the road to the optimal state of maintenance
management can they take us? And furthermore, how far do we want them to take us? Many of the technological
requirements of the future maintenance departments exist today. But, as with the standard functionality of CMMS, the
level of acceptance is not yet there to make this economically viable.
99
While there are definitely other areas that will also advance markedly, it is within these three areas that there will be the
greatest gains to an organizations bottom line. And with these will come the possibilities of further services not yet
accepted in the worlds of maintenance.
When we buy a CMMS, or purchase the modules of Enterprise level management systems, what we are really buying is a
philosophy on how to execute our maintenance management. These have started to converge greatly over the past five
years and the thin line between ERP and EAM systems is beginning to disappear as far as the functionality of
maintenance modules are concerned. The current systems in these spheres, although extremely advanced, generally lack
functionality covered by the other in areas of maintenance and or operational planning.
Soon all systems, from a maintenance viewpoint, will be basically Enterprise Management Systems and be generic
enough for effective application within either the capital intensive industries of mining, oil and gas, defense and utilities
as well as into the standard ERP spheres such as manufacturing and process line planning and control.
All will be based on standard maintenance procedures and business processes and will have the screens and fields
necessary to support these. For example, the generic process of operational maintenance can be reduced to four steps.
Flexibility will need to be included to ensure that any one of the four paths to execution can be taken, with each step
having its pre-determined series of functional and performance reporting structures built into the programs.
1.Initiate
Via work request systems or other work vetting mechanisms, or directly into the work order streams. This can also be
managed automatically as results from condition monitoring tasks or as tasks pre-programmed to coincide with machine
hours or other operational statistics.
2.Plan
For work orders that meet or exceed the corporate guidelines on what work orders should be planned.
3.Schedule
For all work orders meeting the corporate criteria for scheduling.
100
4.Execute
For all work orders. With the capability of accepting data for later use in Root cause analysis and maintenance strategy
overviews, as well as general maintenance performance data.
Although the general work flows always remain the same the specifics of each installation will sometimes vary
dramatically depending on their particular approach and philosophy to maintenance delivery. As such the CMMS of the
future will need to have the ability to add or remove fields and screens as required. Even to the point of being able to
create user specific fields.
One of the issues generally surrounding CMMS system implementation is the reports available. There can be any number
of reports required for measuring maintenance effectiveness and performance. However general maintenance reporting,
and analysis, is pretty much standard. However the client will need to be able to quickly and easily, create their own
reports in the format that they choose to do so. This functionality is generally lacking today, and with the advances in
software technology it shouldn't be.
Increased Functionality
The future CMMS will be judged a lot harsher than those of today. As a minimum the following Modules or areas of
functionality will be demanded:
From this baseline the functionality demands will be centered around automating the general maintenance processes. For
example automatic weekly scheduling. Firstly inclusion of the preventative and predictive maintenance tasks that are
required, then auto inclusion of the corrective maintenance actions in order of priority. Tasks will be measured against the
known human resource levels available to be scheduled, as opposed to all resources available, and then in accordance with
materials availability. Any re-scheduling will be done on the same prioritized basis.
Further functionality developments will centre around auto creation of work orders for pre-set equipment conditions, auto
generation on various operational statistics and so forth.
Another drawback of many of today's CMMS is that, while they cater well for the requirements of managing maintenance
in general, they rarely possess the functionality required to optimize maintenance performance. For this reason there is a
wide range of peripheral or additional software for the optimizing of such tasks.
Many of these can be modified to interface with CMMS systems, however they need to be integrated in such a way that
historical data within the CMMS can become part of an automated decision making process. These fall into three key
areas, all of which are vital to the progression of maintenance management as well as the downward spiral of maintenance
costs.
101
• Root Cause Analysis Modules
• Inventory Optimization or Criticality Analysis Module
• And in the case of maintenance service providers, built in Customer relations Managers.
Although the prospect of increased automation of maintenance processes is extremely interesting, the area which holds
my attention is this one. The platforms of the future will be aimed at reducing hardware requirements, reducing data entry
requirements and creating applications that are more suited to working in the terrain that they are designed to manage.
Of course the principle CMMS delivery platform of the future will be the internet. This technology is still in its infancy
but already shows great promise. Already there are a couple of exclusively internet systems providing CMMS services.
Even some of the larger systems have created internet style versions, although few are true ASP's requiring only the
internet browser to run them.
The next step is to truly integrate these with the wireless devices that are beginning to flood the marketplace. I refer here
to palmtop computers, digital video recorders or even internet enabled mobile phones. For example picture the following
scenario.
Joe Mechanic arrives at work for the day, on downloading his daily schedule to his palm top, re-scheduled as required by
plant operating conditions. He has all of the information required to do the work in his palm. Materials, procedures, safety
information, special tools required and estimates of durations and total man hours required. He even has the option of
viewing the training video clip to revise any areas he feels are lacking.
While working through his first task of the day he finds a problem that wasn't planned for. A quick flick through his palm
top raises the warehouse requisition and the material required and if available, which it will be as the inventory has been
optimized by the built in inventory optimizer, it is delivered to him within 15 minutes of it occurring to him.
On completion of his task he then punches in, or scans, any relevant failure codes and completion commentary, inclusive
of any tips to do the task easier and then closes or reschedules the work order as required. This data, after any required
revisions, is immediately updated on the work order template for that task if it exists.
Of course while working on the job at hand, a higher priority task can be sent to his palmtop workstation, or he even has
the option to create further work requests or work orders if he so requires. All without leaving the job site and without the
need for any paperwork at all.
How many times have you seen failure of CMMS due to poor computer literacy skills of supervisors or other key users?
In my time working with CMMS systems I have yet to see this level drop. People, despite the advances in technology, are
often not inclined to learn basic computation skills, for whatever reason. And in any case, they exist to do what they do
best, fix and maintain equipment in an efficient and safe manner. Why draw them needlessly away from their comfort
zones?
Further advances will eventually arise in the areas of barcode usage. Although it is now widely used there are always
exiting new developments in this area that will grow in their acceptance as time passes. An example here would be the use
of bar-coding devices for work order creation and completion. An operator, on noticing a fault, will be able to scan his
series of barcodes relating to the equipment that he is operating. This will then create the work order required with a
standard description and the relevant work order description for that task.
102
Today this functionality does exist however it is rarely direct to the database system, nor is it in a real time format. In
order to capture all of the data required by operational and conditional decision making modules, it is necessary to have
some form of interface with plant or equipment operating systems. For example inputs from engine monitoring devices, or
inputs from plant and production monitoring DCS systems.
As mentioned many of the solutions on the market place today are able to accommodate this form of data interfacing,
however they do it in a batch manner. This takes away the ability to monitor operations information that may change
rapidly and require rapid reaction to that change.
The CMMS of the future will need to take this into account and provide for the measurement of process variables in a real
time fashion. Many CMMS systems may even grow to encompass the functionality of today's process management
technologies as well as the increased functionality required for managing 21st century maintenance.
There will also be increases in the uses of wireless technologies. For example on line interfacing with engine monitoring
systems on haulage equipment. Thus even further augmenting the condition monitoring abilities of the software and its
abilities to avoid potential costly failures.
Also there needs to be further advances in the use of standard office software. An example here would be the widespread
usage of Gant Chart applications. The ability to transfer data from the CMMS to the application and then update the
CMMS with any changes.
With these advances in place the services available in the future are mind-boggling. One that has impressed itself greatly
on me is the possibility of outsourcing the maintenance management function in its entirety. But in a manner far removed
from the basic outsourcing contracts and arrangements in place today.
By the use of web based CMMS a company could easily provide the functions of maintenance planning and scheduling as
well as root cause analysis and strategy optimization in an outsourced manner. This could easily reduce operating costs as
one company could provide these services for any number of sites. As well, in this manner, they could also develop
standard work order templates and other maintenance management items that could easily be transferred from one
operation to another. Thus reducing the costs of maintenance optimization and system development to each individual
site.
Maintenance call centers could receive work requests or work orders, with their corresponding agreed priorities, via cell
phones telephone, email or via the clients own access to the web based maintenance management system. They could then
coordinate their work force to best manage the work over all of the sites that they had under their control. In a world
where profits are now very dependent on the ability of corporations to control their operating costs, this could prove
extremely beneficial to entire industrial parks. Or even entire industry sectors.
Conclusion
As can be seen the future of this important tool in the fight to continually reduce costs is both interesting and exiting.
Although many of these developments may take a few years to become reality, the vast majority of the technologies
mentioned here are available today. And with time they will become even more effective and less cost prohibitive in the
purchase stages. As always, what is required more than anything else is end user acceptance and understanding of the
benefits of such technologies. Once this is achieved then the business of maintenance can achieve a quantum leap in its
state of efficiency
103
104
MAINTAIBILITY PROGRAMS
The approach to a reliability and maintainability program is dependent upon many factors that include the customer's
requirements, the business strategy of the company, and the size of the project etc. The effective implementation of an
R&M Engineering Program must take into consideration these and other factors. Detailed in Mil-Std-785 and Mil-Std-470
are various tasks associated to the reliability and maintainability engineering program. Careful task selection must be
made for each particular program, to ensure that the reliability and maintainability requirements and objects are achieved.
How do you determine the reliability of a system, taking into consideration the mission operation profiles? How do you
optimize the reliability (and availability) of a system with respect to the life cycle cost? Where should you focus your
engineering efforts to minimize program cost? These are just a few of many questions that need to be asked and answered
prior to implementing a reliability and/ or maintainability program.
Program requirements can be derived from the customer or the company's business strategy. For example:
Customers requirements: The customer may request specific R&M engineering tasks to be implemented. This may
include the development of reliability and maintainability models, or reliability and maintainability testing, to collected
field data. Availability analysis could also be a requirement.
From the view point of a customer, R&M performance characteristics may be critical in terms of the impact upon a
system availability, safety and cost. In one scenario the end user of a military product requires a certain amount of
confidence that a product will perform its operational function when required to do so and for a set duration. In another
scenario a commercial enterprise who releases a product to market with an inadaquate reliability, whether it is a television
or an automobile, would be severely penalized with warranty costs. It is also true that any expected unreliability issues can
be off-set by augmenting the warranty charge in a product. This only serves to dull the competitive edge of the
manufactures product. This is also true for the maintainability characteristics of a product. Inadequacies here could result
in excessive downtime, affecting the overall availability and/ or repairs cost to a consumer, impacting directly the product
and the company's reputation.
Company Strategy: The main concern to a company could be to adopt a strategy to develop a reliable and maintainable
product. A key business objective must be to provide a product, which is highly reliable and maintainable, as these two
characteristics have a direct impact on a product's operational and maintenance cost to the end user. This is commonly
referred to Life Cycle Cost of Ownership, and can be of great importance to the company, in the event it invests in a
single product or multiple products for more than one customer. This may include relatively simple standalone products
such as a television, to more complex electronic and mechanical systems, such as a ground based early warning radar
system.
It should also be realised that these days, many government and commercial organisations, when requesting bids are
asking for LCC data elements to be provided with a proposal submission. The LCC data elements consist of key costs
drivers, such as the reliability and maintainability performance characteristics. It is quite obvious from this, that these
organisations are not just placing the selection criteria emphasis on functional performance, but also on R&M and LCC
attributes and characteristics. If reliability and maintainability are not considered and integrated into in the product design,
it is highly unlikely that the product will stand out from the competition.
For more complex R&M programs, the need to develop a R&M program plan presents itself as a must. The
R&M program plan will identify as a minimum the following
Program requirements: Specify what the program R&M requirements are, in terms of quantitative and qualitative
objectives and also the required R&M engineering activities;
105
Program scope and objectives: Define the limitation of the program in terms of scope, detailing the main objectives;
Program Management: Detail the general program management effort, in terms of Work Breakdown Structure (WBS)
and program schedule, showing relationships through organization charts to other key program players and with the
customer etc.;
Program Tasks: Detail the actual R&M program tasks that will be implemented, making reference to the specific
requirements and specifications;
Interface with design engineering and other groups: Identify the interface particularly where critical inputs are
required. These inputs could be in the form of program milestones and engineering support data;
Interface with the customer or end user: This interface can be captured by working group reviews, telephone hot lines
and status reports;
Interface with Subcontractors: Depending on the involvement and level of effort required from a subcontractor, key
information required may be the Point-of-Contact, their deliverables and the schedule tied to their deliverables; and
Delivery schedules: List the deliverables that are required this maybe the results or reliability and maintainability
analyses and testing results.
This field is too wide and its definition could be really hateful or too general, so I will try to embark the readers in
the Operational Reliability (OR) ship using thinking exercises:
1. Think in low reliability and make a list of facts associated with it (take 2 minutes).
2. Read the list created above and for 3 minutes try to find somebody in the company not involved with these
problems.
3. During a minute make a list of people that could be beneficiaries of an Operational Reliability Improvement plan.
4. Are you still thinking that "OR" is maintenance stuff?
Well during several workshops held, we have found the following answers:
Question 1:
Facts associated with low reliability.
• Failures • Work associated diseases
• Losses • People Stress
• Emergency maintenance • Environmental problems
• General dissatisfaction • Legal liability
• Emergency spare parts • Clients penalties
• Accidents • Bigger energy consumption
• Management dissatisfaction • Union problems
• Production extra time • Outsourcing
• Missing sell orders • Bad Maintenance
• Low production • Bad Operation
• High job/people rotation • Lack of training
• Low productivity • General mistrust
• Low performance
• Etc.
• Low efficiency
The whole above adjective set is indicative of Improvement Chances with a really high value.
Who is involved in the above list?
106
Everybody from management down, covering the whole organization level.
Who could be beneficiary of an Operational Reliability Improvement plan?
Everybody will be beneficiary of such kind of plan.
Are you still thinking that OR is maintenance stuff?
Absolutely no I am not!.
By now, we are clear about what OR means and who is involved in it. The companies insisting to confine OR to
maintenance department are neglecting many aspects that could improve their productivity. On the other side those who
are accepting OR like a collective issue and trying to improve continuously have a row of competitive advantages over the
first ones. In our experience like world-wide consultants we have seen that companies looking OR like a collective topic
are getting better results in their improvement plans than companies that do not, so the greater faulty attempts reside in the
second group.
Let’s see OR in deep. The next figure will illustrate the main idea:
As we can see OR has four big feeders, we need to act over them if we want to have a Long Term Continuos
Improvement Plan. This process called Operational Reliability Improvement (ORI) generates changes in the organization
culture turning it into a different organization with a wide productivity sense, with a clear business vision and fact driven.
Every isolated improvement attempt in one of the four OR’s feeders may bring benefits, in fact it will. But without taking
into account the other big factors, it is possible that these benefits could be limited and/or diluted in the organization and
becoming only projects rather than transformations. These are the typical cases of isolated implementation projects of
Reliability Centered Maintenance (RCM) that is focused in equipment/systems reliability, Total Quality Management
(TQM) focussed and powerful in process/quality reliability, etc.
Different cases are driven in the Japanese Culture, where their aggressive plans of continuous improvements are using a
tool mixture. This allows them to go in the perfect rhythm and generate an industrial revolution in quality. But their TQM
is used with Total Productive Maintenance (TPM) and visionary plans of human reliability improvements, covering this
way the four factors of OR.
In the western world the stories are different. In general we have very well defined boundaries (fenced and mined)
between: production, maintenance, human resources, engineering, etc. This isolates the continuous improvement projects,
and they are always bumping with "neighborhood" collaboration needs. Here we may find the limits (sometimes lethal) of
such kind of projects. Have you ever faced one of these situations? How many times have you heard from maintenance: If
production worked it would be wonderful, Production: That is not my job and vice-versa, that sounds great but here, it is
not possible.
107
Well, some companies are daring to do it (fortunately the number is growing) and all that looked as a fantasy world is
becoming real in some companies. Where there is a festive teamwork environment involving from maintenance to
engineering, from delivering to purchasing departments. Where the problems are seen like improvement opportunities and
they have been solved upon business impact rather than personal rank. Where training is supplied upon business needs
rather than individual wishes. Where everyone accepts his or her responsibility over productivity and the "guilty" concept
is released over a stronger one:
Accordingly, we have seen an Operational Reliability Improvement (ORI) process. This means a structured way to
improvement on every aspect involved in the OR. What could be reached with it? We could talk a lot here, but we can
summarize all in only two words: Improved Productivity.
"Benchmarking", "Vision/Mission statements", maintenance strategies reviews and "business reengineering process" are
activities really popular in these years. Huge money amounts are invested and the results many times are disappointing or
nothing. So, what is making ORI different?
ORI is a flexible tool tailored for companies looking for business excellence and their optimal asset management. It is a
continuous improvement process based in facts, reached by a total harmony in the tools and techniques based in risk. The
companies integrating tools, techniques and organizational development have been prized with several millions of dollars
yearly in benefit.
An Operational Reliability process is a mixture of technical solutions, structured thinking, employee motivation and
organizational development. With everything tied with proven first hand experiences and hard data.
Operational reliability is based on common sense approach towards business excellence. This is not a magical recipe, but
this introduces a systematic approach to eliminate the failure causes and bad reliability actors affecting the critical process
and the overall company profitability.
The workforce is who solves the problems and provides the input assuring success. But without commitment and
management involvement even the biggest effort will not win. The Operational Reliability creates a new manager’s role:
creating the environment to get the results.
Results may be fabulous. Not only in improved productivity and profitability terms but also in terms of motivation,
attitudes, safety and long term understanding.
Let us see some opinions:
"Operational Reliability is not an initiative - it’s just a better way of running the business. It changes the way the
workforce thinks and acts and provides the reliability tools to help them."
"When we started with Operational Reliability people thought they knew it all. Doing Root Cause Analysis they
realized they had much to learn and they have been learning."
MAINTAINABILITY CRITERIA:
This memorandum is the first deliverable. Its objectives are to define the concept of maintainability, to describe the
factors influencing it and to define criteria by which maintainability can be quantitatively evaluated.
2. - Maintainability
Maintenance is the activity of modifying a software product after initial delivery. Maintainability is the ease with which a
software product can be modified. Maintainability is a requirement of the CEI and SWCI specifications. Its importance
stems from the fact that MDSF will have to evolve and adapt to a changing environment over the next 30 years. Using
sound software engineering principles, the cost of maintenance can be minimized.
108
Following [LS80], we divide maintenance in three categories: - corrective maintenance: the correction of faults when the
system does not behave according to its specification; - adaptive maintenance: the adaptation of the system to changes in
the operational environment while keeping the same functionality;
- perfective maintenance: the extension of a system's functionality and improvement in the services provided.
3. - Factors
Maintainability is a component of a more general concept, software quality, which is described in terms of a hierachy (see
figure 1 [EM87]) of factors, criteria and metrics. A factor is a top-level expression of software status for management
reporting. Each factor is described by a set of criteria. Each criterion is measured by a set of metrics. A criterion may
describe more than one factor and some criteria may be measured by the same metre
Fault diagnosis
Rapid diagnosis of faults and problems in equipment is a vitally important contributor to throughput and efficiency in
production environments.
Typically, a few key individuals will be the definitive sources of knowhow for diagnosing faults. This represents a
bottleneck in the fault diagnosis process. What happens if a fault arises and the key people aren't around? "If only we
knew what x would do …." is the plea in countless situations where things go wrong and x can't be located or has just
gone on two weeks well-earned vacation.
IT Innovation's approach to improving fault diagnosis efficiency is to capture and reuse knowhow that exists in the heads
of the key individuals who really
how a piece of complex equipment works. We make this knowhow accessible to, and usable by, the equipment operators.
Similar approaches have been taken before in the form of expert systems. The take up of expert systems has not been
great as they've simply provided rigid structures to guide diagnostic analysis. Expert systems haven't provided the
supportive environment needed by skilled operators in order to exercise their own problem solving capabilities.
Our approach provides a rich environment of context-specific supporting information closely integrated with decision
support tools based on descriptions of known fault scenarios.
Kaoru Ishikawa
The Fishbone diagram is one of the many management tools created by Dr. Kaoru Ishikawa. Ishikawa-san was thirty
years old when the "little boy" and "fat man" bombs dropped on two of Japan's cities. Working in the Kawasaki
shipyards, Dr Ishikawa was also witness to Japan's long road towards recovery and rebuilding which required a lot of
hard work coupled with innovation and creativity.
This was when Japan turned to advanced countries such as the United States for ideas and techniques for application in
their own context. Quickly discarding their pre-war biases and prejudices, Japanese businesses embraced all management
concepts developed by the Americans - for there was just no other way to lower their costs and boost their efficiency.
From an importer of knowledge and ideas, Japan became an exporter of the same when Dr. Kaoru Ishikawa's inventions
and contributions in the management field began to be adopted by management and businesses throughout the world.
Simply adopting and blindly implementing American management concepts were not enough. They had to be
suitably molded and seamlessly blended with the traditional Sogo shosha and Keiretsu business styles, one of whose
unique features is collaborative effort expended through small groups of people. This exercise took a lot of churn and
simmer. Out of this process have emerged several of the groundbreaking ideas and concepts of Japanese management; and
they have taken up a special place in the world of management science today. One such tool in the realm of root cause
analysis is Dr. Ishikawa's Fishbone diagram, also known as the
Cause and Effect Diagram (Diagrama de Causa y Efecto ) , as well as the eponymous Ishikawa Diagram (Diagrama de
Ishikawa).
109
As with any brilliant idea, the basic foundation of the Fishbone is extremely simple and practical. Used and understood
even by non-specialists, the Cause and Effect diagram is used as a team brainstorming tool to provoke, tease and evoke
more and more ideas and issues (causes) to be captured that can go into any particular conclusion (effect) being reached.
When finished, after a few iterations of analyses, the diagram identifies and explains in a graphical format all the possible
causes of a particular effect. All the possible causes are depicted at various levels of detail in connected branches. The
level of detail increases as the branch goes outward, which means that an outer branch is a cause of the inner branch that it
is attached to. This means that the outermost branches indicate the root causes of the problem.
While drawing the chart, care is taken to have the inner branches meet a horizontal straight line, called the "spine" of the
chart. The statement of the problem - or the effect - is to the right of the spine inside a box, which makes it look like the
head of a fish. When finished, the entire map resembles a fishbone.
The mandate for the collaborating team, when they sit down across the table to draw the Ishikawa diagram, is to focus on
why the problem occurs. There is no effort to look at the history or symptoms of the problem, or anything else that might
digress from the intent of the session. When the team comprises members from different departments or functions, each of
them provides their own specialist view about why the problem (the "fish-head") occurs. It might be discovered through
this brainstorming session that there are causes common across two or more departments or functions. Perhaps that some
causes permeate the entire organization. Thus, in one single snapshot, the top management gets to see exactly why the
problem is likely to be occurring.
Usually, this is how a typical Ishikawa Diagram drawing-and-analyses scene pans out:
• First, a large writing area is put up in the center where everybody can see it. This writing area could be a flipchart or a
whiteboard.
• The problem that needs to be addressed is defined. All team members have to be very clear about what exactly the
problem is. The problem statement is described clearly and succinctly in the fish head portion.
• To set the ball rolling and to ensure logical control over the brainstorming process, the following fundamental blocks
are listed to begin with: manpower, machines, methods, materials and environment - in case of a problem related to
110
manufacturing; and equipment, policies, procedures and people - in case the problem facing the team relates to
administration and service. Of course, when listing them, it should be clarified that these blocks are suggestive and
not exhaustive. These blocks along with any other identified are major branches connecting to the spine.
• Each member of the team then gets a chance to come up with what they think is the cause of the problem. Per turn,
only one cause may be contributed by every member, else they simply "pass" if they can't think of any cause in any
particular round.
• Each cause thus identified is then "hung" on the branch of the category that it belongs to. For example, if "Moisture
Content" is a major cause, then "Dryer's RPM" is a cause that is hung on to moisture content.
• In case the cause happens to be the cause of another cause which is already present, then it must be hung on the
branch of the latter. For instance, "Materials" is a major branch that goes to the spine of the problem of "Recurrent
pipe leakage". "Defective measurement tools" is a branch that connects to materials. "Lack of suppliers" or
"Substandard supply of tools" is a cause that hangs on to defective measurement tools.
• Pareto Principle
• Scatter Plots
• Control Charts
• Flow Charts
• Cause and Effect , Fishbone, Ishikawa Diagram
• Histogram or Bar Graph
• Check Lists
111
• Check Sheets
Pareto Principle
The Pareto principle suggests that most effects come from relatively few causes. In quantitative terms: 80% of the
problems come from 20% of the causes (machines, raw materials, operators etc.); 80% of the wealth is owned by 20% of
the people etc. Therefore effort aimed at the right 20% can solve 80% of the problems. Double (back to back) Pareto
charts can be used to compare 'before and after' situations. General use, to decide where to apply initial effort for
maximum effect.
Scatter Plots
A scatter plot is effectively a line graph with no line - i.e. the point intersections between the two data sets are plotted but
no attempt is made to physically draw a line. The Y axis is conventionally used for the characteristic whose behaviour we
would like to predict. Use, to define the area of relationship between two variables.
Warning: There may appear to be a relationship on the plot when in reality there is none, or both variables actually relate
independently to a third variable.
112
Control Charts
Control charts are a method of Statistical Process Control, SPC. (Control system for production processes). They enable
the control of distribution of variation rather than attempting to control each individual variation. Upper and lower control
and tolerance limits are calculated for a process and sampled measures are regularly plotted about a central line between
the two sets of limits. The plotted line corresponds to the stability/trend of the process. Action can be taken based on trend
rather than on individual variation. This prevents over-correction/compensation for random variation, which would lead to
many rejects.
Flow Charts
113
Pictures, symbols or text coupled with lines, arrows on lines show direction of flow. Enables modelling of processes;
problems/opportunities and decision points etc. Develops a common understanding of a process by those involved. No
particular standardisation of symbology, so communication to a different audience may require considerable time and
explanation.
The cause-and-effect diagram is a method for analysing process dispersion. The diagram's purpose is to relate causes and
effects. Three basic types: Dispersion analysis, Process classification and cause enumeration. Effect = problem to be
resolved, opportunity to be grasped, result to be achieved. Excellent for capturing team brainstorming output and for
filling in from the 'wide picture'. Helps organise and relate factors, providing a sequential view. Deals with time direction
but not quantity. Can become very complex. Can be difficult to identify or demonstrate interrelationships.
114
A Histogram is a graphic summary of variation in a set of data. It enables us to see patterns that are difficult to see in a
simple table of numbers. Can be analysed to draw conclusions about the data set.
A histogram is a graph in which the continuous variable is clustered into categories and the value of each cluster is plotted
to give a series of bars as above. The above example reveals the skewed distribution of a set of product measurements that
remain nevertheless within specified limits. Without using some form of graphic this kind of problem can be difficult to
analyse, recognise or identify.
Check Sheets
A Check Sheet is a data recording form that has been designed to readily interpret results from the form itself. It needs to
be designed for the specific data it is to gather. Used for the collection of quantitative or qualitative repetitive data.
Adaptable to different data gathering situations. Minimal interpretation of results required. Easy and quick to use. No
control for various forms of bias - exclusion, interaction, perception, operational, non-response, estimation.
RELIABILITY
Reliability Engg. Is the technology concerned with predictions, controls, continuous improvements in material &
technology & thus continuous reduction of equipment failure rates. Reliability is different from quality as reliability
places more emphasis on the activities of design, manufacturing & operation in the field. Reliability is generally, in
industries, reliability does not necessarily mean failure free operations. Of course, failure free operation is important for
one shot devices (missies, unmanned space-craft) and non-reliable systems like aircraft, high hazards equipments or life
saving components etc.
The concern about reliability can be felt from the comments of an astronaut
“The most nerve-wrecking part of any space flight is the fact that your life depends upon thousands of critical parts each
produced probably by the lowest bidder”.
The precision of a measurement as measured by the variance of repeated measurements of the same objects.
“Engineering reliability is the probability that a product, device or equipment will give failure free performance of its
intended functions for the required duration of time.
115
DESIGN ASPECTS FOR RELIABILITY IMPROVEMENTS FOR INDUSTRIAL EQUIPMENTS ARE:
Less no. of parts increase reliability of equipment or system; using proven, standard components of a supplier instead of
asking them for special or Tailor-made components. This may call for some amount of over design or redesign but may
prove to be over all cheaper.
3} Derating of Equipments:
A 50ton Electric Arc Furnace in Alloy Steel Plant, Durgapur was derated and supplied as 40 ton Furnace; for reliability
electric motors are often derated.
Making the design in such a way that using incorrectly or fitting incorrectly parts are very difficultIdentifying
critical components/parts having less reliability and taking necessary actions is also one of the main tasks of
reliability improvement. 80-20 concept can be applied here also i.e.. 20%of parts amount for 80% of
failures/problems
"Operational Reliability basically not an initiative - but it’s just a better way of running the business. It changes the way of
the workforce, thinks, and acts and provides the reliability tools to help them."
Reliability improvement is a continuous engg. Process. It involves enormous amount of data collection (from operating
equipments and service equipments etc.) . As the failures are of theree tyopes, early faikure, chance failure and wear-out
failures, their analysis is done in right perspective.
(i) The reliability programme starts in nthe conceptual phase of the product or equipement and continous
throught the design, development, production, testing, field evaluation and service stages etc.
(ii) Adequate management and orgasnisational support should be there. Involvement of all department units,
that affect reliability, is essential.
(iii) Proper failure reporting system fron all concerned agencies has to built up. Necessary signal measuring
devices should be installed and their feedback maonitored
(iv) Proper action plans, specifying responsibilities, procedures, schedules and budgets (if necessary) to be
issued and follow up.
(v) The execution of programme is both technical abd report deviations for taking corrective actions.
Reliability is a probability that a product, device or equipment will give failure free performance of its intended functions
116
Reliability improvement is a continuous engineering process. It involves enormous amount of data collection and their
analysis especially with respect to failure modes and stresses etc.’s the failure of three type’s early failure, chance failure
and wear out failures, their analysis is done in right perspective.
IMPROVEMENT OF COMPONENT:
We can use superior components and parts with low failure rates. However we
would immediately release that components of high reliability will require more time and money for development. They
may also be larger in size and weight. Generally objective is not merely to produce a system with highest reliability, but to
evolve a system which reflects an optimum total cost. The major items contributing to total cost are research and
development production spares and maintenance. Similarly the production facilitates sufficiently sophisticated to enable
manufacture of precision components with the result that production cost also would increase with requirement of greater
reliability on other hand the cost of maintenance and spares would reduce with an increase in reliability factor. The
objective in the majority of design will be to attain this optimum cost. However the reliability will assume greater
significance when the goal is not so much the cost but rather the requirement a set mission or for the unit or equipment.
Pump- 1
Ps (P1) = 70%
Pf (P1) = 30% Valve
Cylinder
=91%*90%*80%
117
=66%, Thus by redundancy, the system reliability can be improved.
In addition to cost and space limitation there are some additional constraints in reliability through redundancy such as :
Parallel equipment are some times, connected with charge over switch(for automatic charge over) which may not be fail-
proof and may introduce another reliability factor.
With duplication or triplication of components not working failed components may cause adverse effect on working
components(eg. Possible internal leakage through failed or non working hydraulic valves or pumps which may cause mal-
function).
If the state of the art is such that either it is not possible to produce highly reliable components or the cost of producing
such components is very high, we can improve the system reliability by the technique of introducing redundancies this
involve the deliberate creation of the new parallel path in a system.
There are many methods of introducing redundancies in a system. A few of these will be consider below.
Stand by redundancy:
Another type of redundancies that can be introduced in a system is standing by redundancy. A two
element parallel system used for comparison all the channels or paths are active from the beginning of the operation of the
system till it failure. In a stand by system all the paths are not active at the same time.
OPTIMIZATION:
The reliability of a system can be improved considerably by introducing redundancy either in the sub
system or in the element. It was also shows that the element or component redundancy is superior to sub system or unit
redundancy.
Maintaibility Criteria:
Executive Summary:
Faced with shrinking maintenance budgets and increasingly competitive markets, maintainability is an issue of growing
importance for many companies. Although, maintainability is not a “new†concept, many companies struggle with
consistent, standardized maintenance input during the project delivery process. An important characteristic of any design,
maintainability pertains to the ease, accuracy, safety, and economy in the performance of maintenance actions. This
research examines the opportunities available through the effective inclusion of maintainability concepts during the
project delivery process.
The Construction Industry Institute (CII) defines maintainability as the optimum use of facility maintenance knowledge
and experience in the design/engineering of a facility that meets project objectives (Constructability Implementation
Guide 1993). In this context, maintainability refers to a formal process to include relevant maintenance input during all
phases of the facility delivery process. The Maintainability Research Team adopted a format similar to constructability for
its research methodology and developed model process.
118
investigation is limited to maintainability activities during six phases of the project delivery process: (1) planning; (2)
design; (3) procurement; (4) construction; (5) start-up; and (6) operations and maintenance. This research surveyed a
broad cross-section of companies engaged in many different types of construction, ranging from general building to
petrochemical. Capital and retrofit projects for equipment, systems, and facilities were included in this research. As
maintainability most directly impacts the owner of constructed projects, this research focused on owner organizations.
Conclusions: Implementation
of a formal maintainability process involves a fundamental shift in the role of maintenance, from a “necessary evilâ€
to a value adding activity, in the project delivery process. Maintenance helps achieve and sustain optimum reliability and
performance for all projects. Formal maintainability programs provide benefits to both owner and contractor
organizations. Owners benefit from improved control over maintenance costs and improved facility availability. Designers
and constructors can increase client satisfaction and use success with a maintainability process as a value-adding service
for owner clients.
Recommendations: Each
company must assess the need for maintainability on future projects and then determine the appropriate level of
maintainability efforts. Development of the formal process should reflect the organizational need, with the purpose of
ensuring maintainability objectives are met. A maintainability process has the potential for greatest (and most cost
effective) impact if it can be integrated with existing company work processes and related improvement initiatives, such
as Total Quality Management, etc.
119
Maintainability Program Plan
Overview
The primary purpose of the 'Maintainability Program Plan' is to improve operational readiness, reduce maintenance
manpower needs, reduce system life cycle cost and provide data essential for management.
The objective shall be to ensure attainment of the maintainability requirements of the acquisition.
The maintainability aspect during the systems development is extremely important and it is vital that supplier are aware of
their responsibilities in this respect as the results can have serious affects for the user. The Maintainability requirements
must be expressed as definitively as possible. The requirements shall apply to planned maintenance in the support
environment and shall be expressed in quantitative terms:
• time (e.g., turn around time, time to repair, time between maintenance actions);
• rate (e.g., maintenance hours per operating hours, frequency of preventative maintenance);
• Complexity (e.g., number of people and skill levels, variety of support equipment).
The expectation of carrying out repairs in the field by substitution of components (e.g., the replacement of a faulty card or
module in an electronic item) shall be defined.
Ishikawa diagram
Definition:
A graphic tool used to explore and display opinion about sources of variation in a process. (Also called a Cause-and-
Effect or Fishbone Diagram.)
Purpose:
To arrive at a few key sources that contributes most significantly to the problem being examined. These sources are then
targeted for improvement. The diagram also illustrates the relationships among the wide variety of possible contributors
to the effect.
The figure below shows a simple Ishikawa diagram. Note that this tool is referred to by several different names: Ishikawa
diagram, Cause-and-Effect diagram, Fishbone diagram, and Root Cause Analysis. The first name is after the inventor of
the tool, Kaoru Ishikawa (1969) who first used the technique in the 1960s.
120
The basic concept in the Cause-and-Effect diagram is that the name of a basic problem of interest is entered at the right of
the diagram at the end of the main "bone". The main possible causes of the problem (the effect) are drawn as bones off
of the main backbone. The "Four-M" categories are typically used as a starting point: "Materials", "Machines",
"Manpower", and "Methods". Different names can be chosen to suit the problem at hand, or these general categories can
be revised. The key is to have three to six main categories that encompass all possible influences. Brainstorming is
typically done to add possible causes to the main "bones" and more specific causes to the "bones" on the main "bones".
This subdivision into ever increasing specificity continues as long as the problem areas can be further subdivided. The
practical maximum depth of this tree is usually about four or five levels. When the fishbone is complete, one has a rather
complete picture of all the possibilities about what could be the root cause for the designated problem.
The Cause-and-Effect diagram can be used by individuals or teams; probably most effectively by a group. A typical
utilization is the drawing of a diagram on a blackboard by a team leader who first presents the main problem and asks for
assistance from the group to determine the main causes which are subsequently drawn on the board as the main bones of
the diagram. The team assists by making suggestions and, eventually, the entire cause and effect diagram is filled out.
Once the entire fishbone is complete, team discussion takes place to decide what the most likely root causes of the
problem are. These causes are circled to indicate items that should be acted upon, and the use of the tool is complete.
The Ishikawa diagram, like most quality tools, is a visualization and knowledge organization tool. Simply collecting the
ideas of a group in a systematic way facilitates the understanding and ultimate diagnosis of the problem. Several computer
tools have been created for assisting in creating Ishikawa diagrams. A tool created by the Japanese Union of Scientists and
Engineers (JUSE) provides a rather rigid tool with a limited number of bones. Other similar tools can be created using
various commercial tools.
Only one tool has been created that adds computer analysis to the fishbone. Bourne et al. (1991) reported using Dempster-
Shafer theory (Shafer and Logan, 1987) to systematically organize the beliefs about the various causes that contribute to
the main problem. Based on the idea that the main problem has a total belief of one, each remaining bone has a belief
assigned to it based on several factors; these include the history of problems of a given bone, events and their causal
relationship to the bone, and the belief of the user of the tool about the likelihood that any particular bone is the cause of
the problem.
How to Construct:
121
5. Combine each bone in turn, insuring that the process variables are specific, measurable, and controllable. If they are
not, branch or "explode" the process variables until the ends of the branches are specific, measurable, and
controllable.
Tip:
References:
Cause & Effect Diagram:
The cause & effect diagram is the brainchild of Kaoru Ishikawa, who pioneered quality management processes in the
Kawasaki shipyards, and in the process became one of the founding fathers of modern management. The cause and effect
diagram is used to explore all the potential or real causes (or inputs) that result in a single effect (or output). Causes are
arranged according to their level of importance or detail, resulting in a depiction of relationships and hierarchy of events.
This can help you search for root causes, identify areas where there may be problems, and compare the relative
importance of different causes.
Causes in a cause & effect diagram are frequently arranged into four major categories. While these categories can be
anything, you will often see:
Manpower, methods, materials, and machinery (recommended for manufacturing) Equipment, policies, procedures, and
people (recommended for administration and service).
122
These guidelines can be helpful but should not be used if they limit the diagram or are inappropriate. The categories you
use should suit your needs. At Sky Mark, we often create the branches of the cause and effect tree from the titles of the
affinity sets in a preceding affinity diagram.The C&E diagram is also known as the fishbone diagram because it was
drawn to resemble the skeleton of a fish, with the main causal categories drawn as "bones" attached to the spine of the
fish, as shown below.
Cause & effect diagrams can also be drawn as tree diagrams, resembling a tree turned on its side. From a single outcome
or trunk, branches extend that represent major categories of inputs or causes that create that single outcome. These large
branches then lead to smaller and smaller branches of causes all the way down to twigs at the ends. The tree structure has
an advantage over the fishbone-style diagram. As a fishbone diagram becomes more and more complex, it becomes
difficult to find and compare items that are the same distance from the effect because they are dispersed over the diagram.
With the tree structure, all items on the same causal level are aligned vertically.
123
To successfully build a cause and effect diagram:
Other uses for the Cause and Effect tool include the organization diagramming, parts hierarchies, project planning, tree
diagrams, and the 5 Why's.
In the very early 1900’s, an Italian economist by the name of Vilfredo Pareto created a mathematical formula
describing the unequal distribution of wealth he observed and measured in his country: Pareto observed that roughly
twenty percent of the people controlled or owned eighty percent of the wealth. In the late 1940s, Dr. Joseph M. Juran, a
Quality Management pioneer, attributed the 80/20 Rule to Pareto, calling it Pareto's Principle. While some may claim that
Juran’s broad attribution of this scientific observation to Pareto is inaccurate, Pareto's Principle or “Pareto's Law” as it is
sometimes called, can be a very effective business tool – one that can help us manage more effectively.
The example below is from the Dale H. Besterfield, Ph.D. book, Quality Control Sixth Edition, that includes a CD of
Excel macros
Paint Nonconformities
Item 4 makes a mockery (or perhaps strawman) of the "life isn't fair" concept as MMORPG game designers have, as a
whole, stacked the deck against the casual player. This isn't about "skill" but ability to take advantage of broken game
mechanics while they exist to get ahead of the power curve and stay there whether it is items in EQ or realm points in
DAoC or <your favorite example here>.
Having played EQ with FoH members and then Test Server players and finally a live guild with far above average
representation of Best of the Best winners I can agree that some players are simply better than others. On the other
hand, they are better because they understand the underlying game mechanics better than the average player and uses
(or exploits) them to their maximum benefit. Tactics and strategies that once known to the general populace (and
thereby the devs) are typically nerfed into oblivion.
The experimentation required typically is beyond the time constraints of a casual player and even if they could, once
you drop behind the power curve, typically you cannot access the exploitable content (abilities, classes, mobs, etc)
until after its been nerfed.
125
How Pareto’s Principle Can Help Us
The value of the Pareto Principle in management is in reminding us to stay focused on the “20 percent that matters”. Of all
the tasks performed throughout the day, one could say (based on Pareto’s Principle) that only 20 percent really matter.
Those tasks in the 20 percent very likely will produce 80 percent of our results. Thus, it’s critical that we identify and
focus on those things. When the fire drills surrounding the “crisis of the day” begin to eat up precious time, remind
yourself of the critical 20 percent you need to focus on. If anything in the list of activities and action items has to fall by
the wayside – left undone – be sure it isn’t listed in that critical 20 percent.
DEFINATIONS OF MAINTAINABILITY :
Maintenance can be defined as the characteristic of equipment design and installation which is expressed in
terns of easy and economy of maintenance, availability of equipment, safety and accuracy of performance parameter of
equipment.
Its aim is to design and develop a system or equipment that can be easily maintained at a reasonable cost with minimum
resources, without affecting the performance and safety of equipment.
Maintainability is associated with the design of assets to be maintained. It is a measure of the easy of
maintenance, the parameter for expressing the maintainability is Mean Time to Repair (MTTR).
The concept of maintainability is different from reliability. Reliability is the probability that an asset or a system will
operate satisfactory for some determined period of time, the parameter expressing reliability is Failure Rate (FR) OR
Mean Time Between Failures (MTBF).
• Maintainability is not a factor - this is a short term tactical solution. It is agreed that the system will be replaced or
rewritten before maintenance costs become a problem. It is particularly important that any decision to build a
tactical solution is documented as there is a tendency for such systems to become long term corporate business
systems.
• Maintainability is key - this is a long term system which needs to be easily maintained. Here the criterion for
deployment is not just to provide required business functionality in a robust way, but that the design and code
meet a maintainability standard before the system is accepted and released to the business. This means that
activities such as documentation and tested production should not be allowed to be cut out if time runs short in a
time box. It could mean designing the system so that logic is externalized so that it can be easily changed during
maintenance activities. This may mean that time boxes need to be a little longer than usual.
• Maintainability will be built in later - the business priority is to elicit and implement required functionality
quickly. The system needs a long life and to be maintainable, but the business is prepared to pay for subsequent
(behind the scenes) re-engineering after implementation. This means a greater development cost than engineering
for maintainability first time, but gives a quicker initial delivery, and may produce a lower lifetime ownership cost
than struggling for years with maintenance problems. (This is often the case where time to market is critical -
either in software for sale into a fast moving market, or software to satisfy a fast moving business).
Maintenance can be said to start after the first increment of a system has been delivered. In most cases, any maintenance
on this will need to be undertaken by the DSDM development team during the second increment. If the system is not
maintainable, then the second increment will be slowed down at this stage. Any requirements not covered during the main
part of the development lifecycle because they were prioritized out due to time boxing or using Moscow are often held
over to be considered for future work done by the maintenance team.
However, maintenance is usually considered to begin post implementation. It should not make any difference whether this
process is undertaken by a separate maintenance team or by the development team. However, maintenance is often
transferred to a different team which means that the goal of maintainability is essential since the maintenance team will
126
not have gained knowledge of the system during development. It is important that the maintenance team are represented
during the development process - this role is recommended in DSDM.
The Quality Strategy for the project needs to consider how quality control will be applied to ensure that the
Maintainability Objective is met.
Appropriate staff should be involved in implementing maintainability objectives. The Technical Co-ordinator is
responsible for ensuring the maintainability objectives are met. The Project Manager is responsible for identifying and
calling in specialist roles as required - these could be Support and Maintenance team representatives and the Service/Help
Desk Manager to assist with planning for maintainability and to consider the eventual support of this system once it is
running live.
DSDM does not ensure maintainability by itself. Maintainability is made possible by a combination of four factors within
a well managed DSDM project:
• Tools - The use of tools to cover such areas as configuration management, testing and impact analysis aids
maintainability
• People - The people aspects affecting maintainability are development team skills/experience/business
knowledge/user contribution/maintenance team skills/motivation.
• Documentation - A minimum documentation set is needed for maintenance. This does not have to be paper-based
documentation but could be information residing in a toolset. This documentation set will vary according to
installation guidelines.
• Good practice guidelines - such aspects as standards, style guide, use of DSDM for this installation etc. - in fact
everything that maybe we would have done automatically for a waterfall approach and should not forget just
because a RAD method is being used.
Limitations
It can not be put in complex machine systems because design considerations and secondly, it does not improve
the performance of the equipment.
OBJECTIVES
127
1. To design equipment that can be maintain easily in minimum time and at minimum cost. It implies that the
requirement of other supporting resources such as spare parts, man power, and facilitates of tools and test
equipment must also be minimal.
2. Good maintainability can also improve the safety of personal.
3. Maintainability increase the cost of production of any machine and it reduce the operating cost considerably.
4. To reduce the product’s life cycle cost of maintenance.
5. Good maintainability provisions can help the maintenance department to carry out maintenance successfully with
proper cooperation of the equipment.
6. At the time maintainability implementation program, reliability and other characteristic can be evaluated.
7. its objectives are system readiness and achievements of desired results.
Maintainability objectives are governed by the maintenance strategy selected for the project. Maintainability objectives
must be clearly defined and must support business goals. While qualitative maintainability objectives are very useful,
preference is for quantitative maintainability objectives that can be measured and recorded. Both programs were able to
define quantitative maintain- ability objectives. The owner-led program uses quantitative
objectives such as (1) total spare parts inventory per unit of sales value of production; (2) maintenance cost per unit of
sales value of production; (3) maintenance cost per unit of product produced; (4) planned versus unplanned maintenance
cost; (5) start-up costs (training, travel, checkout, materials); (6) annual maintenance costs; and (7) overall equipment ef-
fectiveness. During project planning and design, maintainabil- ity objectives are established by a joint effort of the project
engineer and plant/maintenance engineer. The cooperative joint effort increases the likelihood of designed-in maintaina-
bility and realistic attainment of maintainability objectives. In comparison, the contractor-led program identified mean
time to repair as a maintainability objective. Other maintain- ability objectives were not as clearly defined—a common
oc- currence throughout industry. Maintainability results can be difficult to track or measure because maintenance has
initial and long-term impacts occurring over the life cycle of a proj- ect. Nonetheless, continuous tracking and
measurement of maintainability objectives provides a means to assess true value and performance. The continuous
assessment also allo informed decisions to be made for appropriate changes to im- prove long-term maintainability.
Fault Diagnosis
Types of Diagnosis
Circuit Partitioning
Cause-Effect Diagnosis
• Fault models
• Fault simulators
• Fault dictionaries
• Diagnosis algorithms
Fault Models
129
Some Diagnostic Fault Models
Gate Fault
Net Fault
Bridging Fault
Path Fault
Fault Simulators
Inputs:
o Circuit (netlist)
o Test set
Fault Dictionaries
• A fault dictionary is a database of the simulated responses for all faults in faultlist
• Used by some diagnosis algorithms for convenience:
o Fast: no simulation at time of diagnosis
o Self-contained: netlist, simulator, and test set not needed after dictionary creation
• Can be very large, however!
Dynamic Diagnosis
Diagnosis Algorithms
Probabilistic Scoring
Diagnosis in Practice
• Using a diagnosis
• Translating the results: circuit navigation
• Evaluating diagnosis quality
• Commercial diagnosis tools
Using a Diagnosis
• Netlist
o Examine RTL (Verilog/VHDL etc) for gates and data paths
• Schematic
o Symbolic view of gates and wires
• Layout/artwork
o Graphical view of metal lines, poly, vias,
cell boundaries, etc.
Netlist Navigation
Schematic Navigation
• Either hand-drawn (from netlist navigation) or tool-generated gate symbols and wires
• Schematic tools in simulators also allow forward and backward traversal and display of logic values
• Used to verify fault propagation
• Does not reflect physical distances
132
• Useful for determining (x,y) values
• Also good for evaluating physical implications of a set of fault candidates
o Faults clustered in a small area are good
o Faults/nets spread around large die areas are bad
Evaluating a Diagnosis
Commercial Tool:
Mentor Graphics
133
Prior Art
Research Directions
• Diagnosibility
o What makes a particular circuit easy or hard to diagnose?
o What can we do to make diagnosis easier?
• Evaluation of diagnoses
o What makes a good diagnosis?
o Can we quantify our confidence in a diagnosis?
135