Anda di halaman 1dari 9

Lifetime Reliability Solutions

Mob: 0402 731 563 Email: info@lifetime-reliability.com Fax: (+61 8) 9457 8642 Post: PO Box 2091, Rossmoyne, WA, 6148, Australia ABN 66 032 495 857

What is Equipment Reliability and How Do You Get It?


Abstract

High equipment reliability is a choice and not an accident of fortune. To a great extent you can
choose how long you want between equipment failures. You can deliver high equipment reliability
by ensuring the chance of incidents that cause failures of equipment parts are low. This article
explains what to do to get remarkably long and trouble-free equipment lives. The secret is in
keeping parts and components at low stress within good local environmental conditions so there is
little risk they are unable to handle their duty. If there is nothing to cause a failure, the failure will
not happen and your equipment continues in service at full capacity and full availability.

Keywords: equipment reliability, downtime, failure avoidance, reliability of parts, equipment risk

Without getting into the mathematics, equipment reliability is described as how long an item of
equipment is likely to last before it fails to perform its duty. It is a measure of the chance of
remaining in-service to a point in time. When equipment is expected to operate at its duty capacity
for a long time, it is considered reliable. When the period is short, and the equipment is out-of-
service often, it is unreliable.

You measure the reliability of equipment by its trouble-free time. If it is meant to last for 10,000
hours (about 14 months of continuous operation), and does last that long, it is 100% reliable to
10,000 hours. But if after 10,000 hours there is an occasional failure, the reliability beyond 10,000
hours is less than 100%. When we talk about reliability, we must also say what time period is
involved. The problem is we expect plant and equipment to operate trouble-free for a long time,
and are unhappy when they do not. It is best to set how long we want our machines to last in-
service before we will accept failure downtime, and then put into place the necessary design,
purchasing, operating and maintenance regimes that will deliver the required lifetime reliability.

Measuring the Impact of Equipment Reliability on a Business

The standard definition of reliability leaves much unsaid about the effects of equipment failure on
businesses and people. You know when you have unreliable plant and equipment because people
are angry that it fails so often. Everywhere you look in companies with equipment reliability
problems people are busy ‘doing’, often repairing failures over and over again. It never ends, and
you go home each day uncertain about what tomorrow will bring. You also know when you have
reliable equipment, because you have the time to do your work well. There are few unexpected
equipment failures, because equipment behaves as they are expected to behave. The business likely
makes good profits because operating costs are low and controlled to a narrow, known range. A
place with reliable equipment is a happy place. It is a pleasure to go to work in such an operation.

I believe equipment reliability needs to be seen as more than just a chance time span. It is about
building great businesses that are world-class performers. You will never get great equipment
reliability if you do not want to live in the sort of world that it will bring to you. You have to want
equipment reliability as strongly as you hang onto life before you will do what is necessary to get it.

In the end you will need to measure equipment reliability if you want to improve it. That measure
will naturally be the time between the equipment’s inabilities to perform its duty - typically hours of
operation. One obvious drawback with only measuring time is that there is no indication of the cost

Page 1 of 9
Lifetime Reliability Solutions
Mob: 0402 731 563 Email: info@lifetime-reliability.com Fax: (+61 8) 9457 8642 Post: PO Box 2091, Rossmoyne, WA, 6148, Australia ABN 66 032 495 857

of that level of reliability. If you do not know what it costs you for reliability, you may spend a lot
of money on small improvements. Or worst still, not spend enough money to get great
improvements. In my opinion, reliability measured only by expected time in service is a poor
business indicator. Reliability must also be measured by the money it made or lost for the business.

You measure reliability in money terms by tracking the unit cost of production between failure
caused outages. It is a simple equation. For a facility that measures production by weight it is:

Unit Cost of Production ($/T Hr) = Operating Costs in the Period ($) x 1 .
Total Saleable Throughput (Tonnage) Production Hours (Hr)

Figure 1 shows you what these measures indicate. Do not include non-failure stoppage time in
Production Hours, but do include operating costs incurred during that time in the Operating Costs.

Duty Available Duty Available


Failure for Failure for
Service Service

Time to Throughput Time to


Repair Production + Production + Production
Repair

Stopped Stopped

Time (Hrs)
Time + Time + Time

Hours between Duty Failures

Operating Costs in Period

Failure Condition Monitoring Costs


Costs
Spent
Time (Hrs)

Figure 1 – Reliability ought to be in Measures of Time and Operating Cost

This is a measure of the value of reliability in real money and time. It is also a means for
comparison across operations and businesses. Higher reliability reduces unit cost of production in
two ways - you get more time to make product and you get more product out. As reliability
improves there is more time and throughput. That is why every company in the world wants more
reliable equipment; there is a lot of free money to be made if you have highly reliable plant!

We now have two measures for equipment reliability – ‘Hours between Duty Failures’ and ‘Unit
Cost of Production between Duty Failures’. These are all-encompassing measures, meaning they
measure the effect of everything that impacts the equipment’s reliability. To improve equipment
reliability we need to rip those measures apart and look inside them to find what caused those
results. You can only do that if you know the costs spent on an item of equipment, exactly what
they were spent on, and what they were spent to do. To get the level of detail you need to fully
analyse the cause of your reliability problems requires tracking the life and cost of your equipment
right down to the individual part level. The secret to successful equipment reliability lies within the
lives of the equipment parts.

Page 2 of 9
Lifetime Reliability Solutions
Mob: 0402 731 563 Email: info@lifetime-reliability.com Fax: (+61 8) 9457 8642 Post: PO Box 2091, Rossmoyne, WA, 6148, Australia ABN 66 032 495 857

Equipment Reliability Depends on the Reliability of Parts and Components

Equipment is made of parts and components combined in assemblies that work together to allow the
equipment to operate. Figure 2 shows how we combine parts to form a shaft bearing assembly. It
shows a bearing in a pillow block housing carrying a shaft, a typical situation in many industrial
machines. There are 14 parts in the assembly. The 14th item is the lubricant.

10

9 7
8

11 5
12 4
13 3
2

Figure 2 – An Arrangement of Parts Allow a Shaft to Turn in a Pillow Block Housing

When you look closely at how the assembly is built, you find that it is configured such that one part
contacts another in a sequence. Figure 3 identifies a portion of the sequence of parts that allow the
shaft to turn in the bearing pillow block. Notice that they are organised in a series arrangement.

Shaft Inner Roller Lube Outer Housing Housing Housing


Journal race bearing race Bore Mount
s

Figure 3 – The Series Arrangements of Parts that Allow the Shaft to Turn in the Bearing Housing

This is the way all industrial equipment is built. They are a series arrangement of parts and
components that together perform the required duty. Once you have a series arrangement, you have
huge risks to equipment reliability. Some simple examples show you what happens.

Figure 4 is as Figure 3, except missing a part from the sequence. Without the outer race the
assembly cannot work, the series connection has been lost. If this assembly were in a piece of
equipment, the equipment would be failed. A sequence arrangement of parts only requires one item
in the series to fail and the whole assembly fails. When the assembly fails, the equipment stops.

Shaft
Journal
Inner
race
Roller
bearing
s
Lube

X Housing
Bore
Housing Housing
Mount

Figure 4 – The Reliability of a Series Arrangement Depends on the Reliability of its Components

Page 3 of 9
Lifetime Reliability Solutions
Mob: 0402 731 563 Email: info@lifetime-reliability.com Fax: (+61 8) 9457 8642 Post: PO Box 2091, Rossmoyne, WA, 6148, Australia ABN 66 032 495 857

The clear message in Figures 3 and 4 is that the reliability of a piece of equipment is totally
dependent on the reliability of its individual parts. If one part fails the machine fails. If one part is
in a bad condition the entire equipment is at ever increasing risk of breakdown. We can calculate
the reliability of a series system with the following equation.

Rseries= R1 x R2 x R3 x ...Rn Eq 1

Equation 1 says that the reliability of a series of items is the multiplication of the reliabilities of the
individual items in the series. It makes more sense if we use numbers. In Figure 5, each part has its
own reliability. For the sake of the example, say the reliability of each item is 99% (usually written
as 0.99), which means 99 parts out of 100 are failure-free for as long as they should be.

Shaft Inner Roller Lube Outer Housing Housing Housing


Journal race bearing race Bore Mount
s
R1 R2 R3 R4 R5 R6 R7 R8

Figure 5 – Every Part in a Machine has its Own Reliability

The series reliability for the eight parts is:

Rseries = 0.99 x 0.99 x 0.99 x 0.99 x 0.99 x 0.99 x 0.99 x 0.99 = 0.92 (or 92%)

Once parts with reliability of 0.99 are in an assembly of eight items, the reliability of the series falls
to 0.92. Now 92% of the assemblies will not last to the time they should. As the number of parts in
a series grows longer, its reliability declines because there are more things to go wrong. But see
what happens to reliability when we have a part with a serious problem. If the outer race has a
reliability of 0.5, where half of the races fail before their expected life, the equation becomes:

Rseries = 0.99 x 0.99 x 0.99 x 0.99 x 0.5 x 0.99 x 0.99 x 0.99 = 0.47 (or 47%)

Now only 47% of the assemblies last to their full life. It only takes one part in a series arrangement
to be of low reliability and it brings the entire machine down to below that reliability. We have now
arrived at a most important principle in equipment reliability:

An assembly of parts (i.e. a machine) can never be more reliable than its least reliable part!

This reliability principle should be uppermost in your mind from now on in everything you do in
the field of maintenance and reliability improvement. You can never improve your equipment’s
reliability more than its least reliable part. If you want highly reliable machines and equipment, you
first must ensure each of their parts are even more highly reliable.

When Machines and Equipment Fail We Replace Parts

When we do Preventive Maintenance or Breakdown Maintenance, we replace parts and/or lubricant


in a machine and then put the equipment back into service. The new parts start their life, and the
parts not replaced continue theirs. Within the machine are now old parts still in good health, parts
that have accumulated stress and approaching end-of-life, distressed parts ready to fail when next
overloaded, and new parts starting into service with their inherent design limitations. What is the
reliability of the machine now?

Page 4 of 9
Lifetime Reliability Solutions
Mob: 0402 731 563 Email: info@lifetime-reliability.com Fax: (+61 8) 9457 8642 Post: PO Box 2091, Rossmoyne, WA, 6148, Australia ABN 66 032 495 857

Hopefully it is clear that the equipment reliability would depend on the reliability of each of the
parts. The distressed parts would have a very poor reliability (they are likely to fail soon); while the
new parts would have much higher reliability, (they are likely to fail sometime in the future).
Overall, the equipment is no more reliable that the most distressed part. What could you do right
now to improve the reliability of the distressed part?

You could stop the equipment and replace the part with new. The Operations Group would be very
unhappy to learn that the equipment again needs to stop. Also, you must know which parts are in
distress; else, you may replace the wrong ones and the equipment will still fail soon. There is
another thing you can do – reduce the chance of overstressing the part. With the chance of excess
stress substantially reduced, the distressed parts have a greater prospect of lasting longer. Simply
by lower the stress on machine parts, you greatly improve the odds of higher equipment reliability.

Measuring the Rate of Equipment Failure

Parts working together form a series system we call machines. We know when a part fails the
machine fails. You can draw the failure rate curve for the machine from the rate of its parts’ failure.
Figure 6 shows the life of a 3-part machine. The bottom chart shows when individual ‘green’,
‘blue’ and ‘orange’ parts failed. Every time one of these parts failed, the machine also failed. The
‘green’ part went onto preventive maintenance after the second failure. The top chart reflects the
machine’s rate of failure. When parts fail close in time, there is a rise in failure rate. When they do
not, the rate falls. The name of the curve is ROCOF - Rate of Occurrence of Failure.

ROCOF
A machine’s rate of occurrence of failure changes as its
parts do, or do not fail.
Rate of Occurrence of
Failure in a Machine
System

Time or Usage Age of System


z(t)
When a part fails it is replaced and starts a new ‘life’. But
the machine’s life does not start from new; it continues to
accumulate time from the second it was first put to work
Rate of Failure of

and the not-replaced parts within continue to age.


Part

Time or Usage Age of Part

Figure 6 –Machine Failures are the Accumulated Effect of Parts’ Failures

The ROCOF curve for a machine reflects what happens to its parts, and moves up and down as
parts fail. But when we take many identical machines and collect their parts’ failure history
together, we get a ‘steady average’ ROCOF, which is representative of the reliability of the machine
Page 5 of 9
Lifetime Reliability Solutions
Mob: 0402 731 563 Email: info@lifetime-reliability.com Fax: (+61 8) 9457 8642 Post: PO Box 2091, Rossmoyne, WA, 6148, Australia ABN 66 032 495 857

design, the quality of manufacture, the precision of its installation, its production abuse, the
purchasing and storage quality control and the standard of maintenance workmanship care.

The ‘failure
curve’ for a
•Defective parts •Operating overload •Many aging parts
machine has a •Poor quality assembly •Aging of some parts •Many parts degraded
special name
– ROCOF – •Manufacture error •Local environment degradation
Rate of
Occurrence of
•Operator error Mean of Many
Systems
Failure. •Poor operating practices (machines)
•Poor maintenance practices With more parts,
•Poor design choice ROCOF
System Rate

becomes
of Failing

approximately
constant

A Single System
(machine)
Component Rates

Time or Usage Age of System


of Failing

Time or Usage Age of Parts

Figure 7 – Failure Curve for a Machine (i.e. a System of Parts)

Figure 7 lists many reasons why our equipment and machines fail. In reality, our parts fail and then
our machines stop. The solution to equipment reliability is to improve parts’ lifetime reliability.

•Better quality control •Do more preventive maintenance •More parts on PM


•More training •Better operator training •Better materials
•Precision assembly •Total Productive Maintenance
•Precision installation •Precision Maintenance
•Better design/material choices
•Machine protection devices Old ROCOF
System Rate
of Failing

New ROCOF
Rates of Failing

Time or Usage Age of System


Component

If we remove parts’ Remove Causes of Time or Usage Age of Parts


failure from our Parts’ Failure
machines by
changing our policies,
the machine reliability
improves

Figure 8 – Improving the Reliability of Machines and Equipment

Page 6 of 9
Lifetime Reliability Solutions
Mob: 0402 731 563 Email: info@lifetime-reliability.com Fax: (+61 8) 9457 8642 Post: PO Box 2091, Rossmoyne, WA, 6148, Australia ABN 66 032 495 857

Failure rates for parts do not readily change without redesign. Once a part is in a machine we are
stuck with its characteristic performance, i.e. it will behave as its design allows. However, as
Figure 8 shows, the failure rate of machines is highly malleable depending on the applied
maintenance policies, the operating policies, purchasing and stores practices, and the assembly
accuracy of the machine’s parts. All these things are totally in the control of every business.

Stop Equipment Risk and You Stop Equipment Failures

How long your equipment lasts before the next failure depends on the chance its parts will survive
to that point in time. Reliability is a high-risk gamble if you do not understand the game you are
playing. I bet that you did not know you are gambling with the future of your business when you
work in Engineering, Operations or Maintenance. But to turn the odds in your favour you only
need to control what risks you permit your equipment to experience.

Risk is a measure of the cost of chance. It is usually calculated by the following equation.

Risk ($/yr) = Consequence ($) x No. of Occurrences (/yr) x Chance of an Occurrence Eq. 2

Equation 2 indicates that you will be hit with a cost every time the occurrence happens. To prevent
a bad cost we must first reduce the chance of a bad event occurring, because once it does you will
pay the full price of that occurrence. If we want to prevent equipment failure, and not incur the
associated cost penalty, we must reduce the chance of the failure event happening. Conversely, if
you want a good risk to happen, you must provide for the chance of a good occurrence. We can
make risk good or bad for our equipment by our choice of actions.

Figure 9 helps us to understand how we control the risk placed on our equipment parts. Once we
know which occurrences cause our parts to fail, and which extend their life, we only accept those
that lead to good lifetime reliability outcomes, and not those that cause increased chance of failure.

Only accept this range of outcomes


because they produce good risk

Best
Better Better
Good Good
Chance of Event

Do not accept these Do not accept these


outcomes because they outcomes because they
produce high risk produce high risk

X X
Consequence of Event

Figure 9 – Control the Chance of an Equipment Failure Event

To reduce the chance of equipment failure you prevent situations where equipment parts are
overstressed or experience fatigue. If a bad-chance situation does not happen, the part does not

Page 7 of 9
Lifetime Reliability Solutions
Mob: 0402 731 563 Email: info@lifetime-reliability.com Fax: (+61 8) 9457 8642 Post: PO Box 2091, Rossmoyne, WA, 6148, Australia ABN 66 032 495 857

experience the risk and it continues its life unaffected. Focus your efforts on removing the causes
of bad-chance to equipment parts. Once the chance of a bad event is reduced, then fewer happen;
with fewer bad occurrences, less cost is lost, and equipment risk falls.

Control the Chance of Overstress and Fatigue Occurrences and You Control Reliability

Figure 10 identifies strategies available to control the risk placed on equipment and its parts. Those
from the left-hand column reduce chance of failure. Those on the right only reduce cost of failure
but the chance remains unchanged.

Risk = Chance x Consequence

Chance Reduction Strategies Consequence Reduction Strategies


remove opportunity for failure to start reduce the loss after a failure has started

• Engineering and Maintenance Standards • Preventive Maintenance


• Standard Operating Procedures • Corrective Maintenance
• Failure Mode Effects Criticality Analysis (FMECA) • Total Productive Maintenance (TPM)
• Hazard and Operability Study (HAZOP) • Non-Destructive Testing
• Hazard Identification (HAZID) • Vibration Analysis
• Root Cause Failure Analysis (RCFA) • Oil Analysis
• Precision Maintenance (shaft alignment, oil • Thermography
particle filtration, deformation prevention, etc) • Motor Current Analysis
• Training and Up-skilling • Prognostic Analysis
• Quality Management Systems • Emergency Management
• Planning and Scheduling • Computerised Maintenance Management
• Continuous Improvement System (CMMS)
• Supply Chain Management • Key Performance Indicators (KPI)
• Accuracy Controlled Enterprise (ACE 3T SOPs) • Risk Based Inspection (RBI)
• Design and Operations Cost Totally Optimised • Operator Watch-keeping
Risk (DOCTOR) • Value Contribution Mapping (Process step
• Defect and Failure True Cost (DAFTC) activity based costing)
• De-rate/Oversize Equipment • Logistics, stores and warehouses
• Reliability Engineering • Maintenance Engineering

Figure 10 – Various Risk Management Methods

Notice from Figure 10 that Condition Monitoring, as it is usually done, does not reduce the chance
of failure. It only identifies impending failure and lets you turn the repair work into a planned job,
instead of a breakdown. The failure has already been initiated, and if it is not addressed in time
your equipment breaks down. Fortunately, depending only on the time when you start condition
monitoring, it can be used to reduce the risk on parts.

Figure 11 shows how Condition Monitoring can be a tool to stop failures, rather than only used to
spot failures. When used to gather information on equipment parts and component’s chance of
failure, Condition Monitoring becomes a risk management tool. Condition Monitoring used this
way spots ‘bad-chance’ by checking if the workmanship standards used to rebuild plant and
equipment will deliver high reliability. An example is when rebuilding machinery, test that the
workmanship was to the required precision standard by checking vibration levels are at very low
values after the rebuild. If the vibration levels are too high after assembly, identify the cause and
rectify it before letting the equipment go into service.

Through persistent use of Condition Monitoring to measure work quality, you will build knowledge
and improve workmanship to deliver plant and equipment in excellent condition, with low operating
risk to parts and maximum chance of a highly reliable and long equipment lifetime.

Page 8 of 9
Lifetime Reliability Solutions
Mob: 0402 731 563 Email: info@lifetime-reliability.com Fax: (+61 8) 9457 8642 Post: PO Box 2091, Rossmoyne, WA, 6148, Australia ABN 66 032 495 857

Figure 11 – Using Condition Monitoring to Control Risk of Low Reliability

It is appropriate to draw together the key issues on equipment reliability that in the article. We can
say that:

1. Plant and equipment are a series arrangement of individual parts.


2. The reliability of a part is the chance it will survive in-service for a required length of time.
3. The reliability of equipment depends on the reliability of its individual parts and to achieve the
equipment reliability, each part must have higher reliability than needed for the equipment.
4. When you reduce the chance of parts’ failure, you increase the chance of higher equipment
reliability.
5. Decide the equipment reliability you want and then identify what to do to the parts to reduce the
chance of them failing before they deliver their necessary reliability.
6. Prove that the precision standards needed for high parts reliability are present at start-up.

Equipment reliability is mostly in our control. We improve equipment reliability by choosing the
policies and practices that reduce the chance of a bad event happening and that increase the chance
of a good event. You can use condition monitoring as a tool to detect the onset of failure. But you
get greater worth from it, if you also use it to ensure that the high quality work and precision
standards needed to produce long lifetime reliability are present for your machine parts at the start
of their lives.

My best regards to you,

Mike Sondalini
www.lifetime-relability.com

Page 9 of 9

Anda mungkin juga menyukai