Anda di halaman 1dari 80

Reliability Analysis

and
Stochastic processes

Lecture Overview
Reliability Data Analysis
Complex Systems
Reliability/Hazard Function
Stochastic Process
Reliability Models

Fundamentals of Reliability Analysis

The collection and analysis of reliability data requires a systematic


approach with clear definition of reliability parameters and
comprehensive collection and analysis procedure

Weibull analysis is considered to be an important techniques for


modelling failure process as we shall see later. Other methods are
relevant to certain situations. Simpler methods of analysis can be
beneficial, particularly in initial stages of a project

Presenting results in a clear and concise form with the emphasis of


benefits obtained from the analysis is an essential ingredent of any
study
3

Recall reliability definition


The term reliability generally expresses a certain degree of assurance
that a system will operate successfully in a specified environment
during a certain time period.
This concept is dynamic and it does not refer just to instantaneous
events. If a system fails, that does not necessarily imply that it is
unreliable. Every piece of equipment fails once in a while.
The question is how frequently do failure occur in a specified period
of time?

Reliability Analysis

Reliability analysis requires the recognition of modelling of system characteristics that


includes the post-failure behaviour of components and the contribution of a component
failure or failure mode to the overall system condition.

Statistical analysis of failure modes and actions carried out, such as preventive
maintenance or repair, can result in an approximate assessment of system reliability and
actions involved to restore the performance of the system at all levels.

Sometimes reliability requirements is based on practical experience and engineering


intuition. This usually requires statistical data about technical systems. The
determination requires both expertise and statistics. An ideal condition for determination
of reliability requirements exists when systems performance can be measured in some
units as the cost of its production and maintenance, its operating characteristics as well
as maintenance.

Reliability data

The analysis of reliability data depends on the types of observations available. Observations
of systems on test which fail yield complete information on the time till failure if monitoring
is continuous.

On systems, which have not failed, we have partial information only. Such data are called
time-censored.

If a system start operating at a particular time, we say that the censoring is single, called
censoring of type I. Some observation terminates at the instant of the rth failure, where r is a
predetermined integer. In this case, the failures are censored according to type II censoring.

If a system starts operating at different time points in an interval (0, t], and the observation
terminates at time t, we have multiple censoring of data. If a system is known to have failed
prior to the time when observation started we have left censoring. The other type of censored
information, where the system is still in operation at the termination of monitoring, is called
right censoring.
6

Identifying Suitable Distributions


In order to understand the failure process or repair process of a system,
knowledge of the characteristics of the theoretical distribution, and
statistical analysis of the data will assist in selecting a failure or repair
distribution. An ideal approach is to

Construct a histogram or the failure or repair times


Compute descriptive statistics
Analyse empirical failure rate
Use prior knowledge of the failure process
Use properties of the theoretical distribution
Construct a probability plot
7

Reliability Data

The reliability function can be defined as;


R(t) = P(system operates during [0, t)),
Where P(A) denotes the probability of an event A. To understand this it is
necessary to understand the concept of probability and the concept of random
variables

Data collected for repairable


electronic system
Collector

Data

Manufacturer Failures

Statistics
EVENTS

BAD NEWS
Customer

CENSORED
Failure Free Time TIMES
GOOD NEWS

The systems were progressively introduced into service and NOT


operated continuously or in a uniform manner

Source : Ansell and Phillips (1994)

Example of Reliability Data


10
9
8

XX
X
XX
X

X
No of
No
r
Observed Information 1
renewal

6
5
4

X X XX
X
XX

X XX

X
X

XX

r2 r1

XX
0

XX

t0

rs rs 1

X XX
t1

X
t2

t s 1

ts

Time
400

800

1200

1600

2000

2400

Time (in hours)


X = Failure:

0: last time withdrawn

Installation
Start of
of new equipment collection data

End of
data collection

: Failure free time

Data example from a group of 10 repairable electronic systems

Method of data collection of mechanical equipment fitted


To fleet x denote an observed renewal

10

Equipment Reliability

A piece of equipment is an assembly of components to perform a


specific function.

It will fail as a result of component failure (assuming no catastrophic


failure) it can be restored back by replacement of failed component

The mapping of component failures history of a piece of equipment is


shown in the diagram

Mechanical component may exhibit constant, decreasing or


increasing failure rates. In certain circumstances it may be necessary
to determine these characteristics differently
11

System reliability

Systems are more complex, they generally comprise of a combination of


equipment (series and parallel)

During the system lifetime component may be replaced (renew) and


equipment repaired.

Modification may also be carried out to improve performance or meet


operational requirement

System reliability techniques are needed to quantify the overall


equipment reliability and make sure the system is at 100 percent
12

Complex system

Complex repairable systems in business, industry, medicine and nature


frequently incorporate preventive maintenance actions in attempts to
improve operational performance and reliability.

These typically involve providing systematic inspection, detection and


eradication of partial or incipient failures.

Several researchers have proposed mathematical models for such


systems, though most of these contain fundamental flaws and serve
only as statistical approximations, as we shall see.

13

Complex repairable systems

A complex system consists of any structure of more than one component,


which performs a particular function. A complex repairable system is a
system, which after it has failed to perform properly, can be restored to a
satisfactory performance by any method except complete replacement of the
entire system (Crowder et al., 1991).

Therefore, it is extremely important that the assumed stochastic process


accurately characterizes system failure. Typical systems include industrial and
domestic machinery, such as production lines, motor vehicles and computers.
They also include biological and ecological structures, such as the human
body and natural ecosystems. We can imagine other applications arising in
society and commerce but concentrate here on industrial systems, which
benefit greatly from reliability and maintenance modelling.
14

Complex repairable systems

15

16

Refinery Pump Data


34
14
81
86
156
20
96
47
45
971
88
30
4

xi
ci

0
1
0
0
0
0
0
0
0
0
1
1

1
4
13
27
8
148
92
13
13
67
29
12
1

1
1
1
1
1
0
1
1
1
0
1
1
1

37
28
38
20
28
44
3
56
64
8
62
8
46

1
1
1
1
0
1
1
1
1
0
0
1
1

22
51
51
15
18
1
26
37
36
2
12
27
102

1
1
1
1
1
1
0
1
1
1
0
1
1

3
21
6
26
15
35
44
61
84
12
65
43
4

1
1
1
1
0
1
0
1
0
1
0
0
1

Main pump in petroleum


refinery collection period
7 years
65 event observations
First half: 15 CM and 11
PM
Second half: 29 CM and
10 PM

= inter-event times (days)


Censoring indicator variables (0 = preventive maintenance 1 = corrective maintenance)

17

Model: Power-Law Process


Estimation Method: Maximum Likelihood
Parameter Estimates
Standard 95% Normal CI
Parameter Estimate Error
Lower Upper
Shape
0.518739 0.064 0.407537 0.660284
Scale 0.0499202 0.051 0.0067508 0.369145
Trend Tests
MIL-Hdbk-189 Laplace's Anderson-Darling
Test Statistic
250.61
-7.39
28.92

18

Distribution Plot

19

Coal-Fired Power Generating


Plant Unit

ID
Fan
A

ID
Fan
B

BOILER
FD
Fan
A

FD
Fan
B
WIND BOX

Coal Mills
PA
Fan
A

BUS MAIN

H
PA
Fan
B
20

Process Failure/Maintenance Activity


Important period in the process
Interval
priority

Testing
evidence
Plant state

WOR

failure

WOC
entry

PFW
issued

SPR

repair

ROMP

PFW
off

WOR

WOC what to do, who to do it


hours
hours

days

days

WOC
closed

Do work

The sequence of activities varies from job to job. The interval between
failure and isolation is a priority. Consider these activities stated above
and work out the state of equipment at a specific time, and how precise
the state of the plant is?
21

Failure Modes

The failure mode is defined as the effect by which a failure is


observed on the item, rather than the effect a failure has on the system
containing the item. Specifying boundary enhances the standard
selection of equipment classes
we classify failures as

Mechanical failure (process failure)


Electrical Failure
Equipment failure (etc..)

22

Classification of failure modes


Information regarding failure modes are further classified according to

Critical failure: A failure which is sudden and causes cessation of one of more
fundamental functions. This failure requires immediate corrective maintenance action in
order to return to satisfactory condition.

Degraded failure: A failure which is gradual, partial or both, such failure does not cease
the fundamental functions but comprises of one or more several function. They may be
compromised by any combination of reduced increased or erratic output may lead to
critical failure.

Incipient failure: An interpretation in the state or condition of an item or equipment so


that a degraded or critical failure can be expected to result if corrective maintenance is
not taken.
23

Plant Database
Data information are monitored in the control room, collected and are
stored in

Genysis (Vax platform)


Efor Database
PI-Database
Shift logs

24

Genysis Database
WO_NUM
85700
85700
85800
85800

EQUIP, WORK / DESCRIPTION


2AA

2AA

203700

"C"PFMILLCOAL/AIROUTLETTEMPIND'N

292900

2AA01

293000
350500
350500
351100
351100
351500
351500
733000
733000

PULVERISEDFUELMILLS---CARRYOUT
1AA033014

293000

PULVERISEDFUELMILLS---CARRYOUT

203700

292900

MO_TYPE

"A"PFMILL---HOLEATWESTROPEBOX
3AA06

"F"PFMILL---LARGECRACKINWELD
2AA0716

"G"PFMILLREJECTSSYSTEM&
CONTROLS
4AA06

"F"PFMILL---CCRPAFLOWINDICATION
4AA03

"C"PFMILL---PLEASEINSPECT3.3Kv
4AA0316
"C"PFMILLREJECTSSYSTEM&
CONTROLS

FAILURE
CAUSE

DURATION
(DAYS)

DATE

12/01/2000

ENTERED

20/01/2000

CLOSED

12/01/2000

ENTERED

60

210

09/08/2000

CLOSED

24/01/2000

ENTERED

413

24/01/2000

CLOSED

31/01/2000

ENTERED

12

15

15/02/2000

CLOSED

01/02/2000

ENTERED

12

14

15/02/2000

CLOSED

04/02/2000

ENTERED

50

17

21/02/2000

CLOSED

04/02/2000

ENTERED

408

05/02/2000

CLOSED

05/02/2000

ENTERED

18

05/02/2000

CLOSED

13/03/2000

ENTERED

15/03/2000

CLOSED

25

Efor Database
Date on

UNIT

MW

DURATION

MWH

CAUSE

TYPE

09/01/1999

57

96

5521

Mills/fuelquality

FR

15/01/1999

47

167

7917

Mills/fuelquality

FR

20/01/1999

150

0.5

75

Millfire

FR

29/01/1999

28

48

1338

Mills/coalcondition

FR

26/02/1999

101

25

2522

Gearboxlub.oilpp&millavail

PR

14/07/1999

104

20.66

2149

Mills

FR

10/08/1999

56

72

4032

Millavailability

PR

12/08/1999

80

1.16

93

Mills

FR

13/08/1999

44

44

Mills

FR

30/08/1999

50

5.25

263

Mills

FR

31/08/1999

74

296

Mills

FR

23/09/1999

100

0.5

50

Lossofmill

FR

07/10/1999

50

17.5

875

Mills

FR

14/10/1999

76

9.17

697

Mills

FR

12/01/2000

103

6.5

669.5

Mills

FR

03/02/2000

115

0.33

37.95

Coalfeeder

FR

03/02/2000

80

1.5

115

Mills

FR

17/02/2000

70

350

Lostmill

FR

24/02/2000

70

0.5

35

Mill

FR

26

PI

Unit

MillA

MillB

MillC

MillD

MillE

MillF

MillG

MillH

L1L

L1BMEA0
1

L1BMEA0
2

L1BMEA0
3

L1BMEA0
4

L1BMEA0
5

L1BMEA0
6

L1BMEA0
7

L1BMEA0
8

Date

Time

01/01/2003

00:00:00

314.3

0.0

54.1

50.3

52.7

0.0

0.0

60.0

0.0

01/01/2003

01:00:00

314.2

0.0

56.4

50.8

57.4

0.0

0.0

60.0

0.0

01/01/2003

02:00:00

317.1

0.0

51.2

53.1

54.6

0.0

0.0

61.5

0.0

01/01/2003

03:00:00

318.1

0.0

51.4

54.7

56.1

0.0

0.0

58.6

0.0

01/01/2003

04:00:00

312.7

0.0

49.3

53.2

56.9

0.0

0.0

57.5

0.0

01/01/2003

05:00:00

312.6

0.0

49.3

55.0

54.6

0.0

0.0

58.0

0.0

01/01/2003

06:00:00

314.9

0.0

51.7

52.4

54.3

0.0

0.0

56.6

0.0

01/01/2003

07:00:00

311.8

0.0

51.7

47.8

57.2

0.0

0.0

56.6

0.0

01/01/2003

08:00:00

313.4

0.0

49.6

49.2

54.4

0.0

0.0

57.6

0.0

01/01/2003

09:00:00

311.8

0.0

50.1

50.9

51.3

0.0

0.0

57.1

0.0

01/01/2003

10:00:00

312.7

0.0

45.8

55.4

52.3

0.0

0.0

55.1

0.0

01/01/2003

11:00:00

312.2

0.0

48.0

55.1

52.3

0.0

0.0

57.0

0.0

01/01/2003

12:00:00

314.2

0.0

48.0

54.3

54.1

0.0

0.0

56.9

0.0

01/01/2003

13:00:00

315.8

0.0

48.0

54.1

51.6

0.0

0.0

56.0

0.0

01/01/2003

14:00:00

322.6

0.0

48.4

53.4

50.5

0.0

0.0

57.7

0.0

01/01/2003

15:00:00

266.7

0.0

1.1

53.5

50.0

0.0

0.0

57.7

0.0

01/01/2003

16:00:00

349.3

0.0

1.1

46.5

48.8

0.0

48.1

57.1

0.0

01/01/2003

17:00:00

361.7

0.0

1.1

48.8

49.0

0.0

52.4

57.1

0.0

01/01/2003

18:00:00

365.4

0.0

1.1

48.8

50.2

0.0

53.3

59.3

0.0

01/01/2003

19:00:00

318.7

0.0

1.1

48.8

50.0

0.0

52.0

59.4

0.0

01/01/2003

20:00:00

312.2

0.0

1.1

50.6

48.9

0.0

51.4

58.0

0.0

01/01/2003

21:00:00

311.6

0.0

1.1

50.6

50.8

0.0

51.5

58.5

0.0

01/01/2003

22:00:00

306.4

0.0

1.1

45.8

49.0

0.0

56.5

58.5

0.0
27

01/01/2003

23:00:00

314.1

0.0

0.0

50.2

51.0

0.0

59.0

61.1

0.0

PI data information
Date
01/01/2003
15/01/2003
15/01/2003
16/01/2003
16/01/2003
14/02/2003
26/02/2003
26/05/2003
27/05/2003
29/05/2003
01/06/2003
06/06/2003
10/06/2003
11/06/2003
13/07/2003
14//07/2003
19/07/2003
22/07/2003
23/07/2003
24/07/2003
22/08/2003
14/11/2003
06/12/2003

time
15:00
4:00
22:00
1:00
15:00
11:00
3:00
7:00
14:00
10:00
5:00
8:00
22:00
7:00
18:00
20:00
1:00
4:00
5:00
9:00
10:00
1:00
8:00

motor current
1.1
51.9
0.3
51.8
0.7
47.2
0.2
45.3
0.5
53.3
0.8
56.8
0.2
56.6
0.7
51.8
0.8
42.6
1.7
68.4
0.1
47.4
0.6

state
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0

Event description mill B unit 1


Mill B not producing and oos
Mill B is running
B FD fan not making contact causing start faults
B mill running
B FD fan to investigate at next opportunity
B mill is running
Mill B not producing and unit is off
B mill is running
PFW issued for mill B not producing
B mill is running
B feeder drag chain
B mill is running
Service
B mill is running
B mill is not running
B g/box not controlling below 50%
B feeder carter g/box
B mil is on and running
B PFW, carter g/box, east roller faulty, o/h to plan
B operating
B east roller faulty, o/h
B mill is running
Internal inspection of rejects required

28

Shift logs
MILL

UNIT1

UNIT2

UNIT3

UNIT4

service

available

available

overhaul

available

available

overhaul

available

available

service

available

available

available

available

available

available

available

available

service

available

available

available

service

available

available

available

available

service

available

overhaul

available

available

For illustrative purposes data extracted from the shift logs are presented in this
form. There are 8 mills in each unit and 7 are required for full production load.
Availability is interpreted as the number of mills operational at a given time
Interval. Availability of mills in the plant is interrupted by services and overhauls.

29

Data on mill availability


Information about the number of mills available in each unit between 1999 -2005 is
extracted from the daily shift log document and the graphs below present mill
availability

Unit 1

Unit 3

Unit 2

Unit 4

30

Statistical Analysis

We conducted a study of the 2003 PI data records. The objective was


to assess the suitability of developing a renewal process to model the
availability of the mills. Using the state variable described we
calculated the consecutive number of hours that a mill would be in
state 1 and 0.

We explored these data for trend, seasonality, autocorrelation and


finally a suitable model fitted to the data. The following analysis is
considered, first the downtime data and secondly the uptime data.

31

Downtime Analysis
We conducted a Laplace test
on the series of downtime data
mills A, E and H tested positive
for a decreasing mean downtime
and mill B tested positive for an
increasing mean downtime
Mill E appears to have a change
point in the series about the 20th
event. The Laplace test on the
mill E data after the 20th event
suggests no trend.

There was no convincing evidence of autocorrelation in the data, we


therefore conclude no trend in the downtimes of the mills and
propose treating them as independent and identically distributed data32

Uptime analysis
While we analyzed these data
through correlations, the results
are best illustrated through
categorizing the data into long
and short durations.

The correlation is not strong, for example the autocorrelation coefficient was 0.227. As
such we conclude no trend in the uptimes for mills A, B, C, D, F and G and propose treating
the times between events in the data set as independent and identically distributed data.
33

Uptime analysis
Mill H appears to have a change point
about the 100th event rather than a
continuously changing mean uptime.
Mill E appears to have a mean uptime
that is changing
predictably

We fit a two degree polynomial to the


cumulative uptime for mill E and
measured an R2 of 99.2%. The model
implies that the mean uptimes is
increasing linearly between successions

For the actual uptimes rather than the cumulative data, the series showed no sign of
Autocorrelation but did exhibit signs of heteroscedasticity with a linearly increasing
variance.
34

Residuals adjusted for linearly


increasing variance for Mill E

The residuals obtained from estimating the time of the successive uptimes from the first
differences of the polynomial model that was fit to the cumulative data, which we divided by
their sequential rank (i.e. ri/i) to stabilize the variability within the data. We see that there
were unusually large residuals for the first three events and there appears to be a skew towards
positive residuals throughout the data.

35

Mean residual uptime for Mill A

This strongly suggests an increasing residual mean uptime, which is consistent with
models such as a Pareto distribution, obtained through a mixture of exponentials
36

Reliability requirements

This requirements should be well known in advance in order to


determine satisfactory confidence.

However, maintenance actions and renewal can show that a system can
meet its reliability requirements.

In reliability terms, the time to failure of a non-repairable item or the


time to failure of a system if repair is considered, is assumed to renew
the system to its original condition.

This assumption is very unrealistic for probability modelling and leads


to the distortion of statistical analysis, Ascher and Feingold (1984).
37

Preliminaries on life
distributions

The failure distribution represents an attempt to describe mathematically the


length of life of a device, Barlow and Proschan (1996).

However, on the basis of actual observations of time to failure, the cumulative


(life) distribution function, denoted by F(t), which is the probability that the
lifetime does not exceed t
F(t) = P(T t),

0 < t < .

The probability density function f(t) corresponding to F(t) is its derivative (if it
exists). This is a non-negative valued function such that
t

F (t ) f (t ) dt ,

0 t .

38

Reliability function

The reliability function, R(t), of a system having life distribution F(t)


is
R(t) = 1 F(t) = P(T > t).

This is the probability that the lifetime of the system will exceed t.
Another important function related to the life distribution is the
hazard function.

The hazard function is the ratio of the density function to reliability


function.

39

Hazard function

This can be used to represent the instantaneous hazard of a system, which has survived t units
of time, the hazard is given as

P (t T t t | T t )
t 0
t

h(t ) lim

P (t T t t T t ) P (t T t t )

.
P (T t )
P (T t )
From the multiplication law of probability, notice that h(t)t is approximately, for small t, the
probability that a system still functioning at age t will fail during the time interval (t, t + t ).

Where P ( t T t t | T t )

By definition, P(t < T t + t) = F(t + t ) F(t) so

h(t )

1
F (t t ) F (t )
f (t )
lim

1 F (t ) t 0
t
R (t )

40

Hazard Function

The hazard function is of great importance to practitioners


and the expression can be used in estimating

The time to failure (or time between failure)


Repair crew size for a given repair policy
Availability of a system
Warranty cost
Behaviour of a system failure with time

41

Mean Time To Failure (MTTF)

The mean time to failure (MTTF), , which is the expected value of T.


The general definition of the expected value of a lifetime random
variable T is

E (T ) t f (t ) dt.
0

The equation can be rewritten in order that the expected life can be
computed as

tf (t )dt

ds

f (t )dt

f (t )dt

ds R (t ) dt
0

42

Repairable and non-repairable


Items

It is important to distinguish between repairable and nonrepairable items when predicting reliability measures.

For non-repairable item such as a light bulb the reliability


is the survival probability over the expected life.

During the items life the instantaneous probability of the


first and only failure is called the hazard rate

43

The pattern of failure with time


(non-repairable systems)

There are three basic ways in which the pattern of failure can
change with time. The hazard rate may be decreasing,
increasing or constant

Decreasing hazard rate are observed in items which


becomes less likely to fail as their survival time increases.
For example electronic equipment and parts.

Constant hazard rate is a characteristic of failure which are


caused by excess load or stress at a constant average rate.
44

Repairable system

For item which are repaired when they fail, the reliability
is the probability that failure will not occur in the period of
interest, when more than one failure can occur.

It can also be expressed as the Rate of Occurrence of


Failure ROCOF.

Repairable systems can also be characterise by the Mean


Time Between Failure (MTBF), but only under the
condition of constant failure rate.
45

The pattern of failure for


repairable systems

The failure rate or ROCOF of repairable items can vary with time from these
three trends CFR, DFR, IFR

Constant failure rate (CFR) is an indicative of externally induced failure, it is


typical in complex system subject to repair and overhaul, where different part
exhibit different pattern of failure.

Repairable system can show a decreasing failure rate (DFR) when reliability
is improved by progressive repair, as defective parts are replaced by good parts.

Increasing failure rate (IFR) occurs when wear out failure modes parts begin
to predominate. The pattern of failure can be illustrated on a bathtub curve

46

Example
Suppose the hazard of a given system is constant in time, that is h(t) =
for all values of 0 t < . Then, the reliability function is

R (t ) exp dt e t , t 0
0

This reliability function corresponds to the exponential life distribution


having a cumulative distribution function
F (t ) 1 e t , t 0

47

Graph of F(t), R(t) and h(t) for


exponential distribution
1

F ( t)

Reliability function

Distribution function

0.5

R ( t)

0.5

0
0

10

10

Exponential cumulative distribution function

Exponential reliability function

Hazard function

h( t)
0.5

10

Exponential hazard function

48

Weibull Function

The Weibull distribution is the asymptotic distribution of the smallest


extreme for an initial underlying distribution which is bounded.

The Weibull function is non-linear expression for the hazard function.


It is used when the function cannot be represented linearly with time.

The cumulative distribution function of a Weibull variate T is given as

t
F (t ) 1 exp

t0

where > 0 and > 0 are the shape and scale parameters respectively
49

The Weibull Model

In the context of reliability modelling, the extreme value


distributions for the minimum are frequently encountered.
For example, if a system consists of n identical components
and the system fails when the first of these components
fails, then system failure times are the minimum of n
random component failure times

Extreme value theory says that, independent of the choice of


component model, the system model will approach a
Weibull as n becomes large.
50

Weibull Distribution

The density function is expressed as


t
1
f (t ) t exp

and the hazard is of the form


t 1
h(t )

and the reliability function is given as

t
R (t ) exp

51

General Failure Curve

52

Weibull Reliability

Shape parameter of distribution


Scale parameter of distribution

>1
=1
<1

we get the increasing hazard rate reliability function


(wear out of the bathtub curve )
reduces to the exponential reliability function
(constant failure rate region)
we get the decreasing hazard rate reliability function
(the early failure rate region)
Measure the overall reliability

53

< 1 Implies Infant Mortality

The term infant mortality rate stems from the high mortality of infants
Electronic and mechanical systems may initially have high failure rate
Manufacturers provide production acceptance test burn in and
environmental test screening, to end the infant mortality before shipment to
clients. Therefore B<1 leads us to suspect that

Inadequate burn in or stress screening


Production problems, misassemble, quality control
Overhaul problems
Solid state electronic failure

If the dominant failure modes for a component is B < 1, and the component
survives infant mortality, it will improve with age. Conditional on survival the
failure rate decreases and the reliability increases
54

B=1 Implies Random Failures

By random we mean the failure are independent of time. These


failure modes are ageless. An old part is as good as new if the
failure mode is random. Therefore we might suspect
Maintenance errors, human errors
Failure due to nature, foreign object damage, lightning strikes
Mixture of data from three or more failure modes (assuming
they all have different betta)
Here gain Overhaul are not appropriate
Weibull with B=1 is identical to the exponential distribution

55

1.0 < B < 4 Implies Early Wear


Out

If these failures occurs within the design life they are


unpleasant suprises
There are many mechanical failure modes in this classes
Low cycle fatigue
Most bearing failure
Corrosion, erosion
Overhaul or part replacement at low B lives may be cost
effective
The period for overhaul is read off the Weibull plot at the
appropriate B life
56

Tutorial Questions

An item is known to have a failure time that is Weibull


distributed with characteristic life 250h and shape
parameter 2.5.

What is the reliability at 100h and at what time is the


reliability 95%?

57

Tutorial (A)

2.5

100
R (100) exp

250

0.904 .

t
R (t ) 0.95 exp

250

250

2.5

2.5

ln(0.95) 0.051

t 250 (0.051)1/ 2.5 96.2hours

58

Stochastic processes

The word stochastic derives from Greek ( to aim, to


guess) and means random or chance.

A stochastic process may be thought of as a family of random


variables depending on parameters.

Stochastic processes are ways of quantifying the dynamic relationship


of sequences of random events, Taylor and Karlin (1994). They are
descriptions of random phenomena changing with time. These
phenomena can occur in complex repairable systems such as in
industrial machinery and other fields and have attracted increasing
attention in recent years.
59

Stochastic Process
Set of random variables, or observations of the same
random variable over time: X t , t 0 (continuous-parameter) or

X n , n 0,1,...

(discrete-parameter)

Xt may be either discrete-valued or continuous-valued.


A counting process can be a discrete-valued, continuousparameter stochastic process that increases by one each
time some event occurs. The value of the process at time t
is the number of events that have occurred up to (and
including) time t.

60

Basic concept of stochastic


processes

A stochastic model predicts a set of possible outcomes weighted by


their likelihood and probabilities. The models play an important role
in elucidating many areas of natural applications. They can be used to
analyze the inherent reliability in many processes.

Stochastic models give an insight to deal with uncertainties affecting


managerial decisions Taylor and Karlin (1994).

For example, a repair model can be used to determine the optimal time
for preventive maintenance before a failure occurs.

61

Stochastic point processes

Stochastic point processes have been applied to repairable systems.


They are mathematical models characterized by highly localized
events distributed randomly in a continuum.

The continuum is time and the highly localized events are failures,
which are assumed to occur at instants within the continuum, Crowder
et al (1991).

The entire technique developed for point processes is potentially


applicable to systems failure data.

62

Stochastic point process

A stochastic point process represents the successive arrival and


inter-arrival times of failure of systems, under the assumption that a
system is operated whenever possible and that repair times are
negligible.

The pattern of failures necessarily develops in calendar time. If a


system is sometimes shut down and no repair is considered, the exact
connection to calendar time disappears but the successive failures are
still calendar time ordered, Ascher and Feingold (1984).

Operating time can be used for reliability study.


63

Poisson Process
Let X t , t 0 be a stochastic process where X(t) is the number of
events (arrivals) up to time t. Assume X(0)=0 and
(i) Pr(arrival occurs between t and t+t) = t o t ,
where o(t) is some quantity such that lim t 0 o t / t 0
(ii) Pr(more than one arrival between t and t+t) = o(t)
(iii) If t < u < v < w, then X(w) X(v) is independent of X(u) X(t).
Let pn(t) = P(n arrivals occur during the interval (0,t). Then

e t t
pn t
,n 0
n!
n

64

Poison Process

fa
il

fa
il

fa
il

ur
e

fa
il

ur
e

ur
e

ne
w

ur
e

Let T1, T2, T3,be the times to successive failures of the system and
let Xi = Ti Ti1 be the time between failure i 1 and failure i where T0
= 0. The Ti and Xi are random variables and we define ti and xi to be
their corresponding realized values. We can define N(t) as the number
of failures within the given interval of time (t).

t1

t2

t3

t4

time

65

Intensity function

The intensity function of a stochastic point process is


Pr N t t N (t ) 1
t lim
t 0
t

A point process is said to be regular or orderly if


Pr{N t t N (t ) 2} 0 (t )

That is, if independent failures cannot occur


simultaneously. We will assume this property throughout
66

Point process models

The important point in system failure data analysis is that failures occur in a
specific sequence and can either be increasing, decreasing or constant.

The point process models that have been applied to repairable system
reliability are the homogeneous Poisson process (HPP), the nonhomogeneous Poisson process (NHPP) and the superimposed renewal
process (SRP).

Modelling terms used in the reliability for components (parts) and systems
had been confused totally by reliability engineers and scientists.

67

Non-homogeneous Poisson
process (NHPP)

Most repairs involve the replacement of very small


fraction of a systems constituent parts

It is plausible to assume that system reliability after


repair is essentially the same as it is immediately
after failure

This assumptions leads to NHPP as a system


reliability model
68

Characteristics of NHPP
N 0 0

system initialisation at time t

N t N s N s

N t N s ~ Po

where

independence of increments

t dt
s

dE N t
t
dt
69

NHPP

There are many connections between the NHPP and a distribution of time
of failure.

Consider a system of age t, modelled by an NHPP, can, for some purposes


be considered to have age x (numerically equal to t).

This normally results from the independent increments property of the


NHPP (non-stationary process)

70

Homogeneous Poisson process


(HPP)

The most straightforward way to define HPP is as


a sequence of independent and identically (IID)
exponentially distributed xis.

Several equivalent definition refer to the HPP as


an orderly stochastic process, with stationery,
independent increment

71

Renewal Process
The renewal process is defined as a
sequence of independent and identically
distributed non-negative random variables
X1, X2, X3 which with probability 1 are not
all zero. Hence it is a generalisation of the
HPP

72

fa
il

fa
il

fa
il

ur
e

fa
il

ur
e

ur
e

ne
w

ur
e

Example

t1

t2

t3

t4

N(t) = number of failures to time t

time

H(t) = history of failures to time t

73

Reliability Model relationships


MODELS
Minimal Repair

Non-homogeneous
Poisson Process

Partial Repair

Rejuvenation

Maximal Repair

Correction

Renewal Process

74

Proportional Intensities Model


(Coxs, 1972)
Nt

t 0 t si

where

i 1

si

0 t

= constant scaling factor

baseline intensity function

s1 s N t

0 t

constant (exponential)

0 t t

loglinear (truncated gumbel)

0 t t

power law (Weibull)


75

Hypothetical Data from Ascher


and Feingold (1984)
happy
system

sad
system

noncommittal
system

15

177

51

27

65

43

32

51

27

43

43

177

51

32

15

65

27

65

177

15

32
76

Log-likelihoods for Hypothetical


Data
Model

happy
system

sad
system

noncommittal
system

PIM (partial repair) constant

-33.7

-33.7

-35.5

loglinear

-32.4

-28.6

-31.0

power-law

-29.2

-32.0

-34.7

constant

-35.5

-35.5

-35.5

loglinear

-34.8

-34.8

-34.8

power-law

-35.1

-35.1

-35.1

constant

-35.5

-35.5

-35.5

loglinear

-34.8

-32.0

-35.2

power-law

-35.0

-31.8

-35.3

maximal repair

minimal repair

Baseline
Intensity

77

Contour Plots for Log-likelihood


Fits

loglik

loglik

power-law fit to happy data

loglik

loglinear fit to sad data

loglinear fit to noncommittal data

78

Intensity Functions for Chosen


Models
Intensity Function

0.18

Intensity Function

0.18

( t a b s)

( t a b s)

( t a b s)

happy

410

sad

Intensity Function

0.18

410

410

noncommittal

79

Revision: Possible
Questions/Problems

Definition of reliability, hazard, stochastic processes


Show the reliability function and gives an applied example
Derivation of important equations
Hazard Function
Stochastic Point Process
Weibull Model
Point Process Models
Intensity Function
Bathtub Curves and its significance to repairable and nonrepairable systems
80

Anda mungkin juga menyukai