Methods in Reliability
Larry Leemis
Department of Mathematics
College of William and Mary
Williamsburg, VA 23187-8795
leemis@math.wm.edu 757-221-2034
Undergraduate Simulation, Modeling and Analysis
February 14, 2000
Outline
1. Introduction
2. Coherent Systems Analysis
3. Lifetime Distributions
4. Parametric Lifetime Models
5. Specialized Models
6. Repairable Systems
7. Lifetime Data Analysis
8. Fitting Parametric Models to Data
9. Parametric Estimation for Models with Covariates
10. Nonparametric Methods
11. Assessing Model Adequacy
1. Introduction
Motivation
Space Shuttle Challenger accident
Chernobyl and Three Mile Island accidents
product liability
customer goodwill
corporate reputation
Three closely-related fields of study
Actuarial science
Biostatistics
Reliability engineering
Terminology
The event at the end of a lifetime is called
a failure by reliability engineers
death by actuaries and biostatisticians
an epoch by point process researchers
The object of a study is called
a system, component or item by reliability
engineers
an individual by actuaries
an organism by biostatisticians
To avoid switching terms, failure of an item will be
used here.
Adequate performance
Must be stated unambiguously
Standards
Example: a ball bearing has failed when its
diameter falls outside of 3 + 0.05 mm
Binary models
the item is in either the functioning or failed
state (e.g., a fuse)
Purpose
Intended use
Example: a drill may have one grade for a
handyman and another for a contractor
Time
Units
must be specified (e.g., hours, years)
Notation
many lifetime models use the random
variable T
Time need not be taken literally
consider an automobile tire, light switch
Time duration must be specified
Example: 1000 hour reliability is 0.8
Continuous operation vs. on/off cycling
time alone may not be the only consideration
(e.g., motors, computers)
Environmental conditions
Factors
temperature, humidity, and turning speed all
affect the lifetime of a machine tool
Preventive maintenance
usually effective in prolonging the lifetime of
the item and hence increasing the reliability
Reliability vs. quality
reliability incorporates the passage of time
quality is a static descriptor of an item
Example 1: Two transistors of equal quality. One
used in a television set, the other in a cannon
launch environment. Identical quality, different
reliabilities.
Example 2: Two automobile tires, each of high
quality. One was produced in 1957, the other in
1994. Same purpose, different reliabilities due
to improved design (e.g., tread or steel belts),
components (e.g., rubber) or processes (e.g.,
manufacturing advances).
Some quality
improvements (e.g., improved tread design)
improve the reliability of the tire, while others
(e.g., improved white wall design) will not.
Failures
3.0
2.5
2.0
1.5
1.0
0.5
0.0
55
60
65
70
75
80
85
Temperature
Figure 1.3 Launch temperature versus
number of field joint failures.
Conclusion: temperature was indeed significant
components
two key modeling decisions
which elements of the system are included
the level of detail
the first two sections: structural properties
the next two sections: probabilistic properties
outline
structure functions
minimal path and cut sets
reliability functions
reliability bounds
i=1
xi .
...
0 if x i = 0 for all i = 1, 2, . . . , n
(x) =
1 if there exists an i such that x i = 1
= max {x 1 , x 2 , . . . , x n }
=1
i=1
(1 x i ).
1
2
3
..
.
n
Figure 2.2 A parallel system block diagram.
Applications
kidneys
brake system on an automobile with two brake
fluid reservoirs
Series and parallel systems are special cases of k-outof-n systems, where the system functions if k or more
of the n components function.
Applications
xi < k
0 if i
=1
(x) =
n
1 if x i k.
i=1
r(p) = P[ (X) = 1] = P[ X i = 1] =
i=1
i=1
P[X i = 1] =
i=1
r(p)
1.0
0.8
0.6
n=1
0.4
n=2
n=5
0.2
n = 10
0.0
0.0
0.2
0.4
0.6
0.8
1.0
pi
i=1
Special
case:
r( p) = 1 (1 p)n
i=1
(1 X i )]
E[1 X i ] = 1
identical
i=1
(1 pi )
components
r(p)
1.0
n = 10
n=5
0.8
n=2
0.6
n=1
0.4
0.2
0.0
p
0.0
0.2
0.4
0.6
0.8
1.0
3. Lifetime Distributions
Motivation
Up to this point, reliability has only been considered
at one particular instance of time.
Outline
lifetime distribution representations
discrete distributions
moments and fractiles
system lifetime distributions
distribution classes
Interpretations
S(t) is the probability that an individual item is
functioning at time t
S(t) is the expected fraction of items surviving to
time t
S(t)
1.0
0.8
0.6
S1(t)
S2(t)
0.4
0.2
0.0
t
0.0
0.5
1.0
1.5
2.0
P[a T b] =
f (t) dt = 1
f(t)
0.8
0.6
0.4
F(t0 )
0.2
S(t0 )
t0
0.0
0.0
0.5
1.0
t
1.5
2.0
h(t) dt =
h(t)
IFR
5
4
DFR
3
2
BT
1
0
t
0.0
0.5
1.0
1.5
2.0
H(t) =
h( )d
t0
Applications
variate generation in Monte Carlo simulation
implementing certain procedures in statistical
inference
defining certain distribution classes
1
L(t) = E[T t | T t] =
f ( )d t
S(t) t
All mean residual life functions satisfy
L(t) 0
L(t) 1
dt
L(t)
t0
e d t =
t0
H(t) =
h( )d =
f ( )
d = log S(t)
S( )
u(t) f (t) dt
f (t) dt =
S(t) dt
Variance
2 = V [T ] = E[(T )2 ] = E[T 2 ] (E[T ])2
Coefficient of variation
=
Skewness
T 3
3 = E
Kurtosis
T 4
4 = E
Fractiles: t p satisfies
F(t p ) = P[T t p ] = p
t p = F 1 ( p )
or
E[T ] =
2
t0
S(t) dt =
t f (t) dt =
2
e t dt =
0
2
3 = 3 6 3 6 3
tp =
t 2 e t dt =
= E[T ] (E[T ]) =
2
2
2
2
2
2
+ 3 3 3 = 2
log (1 p)
Motivation
Survival patterns of a drill bit, a fuse, and an
automobile are vastly different.
Outline
parameters
exponential
Weibull
gamma
other distributions
4.1 Parameters
Three types of parameters:
location
scale
shape
Location (or shift) parameters
Shift a distribution along the time axis. If c 1
and c 2 are two values of a location parameter
for a lifetime distribution with survivor function
S(t; c), then there exists a constant such that
S(t; c 1 ) = S( + t; c 2 ).
Example Mean
distribution.
in
the
normal
Scale parameters
Used to expand or contract the time axis by a
factor of . If 1 and 2 are two values for a
scale parameter for a lifetime distribution with
survivor function S(t; ), then there exists a
constant such that S( t; 1 ) = S(t; 2 ).
Example Exponential scale parameter .
Shape parameters
Affect the shape of the probability density
function.
Example Weibull shape parameter .
S(t)
1.0
O
O
O
O
0.8
OO
O
O
O
O
0.6
0.4
0.2
0.0
t
0
Figure
4.2
50
100
mixed
200
300
discrete-continuous
survivor function.
0.8
=2
1.0
0.0
S(t)
0.4
=1
0.0
h(t)
2.0
1.0
1.0
=2
0.0
t
2.0
0.0
H(t)
4
=2
=1
0.0
0.0
=1
1.0
t
2.0
1.0
t
2.0
=2
=1
0
0.0
1.0
t
2.0
L(t)
0.8
=1
0.4
=2
0.0
0.0
1.0
t
2.0
S(t) = e t
f (t) = e t
H(t) = t
L(t) =
h(t) =
1
t
0.0
0.5
1.0
1.5
2.0
Motivation
The exponential distributions constant failure rate
is often too restrictive or inappropriate.
S(t) = e
( t)
h(t) = t 1
for all t 0.
1 ( t)
f (t) = t
H(t) = ( t)
Notes
is a positive scale parameter
is a positive shape parameter
exponential distribution is a special case
( = 1)
hazard function increases from 0 when > 1
(IFR)
hazard function decreases from to 0 when
< 1 (DFR)
= 2 known as the Rayleigh distribution
when 3 < < 4 the probability density
function resembles that of a normal random
variable
the mode and median of the distribution are
equal when 3. 26
the characteristic life is a special fractile
1
defined by t c = ; all Weibull survivor
1
functions pass through the point ( , e1 )
self-reproducing
property:
if
T i Weibull( i , ) for i = 1, 2, . . . , n, then
n
min {T 1 , T 2 , . . . , T n } Weibull( i , )
i=1
moments
1
1
1 1
= 1+
=
2
1 2 2 1 1
2
= 2
2
2 2 1 1
=
1 1
1/2
f(t)
S(t)
= 0.5
1.0
0.8
0.6
0.4
0.2
t 0.0
=3
1.2
=2
0.8
=1
0.4
0.0
0.0
1.0
0.0
h(t)
1.0
H(t)
=3
6
= 0.5
= 0.5
=1
=2
=3
t
2
0
0.0
1.0
=3
=2
=1
1
t
=2
=1
= 0.5
0.0
1.0
L(t)
= 0.5
4
3
2
1
0
=1
=2
=3
t
0.0
1.0
IFR
DFR
BT
UBT
Exponential
YES
YES
NO
NO
Muth
YES
NO
NO
NO
Weibull
YES
YES
NO
NO
Gamma
YES
YES
NO
NO
Uniform
YES
NO
NO
NO
Log normal
NO
NO
NO
YES
Log logistic
NO
YES
NO
YES
Inv. Gaussian
NO
NO
NO
YES
Expon. power
YES
NO
YES
NO
Pareto
NO
YES
NO
NO
Gompertz
YES
NO
NO
NO
Makeham
YES
NO
NO
NO
IDB
YES
YES
YES
NO
Gen. Pareto
YES
YES
NO
NO
5. Specialized Models
Motivation
There are several ways to combine and extend the
continuous lifetime models previously outlined.
Outline
competing risks
mixtures
accelerated life
proportional hazards
t
0.0
0.5
1.0
1.5
2.0
lifetime
number of risks
net life for risk j
crude life for risk j
j
probability risk j causes failure
net lifetimes: causes C 1 , . . . , C k are viewed
individually
crude lifetimes: lifetimes are considered in the
presence of all other risks
5.2 Mixtures
Mixture models are appropriate when items ar
drawn from one of several populations (finite
mixtures) or can be differentiated by a continuous
parameter.
Finite mixtures
f (t) =
m
where
l=1
l=1
pl f l (t | l )
pl = 1, pl 0 for l = 1, 2, . . . , m.
f (t| ) p( )d
all
1 t 4 2t
e + e
t0
3
3
which is a finite mixture of the two
populations. This model is a special case of
the hyperexponential distribution.
=
pl h lj (t) e 0 j = 1
j=1
f (t) =
l=1
l=1
pl = 1,
burglary
Notation
z = (z 1 , z 2 , . . . , z q )
= ( 1 , 2 , . . . , q )
(z)
S 0 (t), f 0 (t), h0 (t), H 0 (t)
covariates
regression coefficients
link function
baseline functions
consuming
Accelerated Life
Proportional Hazards
S(t)
S 0 (t (z))
[S 0 (t)] (z)
f (t)
(z) f 0 (t (z))
h(t)
(z)h 0 (t (z))
(z)h 0 (t)
H(t)
H 0 (t (z))
(z) H 0 (t)
h(t)
4
h AL(t)
h PH(t)
3
2
h 0(t)
1
0
t
0.0
0.5
1.0
1.5
2.0
6. Repairable Systems
Motivation
So far, only nonrepairable systems of components
have been considered. Most systems are
repairable.
Outline
Introduction
Point processes
Availability
Birth-death processes
6.1 Introduction
A repairable item may be returned to an operating
condition after failure to perform a required
function by any method other than replacement of
the entire item.
Replacement models
used when a nonrepairable item is replaced
with another item upon failure
"socket models"
unlimited spares
redundancy allocation problem (optimal
number of spares)
replacement policies
X
X X
O X
O X
XO
O X
XO
O X
h(t)
0.0
t
1.0
2.0
0.0
1.0
t
2.0
(t )
4
X XX X X
0.0
1.0
2.0
0.0
XX
1.0
XX XXXX
2.0
Nonrepairable
Repairable
Gets better
Gets worse
burn-in h(t) 0
wear out h(t) 0
improving (t) 0
deteriorating (t) 0
( t )
2
1
X1
0
0.0
O N(t)
X2
X3
O
X
T1
T2
0.5
X4
X
T3
1.0
1.5
T4
2.0
(
t1
]
t2
(
t3
]
t4
( )d
[ (t)dt] e
n
(t)dt
a
n!
for n = 0, 1, . . . .
6.3 Availability
Notation
X i denotes the i th time to failure, i = 1, 2, . . .
Ri denotes the i th time to repair, i = 1, 2, . . .
X
X1
R1
Figure
6.10
X2
R2
X3
R3
Failure and
realization.
repair
process
Ri
(
(
O
(
(
O
O
O
(
(
(
(
(
0.0
0.5
)
O
)
O
O
)
)
)
)
)
O
0
1.0
)
)
1.5
2.0
2.5
i=1
f (t i , )
i=1
= (1 , 2 , . . . , p )
log f (t i , )
is
the
maximum
likelihood estimator
1.0
f(t)
0.8
0.6
0.4
0.2
0.0
0.0
0.5
1.0
1.5
2.0
log L(t, )
U i ( ) =
i = 1, 2, . . . , p
i
the score vector components have expectation
E [U i ( ) ] = 0
i = 1, 2, . . . , p
and variance-covariance matrix
I ( ) = E[U ( ) U ( )]
this variance-covariance matrix is called the
Fisher information matrix with components
2 log L(t, )
Cov(U i ( ), U j ( )) = E
i
j
for i = 1, 2, . . . , p and j = 1, 2, . . . , p.
the observed information matrix has
components O( ) is
2 log L(t, )
i
j
i = 1, 2, . . . , p
j = 1, 2, . . . , p
i=1
n
f (t i , )
i=1
e t i /
n
= e
The log likelihood function is
i=1
ti /
i=1
ti /
log L(t, )
n i
=1
U( ) =
= +
2
The maximum likelihood estimator is
1 n
=
ti
n i=1
ti
n
log L(t, )
=
2
2
i=1
3
ti
2 log L(t, )
I ( ) = E
t
2
i
n
i=1
= E 2 +
n
2 n
= 2 + 3 E ti
i=1
n
= 2
=
n
= 2
7.5 Censoring
A censored observation occurs when only a bound
is known on the time of failure.
Notation
n: number of items on test
r: number of observed failures
c: censoring time
A data set where all failure times are known is
called a complete data set.
X
X
X
X
X
f (t, ) = e t
H(t, ) = t
i=1
f (t i , )
i=1
or
log L( ) =
i=1
[log t i ] = n log
i=1
n
The MLE is = n
ti
i=1
i=1
ti
2
2n,
/2
<<
2n
2n
Example 8.5 Consider the n = 23 ball
bearing failure times. The maximum
likelihood estimator is
23
n
=
= n
= 0. 0138
1661.
16
t
i=1
t
0
50
100
150
data set.
and H(t, , ) = ( t)
Notation
n is the number of items on test
r is the number of observed failures
t 1 , t 2 , . . . , t n are the failure times
c 1 , c 2 , . . . , c n are the censoring times
x i = min {t i , c i } for i = 1, 2, . . . , n
Assuming random censoring
log L( , ) =
i U
log h(x i , , )
= r log + r log + ( 1)
The 2 1 score vector has elements
log L( , ) r
U 1 ( , ) =
=
i=1
H(x i , , )
log x i
i U
i=1
xi
i=1
xi
log L( , )
U 2 ( , ) =
r
= + r log + log x i
i U
( x i ) log x i
i=1
xi = 0
+ r log +
i U
i=1
n
log x i
( x i ) log x i = 0
i=1
r
= n
i=1 i
1 /
1)
=
+
2
2
i=1
xi
n
r
2 log L( , )
2
=
+
(
x
)
(log
x
)
i
i
2
2 i=1
2 log L( , )
=
r
1
+
Information matrices
x i log x i + (1 + log ) x i
i=1
i=1
0.6
0.4
Exponential
0.2
0.0
t
0
50
100
150
681, 000
875
O( , ) =
875
10.
4
0.015
0.010
0.005
0.0
0. 00000165
0. 000139
O ( , ) =
0.
000139
0.
108
regression
Applications
determine which covariates significantly
impact survival
determine the impact of changing the values of
covariates
The accelerated life model
S(t, z) = S 0 (t (z))
The proportional hazards model
h(t, z) = (z) h0 (t)
Notation (for i = 1, 2, . . . , n; j = 1, 2, . . . ,
x i = min {t i , c i }
i is a censoring indicator variable
zi = (z i1 , z i2 , . . . , z iq )
z ij is the value of covariate j for item i
extra parameters: S(t, z, , )
Matrix formulation
x1
1
z 11 z 12
x2
2
z 21 z 22
.
.
.
.
x=
=
and Z =
.
.
.
.
.
.
.
.
xn
n
z n1 z n2
L( , ) =
i U
f (x i , zi , , )
i C
q)
. . . z 1q
. . . z 2q
.
.
.
.
.
.
. . . z nq
S(x i , zi , , )
i C
log S(x i , zi , , )
or
log L( , ) =
i U
log h(x i , zi , , )
i=1
H(x i , zi , , )
Notes
the maximum likelihood estimators for and
cannot be expressed in closed form
the number of unique covariate vectors and n
determine whether to use regression models
100 w
100 w
60 w
1
2
t1
t2
z21 = 1
20
t(1)
z11 = 1
t3
50
t(2)
z31 = 0
80
t(3)
80
1
1
x = 20
=1
Z=1
50
1
0
Example 9.8 North Carolina collected
recidivism data on n = 1540 prisoners
in 1978 (Schmidt and Witte, 1988). T
is the time of release until the time of
return to prison. The purpose of the
study is to assess the impact of the
q = 15 covariates.
Table 9.2 North Carolina recidivism model.
zi
Covar.
z2
z3
z1
z6
z8
z 13
z9
z7
z 15
z4
z 10
z5
z 12
z 14
z 11
AGE
PRIORS
TSERV
WHITE
ALCHY
FELON
JUNKY
MALE
PROPTY
RULE
MARRY
SCHOOL
WORK
PERSON
SUPER
-3.34
0.84
1.17
-0.44
0.43
-0.58
0.28
0.67
0.39
3.08
-0.15
-0.25
0.09
0.07
-0.01
V [ ]
0.52
0.14
0.20
0.09
0.10
0.16
0.10
0.24
0.16
1.69
0.11
0.19
0.09
0.24
0.10
V [ ]
p-value
-6.43
6.10
5.96
-5.07
4.11
-3.54
2.91
2.78
2.47
1.82
-1.42
-1.30
0.96
0.30
-0.09
0.000
0.000
0.000
0.000
0.000
0.000
0.002
0.003
0.007
0.034
0.077
0.097
0.169
0.381
0.464
0.8
0.6
0.4
Exponential fit
0.2
0.0
Nonparametric estimator
t
0
50
100
150
100(1 )%
S(t)(1
S(t))
n
t
0
50
100
150
S(50)
=
= 0. 696, and a 95%
23
confidence interval for S(50) is
S(50)
1. 96
or
S(50)(1
S(50))
23
=
S(t)
dj
[1 n ]
j R(t)
j
Computational formulas
i
+
F 0 (t (i) )
D n = max
i = 1, 2, ..., n n
Dn
1.0
0.8
0.6
0.4
0.2
0.0
i 1
= max
F 0 (t (i) )
i = 1, 2, ..., n
n
D n = max {D+n , Dn }
S(t)
D23 Nonparametric estimator
Exponential fit
t
0
50
100
150
F(t)
0.8
Exponential fit
0.6
0.4
D23
0.2
Nonparametric estimator
0.0
t
0
50
100
150
0.80
0.900
0.683
0.565
0.493
0.447
0.410
0.381
0.358
0.339
0.323
0.266
0.231
0.216
0.208
0.190
0.166
0.148
0.90
0.950
0.775
0.636
0.565
0.510
0.468
0.435
0.410
0.387
0.369
0.304
0.264
0.248
0.237
0.217
0.189
0.170
0.95
0.975
0.841
0.708
0.624
0.564
0.519
0.483
0.455
0.430
0.409
0.338
0.294
0.275
0.264
0.242
0.210
0.188
0.99
0.995
0.929
0.829
0.734
0.668
0.615
0.576
0.543
0.513
0.490
0.405
0.352
0.330
0.317
0.290
0.252
0.226
H 1 : F(t) 1 e (0.01t)
The test statistic is D23 = 0. 274. At
= 0. 10, the critical value is 0. 248, so
H 0 is rejected.
2
F(t)
1.0 Nonparametric estimator
0.8
0.6
D23
0.4
Hypothesized Weibull dist.
0.2
0.0
t
0
50
100
150