Availability of Systems With Self-Diagnostic Components-Applying Markov Model To IEC 61508-6

Reliability Engineering and System Safety 80 (2003) 133141
www.elsevier.com/locate/ress
Availability of systems with self-diagnostic componentsapplying

Markov model to IEC 61508-6
Tieling Zhanga,*, Wei Longb, Yoshinobu Satob
b
a
HAL Corporation, 6-21-17-701 Nishikasai, Edogawa-Ku, Tokyo 134-0088, Japan
Tokyo University of Mercantile Marine, 2-1-6 Etchujima, Koto-Ku, Tokyo 135-8533, Japan
Received 11 December 2000; accepted 19 December 2002
Abstract
Of all the techniques applicable to safety-related analyses, each one may be adaptable to some aspects of the system safety behavior. On
the other hand, some of them can fit to analysis on one aspect of the system behavior concerning risk, but they do not always lead to the same
results. Rouvroye and Brombacher made a comparison of these techniques and indicated that Markov and Enhanced Markov analysis
techniques can cover most aspects of systems safety-related behavior. According to their conclusion, the Markov method is introduced to
Part 6 of the standard IEC 61508 for quantitative analysis in this paper. The purpose is to present explanation in details for solutions given in
the standard because there are not clear descriptions for many results and it is not easy for a safety engineer to find the clue. In addition, the
down time tc1 shown in the standard is newly defined because it is the basis to get the results of average probability of failure on demand of
system architectures and its meaning is not clearly explained. Through derivation, however, a discrepancy is found in the standard. From this
point of view, new suggestions are proposed based on the results obtained.
q 2003 Elsevier Science Ltd. All rights reserved.
Keywords: IEC 61508; Self-diagnosis; Probability of failure on demand; Markov model
1. Introduction
Recently IEC 61508 [1] was compiled and published as a
modish international standard. Many studies concerning
discussions and applications of the standard have been
carried out such as those published in the special issue of
Reliability Engineering & System Safety in 1999 [2] and
some others [3 6]. In this standard, two frameworks are
concerned. One is risk reduction with Safety-Related
System (SRS) and the other is the Overall Safety LifeCycle. In order to understand the first framework more
profoundly, the dependence of the risk reduction on both
Safety Integrity Levels (SILs) of SRS and demands from the
Equipment Under Control to SRS has to be clarified.
Misumi and Sato [7] studied this point and expressed the
mutual relationship mathematically by means of a simple
fault tree analysis (FTA). Their research needs to be
developed further. The configuration of SRS, proof test,
* Corresponding author.
E-mail addresses: zhangtling@yahoo.com (T. Zhang), yoshi@ipc.
tosho-u.ac.jp (Y. Sato).
self-diagnostic coverage and the failure rates of components

are often utilized in the evaluation of SILs of SRSs.
The SILs of SRS need to be evaluated by quantitative
analyses as required by the standard IEC 61508 and some
others like ISA-S84.01 [8]. There are many quantitative
analysis techniques such as Markov analysis, reliability
block diagram, hybrid techniques, parts count analysis and
FTA. Each of them might fit to cover several aspects of the
system behavior concerning safety. At the same time, one
aspect of the systems risk related behavior may be suitably
analyzed by some of them, but they do not always lead to the
same results. Rouvroye and Brombacher [9] outlined these
techniques and compared them to each other. The
calculation results obtained by these techniques for the
same example showed large differences. Therefore, they
pointed out that the application of different (quantitative)
techniques to practical systems would not always lead to the
same and definite results. However, they also clearly wrote
that Markov analysis covers most aspects of quantitative
safety evaluation of systems. Others that this approach
cannot cover are limited to the uncertainty or sensitivity
analyses. These aspects, however, can be carried out with
0951-8320/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved.
doi:10.1016/S0951-8320(03)00004-8
134
T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141
Nomenclature
tGE
A
FF
Ft
MDT
MTTR
PCC
pdf
Pi t; Pi
t0GE
PFD
PFDG
T1
tc1
tc2
tCE
t0CE
steady state availability

failure frequency
probability of failure of a system
mean down time
mean time to restoration
probability of failure caused by common cause
probability density function
probability and steady state probability of system
in the ith state
average probability of failure on demand for one
channel
average probability of failure on demand for
system architectures
proof-test interval
equivalent mean down time for the undetected
failure of a channel
equivalent mean down time for the detected
failure of a channel
channel equivalent mean down time for 1oo1,
1oo2, 2oo2, 2oo3 architectures
channel equivalent mean down time for 1oo2D
architecture
the Enhanced Markov analysis. So the combination of the

above two types of Markovian methods is preferable and
hence recommended.
The present paper takes up the SRSs. Their system
configurations are composed of channels that include both
detectable failures with self-diagnosis and undetectable
failures. The Markovian approach is applied to the
quantitative SIL analyses. In most cases, it can give
satisfactory results. The intention of this paper is to present
clue for solutions of many probabilistic parameters
regulated in IEC 61508-6 [10] based on the specific system
structures and associated conditions. It is not easy for a
common safety engineer to understand the results of these
probabilistic parameters because there is no detailed process
of derivation. Especially, in this paper, the down time tc1 is
newly defined since it is the basis as shown in typical system
architectures and its meaning is not clearly explained in the
standard. After detailed derivation, however, difficulties
and discrepancy are found for some specified examples in
Part 6 of IEC 61508. Suggestions for solving them are
hence proposed.
l; m
lD
lDD
lDU
lSD
mDD
mDU
b
bD
ii.
iii.
iv.
v.
vi.
vii.
2. Assumptions
In order to describe the state transitions of systems as
clearly as possible, we make the following assumptions:
viii.
ix.
i.
Component failure and repair rates are constant over

the life of the system.
voted group equivalent mean down time for 1oo2

and 2oo3 architectures
voted group equivalent mean down time for
1oo2D architecture
failure and equivalent repair rates of a channel
dangerous failure rate lD lDD lDU of a
channel in a subsystem
detected dangerous failure rate of a channel in a
subsystem
undetected dangerous failure rate
detected safe failure rate of a channel
repair rate of detected dangerous fault in a channel
repair rate of undetected dangerous fault in a
channel
the fraction of undetected failures that have a
common cause (expressed as a fraction in the
equations and as a percentage elsewhere)
of those failures that are detected by the diagnostic
tests, the fraction that have a common cause
(expressed as a fraction in the equations and as a
percentage elsewhere)
For details, refer to Table B.1 in Annex B of IEC
61508-6 [10]
The resulting average probability of failure on demand

for the subsystem is less than 1021, or the resultant
probability of failure per hour for the subsystem is less
than 1025.
The input subsystem comprises the actual sensor(s) and
any other components and wiring, up to but not
including the component(s) where the signals are first
combined by voting or other processing. For example,
the configuration for two sensor channels is shown in
Annex B of Part 6 of IEC 61508 [10].
The hardware failure rates used as inputs to the
calculations and tables are for a single channel of the
subsystem. For example, if 2-out-of-3 sensors are used,
the failure rate is for a single sensor and the result of
failure rate for a 2-out-of-3 is calculated separately.
All channels in a voted group have the same failure rate
and diagnostic coverage rate.
The overall hardware failure rate of a channel in a
subsystem is the sum of the failure rates: dangerousand safe-failures for that channel. These values are
assumed to be equal.
For each safety function, there is a perfect proof
testing and repairing. Namely, all failures that remain
undetected are assumed to be detected by the proof
test.
The proof test interval is at least one order of
magnitude greater than the diagnostic test interval.
The demand rate and expected interval between demands
are not considered in this paper. Therefore, we can
analyze the SRS failures separately from the demand.
135
x.
For each subsystem, there is a single T1 and MTTR:

MTTR is defined to include the time taken to detect
a failure. It is at least one order of magnitude less
than T1 : In this paper, the single assumed value of
MTTR for both detected and undetected failures
includes the diagnostic test interval but not the T1 :
For undetected failures, the MTTR used in the
calculations should not include the diagnostic test
interval since the mean time to restoration is always
added to the proof test interval, which is at least one
order of magnitude greater than the diagnostic test
interval. The error introduced here is not significant.
xi. Multiple repair teams (each of them is assumed to
have the same repair rate) are available to work on
all known faults in a system.
xii. In a channel, the detected and undetected faults can
exist simultaneously, i.e. if one occurred, the other
one can occur before the former is repaired.
xiii. Repairs of the detected and undetected faults are
viewed as independent for the sake of conservative
consideration though some dependence may be
invoked in the process.
Other assumptions can be referred to the Annex B, IEC
61508-6 [10].
3. Meaning of down time, tc1

In the standard IEC 61508, system configurations are
composed of channels. Each channel includes both
detectable failures with rate lDD by self-diagnosis and
undetectable failures with rate lDU : See Figs. 1 and 2 for
physical and reliability block diagrams of five typical
system architectures. The failure rates lDD and lDU are
assumed to be constants. Hence, the times of occurrences
for these two kinds of failures follow the exponential
distributions. For the detected dangerous fault is repaired to
be good, the MTTR is used. However, the undetectable
dangerous fault cannot be detected out until the next prooftest. It follows such a process as shown in Fig. 3. In the
figure, t is the time of occurrence of the failure and td is the
duration of down time.
In the standard, tc1 is not clearly defined (refer to
Fig. 2). However, it is the basis to get all results of
average probability of failure on demand of typical
system architectures. If its meaning is not clear, it is not
easy for a safety engineer to understand all of other
solutions. Here, tc1 is named equivalent mean down time
for the undetectable fault in a channel. It is defined as
follows:
Suppose ta is the time when the average probability of
failure for the undetectable fault in the interval 0; T1
occurs in a system. Then
tc1 T1 2 ta MTTR:
Fig. 1. Physical block diagrams.
For one channel architecture,

T1
T1

tlDU exp2lDU tdt
lDU exp2lDU tdt
ta
0
T1
0
tlDU exp2lDU tdt
< T1 =2

1 2 exp2lDU T1
2
since lDU T1 p 1 and exp2lDU T1 < 1 2 lDU T1 : Therefore

tc1 T1 =2 MTTR:
If lD T1 , 0:1; T1 =2 is a quite good approximation to the real

value of ta :
136
Hence
T1

2lDU t1 2 exp2lDU texp2lDU tdt
ta
0
T1

zz
2lDU 1 2 exp2lDU texp2lDU tdt
0
T1
2lDU t1 2 exp2lDU texp2lDU tdt

0

<
1 2 exp2lDU T1
2
2
3
7 2 3
T1 2 lDU T12
l T 2
3
4
12 DU 1
since lDU T1 p 1 and exp2lDU T1 < 1 2 lDU T1 : As

lDU T1 , 0:1; ta is a little less than 2T1 =3 but approaches
the value, tc1 can be evaluated as
tc1 T1 2 ta MTTR T1 =3 MTTR:
Similar to the case for 1oo2 architecture, tc1 for 1oo2D and
2oo3 architectures can be obtained as given in Eq. (5). If
lDU T1 , 0:1; one can justify that 2T1 =3 is a quite good
approximation to the real value of ta for these three
architectures by using numerical examples.
4. Availability of system architectures by Markov model

4.1. 1oo1 architecture
This system consists of a single channel, where any
dangerous failure leads to a failure of the safety function
when a demand arises. Its physical- and reliability-block
diagrams are shown in Figs. 1a and 2a. These dangerous
failures are divided into two parts being regarded as
components c1 and c2. Thus the system is composed of
two components connected in series so that there are four
system states as follows:
System state
Fig. 2. Reliability block diagrams.
For 1oo2 architecture (see Fig. 2b), the probability of

failure for the undetectable fault is
1 2 exp2lDU t :
2
Fig. 3. Process for the undetected dangerous fault.
Component c1
Component c2
0
0
1
1
0
1
0
1
State 0: operation state; State 1: failure state.
Then the system states transition diagram obtained is

shown in Fig. 4.
This figure gives a set of differential equations:
8 0
>
P t 2lDD lDU P0 t mDD P1 t mDU P2 t;
>
> 0
>
>
< P01 t lDD P0 t 2 lDU mDD P1 t mDU P3 t;
>
>
P02 t lDU P0 t 2 lDD mDU P2 t mDD P3 t;
>
>
>
: 0
P3 t lDU P1 t lDD P2 t 2 mDD mDU P3 t:
137
From Eqs. (6) and (9),

FF P0 lDD lDU
lDD lDU
mDD mDU
:
lDD mDD lDU mDU
10
Further,
MDT
<
Fig. 4. Markov states transition diagram for 2-component system.
P0 MP
where

P0 P00 t; P01 t; P02 t;
P P0 t;
P1 t;
P2 t;
P3 t
lDU
and lDD and lDU are less than 1025. Therefore
lDD
1
T1 =2 MTTR;
mDU
1
MTTR;
mDD
T
P03 t ;
and M is given as
0
2lDD lDU
mDD
mDU
B
B
lDD
2lDU mDD
0
B
B
B
B
lDU
0
2lDD mDU
@
0
tCE MDT
1
mDU
mDD
2mDD mDU
C
C
C
C:
C
C
A
lDU
l
T =2 MTTR DD MTTR:
lD 1
lD
lDD lDU lDD mDU lDU mDD

lDD mDD lDU mDU

lDU
lDD
, lD
lD mDU lD mDD

lDU
lDD
lD
T =2 MTTR
MTTR lD tCE :
lD 1
lD
PFD <
1
{m m l m
lDD mDD lDU mDU DD DU DD DU
exp2lDD mDD tlDU mDD
Hence, for a 1oo1 architecture,
exp2lDU mDU tlDD lDU
PFDG lDU lDD tCE lD tCE :
exp2lDD lDU mDD mDU t}:
11
For a channel with down time, tCE ; the resulting in average

probability of failure,
The system availability At is

AtP0 t
lDU
l
T =2 MTTR DD MTTR
lD 1
lD
as lD lDD lDU ;
They are simply rewritten as
12A
l l lDD mDU lDU mDD
DD DU
FF
lDD lDU mDD mDU
12
7
In general, the system probabilistic parameters at steady

state are of interest. At the steady state, the system
availability is
A
mDD mDU
lDD mDD lDU mDU
and the system failure frequency, FF, is given as [11]

X X
FF
Pk ajk ;
k[W
The architecture includes two channels connected in

parallel. See Figs. 1b and 2b for system block diagrams.
Assuming the two channels are of the same, the system
states can be simply defined as follows:
System
state
System state definition
0
1
2
Two channels are operative (up state)

Only one channel is in operation (up state)
The two channels are all in fault (down state)
j[F
where
F is the failure states set of the system,
W is the operating states set of the system,
Pk is the probability of system in working state k and
ajk is the element of M given in Eq. (6).
So, the states transition diagram is easily obtained

as shown in Fig. 5 where two repair teams can be
available to work on all known failures in the system (see
assumption xi).
138
Fig. 5. Markov states transition diagram.
This figure stands for the following differential

equations:
8 0
P t 22lP0 t mP1 t;
>
>
< 0
13
P01 t 2lP0 t 2 l mP1 t 2mP2 t;
>
>
: 0
P2 t lP1 t 2 2mP2 t:
It is rewritten as
0 0 1 0
22l
P0 t
B 0 C B
B P t C B 2l
@ 1 A @
P02 t
m
2l m
10
P0 t
CB
C
B
C
2m C
A@ P1 t A:
22m
14
P2 t
Similar to the case of 1oo1 architecture, the probabilistic

parameters at steady state are concerned here. From Eq. (14)
A P0 0 P1 0
2lm m2
:
l m2
Hence, FF lP1 according to Eq. (6). Then

MDT
12A
l2
1
1
:
FF
2m
l m2 lP1
Refer to Eqs. (5) and (11), the equivalent mean down time of
a channel in the system architecture, tCE is
tCE
lDU
l
T =3 MTTR DD MTTR:
lD 1
lD
15
As m is equivalent repair rate of a channel,

21
lDU
l
m 1=tCE
T1 =3 MTTR DD MTTR
:
lD
lD
The equivalent mean down time for this system architecture
is obtained as
tGE MDT
1
2m
4.3. 1oo2D architecture

Two channels in this architecture are connected in parallel.
During normal operation, both channels need to demand the
safety function before it can take place. In addition, if the
diagnostic tests detect a fault in either channel, then the output
voting is adapted so that the overall output state then follows
that given by the other channel. If the diagnostic tests find
faults in both of channels or a discrepancy that cannot be
allocated to either channel, then the output goes to the safe
state. In order to detect a discrepancy between the channels,
either channel can determine the state of the other via a means
independent of the other channel. See Figs. 1c and 2c for the
system block diagrams.
Since each component follows the exponential distribution, comparing Fig. 2b and c, the values of equivalent
mean down times for each channel and the architecture are:
t0CE
lDU T1 =3 MTTR lDD lSD MTTR

;
lDU lDD lSD
t0GE

:
2lDU lDD lSD
The PFDG for this architecture is then obtained by referring

to Eq. (17) as follows
PFDG 21 2 blDU lSD 1 2 bD lDD
1 2 blDU t0CE t0GE bD lDD MTTR
blDU T1 =2 MTTR:
18
1
lDU =lD T1 =3MTTRlDD =lD MTTR:
2
16
Hence, the PFDG for this architecture is

PFDG P2 PCC
component compose a series system. According to

the definitions of b and bD ; the probability of failure for
detectable fault caused by common cause is bD lDD MTTR
and the probability of failure for undetectable fault due to
common cause is blDU T1 =2MTTR: Here, T1 =2
MTTR is equivalent mean down time of undetectable
fault in a channel. Hence, we have Eq. (17).
l2
PCC < l2 =m2 PCC
l m2
PFDG 2lD tCE ;
2
l2 tCE
PCC
19
where tCE is given in Eq. (11).
212 bD lDD 12 blDU 2 tCE tGE bD lDD

MTTR blDU T1 =2MTTR
This system consists of two channels connected in

parallel. The system is in fault whenever anyone fails. See
Figs. 1d and 2d for system block diagrams. From Eq. (12),
PFDG for this architecture is easily obtained on the basis of
reliability block diagram
17
by considering the effects of common causes and l p m:

Refer to Fig. 2b, the two channels and common cause

The block diagrams of this architecture are shown in
Figs. 1e and 2e. Refer to Section 4.2, the Markov states
139
tGE MDT lDU =lD T1 =3 MTTR lDD =lD

MTTR=2:
Fig. 6. Markov states transition diagram.
transition diagram for this architecture is given in Fig. 6

where the assumption xi applies.
Furthermore, there are the equations according to
Fig. 6.
8 0
>
P0 t 23lP0 t mP1 t;
>
>
>
>
>
>
0
>
< P1 t 3lP0 t 2 2l mP1 t 2mP2 t;
>
>
>
P02 t 2lP1 t 2 l 2mP2 t 3mP3 t;
>
>
>
>
>
: 0
P3 t lP2 t 2 3mP3 t:
l2 l 3m
l m3
20
21
3lm2
l m 3
based on Eq. (9). Hence, MDT of the system is given by

MDT
12A
l 3m
FF
6m2
PFDG 61 2 bD lDD 1 2 blDU 2 tCE tGE

bD lDD MTTR blDU T1 =2 MTTR
24
5. Discussions
5.1. Discrepancies
and
FF 2lP1 2l
For this architecture, the following is easily obtained by

referring to Eqs. (17) and (21).
since m q l:
We investigate the steady state probabilistic parameters.

At steady state, the system unavailability 1 2 A and FF
are obtained as:
1 2 A
23
22
where m 1=tCE and tCE is given in Eq. (15). Since m q

l; MDT < 1=2m: Therefore
In Section 4, the equivalent mean down times and

average probabilities of failure on demand for five system
architectures are obtained by Markov model. They are the
steady state values of the corresponding systems. However,
the equivalent mean down times, tCE ; tGE and t0GE for 1oo2,
1oo2D and 2oo3 architectures shown in Section 4 are
different from what are described in Annex B of Part 6 of
IEC 61508 [10]. In the standard, tCE expressed in Eq. (11) is
used for all of 1oo1, 1oo2, 1oo2D, 2oo2 and 2oo3 typical
architectures. In order to make a comparison, the results of
tCE ; tGE and t0GE used for these system architectures are listed
in Table 1. In the standard, no description can be found for
getting the expressions of tCE ; tGE ; t0GE and average
probability of failure on demand.
The average probabilities of failure on demand for these
three architectures in IEC 61508 are of the same forms as
those given in Section 4 but tCE ; tGE and t0GE are different.
In fact, the systems discussed in this paper could not
access steady state in proof test interval as mean down time
of undetectable fault in a channel is T1 =2 MTTR and thus
the corresponding equivalent repair rate is much smaller.
The average probabilities of failure on demand for the five
typical architectures should be calculated by the following
Table 1
Comparisons among equivalent mean down times for the three system architectures
System
tCE ; tGE and t 0GE obtained by Markov model
tCE ; tGE and t 0GE given in IEC 61508-6
1oo2
tCE lDU =lD T1 =3 MTTR lDD =lD MTTR
tCE lDU =lD T1 =2 MTTR lDD =lD MTTR
2oo3
tGE 1=2lDU =lD T1 =3 MTTR lDD =lD MTTR
tGE lDU =lD T1 =3 MTTR lDD =lD MTTR
1oo2D
t 0CE lDU =lT1 =3 MTTR lDD =lMTTR; l lDD lDU lSD
t 0CE lDU =lT1 =2 MTTR lDD =lMTTR; l lDD lDU lSD
t 0GE

2lDU lDD lSD
t 0GE

lDU lDD lSD
140
equation
Table 3
PFDG obtained by two different methods
1 T1
1 2 Atdt:
PFDG
T1 0
25
For 1oo1 architecture

1 T1
1 2 P0 tdt
PFDG
T1 0
l l lDD mDU lDU mDD
DD DU
lDD mDD lDU mDU
1
2
lDD mDD lDU mDU T1

lDD mDU
1 2 exp2lDD mDD T1

lDD mDD
lDU mDD
1 2 exp2lDU mDU T1
lDU mDU
lDD lDU
lDD lDU mDD mDU

1 2 exp2lDD lDU mDD mDU T1 :
1oo1
1oo2
2oo3
26
1 T1
l2
2l2
P2 tdt
2
PFDG
2
T1 0
l m
l m3 T1
l2
1 2 exp2l mT1
l m3 T1
1 2 exp22l mT1 ;
27
where P2 t is obtained from Eq. (13).

1 T1
PFDG
P t P3 tdt
T1 0 2
l2 l 3m
6l2 m
2
1 2 exp2l mT1
3
l m
l m4 T1
3l2 l 2 m
1 2 exp22l mT1
2
2l m4 T1
2l3
1 2 exp23l mT1 ;
28
3l m4 T1
where P2 t and P3 t are got from Eq. (20).
Through the above derivations, it is found that the values
of PFDG calculated by Eq. (25) and by the system steady
state values are different. In the following, let us investigate
the differences by numerical examples.
Table 2
PFDG calculated by two methods
System architecture
1oo1
1oo2
2oo3
System architecture
PFDG
by Eq. (25)
by steady state values
7.17 1023
6.28 1025
1.76 1024
1.25 1022
7.08 1025
2.11 1024
PFDG
by Eq. (25)
by steady state values
2.45 1023
6.89 1026
1.86 1025
4.30 1023
8.26 1026
2.47 1025
Example 1: lDD 1025 ; lDU 5 1026 ; MTTR 10;

T1 5 103 :
PFDG values are shown in Table 2 where the effects of
common cause failures are not concerned.
Example 2: lDD 1026 ; lDU 1026 ; MTTR 8; T1
8600:
See Table 3 for PFDG values calculated by two different
methods, where the effects of common cause failures are not
involved.
By comparing the values shown in Tables 2 and 3, it
concludes that the value of PFDG obtained by the steady
state system values is a little larger than the one
calculated by Eq. (25) for the same system architecture
but they approach well to each other. Therefore, it is
reasonable to calculate PFDG by the steady state system
values. Moreover, it is simple for application in
engineering.
5.2. Effects of common cause failures
Common cause failure is an important part in construction
of redundant system architectures. This part and other
redundant structure compose a series system in logic; see
Fig. 2b, c and e. The effects of common cause failures on
system average probability of failure on demand are then
represented by
bD lDD MTTR blDU T1 =2 MTTR

for 1oo2, 1oo2D and 2oo3 system architectures. The
contribution of common cause failure to PFDG is influenced
by factors bD and b; which depend on a physical system.
6. Remarks
In IEC 61508-6, the down time, tc1 ; is not defined. If its
meaning is not clearly known, all other results presented in
the standard could be difficult to be understood for a
common safety engineer because it is the basis as shown in
all typical system architectures. tc1 is newly named
equivalent mean down time of undetected failure in a
channel. It is defined as T1 2 ta MTTR; where ta stands
for the time when the average probability of failure for the
undetectable fault occurs in a system in the interval 0; T1 :
As the meaning of tc1 is now defined, one can get
understood all other results of average probability of failure

on demand of typical architectures.
As discussed in the above sections, it shows that there
is a discrepancy in calculation of tCE ; tGE and t0GE for
1oo2, 1oo2D and 2oo3 system architectures between
the ones given in the standard IEC 61508-6 and the new
ones obtained by Markov model. These differences are
presented in Table 1.
The average probabilities of failure on demand
obtained by Markov model for 1oo2, 1oo2D and 2oo3
system architectures are of the same forms with those
presented in IEC 61508-6. Where, however, the
expressions of tCE ; tGE or t0GE for these three architectures
are different from what are newly obtained. No
description in details can be found for getting tCE ; tGE
and t0GE in this standard. As a result, the new expressions
for tCE ; tGE and t0GE are suggested to be applied.
References
[1] IEC 61508. Functional safety of electric/electronic/programmable
electronic safety-related systems, Parts. 17;October 1998May 2000.
[2] Karydas DM, Brombacher AC (Guest editors). Special issue
Reliability certification of programmable electronic systems. Reliab
Engng Syst Safety, No. 2; 1999. p. 66.
141
[3] Kato E, Sato Y. Safety integrity levels model for IEC 61508
examination of modes of operation. IEICE Trans A 2000;E83-A(5):
8635.
[4] Muta H, Ibe H, Sugiyama E. Safety design of oil reclamation system
using IEC 61508. PSAM5Proceedings of the Fifth International
Conference on Probabilistic Safety Assessment and Management,
Osaka, Japan; Nov. 27 Dec. 1, 2000. p. 479 84.
[5] Kawahara T, Kushibiki T, et al. Safety-integrity of safety-related
systems with human beings. PSAM5Proceedings of the Fifth
International Conference on Probabilistic Safety Assessment and
Management, Osaka, Japan; Nov. 27Dec. 1, 2000. p. 2411 7.
[6] Kato E, Sato Y. Safety integrity levels model for IEC 61508.
PSAM5Proceedings of the Fifth International Conference on
Probabilistic Safety Assessment and Management, Osaka, Japan;
Nov. 27 Dec. 1, 2000. p. 278793.
[7] Misumi Y, Sato Y. Estimation of average hazardous-event-frequency
for allocation of safety-integrity levels. Reliab Engng Syst Safety
1999;66:135 44.
[8] ISA-S84.01.1996. Application of safety instrumented systems for
process industries. Instrument Society of America, Research Triangle
Park; 1996.
[9] Rouvroye JL, Brombacher AC. New quantitative safety standards:
different techniques, different results? Reliab Engng Syst Safety 1999;
66:1215.
[10] IEC 61508-6. Functional safety of electric/electronic/programmable
electronic safety-related systems. Part 6. Guidelines on the application
of IEC 61508-2 and IEC 61508-3; April 2000.
[11] Cao JH, Cheng K. An introduction to mathematics of
reliability. Beijing: Publication of Science; 1986. p. 2. p. 210 30,
in Chinese.

Availability of Systems With Self-Diagnostic Components-Applying Markov Model To IEC 61508-6

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Availability of Systems With Self-Diagnostic Components-Applying Markov Model To IEC 61508-6

Diunggah oleh

Hak Cipta:

Format Tersedia

Reliability Engineering and System Safety 80 (2003) 133141

Availability of systems with self-diagnostic componentsapplying

Received 11 December 2000; accepted 19 December 2002

self-diagnostic coverage and the failure rates of components

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

steady state availability

the Enhanced Markov analysis. So the combination of the

Component failure and repair rates are constant over

voted group equivalent mean down time for 1oo2

The resulting average probability of failure on demand

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

For each subsystem, there is a single T1 and MTTR:

3. Meaning of down time, tc1

Fig. 1. Physical block diagrams.

For one channel architecture,

tlDU exp2lDU tdt

since lDU T1 p 1 and exp2lDU T1 < 1 2 lDU T1 : Therefore

If lD T1 , 0:1; T1 =2 is a quite good approximation to the real

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

2lDU t1 2 exp2lDU texp2lDU tdt

since lDU T1 p 1 and exp2lDU T1 < 1 2 lDU T1 : As

4. Availability of system architectures by Markov model

Fig. 2. Reliability block diagrams.

For 1oo2 architecture (see Fig. 2b), the probability of

Fig. 3. Process for the undetected dangerous fault.

State 0: operation state; State 1: failure state.

Then the system states transition diagram obtained is

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

From Eqs. (6) and (9),

Fig. 4. Markov states transition diagram for 2-component system.

and lDD and lDU are less than 1025. Therefore

lDD lDU lDD mDU lDU mDD

exp2lDD mDD tlDU mDD

Hence, for a 1oo1 architecture,

exp2lDU mDU tlDD lDU

PFDG lDU lDD tCE lD tCE :

exp2lDD lDU mDD mDU t}:

For a channel with down time, tCE ; the resulting in average

The system availability At is

They are simply rewritten as

In general, the system probabilistic parameters at steady

and the system failure frequency, FF, is given as [11]

The architecture includes two channels connected in

System state definition

Two channels are operative (up state)

So, the states transition diagram is easily obtained

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

Fig. 5. Markov states transition diagram.

This figure stands for the following differential

Similar to the case of 1oo1 architecture, the probabilistic

Hence, FF lP1 according to Eq. (6). Then

As m is equivalent repair rate of a channel,

4.3. 1oo2D architecture

lDU T1 =3 MTTR lDD lSD MTTR

lDU T1 =3 MTTR lDD lSD MTTR

The PFDG for this architecture is then obtained by referring

4.4. 2oo2 architecture

Hence, the PFDG for this architecture is

component compose a series system. According to

PFDG 2lD tCE ;

where tCE is given in Eq. (11).

212 bD lDD 12 blDU 2 tCE tGE bD lDD

This system consists of two channels connected in

2lDU t1 2 exp2lDU texp2lDU tdt

exp2lDD mDD tlDU mDD

exp2lDU mDU tlDD lDU

exp2lDD lDU mDD mDU t}:

212 bD lDD 12 blDU 2 tCE tGE bD lDD

PFDG 61 2 bD lDD 1 2 blDU 2 tCE tGE

tGE 1=2lDU =lD T1 =3 MTTR lDD =lD MTTR