Anda di halaman 1dari 9

Reliability Engineering and System Safety 80 (2003) 133141

www.elsevier.com/locate/ress

Availability of systems with self-diagnostic componentsapplying


Markov model to IEC 61508-6
Tieling Zhanga,*, Wei Longb, Yoshinobu Satob
b

a
HAL Corporation, 6-21-17-701 Nishikasai, Edogawa-Ku, Tokyo 134-0088, Japan
Tokyo University of Mercantile Marine, 2-1-6 Etchujima, Koto-Ku, Tokyo 135-8533, Japan

Received 11 December 2000; accepted 19 December 2002

Abstract
Of all the techniques applicable to safety-related analyses, each one may be adaptable to some aspects of the system safety behavior. On
the other hand, some of them can fit to analysis on one aspect of the system behavior concerning risk, but they do not always lead to the same
results. Rouvroye and Brombacher made a comparison of these techniques and indicated that Markov and Enhanced Markov analysis
techniques can cover most aspects of systems safety-related behavior. According to their conclusion, the Markov method is introduced to
Part 6 of the standard IEC 61508 for quantitative analysis in this paper. The purpose is to present explanation in details for solutions given in
the standard because there are not clear descriptions for many results and it is not easy for a safety engineer to find the clue. In addition, the
down time tc1 shown in the standard is newly defined because it is the basis to get the results of average probability of failure on demand of
system architectures and its meaning is not clearly explained. Through derivation, however, a discrepancy is found in the standard. From this
point of view, new suggestions are proposed based on the results obtained.
q 2003 Elsevier Science Ltd. All rights reserved.
Keywords: IEC 61508; Self-diagnosis; Probability of failure on demand; Markov model

1. Introduction
Recently IEC 61508 [1] was compiled and published as a
modish international standard. Many studies concerning
discussions and applications of the standard have been
carried out such as those published in the special issue of
Reliability Engineering & System Safety in 1999 [2] and
some others [3 6]. In this standard, two frameworks are
concerned. One is risk reduction with Safety-Related
System (SRS) and the other is the Overall Safety LifeCycle. In order to understand the first framework more
profoundly, the dependence of the risk reduction on both
Safety Integrity Levels (SILs) of SRS and demands from the
Equipment Under Control to SRS has to be clarified.
Misumi and Sato [7] studied this point and expressed the
mutual relationship mathematically by means of a simple
fault tree analysis (FTA). Their research needs to be
developed further. The configuration of SRS, proof test,
* Corresponding author.
E-mail addresses: zhangtling@yahoo.com (T. Zhang), yoshi@ipc.
tosho-u.ac.jp (Y. Sato).

self-diagnostic coverage and the failure rates of components


are often utilized in the evaluation of SILs of SRSs.
The SILs of SRS need to be evaluated by quantitative
analyses as required by the standard IEC 61508 and some
others like ISA-S84.01 [8]. There are many quantitative
analysis techniques such as Markov analysis, reliability
block diagram, hybrid techniques, parts count analysis and
FTA. Each of them might fit to cover several aspects of the
system behavior concerning safety. At the same time, one
aspect of the systems risk related behavior may be suitably
analyzed by some of them, but they do not always lead to the
same results. Rouvroye and Brombacher [9] outlined these
techniques and compared them to each other. The
calculation results obtained by these techniques for the
same example showed large differences. Therefore, they
pointed out that the application of different (quantitative)
techniques to practical systems would not always lead to the
same and definite results. However, they also clearly wrote
that Markov analysis covers most aspects of quantitative
safety evaluation of systems. Others that this approach
cannot cover are limited to the uncertainty or sensitivity
analyses. These aspects, however, can be carried out with

0951-8320/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved.
doi:10.1016/S0951-8320(03)00004-8

134

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

Nomenclature

tGE

A
FF
Ft
MDT
MTTR
PCC
pdf
Pi t; Pi

t0GE

PFD
PFDG
T1
tc1
tc2
tCE
t0CE

steady state availability


failure frequency
probability of failure of a system
mean down time
mean time to restoration
probability of failure caused by common cause
probability density function
probability and steady state probability of system
in the ith state
average probability of failure on demand for one
channel
average probability of failure on demand for
system architectures
proof-test interval
equivalent mean down time for the undetected
failure of a channel
equivalent mean down time for the detected
failure of a channel
channel equivalent mean down time for 1oo1,
1oo2, 2oo2, 2oo3 architectures
channel equivalent mean down time for 1oo2D
architecture

the Enhanced Markov analysis. So the combination of the


above two types of Markovian methods is preferable and
hence recommended.
The present paper takes up the SRSs. Their system
configurations are composed of channels that include both
detectable failures with self-diagnosis and undetectable
failures. The Markovian approach is applied to the
quantitative SIL analyses. In most cases, it can give
satisfactory results. The intention of this paper is to present
clue for solutions of many probabilistic parameters
regulated in IEC 61508-6 [10] based on the specific system
structures and associated conditions. It is not easy for a
common safety engineer to understand the results of these
probabilistic parameters because there is no detailed process
of derivation. Especially, in this paper, the down time tc1 is
newly defined since it is the basis as shown in typical system
architectures and its meaning is not clearly explained in the
standard. After detailed derivation, however, difficulties
and discrepancy are found for some specified examples in
Part 6 of IEC 61508. Suggestions for solving them are
hence proposed.

l; m
lD
lDD
lDU
lSD
mDD
mDU
b

bD

ii.

iii.

iv.

v.
vi.

vii.
2. Assumptions
In order to describe the state transitions of systems as
clearly as possible, we make the following assumptions:

viii.
ix.

i.

Component failure and repair rates are constant over


the life of the system.

voted group equivalent mean down time for 1oo2


and 2oo3 architectures
voted group equivalent mean down time for
1oo2D architecture
failure and equivalent repair rates of a channel
dangerous failure rate lD lDD lDU of a
channel in a subsystem
detected dangerous failure rate of a channel in a
subsystem
undetected dangerous failure rate
detected safe failure rate of a channel
repair rate of detected dangerous fault in a channel
repair rate of undetected dangerous fault in a
channel
the fraction of undetected failures that have a
common cause (expressed as a fraction in the
equations and as a percentage elsewhere)
of those failures that are detected by the diagnostic
tests, the fraction that have a common cause
(expressed as a fraction in the equations and as a
percentage elsewhere)
For details, refer to Table B.1 in Annex B of IEC
61508-6 [10]

The resulting average probability of failure on demand


for the subsystem is less than 1021, or the resultant
probability of failure per hour for the subsystem is less
than 1025.
The input subsystem comprises the actual sensor(s) and
any other components and wiring, up to but not
including the component(s) where the signals are first
combined by voting or other processing. For example,
the configuration for two sensor channels is shown in
Annex B of Part 6 of IEC 61508 [10].
The hardware failure rates used as inputs to the
calculations and tables are for a single channel of the
subsystem. For example, if 2-out-of-3 sensors are used,
the failure rate is for a single sensor and the result of
failure rate for a 2-out-of-3 is calculated separately.
All channels in a voted group have the same failure rate
and diagnostic coverage rate.
The overall hardware failure rate of a channel in a
subsystem is the sum of the failure rates: dangerousand safe-failures for that channel. These values are
assumed to be equal.
For each safety function, there is a perfect proof
testing and repairing. Namely, all failures that remain
undetected are assumed to be detected by the proof
test.
The proof test interval is at least one order of
magnitude greater than the diagnostic test interval.
The demand rate and expected interval between demands
are not considered in this paper. Therefore, we can
analyze the SRS failures separately from the demand.

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

135

x.

For each subsystem, there is a single T1 and MTTR:


MTTR is defined to include the time taken to detect
a failure. It is at least one order of magnitude less
than T1 : In this paper, the single assumed value of
MTTR for both detected and undetected failures
includes the diagnostic test interval but not the T1 :
For undetected failures, the MTTR used in the
calculations should not include the diagnostic test
interval since the mean time to restoration is always
added to the proof test interval, which is at least one
order of magnitude greater than the diagnostic test
interval. The error introduced here is not significant.
xi. Multiple repair teams (each of them is assumed to
have the same repair rate) are available to work on
all known faults in a system.
xii. In a channel, the detected and undetected faults can
exist simultaneously, i.e. if one occurred, the other
one can occur before the former is repaired.
xiii. Repairs of the detected and undetected faults are
viewed as independent for the sake of conservative
consideration though some dependence may be
invoked in the process.
Other assumptions can be referred to the Annex B, IEC
61508-6 [10].

3. Meaning of down time, tc1


In the standard IEC 61508, system configurations are
composed of channels. Each channel includes both
detectable failures with rate lDD by self-diagnosis and
undetectable failures with rate lDU : See Figs. 1 and 2 for
physical and reliability block diagrams of five typical
system architectures. The failure rates lDD and lDU are
assumed to be constants. Hence, the times of occurrences
for these two kinds of failures follow the exponential
distributions. For the detected dangerous fault is repaired to
be good, the MTTR is used. However, the undetectable
dangerous fault cannot be detected out until the next prooftest. It follows such a process as shown in Fig. 3. In the
figure, t is the time of occurrence of the failure and td is the
duration of down time.
In the standard, tc1 is not clearly defined (refer to
Fig. 2). However, it is the basis to get all results of
average probability of failure on demand of typical
system architectures. If its meaning is not clear, it is not
easy for a safety engineer to understand all of other
solutions. Here, tc1 is named equivalent mean down time
for the undetectable fault in a channel. It is defined as
follows:
Suppose ta is the time when the average probability of
failure for the undetectable fault in the interval 0; T1 
occurs in a system. Then
tc1 T1 2 ta MTTR:

Fig. 1. Physical block diagrams.

For one channel architecture,


T1
T1

tlDU exp2lDU tdt
lDU exp2lDU tdt
ta
0

T1
0

tlDU exp2lDU tdt

< T1 =2

 


1 2 exp2lDU T1
2

since lDU T1 p 1 and exp2lDU T1 < 1 2 lDU T1 : Therefore


tc1 T1 =2 MTTR:

If lD T1 , 0:1; T1 =2 is a quite good approximation to the real


value of ta :

136

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

Hence
 T1

2lDU t1 2 exp2lDU texp2lDU tdt
ta
0
T1

zz
2lDU 1 2 exp2lDU texp2lDU tdt
0
T1


2lDU t1 2 exp2lDU texp2lDU tdt


0



<

1 2 exp2lDU T1

2

2
3
7 2 3
T1 2 lDU T12
l T 2
3
4
12 DU 1

since lDU T1 p 1 and exp2lDU T1 < 1 2 lDU T1 : As


lDU T1 , 0:1; ta is a little less than 2T1 =3 but approaches
the value, tc1 can be evaluated as
tc1 T1 2 ta MTTR T1 =3 MTTR:

Similar to the case for 1oo2 architecture, tc1 for 1oo2D and
2oo3 architectures can be obtained as given in Eq. (5). If
lDU T1 , 0:1; one can justify that 2T1 =3 is a quite good
approximation to the real value of ta for these three
architectures by using numerical examples.

4. Availability of system architectures by Markov model


4.1. 1oo1 architecture
This system consists of a single channel, where any
dangerous failure leads to a failure of the safety function
when a demand arises. Its physical- and reliability-block
diagrams are shown in Figs. 1a and 2a. These dangerous
failures are divided into two parts being regarded as
components c1 and c2. Thus the system is composed of
two components connected in series so that there are four
system states as follows:
System state

Fig. 2. Reliability block diagrams.

For 1oo2 architecture (see Fig. 2b), the probability of


failure for the undetectable fault is
1 2 exp2lDU t :
2

Fig. 3. Process for the undetected dangerous fault.

Component c1

Component c2

0
0
1
1

0
1
0
1

State 0: operation state; State 1: failure state.

Then the system states transition diagram obtained is


shown in Fig. 4.
This figure gives a set of differential equations:
8 0
>
P t 2lDD lDU P0 t mDD P1 t mDU P2 t;
>
> 0
>
>
< P01 t lDD P0 t 2 lDU mDD P1 t mDU P3 t;
>
>
P02 t lDU P0 t 2 lDD mDU P2 t mDD P3 t;
>
>
>
: 0
P3 t lDU P1 t lDD P2 t 2 mDD mDU P3 t:

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

137

From Eqs. (6) and (9),


FF P0 lDD lDU
lDD lDU

mDD mDU
:
lDD mDD lDU mDU

10

Further,
MDT
<

Fig. 4. Markov states transition diagram for 2-component system.

P0 MP

where

P0 P00 t; P01 t; P02 t;
P P0 t;

P1 t;

P2 t;

P3 t

lDU

and lDD and lDU are less than 1025. Therefore

lDD

1
T1 =2 MTTR;
mDU
1
MTTR;
mDD

T
P03 t ;

and M is given as
0
2lDD lDU
mDD
mDU
B
B
lDD
2lDU mDD
0
B
B
B
B
lDU
0
2lDD mDU
@
0

tCE MDT
1

mDU
mDD
2mDD mDU

C
C
C
C:
C
C
A

lDU
l
T =2 MTTR DD MTTR:
lD 1
lD

lDD lDU lDD mDU lDU mDD


lDD mDD lDU mDU


lDU
lDD

, lD
lD mDU lD mDD


lDU
lDD
lD
T =2 MTTR
MTTR lD tCE :
lD 1
lD

PFD <

1
{m m l m
lDD mDD lDU mDU DD DU DD DU

exp2lDD mDD tlDU mDD

Hence, for a 1oo1 architecture,

exp2lDU mDU tlDD lDU

PFDG lDU lDD tCE lD tCE :

exp2lDD lDU mDD mDU t}:

11

For a channel with down time, tCE ; the resulting in average


probability of failure,

The system availability At is


AtP0 t

lDU
l
T =2 MTTR DD MTTR
lD 1
lD

as lD lDD lDU ;

They are simply rewritten as

12A
l l lDD mDU lDU mDD
DD DU
FF
lDD lDU mDD mDU

12

7
4.2. 1oo2 architecture

In general, the system probabilistic parameters at steady


state are of interest. At the steady state, the system
availability is
A

mDD mDU
lDD mDD lDU mDU

and the system failure frequency, FF, is given as [11]


X X
FF
Pk ajk ;
k[W

The architecture includes two channels connected in


parallel. See Figs. 1b and 2b for system block diagrams.
Assuming the two channels are of the same, the system
states can be simply defined as follows:
System
state

System state definition

0
1
2

Two channels are operative (up state)


Only one channel is in operation (up state)
The two channels are all in fault (down state)

j[F

where
F is the failure states set of the system,
W is the operating states set of the system,
Pk is the probability of system in working state k and
ajk is the element of M given in Eq. (6).

So, the states transition diagram is easily obtained


as shown in Fig. 5 where two repair teams can be
available to work on all known failures in the system (see
assumption xi).

138

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

Fig. 5. Markov states transition diagram.

This figure stands for the following differential


equations:
8 0
P t 22lP0 t mP1 t;
>
>
< 0
13
P01 t 2lP0 t 2 l mP1 t 2mP2 t;
>
>
: 0
P2 t lP1 t 2 2mP2 t:
It is rewritten as
0 0 1 0
22l
P0 t
B 0 C B
B P t C B 2l
@ 1 A @
P02 t

m
2l m

10

P0 t

CB
C
B
C
2m C
A@ P1 t A:
22m

14

P2 t

Similar to the case of 1oo1 architecture, the probabilistic


parameters at steady state are concerned here. From Eq. (14)
A P0 0 P1 0

2lm m2
:
l m2

Hence, FF lP1 according to Eq. (6). Then


MDT

12A
l2
1
1

:
FF
2m
l m2 lP1

Refer to Eqs. (5) and (11), the equivalent mean down time of
a channel in the system architecture, tCE is
tCE

lDU
l
T =3 MTTR DD MTTR:
lD 1
lD

15

As m is equivalent repair rate of a channel,



21
lDU
l
m 1=tCE
T1 =3 MTTR DD MTTR
:
lD
lD
The equivalent mean down time for this system architecture
is obtained as
tGE MDT

1
2m

4.3. 1oo2D architecture


Two channels in this architecture are connected in parallel.
During normal operation, both channels need to demand the
safety function before it can take place. In addition, if the
diagnostic tests detect a fault in either channel, then the output
voting is adapted so that the overall output state then follows
that given by the other channel. If the diagnostic tests find
faults in both of channels or a discrepancy that cannot be
allocated to either channel, then the output goes to the safe
state. In order to detect a discrepancy between the channels,
either channel can determine the state of the other via a means
independent of the other channel. See Figs. 1c and 2c for the
system block diagrams.
Since each component follows the exponential distribution, comparing Fig. 2b and c, the values of equivalent
mean down times for each channel and the architecture are:
t0CE

lDU T1 =3 MTTR lDD lSD MTTR


;
lDU lDD lSD

t0GE

lDU T1 =3 MTTR lDD lSD MTTR


:
2lDU lDD lSD

The PFDG for this architecture is then obtained by referring


to Eq. (17) as follows
PFDG 21 2 blDU lSD 1 2 bD lDD
1 2 blDU t0CE t0GE bD lDD MTTR
blDU T1 =2 MTTR:

18

4.4. 2oo2 architecture

1
lDU =lD T1 =3MTTRlDD =lD MTTR:
2

16

Hence, the PFDG for this architecture is


PFDG P2 PCC

component compose a series system. According to


the definitions of b and bD ; the probability of failure for
detectable fault caused by common cause is bD lDD MTTR
and the probability of failure for undetectable fault due to
common cause is blDU T1 =2MTTR: Here, T1 =2
MTTR is equivalent mean down time of undetectable
fault in a channel. Hence, we have Eq. (17).

l2
PCC < l2 =m2 PCC
l m2

PFDG 2lD tCE ;

2
l2 tCE
PCC

19

where tCE is given in Eq. (11).

212 bD lDD 12 blDU 2 tCE tGE bD lDD


 MTTR blDU T1 =2MTTR

This system consists of two channels connected in


parallel. The system is in fault whenever anyone fails. See
Figs. 1d and 2d for system block diagrams. From Eq. (12),
PFDG for this architecture is easily obtained on the basis of
reliability block diagram

17

by considering the effects of common causes and l p m:


Refer to Fig. 2b, the two channels and common cause

4.5. 2oo3 architecture


The block diagrams of this architecture are shown in
Figs. 1e and 2e. Refer to Section 4.2, the Markov states

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

139

tGE MDT lDU =lD T1 =3 MTTR lDD =lD


 MTTR=2:

Fig. 6. Markov states transition diagram.

transition diagram for this architecture is given in Fig. 6


where the assumption xi applies.
Furthermore, there are the equations according to
Fig. 6.
8 0
>
P0 t 23lP0 t mP1 t;
>
>
>
>
>
>
0
>
< P1 t 3lP0 t 2 2l mP1 t 2mP2 t;
>
>
>
P02 t 2lP1 t 2 l 2mP2 t 3mP3 t;
>
>
>
>
>
: 0
P3 t lP2 t 2 3mP3 t:

l2 l 3m
l m3

20

21

3lm2
l m 3

based on Eq. (9). Hence, MDT of the system is given by


MDT

12A
l 3m

FF
6m2

PFDG 61 2 bD lDD 1 2 blDU 2 tCE tGE


bD lDD MTTR blDU T1 =2 MTTR

24

5. Discussions
5.1. Discrepancies

and

FF 2lP1 2l

For this architecture, the following is easily obtained by


referring to Eqs. (17) and (21).

since m q l:

We investigate the steady state probabilistic parameters.


At steady state, the system unavailability 1 2 A and FF
are obtained as:
1 2 A

23

22

where m 1=tCE and tCE is given in Eq. (15). Since m q


l; MDT < 1=2m: Therefore

In Section 4, the equivalent mean down times and


average probabilities of failure on demand for five system
architectures are obtained by Markov model. They are the
steady state values of the corresponding systems. However,
the equivalent mean down times, tCE ; tGE and t0GE for 1oo2,
1oo2D and 2oo3 architectures shown in Section 4 are
different from what are described in Annex B of Part 6 of
IEC 61508 [10]. In the standard, tCE expressed in Eq. (11) is
used for all of 1oo1, 1oo2, 1oo2D, 2oo2 and 2oo3 typical
architectures. In order to make a comparison, the results of
tCE ; tGE and t0GE used for these system architectures are listed
in Table 1. In the standard, no description can be found for
getting the expressions of tCE ; tGE ; t0GE and average
probability of failure on demand.
The average probabilities of failure on demand for these
three architectures in IEC 61508 are of the same forms as
those given in Section 4 but tCE ; tGE and t0GE are different.
In fact, the systems discussed in this paper could not
access steady state in proof test interval as mean down time
of undetectable fault in a channel is T1 =2 MTTR and thus
the corresponding equivalent repair rate is much smaller.
The average probabilities of failure on demand for the five
typical architectures should be calculated by the following

Table 1
Comparisons among equivalent mean down times for the three system architectures
System

tCE ; tGE and t 0GE obtained by Markov model

tCE ; tGE and t 0GE given in IEC 61508-6

1oo2

tCE lDU =lD T1 =3 MTTR lDD =lD MTTR

tCE lDU =lD T1 =2 MTTR lDD =lD MTTR

2oo3

tGE 1=2lDU =lD T1 =3 MTTR lDD =lD MTTR

tGE lDU =lD T1 =3 MTTR lDD =lD MTTR

1oo2D

t 0CE lDU =lT1 =3 MTTR lDD =lMTTR; l lDD lDU lSD

t 0CE lDU =lT1 =2 MTTR lDD =lMTTR; l lDD lDU lSD

t 0GE

lDU T1 =3 MTTR lDD lSD MTTR


2lDU lDD lSD

t 0GE

lDU T1 =3 MTTR lDD lSD MTTR


lDU lDD lSD

140

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

equation

Table 3
PFDG obtained by two different methods

1 T1
1 2 Atdt:
PFDG
T1 0

25

For 1oo1 architecture


1 T1
1 2 P0 tdt
PFDG
T1 0
l l lDD mDU lDU mDD
DD DU
lDD mDD lDU mDU
1
2
lDD mDD lDU mDU T1

lDD mDU
1 2 exp2lDD mDD T1 

lDD mDD
lDU mDD
1 2 exp2lDU mDU T1 

lDU mDU
lDD lDU

lDD lDU mDD mDU



 1 2 exp2lDD lDU mDD mDU T1  :

1oo1
1oo2
2oo3

26
For 1oo2 architecture
1 T1
l2
2l2
P2 tdt
2
PFDG
2
T1 0
l m
l m3 T1
l2
 1 2 exp2l mT1 
l m3 T1
 1 2 exp22l mT1 ;

27

where P2 t is obtained from Eq. (13).


For 2oo3 architecture
1 T1
PFDG
P t P3 tdt
T1 0 2
l2 l 3m
6l2 m
2
1 2 exp2l mT1 

3
l m
l m4 T1
3l2 l 2 m
1 2 exp22l mT1 
2
2l m4 T1
2l3

1 2 exp23l mT1 ;
28
3l m4 T1
where P2 t and P3 t are got from Eq. (20).
Through the above derivations, it is found that the values
of PFDG calculated by Eq. (25) and by the system steady
state values are different. In the following, let us investigate
the differences by numerical examples.
Table 2
PFDG calculated by two methods
System architecture

1oo1
1oo2
2oo3

System architecture

PFDG
by Eq. (25)

by steady state values

7.17 1023
6.28 1025
1.76 1024

1.25 1022
7.08 1025
2.11 1024

PFDG
by Eq. (25)

by steady state values

2.45 1023
6.89 1026
1.86 1025

4.30 1023
8.26 1026
2.47 1025

Example 1: lDD 1025 ; lDU 5 1026 ; MTTR 10;


T1 5 103 :
PFDG values are shown in Table 2 where the effects of
common cause failures are not concerned.
Example 2: lDD 1026 ; lDU 1026 ; MTTR 8; T1
8600:
See Table 3 for PFDG values calculated by two different
methods, where the effects of common cause failures are not
involved.
By comparing the values shown in Tables 2 and 3, it
concludes that the value of PFDG obtained by the steady
state system values is a little larger than the one
calculated by Eq. (25) for the same system architecture
but they approach well to each other. Therefore, it is
reasonable to calculate PFDG by the steady state system
values. Moreover, it is simple for application in
engineering.
5.2. Effects of common cause failures
Common cause failure is an important part in construction
of redundant system architectures. This part and other
redundant structure compose a series system in logic; see
Fig. 2b, c and e. The effects of common cause failures on
system average probability of failure on demand are then
represented by

bD lDD MTTR blDU T1 =2 MTTR


for 1oo2, 1oo2D and 2oo3 system architectures. The
contribution of common cause failure to PFDG is influenced
by factors bD and b; which depend on a physical system.

6. Remarks
In IEC 61508-6, the down time, tc1 ; is not defined. If its
meaning is not clearly known, all other results presented in
the standard could be difficult to be understood for a
common safety engineer because it is the basis as shown in
all typical system architectures. tc1 is newly named
equivalent mean down time of undetected failure in a
channel. It is defined as T1 2 ta MTTR; where ta stands
for the time when the average probability of failure for the
undetectable fault occurs in a system in the interval 0; T1 :
As the meaning of tc1 is now defined, one can get

T. Zhang et al. / Reliability Engineering and System Safety 80 (2003) 133141

understood all other results of average probability of failure


on demand of typical architectures.
As discussed in the above sections, it shows that there
is a discrepancy in calculation of tCE ; tGE and t0GE for
1oo2, 1oo2D and 2oo3 system architectures between
the ones given in the standard IEC 61508-6 and the new
ones obtained by Markov model. These differences are
presented in Table 1.
The average probabilities of failure on demand
obtained by Markov model for 1oo2, 1oo2D and 2oo3
system architectures are of the same forms with those
presented in IEC 61508-6. Where, however, the
expressions of tCE ; tGE or t0GE for these three architectures
are different from what are newly obtained. No
description in details can be found for getting tCE ; tGE
and t0GE in this standard. As a result, the new expressions
for tCE ; tGE and t0GE are suggested to be applied.

References
[1] IEC 61508. Functional safety of electric/electronic/programmable
electronic safety-related systems, Parts. 17;October 1998May 2000.
[2] Karydas DM, Brombacher AC (Guest editors). Special issue
Reliability certification of programmable electronic systems. Reliab
Engng Syst Safety, No. 2; 1999. p. 66.

141

[3] Kato E, Sato Y. Safety integrity levels model for IEC 61508
examination of modes of operation. IEICE Trans A 2000;E83-A(5):
8635.
[4] Muta H, Ibe H, Sugiyama E. Safety design of oil reclamation system
using IEC 61508. PSAM5Proceedings of the Fifth International
Conference on Probabilistic Safety Assessment and Management,
Osaka, Japan; Nov. 27 Dec. 1, 2000. p. 479 84.
[5] Kawahara T, Kushibiki T, et al. Safety-integrity of safety-related
systems with human beings. PSAM5Proceedings of the Fifth
International Conference on Probabilistic Safety Assessment and
Management, Osaka, Japan; Nov. 27Dec. 1, 2000. p. 2411 7.
[6] Kato E, Sato Y. Safety integrity levels model for IEC 61508.
PSAM5Proceedings of the Fifth International Conference on
Probabilistic Safety Assessment and Management, Osaka, Japan;
Nov. 27 Dec. 1, 2000. p. 278793.
[7] Misumi Y, Sato Y. Estimation of average hazardous-event-frequency
for allocation of safety-integrity levels. Reliab Engng Syst Safety
1999;66:135 44.
[8] ISA-S84.01.1996. Application of safety instrumented systems for
process industries. Instrument Society of America, Research Triangle
Park; 1996.
[9] Rouvroye JL, Brombacher AC. New quantitative safety standards:
different techniques, different results? Reliab Engng Syst Safety 1999;
66:1215.
[10] IEC 61508-6. Functional safety of electric/electronic/programmable
electronic safety-related systems. Part 6. Guidelines on the application
of IEC 61508-2 and IEC 61508-3; April 2000.
[11] Cao JH, Cheng K. An introduction to mathematics of
reliability. Beijing: Publication of Science; 1986. p. 2. p. 210 30,
in Chinese.

Anda mungkin juga menyukai