Risk, Reliability and Safety 2013 Answers

Questions
x-y below refer to the Meteorology Telemetry System Case Study below make sure
you refer to this case when answering these questions
You are working for the New Products section in the Communications Division of Electrosystems Ltd
which has an annual turnover of $30m for the local and export markets. A new request for 10
identical remote meteorology telemetry systems has been received. The order is expected to be
worth around $2million, with additional sales in subsequent years being worth up to $6million.
The requirements of each system are to:
Scan a number of weather recording instruments once per minute

Transmit the information by frequency modulated VHF carrier to a terminal station
You are part of a team that develops a conceptual design as shown in Figure 1.
Signal processing board
CPU ROM
A/D I/O
etc
Instrument
Module
Modulator and transmitter
Power supply unit

(battery and generator)

Figure 1: Preliminary design
The instrument module provides 14 channels, scanned by a microprocessor controlled signal

processing board. The digital output from this board is used to modulate a VHF carrier. Power is
supplied by rechargeable batteries and a solar charger.
Five of the 14 weather inputs are classified as major. These inputs are barometric pressure, wet
and dry bulb temperature, wind direction and speed. Failure of a sensor will result in grossly
inaccurate (or loss of) readings, and the loss of the corresponding input parameter. The system fails
if any one of these five parameters fails, or if the signal processing board, communication board or
power supply unit fails. The customer requires a 10 year MTBF for the system.
First consider failure of the instrument modules. The required meteorological instrument modules
are available from specialist suppliers. One supplier showed evidence from three contracts, each for
50 instrument modules over a period of 5 years. In this data set there were 12 sensor failures that
resulted in the failure of a major parameter. Data on when the failures occurred (based on number
of months in service) is provided in Table 1. When a sensor failed it was not replaced and the
instrument module was considered failed. The combined operation time for these 12 sensors is 324
months (=27.0 years). The remaining units that did not fail survived the 60 month period.
GENG5507 2013 pg. 1

Table 1: Times to failure for the sensors on the instrument modules (months from start of service)
1 2 3 4 5 6 7 8 9 10 11 12
Months 6 9 11 15 17 18 28 30 33 45 55 57
in
service

An industry database for failure data on instrumentation was used to check the calculated failure
rate.
Table 2: Extract from FARADIP database from Smith (2007)
Failure rate in failures per million hours

Lowest value quoted Highest value quoted
Pressure sensor 2 10
Level indicator 1 10
Temperature sensor 0.2 10

A Failure Modes and Effects Analysis (FMEA) was done on the proposed design shown in Figure 1. In
the FMEA process the following failure rates were used:
Table 3: Failure rates used in FMEA analysis
Failure rate in failure per million hours (pmh)

Signal processing board 2
Communication board (Modulator and 2.4
transmitter)
Power supply on the power supply unit 0.79
Battery on the power supply unit 0.5
Instrument module As per calculation based on supplier data

1. What is the expected failure rate of the instrument module based on the total time on test?
A. 1.82 failures per million hours
B. 1.91 failures per million hours
C. 5.48 failures per million hours
D. 30.45 failures per million hours

2. The data presented in Table 1 is an example of which of the following (select one only)
A. Right censoring - Failure terminated
B. Right censoring - Time terminated data
C. Left censoring
D. Interval censoring
GENG5507 2013 pg. 2

3. The failure data on the instrument module provided in Table 1 is plotted using the software
package Weibull ++. Which of the following figures is the correct output? Correct means that
the correct number of events are used in the calculations and the correct software selection
for MLE (Maximum Likelihood Estimator) or RRX (Rank Regression on X) is used. The other
codes shown are immaterial to the question, for information they are FM (Fischer Matrix)
and MED (Median Rank).

=1.5864, =30.2644, =0.9862 =1.2318, =346.9859, =0.9599
99 99
Probability -Weibull Probability -Weibull
90
90
Data 1 12F-13 8S
Weibull-2 P 50 Weibull-2 P
U n r e lia b ilit y , F ( t )
RRX SRM MED FM RRX SRM MED FM
50
F=12/S=0 F=12/S=138
Data Point s Data Point s
Probability Line 10 Probability Line
5
10
5 1
5.E-1
1 1.E-1
1 10 100 1 10 100
Time, (t) Time, (t)
=1.6787, =30.3709 =1.0038, =710.2208

99 99
Probability -Weibull Probability -Weibull
90
90
12F 12F_ 138S
Weibull-2 P 50 Weibull-2 P
MLE SRM MED FM MLE SRM MED FM

50
F=12/S=0 F=12/S=138
Data Point s Data Point s
Probability Line 10 Probability Line
5
10
5 1
5.E-1
1 1.E-1

1 10 100 1 10 100
Time, (t) Time, (t)
A. Top left
B. Top right
C. Bottom left
D. Bottom right

4. What is the mean time to failure (years) of the meteorological instrument modules based on
the Weibull analysis in the previous question?
Note: Gamma values (1.60)=0.892; (1.63)=0.897; (1.81)=0.934; (2)=1
A. 2.2 years
B. 7.7 years
C. 27.0 years
D. 59.2 years

5. The failure of any sensor or component leads to the loss of a major parameter. Using the
estimated failure rate for the instrument modules based on the total time on test and the
data provided on failure rate for the other components given in Table X, what is the
GENG5507 2013 pg. 3

predicted failure rate of the system? Hint: Assume that the failure rate for an individual
sensor is what you calculated in Question 1. You need 5 of these sensors on your instrument
module.

A. 7.60 per million hours
B. 15.24 per million hours
C. 18.11 per million hours
D. 33.09 per million hours
6. Assuming for this question only that the failure distribution for each component in the
system is exponential, what is the estimated MTBF of your system?

A. 7.49 years
B. 10.0 years
C. 15.0 years
D. 59.5 years

7. You need to evaluate the potential improvement in system reliability over 10 years from
installing a 2nd instrument module in parallel with the 1st. You have already calculated the
following information. The Reliability of the Instrument Module over 10 years = 0.4506; The
reliability of the sub-system comprising the signal processing board, communication board,
power supply and battery, over 10 years is 0.607. Assuming that the failures of the two
instrument modules in parallel are exponentially distributed, what is the expected reliability
over 10 years of the system with this new configuration and does it meet the clients
requirement?

A. 0.123, No
B. 0.123, Yes
C. 0.424, No
D. 0.424, Yes
8. You and your team have conducted a risk identification and risk assessment. The major risks
you have identified are a) financial exposure and opportunity, b) the reliance on a limited
number of specialist suppliers for the instrument module, c) the maintenance costs involved
and lack of access to the remote sites where the meteorological instruments will be located,
d) uncertainties over how the instruments will perform in a range of operating
environments.
What is the most appropriate risk control measure for managing the exposure in supply of the
instrument module?
A. Develop a preferred relationship with one supplier

B. Engage with multiple suppliers
C. Develop internal manufacturing capability to make the instrument boards
D. None of the above.
GENG5507 2013 pg. 4

9. In the process of the study to develop this system you have used a number of processes
covered by a Standards discussed in the GENG5507 course. Which one of the following
Standards have you NOT used in this case study?

A. AS IEC 61883: 2005 Hazard Identification Guide
B. AS/NZS ISO 31000: 2009 Risk Management
C. AS IEC 60812:2008 Procedure for failure modes and effects analysis
D. AS IEC 60300.3.1:2003 Dependability management, Application Guide Analysis techniques
for dependability, guide on methodology.

10. In the course of this design process you conducted a FMEA. Key steps in the FMEA process
are defining the function, functional failures and failure modes. In doing this you realise that
people in your team have a number of misconceptions about failure modes. Which one of
the following statements is NOT correct? A failure mode occurs when

A. a desired function is not obtained
B. a physical process initiates deterioration
C. a specified function is outside acceptable operating limits
D. there is an immediate and critical impact on equipment function

11. One of the main outcomes of the FMEA process is the determination of an RPN. Which one
of the following statements is correct?

A. RPN stands for Ranking Priority Number and is calculated by multiplying likelihood and
consequence
B. RPN stands for Risk Priority Number and is calculated by multiplying likelihood and
consequence
C. RPN stands for Risk Potential Number and is calculated by multiplying likelihood,
consequence and detectability
D. RPN stands for Risk Priority Number and is calculated by multiplying likelihood,
consequence and detectability

12. Based on the data presented in Table 1 and your calculations which of the following hazard
rate curves would best represent a plot of the hazard rate of the instrument module against
time?

A. Infant mortality curve
B. Wear out profile
C. Constant (horizontal) line
D. Bathtub curve
End of questions on the telemetry case study
GENG5507 2013 pg. 5

Questions x-y below refer to the Gold Bug Tailings Case Study below make sure you refer to this
case when answering these questions
You are part of a team involved in doing a quantitative risk assessment for the tailings dam at a the
Gold Bug mine near Meekatharra. Gold Bug is a relatively small and remote mine producing about
80k ozs/ year. The dam is about 2 km from the plant and has been running about 10 years since its
construction in 2003. It was built using conventional processes in which an outer wall ~50 m high is
constructed and then an inner drywall from consolidated slimes. Slurry is deposited in the dam from
spigots. As slurry runs down the inside of the dam it spreads into thin layers, allowing the solids to
settle and compact over a period of weeks. Excess (clear) water drains to a pool in the centre of the
dam from where it is pumps back to the plant.
Due to problems with topography and aboriginal heritage issues during construction a contractors
camp was located downhill from the tailings dam. This is where the off-shift contractors are housed.
During construction there were 6 people in the camp at all times. The initial modelling work
assumed that when the dam was more than 20% full that a breach (failure) of the dam wall would
release 60,000 m3 of mud which will likely result in the burial of the camp. The initial risk assessment
in 2003 used the following assumptions:
The frequency of dam failures in the WA mid-west is 1 x 10-5/ annum

The conditional probability of the dam contents reaching the camp, if the dam fails, is 0.3.
The probability of a camp occupant being killed, if the dam contents due to a failure reach
the camp, is 0.5
It was understood that once construction was completed in 2003 that the construction camp would
be moved. However this has not happened. The construction camp is still being used to house
visiting contractors and consultants.
Recently there have been problems with the tailings dam as follows:
Process upsets in the gold plant have resulted in large volumes of lower density material
being sent to the dam.
The dam has a set of piezometers monitoring wall stability.
There is only about 300 mm of vertical freeboard which includes the height of the tailings
dry wall.
Instead of the pond sitting in the middle of the dam it is now up against the northern wall.
In 2013 you and your colleagues update the probability of failure of the dam wall to 1 x 10-3 / annum
and the probability of the dam contents reaching the camp to 0.75. All other assumptions remain
the same. You compare your calculated values with the AGS (2000) suggested tolerable risk criteria
of 10-5 per annum.
Questions
13. In the original design, what is the probability of a camp occupant being killed given a failure of
the tailings dam?

A. 0.15
GENG5507 2013 pg. 6

B. 0.20
C. 0.30
D. 0.50

14. When you update the probability values in 2013, what is the revised value for the probability of a
potential loss of camp occupants life in any year?
A. 3.75 x 10-6
B. 2.25 x 10-4
C. 3.75 x 10-4
D. 5.00 x 10-4
15. The regulation says that the storage capacity should be sufficient to ensure a freeboard of
at least 0.5 m above the expected maximum water level, which shall be based on the
average monthly rainfall figures less the gross mean evaporation in that area, plus the
maximum precipitation to be expected over a period of 24 hours with a frequency of once in
100 years. The phrase with with a frequency of once in 100 years means which of the
following (select one)?

A. This will occur once every 100 years.
B. The time between events follows a distribution with a mean of 100 years
C. The time between events is 100 years
D. None of the above

16. There are a number of barriers engineers traditionally consider when selecting risk controls,
which one of the following barriers will NOT assist in managing the risk of dam failure.
A. Engineering controls
B. Physical separation
C. Monitoring and Control
D. PPE
End of questions on the Tailings dam case study
GENG5507 2013 pg. 7

Questions x-y below refer to the Maleny ClearWater Pumping Station Case Study below make
sure you refer to this case when answering these questions
Overview of the case:

The Maleny ClearWater Pumping station was built in 1975. The four Clearwater pumps transfer
clean water from the station to a reservoir (and two backwash pumps). Normally there are two
pumps operating in parallel. The function of the pump station is to move water from the clear
water tank to the reservoir at a rate of 20-54 Ml/day depending on demand.
A functional diagram of the key elements is given below
Citect Contactor Electric power supply Reset switch

Function: to provide control and Function: A device that uses a Function: Provide electric power Function: Element of the control
monitoring capability for the small control current to energize at the required voltage, current and diagnostic system for the
circuit or de-energize a load and phase to the system electrical supply to the motor
AC induction Motor
Function: to convert electrical
energy to mechanical energy to
supply torque to the pump shaft
Pump
Control Valve Piping
Function: to move water at a
Function: to control the flow of Function: to convey the water
specified rate from source to
water through the system from source to destination
destination
Isolation Valve
Function: to isolate required
sections of the piping sytem

Figure 2 Functional diagram for the Maleny pump station
The maintenance strategy for the station is as follows:
Pumps are replaced based on their age.

Condition monitoring using vibration analysis is conducted monthly on the motor and pump
bearings.
Once a year the electrical connections are supposed to be checked with thermographic
methods to identify hot spots and loose connections.
Investigation
The section below describes the data collected and analysis conducted in 2004 at the pump station.
Planned vs unplanned work: Asset managers seek to manage equipment to avoid unexpected
failures or unplanned work. Unplanned work is generally classified as work that is not part of the
scheduled maintenance plan (planning window).. When failures occur within the planning window
unplanned work is often initiated to address the failures.
GENG5507 2013 pg. 8

When work is unplanned there are additional risks that need to be managed due to the urgent
nature of the work and costs are generally about three times higher than planned work. Also as
resources are finite, doing unplanned work means that planned work has to be deferred.
Sources of failures: In order to investigate the source of these costs unplanned work orders were
broken down into mechanical and electrical failures. 90% of all failures were found to be associated
with electrical equipment. The remaining 10% were mechanical failures and were due to broken
flow switches, pipe work and valve leaks.
Reset failures are associated with electrical problems when the pump has either failed to start or
failed in operation. They are generally associated with poor troubleshooting skills. Contactor failures
are associated with wear on the contactors in electrical starter system or problems with the PLC
control logic. Protection failures are associates with the motor protection and soft start system.
Citect failures are associated with the PLC due to logic errors, control system crashes and
communications issues. Flow failures are associated with the flow instrumentation; these are often
associated with the physical parts of the measurement system such as the paddle. Thermal failures
are associated with the thermistor in the motor when it detects high temperature.
Reliability analysis: Reliability analysis was conducted based on failure data from November 2000-
August 2004. Table 1 shows the results including analysis of failure events after the electrical system
upgrade in 2003. During this upgrade the starter and main switchboard were replaced and the Citect
control system improved. This change was motivated by problems of finding spare parts.
Table 4: Results from Weibull analysis of selected failure events
Failure MTBF (hours)
Contactors 0.46 274 646
Power bump 0.57 4384 7012
Leakage failures 0.42 55839 158828
Flow failures 1.35 7363 6749
Thermal failures 1.25 8298 7725
Citect 0.47 3083 7102
Protection 0.52 1913 3538
Reset failures 0.45 1391 3514
Pump 1.01 26280 26280
GENG5507 2013 pg. 9

Questions
17. Prior to the electric upgrade the failure data for the Citect system had a Beta value of 1.11
and an Eta value of 805 hours. What was the MTBF of the Citect system?

A. 509 hrs
B. 762 hrs
C. 774 hrs
D. 805 hrs
18. Hazard rate plots of data from Table X for the Citect, Contactor, Flow failures and Pump
failures are shown in the Figure below. They are labelled A, B, C and D. Identify which plots is
for which failure.
0.0025
0.002
A
0.0015
Hazard rate
B
C
D
0.001
0.0005
Time (hrs)

A. A-Citect; B-Contactor; C-Flow Failures; D-Pump Failures

B. A-Contactor; B-Flow Failures; C- Pump Failures; D-Citect
C. A-Citect; B-Flow Failures; C-Pump Failures; D-Contactor
D. A-Contactor; B-Pump Failures; C-Flow Failures; D-Citect

19. Based on the data in Table X, what is the expected reliability of one pump over 3 years?
A. 0.368
B. 0.632
C. 0.716
D. 0.999
GENG5507 2013 pg. 10

20. In this case we are told that 2 of the 4 pumps need to operate. However if only one of the
four pumps needed to operate, what is the reliability of the pump system? (Consider only
the pump not the other equipment and failure modes)
A. 0.007
B. 0.101
C. 0.840
D. 0.993
21. Based on your answer to the previous question and without doing any calculations, use your
judgement to estimate what the reliability of the system for 2 out of 4 pumps is?
A. 0 0.20
B. 0.20 0.40
C. 0.40 0.70
D. 0.70 1.00
22. Examining the failure data collected on the pumps and the other information presented in
the case, is an age-based maintenance replacement strategy appropriate for the pumps?
A. Yes because you know the MTBF value of the pump and this is used to set the age of change
out
B. Yes because this is what has been done in the past
C. No because the failure times are exponentially distributed and therefore an age based
replacement strategy is not appropriate
D. No because the failure behaviour is indicative of wear in and therefore an age based
replacement strategy is not appropriate

23. Which of the following is not true in this case study?
A. Unplanned work is more costly than planned work
B. Planned work is work completed as per the weekly maintenance plan
C. The risks associated with planned and unplanned work are the same
D. Unplanned work often results in deferring planned work

24. Given the experience and knowledge available concerning the operation and maintenance of
the Clearwater Pumping Station, what would be the most economical and practical
approach for the reliability engineer to review and update the maintenance strategies/
tactics for assets at the station?
A. Hazard and Operability Study (HAZOP)
B. PMO (also known as Reverse RCM)
C. Reliability Centred Maintenance (RCM)
D. Risk Management Study
End of questions on the Clearwater Pumping Station Case study
GENG5507 2013 pg. 11

Other questions
25. The Piper Alpha disaster in 1988 is now considered a classic example of the failure of what
system
A. Communication
B. Design with respect to location of the gas compression system
C. Emergency management
D. Permit to work

26. After the initial gas explosion on Piper Alpha caused by the start of pump A and the failure of
the blind closing the line, there were a number of events that aggravated the situation
contributing to more causalities than might otherwise have occurred. Which one of the
following was NOT an aggravating event?
A. Gas continued to rise up through the Piper Alpha drilling system from the reservoir
B. The fire deluge system failed to start
C. Adjacent rigs Tartan and Claymore continued to pump oil
D. The helicopter deck was unusable due to smoke and high winds

27. The Cullen Enquiry identified a number of design decisions that had contributed to the scale
of the disaster. Which one of the following was NOT considered a contributing factor.
A. The rig was powered by gas-fired generators which were reliant on either gas pump A or B
operating to supply power to the drill
B. The design of the pressure relief valve on the discharge of pumps A and B
C. The weakness of the walls separating the gas compression area from the oil area
D. Co-location of the main oil and gas trunk lines

28. The hierarchy of controls is an important concept in the selection of reactive and proactive
controls. What is the correct order of this hierarchy from least to most effective?
A. Administrative controls - PPE - Engineering controls Substitution - Elimination
B. PPE - Engineering controls Elimination Substitution - Administrative Controls
C. PPE Administrative controls - Engineering controls Substitution Elimination
D. Administrative controls - PPE - Engineering controls Elimination Substitution

29. The work of the Centre for Safety at UWA has been instrumental in trying to develop ways of
communicating about safety. One result of this has been to distil core concept of safety
culture into two terms. These were core themes in the 3.5 min video made by Rio Tinto and
the Centre for Safety and in the 12th workshop these were presented as an equation to assist
you to remember it. What was the equation?
A. Safety culture = function (reliability, performance)
B. Safety culture = function ( compliance, proactivity)
C. Safety culture = function (behaviour, shared values)
D. Safety culture = function (participation, mental models)

GENG5507 2013 pg. 12

30. The recording on how do we create a safety culture that people want to be part of made
by Rio Tinto and the Centre for Safety identified three things that you could do to assist in
the creation of a safety culture. Which one of the following is not part of this list?
A. Lead by example
B. Ability to adapt and modify behaviour
C. Communicate safety concerns
D. Follow managers instructions
31. ISO 31000:2009 defines the external context as being the external environment in which
the organization seeks to achieve its objectives. Which of the following is not considered
part of the external context

A. Form and extent of contractual relationships
B. Cultural, social, political, legal, regulatory, financial, technological, economic, natural and
competitive environment
C. Key drivers and trends having impact on the objectives of the organization
D. Relationships with, and perceptions and values of external stakeholders

32. The establishment of external communication and reporting processes is described in the
ISO 31000:2009 risk management standard. In the context of applying this requirement to
the proposed Driverless Car Trial discussed in the class, which of the following actions would
NOT be appropriate.
A. Engaging appropriate external stakeholders such as local councils (Nedlands, Subiaco
and the City of Perth), residents of these areas and groups such as Main Roads.
B. External reporting to comply with legal, regulatory and government regulations.
C. Advertising to promote the clean green image of the Driverless Car project
D. Communication with stakeholders involved in crisis management such as Police, Fire and
Emergency Services.

33. How do we assess the reliability of a software system?
A. Determine system reliability using reliability block diagrams
B. Examining repair history
C. Examining the testing history
D. Calculating the reliability of the individual software modules

34. Which of the following is NOT a common assumption of software reliability models?
A. The failure rate is proportional to the number of remaining defects
B. New defects can be introduced during the repair process
C. Defects are fixed very soon after discover (MTTR is small)
D. Defects are independent

35. The definition for risk in ISO31000:2009 is
A. deviation from the expected positive or negative
B. The effect of uncertainty on objectives
C. The probability of something happening multiplied by the resulting cost or benefit if it does
D. The probability of uncertain future events
GENG5507 2013 pg. 13

GENG5507 2013 pg. 14

Risk, Reliability and Safety 2013 Answers

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Risk, Reliability and Safety 2013 Answers

Diunggah oleh

Hak Cipta:

Format Tersedia

Questions

The requirements of each system are to:

Scan a number of weather recording instruments once per minute

Signal processing board

Power supply unit

The instrument module provides 14 channels, scanned by a microprocessor controlled signal

GENG5507 2013 pg. 1

Table 2: Extract from FARADIP database from Smith (2007)

Failure rate in failures per million hours

Table 3: Failure rates used in FMEA analysis

Failure rate in failure per million hours (pmh)

GENG5507 2013 pg. 2

=1.6787, =30.3709 =1.0038, =710.2208

MLE SRM MED FM MLE SRM MED FM

GENG5507 2013 pg. 3

A. Develop a preferred relationship with one supplier

GENG5507 2013 pg. 4

End of questions on the telemetry case study

GENG5507 2013 pg. 5

The frequency of dam failures in the WA mid-west is 1 x 10-5/ annum

GENG5507 2013 pg. 6

End of questions on the Tailings dam case study

GENG5507 2013 pg. 7

Overview of the case:

A functional diagram of the key elements is given below

Citect Contactor Electric power supply Reset switch

The maintenance strategy for the station is as follows:

Pumps are replaced based on their age.

GENG5507 2013 pg. 8

Table 4: Results from Weibull analysis of selected failure events

Failure MTBF (hours)

Contactors 0.46 274 646

Power bump 0.57 4384 7012

Leakage failures 0.42 55839 158828

Flow failures 1.35 7363 6749

Thermal failures 1.25 8298 7725

Citect 0.47 3083 7102

Protection 0.52 1913 3538

Reset failures 0.45 1391 3514

Pump 1.01 26280 26280

GENG5507 2013 pg. 9

A. A-Citect; B-Contactor; C-Flow Failures; D-Pump Failures

GENG5507 2013 pg. 10

End of questions on the Clearwater Pumping Station Case study

GENG5507 2013 pg. 11

GENG5507 2013 pg. 12

GENG5507 2013 pg. 13

GENG5507 2013 pg. 14

Anda mungkin juga menyukai