Anda di halaman 1dari 59

10

Safety, Availability
& Security

To properly operate plant processes, factory assembly lines, and


building utilities, we depend on automation systems. The automa-
tion systems must, therefore, be dependable. Dependability is
comprised of availability and credibility. Availability, in turn is
comprised of reliability and maintainability, while credibility is
comprised of integrity and security (see Figure 10-1). Security is
also related to availability in the sense that a Denial-of-Service
(DoS) attack can make the system unavailable and to safety in the
sense that, if unauthorized persons can make changes, the system
may become unsafe.

Figure 10-1. Dependability According to IEC 1069-5

Every application has different requirements for safety, security,


reliability, regulatory compliance, etc. Every company has a
different policy for dealing with these issues. Every automation
product has different capabilities to enable the required functions
and to make them easy to deploy. Familiarize yourself with
product manuals in order to customize hardware and software to
meet those different company policies and other requirements.

241
242 Software for Automation

Safety Integrity
Special care must be taken when using software in applications with
safety implications. The complexity of workstation and server oper-
ating systems, such as Windows, makes it impossible to prove them
functionally safe. In most applications the basic automation system
is adequate to handle the safety aspects, but some tasks have a few
critical functions that require safety with even greater integrity.
Safety-related functions may include, for example, machinery that
cuts and stamps, and processing dangerous chemicals.

Basic Process Control and Automation


When integrating a SIS, the OPC servers and clients are aspects to
consider along with architecture and logic solver configuration. If
the OPC server is no longer providing data to the OPC clients, the
clients must flag this to the operator so the operator is not misled
by invalid data on the screen. (See section on OPC troubleshooting
in Chapter 6.) An operator should always be able to trust a numer-
ical value. Showing frozen, stale, or otherwise invalid values may
be very unsafe. For example, the OPC client may use the OPC
status to show some invalid symbol in a specific color, instead of
the value, and to issue an alarm. In some applications, even if the
OPC server is only used for visualization, a loop should not run
blindly for extended periods of time. If the problem persists the
loop may have to be shut down in case of OPC failure.

In rare cases where OPC-DA and DX are used in closed loop


control, a failure must result in the loop output freezing in its
present value or the output going to a predetermined state.
Special mechanisms may have to be configured in the receiving
end to monitor the status and trip accordingly. OPC-DX has some
basic mechanisms to detect failed communication links based on
time-out and to trip by substituting a pre-configured fail-safe
value. Make sure to verify these mechanisms are properly config-
ured to propagate status between servers and clients, ensuring
trips operate as required.

Safety Related
Many automation applications also have safety-related aspects. This
may include machine safety and process safety in factories, or fire
fighting in buildings. The safety-related aspects are a very small
part of the overall automation systems, since most processes and
buildings are primarily designed to be inherently safe as reasonably
possible, and secondly, because most hazards are prevented by the
controls and alarms in the basic automation system. Remaining
Chapter 10 – Safety, Availability & Security 243

safety-related functions such as emergency shutdown in a process


plant are handled by dedicated high-integrity systems designed for
safety, such as a Safety Instrumented System (SIS) in the case of
process plants. However, it is common that, although the safety-
related system is standalone, operators want it networked to the
basic automation system’s consoles so it can be easily viewed
together with the rest of the controls. The reader is advised to refer
to IEC 61508 and IEC 61511.

Before networking a SIS, read the safety manual and certification


reports to understand the restrictions. Some solutions may not
permit networking, or may only permit use in read-only mode.
However, most SIS built specifically for safety-related applications
do not have this restriction. Today, OPC servers available for most
SIS and most basic automation systems have OPC client software.
OPC makes integration between SIS and basic automation very
easy. Ideally, OPC servers for the SIS logic solver should support
both OPC-DA for reading and writing parameters, as well as
OPC-A&E for capturing and propagating alarms. Typically, OPC
data is not safety-related and changes to the logic solver through
OPC is prevented, rendering it interference free.

WARNING – Read the safety manual of the SIS before networking


it to understand the limitations and safety implications.

Communication
Do not make OPC part of the shutdown chain. However, OPC can
be used to display SIS status to operators on the operator visuali-
zation software. Consider what action the logic solver needs to
take if operators lose visibility if, for example, the communication
link is severed or software, such as OPC server, or clients stop
functioning. A handshake mechanism between the operator visu-
alization software and the logic solver may be required to detect
such situations. In special cases where writing to parameters in
the SIS is permitted, after enabling writes to the SIS, perform a
read back check to make sure the value and only the intended
variable have changed. When the operator makes a change, it will
typically be read back while the operator waits around to see that
the value was indeed accepted. However, in the case of scripts
and other automated functions, special provisions are required to
read back the value to ascertain that it indeed changed.

The SIS should not be connected to the enterprise LAN or public


Internet. However, users may want the SIS to be monitored from
the basic control system workstations, the control system is
244 Software for Automation

connected to the enterprise LAN, and the enterprise LAN is


connected to the public Internet. In other words, the safety system
may end up connected to the public Internet through a mishmash
of routers and switches.

Operator Display
There are several ways to improve the safety of the system by
employing system features and designing operator graphics to be
user friendly. Operators must be able to see if a trip in the SIS has
actually occurred, receive alarm notifications, and must also be
able to overview the overrides and health of system components.

Alarms
Operating a fully automated plant is boring, as very little human
intervention is required. This results in lowered alertness and vigi-
lance among operators. Refer to the alarm management discussion
in Chapter 4.

Bypass/Override
Bypass or override refers to the ability to ignore the actual signal
(e.g., from a sensor or the logic solver) and instead use a manually
entered value or state. It should not be possible to enable and
disable override in the SIS through OPC or DDE. Overrides are
only possible from the SIS engineering console. However, operators
at consoles of the basic automation system must be alerted to any
override in the SIS. Overrides shall be indicated on the screens and
operators should also be notified by means of alarms. It is also
important that, during override, it is possible to monitor the actual
value from the field so the operator can know the true state of the
process and machines.

Write
It should not be possible to modify the operation of the SIS from
the basic automation system’s operator console through OPC or
DDE. SIS logic changes should only be possible from the SIS engi-
neering console. Possible exceptions to this rule are certain batch
process applications that must permit certain trip limits to be set
according to product, batch size, and so on. Therefore, writing to a
SIS through OPC is extremely rare. If writes are permitted at all, a
mechanism must be configured to first write a parameter to
unlock the logic solver before the actual parameter of interest is
written, and the logic solver must subsequently be write-locked
again. Refer to the safety manual and user manual of the logic
solver to determine what is permitted and how it can be done. In
Chapter 10 – Safety, Availability & Security 245

addition to configuring the OPC client and the logic solver to


permit writes, it is important to establish proper written proce-
dures for making changes. The SIS must perform a plausibility
check on the values received before accepting them.

The few parameters that can be written need access restriction, as


explained in the subsection on software security below. There
should be limits on operator entries, both in operator visualization
software to provide the operator with range violation feedback
and in the SIS logic solver for safety. There should be a summary
screen easy to overview where all settings can be reviewed.

Feedback
Operators need feedback to see that a tripped valve has actually
moved. Therefore feedback from valve instrumentation should
show actual valve position to alert operators if the valve does not
fully close. It may be a good idea to use intelligent valves, possibly
networked with a safety fieldbus in conjunction with OPC to
bring feedback information such as actual position, health, mode,
and override status all the way to the operator. Check if the OPC
server for the logic solver supports feedback for such information.

Status
The operator visualization software must show if a SIS subsystem,
component, or network has failed or is degraded. Check if the
OPC server for the logic solver supports feedback for such infor-
mation. It may be a good idea to use field instruments networked
with a safety fieldbus in order to access status of intelligent field
devices such as transmitters and valves. If communication
networks in the SIS are failed or degraded, this must be indicated
by the status in the logic solver OPC server. The OPC client may
use the OPC status to show some invalid symbol in a specific
color, instead of the value, and to issue an alarm.

Access
Access should be limited to authorized persons as a means of protec-
tion. The software must therefore have security built in, preferably
integrated with the operating system. Authentication and authoriza-
tion is explained further in the software security subsection.

Others
There are a few other points of consideration for displaying SIS
data on the consoles of the basic automation system. First, make
sure the update time is fast enough even during emergency condi-
246 Software for Automation

tions when there may be a storm of alarms, and timely access to


data is critical. Also, make sure graphics and displays use colors
that can be distinguished by all operators.

Consider the impact of loss of operator visualization for the SIS


which would result, for example, from failure of client software,
operating system, computers, networking, or the OPC server.
Other means of operating the SIS may have to be put in place,
such as also communicating the bare essentials of parameters
through Modbus. If a shutdown is required when visualization is
lost, then a handshake mechanism is required to detect such a
failure.

If time-stamped alarms are propagated from the SIS to the basic


automation system, make sure the SIS and basic automation
system are time synchronized from the same time master, such as
by using SNTP.

Availability
Open systems based on OPC interfaces and other open technolo-
gies must meet the same availability requirements as proprietary
systems. If automation systems stop functioning, the result could
be production downtime, people stuck in lifts, etc. In other words,
automation system failures mean loss of revenue, reduced produc-
tivity, frustration, and so on. Failure of the automation system may
also result in lost data, such as production records, which may
have regulatory implications. Hot-standby redundancy may there-
fore be implemented for networking, OPC servers, and disk drives.
Other measures include backup power, industrially hardened or
fault tolerant computers, as well as using solid-state hard disks.

Network Redundancy
To provide full benefits, automation software at the automation
system level requires uninterrupted communication with the
underlying automation hardware. Ring topology or redundancy
can be used for control networking. Redundancy at the execution
and business levels is rare, but this may change in the future as
supply chain management, etc. requires a continuous flow of data
through all levels. Using Web services also requires high connec-
tion availability.
Chapter 10 – Safety, Availability & Security 247

Ethernet Ring Topology


A simple way to increase availability is to use ring topology for
the Ethernet LAN. In ring topology, a physical ring is formed by
connecting special LAN switches together (see Figure 10-2). The
LAN switches have two ports to form the ring and use a propri-
etary protocol to detect faults and quickly direct communication
the other way. The ring may use copper or fiber optic media.
Ethernet devices, servers, and workstations only need to have a
single Ethernet port to connect to the LAN switches.

Figure 10-2. Ring Topology

If the ring wiring is broken, the communication simply goes the


other way. Ring topology is not quite redundant because the
switches constitute a possible single point of failure. Ring
topology is not yet an IEEE standard. Therefore all LAN switches
in the ring must be the same brand.

Dual Ethernet
Another simple approach to redundancy in control networking is
to use automation hardware such as controllers that have dual
network ports, while computers only use single network ports.
Most automation hardware such as controllers is easily available
with two network ports. By forming two independent networks
using two LAN switches, with individual sets of OPC servers and
248 Software for Automation

client software, the automation system can still be operated from


the secondary system if the primary system should experience a
failure in a device, LAN switch, network cabling, client or server
software (see Figure 10-3).

Figure 10-3. Two Ethernet Networks

This simple approach may work as a redundancy scheme in a


small system. However, synchronization between software and
hardware on the primary and secondary networks cannot be
complete and, therefore, this solution may not be adequate for
medium-to-large systems.

Full LAN Redundancy


In a fully redundant network all devices, workstations, and servers
have two network ports. Two networks are formed using two LAN
switches, and all nodes are connected to both networks. Devices,
servers, and workstations are all redundant (see Figure 10-4).
Chapter 10 – Safety, Availability & Security 249

Figure 10-4. Full LAN Redundancy

Full redundancy means any one component can fail and the
automation system is still able to operate and be supervised. Full
LAN redundancy requires the controller networking to have a
application layer protocol that provides this redundancy scheme.
Not all Ethernet-based industrial networking protocols have this.
An example of an industrial application layer protocol that
supports full LAN redundancy is FOUNDATION™ Fieldbus HSE.
Full LAN redundancy can also be combined with ring topology
forming two rings. For more information on redundancy, refer to
“Fieldbuses for Process Control: Engineering, Operation and
Maintenance”.1

OPC does not natively support any form of redundancy. OPC


servers can have dual network interface cards (NIC) and can, at
the hardware end, support the redundancy mechanism of the
industrial Ethernet. However, for OPC, redundancy has to be
implemented by switching at the client end.

Make sure to use OPC servers that support two Network Interface Cards
(NIC).

Hot-standby Redundancy OPC Servers


If the OPC server fails, the operators go blind, and any software
gateways that bridge OPC between different systems and indus-
trial Ethernet protocols will cease to function. Therefore, the OPC
250 Software for Automation

server is critical. In many systems, this requires hot-standby


redundancy. Since some critical loops may have to shut down
when the operator goes blind, and this may cause a stop, it is very
undesirable to have OPC failures. Data from the underlying
automation hardware must be provided to the automation soft-
ware without interruption. Therefore, hot-standby redundancy
can be used with OPC servers to increase availability.

The status capability built into OPC allows detecting a failed


server and automatic switch to a working server. In a redundant
OPC architecture, the OPC redundancy management software
connects to both the primary and secondary OPC servers. The
OPC redundancy management application is executing in every
client workstation. The redundancy manager software monitors
the primary and secondary OPC server and switches to the
secondary should the primary fail. The status functionality indi-
cates the parameter as being “good” or “bad” and thus allows a
redundancy management application to automatically switch
from a failed OPC server to the secondary. The OPC clients, in
turn, are always connected to the redundancy manager and never
notice a single server failure (Figure 10-5).

Figure 10-5. OPC Redundancy Architecture

The switchover is completely automatic and transparent; there is


no need to reconfigure the client and no scripts need to be created
for the switching to work. Servers are selected by browsing the
network, and the primary server is then identified. It is possible to
select more than one backup server, and the order in which they
are chosen can be rearranged (Figure 10-6).
Chapter 10 – Safety, Availability & Security 251

Figure 10-6. OPC Hot-Standby Redundancy Configuration (Screenshot:


SMAR SYSTEM302)

Diagnostics tags are available to see if the primary is being used.


This allows display in system status screens and notification to
technicians in case of server failure. Once the primary recovers,
there is an option to switch back to primary, or retain the
secondary. Another option prompts the user for confirmation
before switching back to the primary server.

RAID Drives
Servers require high availability for the data. High availability for
data storage can be achieved using fault tolerant hard disk archi-
tectures such as RAID (Redundant Array of Independent Disks).
Using RAID controllers for disks on the servers, fault tolerance
can be achieved using regular hard disks. There are many
different RAID architectures available: 0, 1, 2, 3, 4, 5, 6, 10, 50, and
0+1. However, the most common architectures are RAID 1 and
RAID 5. RAID 0 is not fault tolerant. RAID 1 uses disk mirroring,
requiring a one-for-one duplication of disks. RAID 5 uses a parity
scheme requiring one additional disk, but requires a minimum of
three disks. Figure 10-7 illustrates that RAID 1 mirroring needs
eight disks to hold four disks of data while RAID 5 parity only
needs five disks to hold four disks of data.
252 Software for Automation

Figure 10-7. Computers with RAID 1 and 5 Architectures

An additional benefit of this architecture is that many servers


using RAID permit adding more drives online without having to
stop the SQL server or operating system. Thus, the capacity can be
increased without stopping if existing storage space or speed is
insufficient.

Use RAID 1 or RAID 5 for servers in order to protect the data.

RAID 1 (Mirroring)
A RAID 1 controller implements mirroring by writing the infor-
mation to two drives in a mirrored pair. If a drive fails, the
mirrored drive still contains all the data. Once the failed drive is
replaced, it is immediately rebuilt using data from the mirror
drive. Most RAID 1 solutions are implemented in hardware,
permitting this to be done online. The mirroring scheme is expen-
sive, as it requires twice the number of disks.

RAID 5 (Parity)
A RAID 5 controller implements parity by computing recovery
data and storing it in parts across other disks. If a drive fails, its
data can be regenerated from the parity data stored on the other
drives. The parity scheme is lower cost because it requires only
one additional disk. RAID 5 writes data slower and is not as fault
tolerant as RAID 1.

Power
Uninterrupted operation of software requires continuous power
supply to the computers. Several measures can be taken to ensure
clean and uninterrupted power.
Chapter 10 – Safety, Availability & Security 253

Redundant Power Supply Modules


Server chassis can be ordered with hot-standby redundant power
supplies, meaning if one power supply module fails, the other
takes over.

UPS
An uninterruptible power supply (UPS) sits between the line
power outlet and the computer. It includes a battery and an
inverter that can be dimensioned to power a computer for about
15 minutes if line power is lost. This UPS contains power condi-
tioning that carries the server through short power glitches such
as brownout sags and over-voltage conditions. While the server
operates on power from the UPS, it is possible for data to be
saved or even backed up, then perform a graceful shutdown in an
orderly manner. Sophisticated UPS can be linked to the communi-
cation ports of the server to provide unattended file-saving,
followed by automatic shutdown in the event of a sustained
power outage.

To enable continuous operation during prolonged blackouts, it is


necessary to use a backup generator.

Power Conditioning
Surges and harmonics present on the utility power line caused by
nearby lightning strikes and switching of heavy electrical loads –
both common in industrial environments – may damage
computers and other equipment. Transient voltage surge suppres-
sors can be installed for each computer for line conditioning.

Also consider surge protection on other external connections to


the computers, such as the phone line for the modem and the
network connection.

Industrial Computers
Industrial grade computers are hardened to survive a much
harsher environment and may even be used on a factory floor.
Industrial computers are typically installed in a 19-inch rack in a
control panel or cabinet. Carefully designed ventilation creates a
slight excess pressure inside the chassis. This, together with filters,
prevents dust from entering and extends the operating tempera-
ture range. A rugged chassis with holders for plug-in cards makes
industrial computers less sensitive to shocks and vibrations. Solid-
254 Software for Automation

state hard disks based on Flash technology are available up to


several gigabytes, and they are constantly getting larger. Since
computers used in automation run continuously and are often
unattended, industrial computers have a lockable front covering
the removable disk, on/off button, reset switch, etc. (Figure 10-8).

Figure 10-8. Industrial Computer (Courtesy: Beckhoff)

Fault Tolerant Server


A single fault tolerant server is an alternative to using two separate
dual redundant servers. A fault tolerant server has a redundant
architecture internally, permitting it to tolerate faults and to
continue operating. A fault tolerant server is an ideal platform for
supervisory control and batch control functions executing on
computers, but also ideal as alarm & event server, alarm & event
logger, historian logger, etc. Batch control software may include,
for example, recipe execution, equipment allocation, and batch
history. These are tasks for which high availability is required, but
for which redundancy in two separate servers would be hard to
achieve. For such applications the 99.999% “five nines” avail-
ability of fault tolerant servers is ideal.

Fault tolerant servers are computers with dual or triple redundant


hardware to achieve fault tolerance. While simple computer part
redundancy only provides for redundant network interface cards,
power supplies, fans, and hard disks, fault tolerant servers also
have redundant motherboards including CPU, memory, and I/O
buses (Figure 10-9). Unlike fault tolerant servers, clusters use
multiple servers in a warm-standby scheme. Once a fault has been
detected, another computer is charged with the task. Therefore, it
takes time to load the application and data of the failed computer,
and during this time the system is down. Clustered computer
redundancy requires the backup computer to load applications
and data before it can pick up the task of a failed computer.
However, fault tolerant servers have no downtime. The redundant
Chapter 10 – Safety, Availability & Security 255

CPUs in a fault tolerant server execute the same code simultane-


ously in lockstep, permitting the remaining healthy CPU to
continue execution uninterrupted if a CPU fails. The redundant
I/O bus boards are separate from the CPU boards.

Figure 10-9. Fault Tolerant Server (Courtesy: Stratus Technologies)

A fault tolerant server executes regular software applications; no


special cluster-aware version of software is required. This means
any software can be used on a fault tolerant server. No special
scripting has to be created and tested to achieve the redundancy.
However, a notable exception may be some license-controlled
software, often found in automation. Verify with the software
vendor if there may be license issues when running the software
on fault tolerant computers and if special versions exist.

Many problems attributed to software are actually caused by tran-


sient hardware errors. The redundancy in the fault tolerant server
architecture detects and prevents such hardware faults. While a
clustered computer would actually fail, and the backup would
take some time to get running, a fault tolerant server does not fail
in the first place.

Availability can also be improved using auto-reboot to provide


quick recovery.

Cyber Security
In the past, most automation systems were isolated islands not
connected to other networks or systems outside the control room,
somewhere connected to the PIMS and LIMS that were also
isolated islands. From the point of security, this is far easier to
manage than a system permanently connected to other networks,
256 Software for Automation

particularly the Internet. The push for enterprise integration is


very strong, but connect the automation system externally only if
you must.

Today’s automation system may be connected to the enterprise


LAN, and even the corporate Intranet and the public Internet.
Such networking enables internal staff as well as external
customers and suppliers to be able to access the information they
need online themselves, and is also a platform for SCADA. Such
publicly accessible connections could make the automation
system and important internal databases vulnerable to remote
cyber attacks. The system could be compromised without the
culprit even getting near it. In this book, the term “hacker” is used
in the broad sense of anyone unauthorized trying to gain access to
networks and computers. The correct term may sometimes be
“cracker” or something else; the person may even be a terrorist.

Contrary to popular belief, obscurity in the form of proprietary


technologies is not a protection. Hackers can bring an automation
system to its knees without understanding the protocol or even
knowing it is an automation system, thereby jeopardizing a plant
and everything around it. An example of this is a so-called denial-
of-service attack where a device, gateway, or computer is flooded
with more requests than it can handle, preventing access by the
operator and other machinery. Invalid messages can make the
device behave unexpectedly. Similarly, valid messages in a propri-
etary protocol can be captured at one time and replicated another.
It is relatively easy to figure out the frame structure of an automa-
tion protocol, since they are not encrypted. Although the exact
effect of this may not be known, it is sure to cause disruptions and
possibly trips and hazards. Therefore, a network using a propri-
etary protocol or computers with exotic operating system is just as
vulnerable as open standards-based equivalents. In the past
systems were protected simply because they were not connected.

The perceived obscurity of proprietary protocols may give users a


false sense of security. One conceivable safety aspect of security is
the possibility for a terrorist to “spoof” control room commands to
create a disaster. However, in addition to network protocol knowl-
edge, an attempt to move a particular valve or toggle a particular
piece of equipment requires detailed knowledge of the unique
configuration of the system at the particular site, the wiring, and
the process. In other words, even knowing a standard protocol,
this approach is too difficult. Without the application-specific
configuration file for the specific system, the communication
Chapter 10 – Safety, Availability & Security 257

messages are just as obscure as a proprietary protocol. A denial-of-


service attack is a more likely terrorist threat, since using brute
force to create a hazard by bogging down or even crashing the
system can be done without requiring any knowledge. Vulnerabil-
ities in open systems are the same as those in proprietary systems.

To make the task as difficult for the hacker as possible, limit the prolifera-
tion of system documentation only to those that really need to know.

Connect the automation system to the plant information network


and Internet only of you must in order to share data with the rest of
the enterprise and the world. Not connecting the automation system
to the outside world saves you a great deal of trouble, but you will
not be able to reap the benefits you otherwise could. Use network
and software security regardless if protocols and operating system
used in the automation system are proprietary or open.

A cyber security threat may include an attack through any


external connection such as Intranet or Internet connection
through LAN switch or router, a dial-up modem, or wireless
access point. Another threat often forgotten is somebody walking
into the control room and connecting a laptop, which might have
contracted malicious software elsewhere. Such a laptop may also
have internal wireless LAN enabled.

Network and computer security is a big topic covered comprehen-


sively in general information technology (IT) books. This section
of the book gives an overview of the security topic and tries to put
it in an automation perspective. Additionally the reader may wish
to refer to the ISA TR99 technical reports.

Security technology such as firewalls and reverse proxies is only


half the solution. There is no silver bullet for security because
hackers are constantly getting smarter. Security is warfare
constantly using new measures and countermeasures. Security is
not an event. It is a constant process, part of the life-cycle manage-
ment of the automation system. A comprehensive security solu-
tion needs a security policy to keep the system secure over time
because there is no “set and forget” solution for security. New
threats are created all the time.

If you need to connect the automation system to the Internet, then


you may need a full time administrator to manage the automation
system’s security. Hackers are constantly looking for weaknesses
not yet discovered, and new technologies introduce new threats.
258 Software for Automation

For example, enterprise integration, permanent connection to


Internet, and wireless networking have all introduced new chal-
lenges that had to be dealt with. Similarly, the next new tech-
nology, in addition to enabling a quantum leap in performance,
will also introduce new possible openings for a cyber attack. For
example, the current trend of using port 80 for SOAP messages
that may contain XML-RPC commands to execute code on the
server will require more sophisticated screening.

A denial-of-service attack on the internal automation system


network would mean operators would no longer be able to see
what is going on the factory floor or take action, and would be
unable to do their job safely. The operation may have to be shut
down. A denial-of-service attack on a Web server could, for
example, prevent a portal from working properly; customers
might be unable to enter orders or get updates, and automatic
raw material reordering would not work. If you need to connect
the automation system to the office LAN, do so with caution. In a
business-to-business networking scenario, make sure customers,
vendors, and other extranet partners are also secure.

There are critical systems where no amount of security provides


adequate protection to justify a network connection to other
systems. For emergency shutdown systems, the best option may
be to not integrate it with the enterprise. It sometimes is better to
keep the system an isolated island. However, simply having the
system disconnected does not ensure security. Portable laptop
computers, file transfer through removable disks, etc. are threats
that remain even for an otherwise isolated system.

There are two aspects to security technology in addition to phys-


ical security: network security and software security. Network
security is the first line of defense designed to prevent unautho-
rized remote connection to the private network and devices and
computers on the network. Connecting to the network is one
thing, accessing the resources on that network is another. Software
security includes authentication and verification to make sure the
user connected on a network trying to access some resource really
is who he pretends to be – and restrict rights accordingly.

Protecting the automation system requires many layers of security.


This includes security for the network connection, security for the
operating system, for the software applications, etc. The network
in turn may require many lines of defense: router, firewall, and
Chapter 10 – Safety, Availability & Security 259

reverse proxy. Similarly, the network architecture can have


multiple perimeters: Internet-ERP, ERP-MES, and MES-PAS, thus
ensuring security in depth by reducing vulnerability. Compare the
single network security layer architecture in Figure 10-10 with the
multiple layers in Figure 10-11.

Figure 10-10. A Single Network Security Layer Architecture Provides Less


Security

WARNING – Ultimately there is nothing that is absolutely


secure, therefore in this book “secure” simply means “difficult
to break.”

Network Security
Because security measures such as a firewall between the automa-
tion network and the information network makes integration
more difficult and costly, it is tempting to not use any form of
security. However, security measures are necessary to protect the
automation network from problems permissible in the execution
and business environment, but not in automation. The risk associ-
ated with connecting the automation system to the outside world
includes the risk of a hacker destroying valuable data and falsi-
fying operator commands that could jeopardize assets, possibly
even the environment and human life. The other risk is denial-of-
service attacks that could make system operation unacceptably
slow or even force a stop.

Device-level automation networks such as FOUNDATION™ Fieldbus


H1 and Modbus/RTU are all enclosed within the walls of a
building or perimeter fence of the plant, meaning they are not
260 Software for Automation

connected directly to the Internet. Because automation networks


are physically protected, they are not points of entry for a remote
cyber attack. The need to secure the system exists from the IP-end
– at the connection to the corporate Intranet and the public Internet.

A firewall is used in high-level networks based on Ethernet and


IP. A firewall can have many different roles depending on how the
network is used. The firewall may be used to prevent internal
clients from accessing certain external servers. The firewall may
be used to prevent outside clients from accessing servers on the
internal private network. The most difficult scenario is when
outside clients must be permitted to access some internal servers,
such as a Web portal, while at the same time preventing these
clients from accessing everything else. To make matters worse, the
Web portal must be able to access the internal network to get the
data it presents. The firewall is located at the network perimeter
to the external networks.

Administrating workstations and servers in the automation


system is somewhat different from the office environment.
Computers in the automation system execute critical applications
that cannot be shut down while the plant is running, sometimes
running for years. Automation software requires automation
expertise rather than IT expertise. For these and other reasons the
computers and networks in the automation system should not be
administered by corporate IT department. The automation system
should therefore be separated from the execution and business
systems using back-to-back routers or firewalls. One automation
system-side firewall is administered by the automation system
group, while the other is administered by the IT department. This
way, there is no issue of ownership and responsibility. The IT
department simply needs to allocate a few blocks of IP addresses
to be used on the automation subnets.
Chapter 10 – Safety, Availability & Security 261

Figure 10-11. DMZ Provides Clear Demarcation of Responsibilities

Connecting the automation network to the company’s own busi-


ness IT network is a security risk for several reasons. First, the
business LAN is invariably connected to the Internet; hence, the
automation network will indirectly be connected to the Internet.
Second, a majority of cyber security attacks is performed by
company employees, not anonymous strangers. Third, the casual
use of networking in the business environment may result in inad-
vertent upsets of the automation network. Plug-and-play devices
and software perform considerable communication to discover
each other consuming bandwidth in the process, so direct connec-
tion to the Internet is not the only security consideration.

Automation network protocols are not firewall-friendly. The best


option is to present data through a portal using Web technologies.
This creates the need for additional hardware and software. It
may be tempting to run industrial fieldbus and Ethernet outside
the automation system boundaries, but this should be avoided.

Router, Firewall, and Proxy Server


The terminology for router, firewall, and proxy server is not
completely consistent. All these devices among other measures are
useful to protect the network and individual devices by restricting
access and forwarding only legitimate messages to their destina-
tion. Many times a router contains a firewall for the WAN port.
Sometimes a proxy server is simply referred to as a firewall,
although a proxy server typically provides two functions: caching
262 Software for Automation

for higher speed, and application proxy firewall for security. The
features that routers, firewalls, and proxies support vary consider-
ably. All of them, however, have the basic functionality of disabling
response to the PING command. Typical functionality is explained
below, although from time-to-time you may find that the function-
ality is available in another device. In this book the following
definitions are used, in order of increasing sophistication:

• Router - a device that performs stateless packet filtering,


NAT, and VPN

• Network Firewall - or simply firewall, is a device that adds


intrusion detection with alert and logging

• Application Proxy Server - or simply proxy, is a computer


with software that adds filters for services at the application
protocol level, and optionally virus scanning. The complexi-
ty of these filters introduces a slight communication delay.

The primary task of a router is not security, but the router is a first
line of cyber security defense. Even simple routers and some LAN
switches have basic security features built in, such as discarding
PING and performing packet filtering (see Figure 10-12). These
are simple measures that make it much more difficult for the
hacker to locate and access the network and may be sufficient to
make the casual hacker choose another, easier target, instead.
However, it is not sufficient to deter a more determined hacker.

Figure 10-12. Router Configured to Discard PING


Chapter 10 – Safety, Availability & Security 263

Each location must evaluate its cyber security needs. To put it


simply, a hacker would grab more attention hacking into the plant
of a multinational company and disrupting the power in a major
city, than turning off the HVAC of a small hotel. Essentially, the
higher the profile of the company, the better the security may
have to be. The router is the first line of defense. Next comes a
firewall or proxy server, which may possibly include a virus
scanner. Additional technologies include SSL encrypted HTTP,
that is HTTPS, as well as VPN. Last but not least, technologies
alone are not sufficient. A cyber security policy is necessary.

Modern router devices contain much more functionality,


including some basic capabilities previously only found in dedi-
cated firewall devices. Networking hardware such as ADSL
modems for broadband connection to the Internet usually
contains router and basic firewall capabilities. Even LAN switches
may have one WAN port providing simple packet filtering.

Skilled hackers can with time and determination penetrate simple


packet filter routers by port scanning, etc. Automation systems
that become targets of cyber attacks may have to employ a fire-
wall or a reverse-application proxy.

The automation system needs just one firewall. Place it between


the automation system network and its connection to the rest of
the enterprise IT network. This firewall protects all workstations
and automation devices on the automation network. But the fire-
wall is part of the automation system so that it becomes an asset
under the control of the automation department and maintained
by the automation vendor.

Packet Filtering
Packet filtering is communication screening based on only IP
address and port number. Each port number essentially corre-
sponds to one application layer protocol – for example, HTTP is
port 80. Enabling and disabling ranges of addresses and port
numbers is done using simple fill-in-the-blank fields (see Figure
10-13). Therefore, using packet filtering, it is possible to enable
and disable access from different computers and prevent using
specific protocols, such as FTP and Telnet, etc. Typically, all IP
addresses except those that really need access are disabled. Simi-
larly, all ports are blocked, except for those few protocols neces-
sary. For example, only a few computers on the Intranet in a
264 Software for Automation

restricted IP address range may be permitted to access the automa-


tion network, and they can only do so using a limited number of
permitted protocols. Another example is: only certain IP addresses
are permitted to access a Web server, and may do so only using
HTTP.

Figure 10-13. Fill-in-the-Blanks Packet Filter Configuration

Because virus and malicious software are often contracted


through Web browsing, file transfer, or Email, the computers in
the automation system should not have HTTP, FTP, or SMTP
access outside the automation network. This is easily achieved by
blocking the respective ports. Similarly, TELNET often permits
powerful functions to be performed. Since FTP and TELNET is
not encrypted, it should be blocked. In short, block all ports
except the few actually needed. An advantage of router packet
filtering is that no special configuration is required in client
computers.

Block FTP and TELNET if not needed, if FTP is used, do not permit
anonymous login.

The software firewall in Windows XP is packet filtering and it


blocks OPC by default. However, it is recommended that a fire-
wall be placed at the perimeter of the automation network
connecting up to higher-level networks, protecting the entire
automation network from the outside, such that software firewalls
need not be enabled in individual workstations. If it is used, the
software firewall can be configured to open specific ports or ports
used by specific applications (see Chapter 3).
Chapter 10 – Safety, Availability & Security 265

The packet filtering scheme would not work in cases where the
same port or same set of ports is used to perform several diverse
functions – for instance, whenever one protocol is used for many
different things. When a single protocol is used for many func-
tions it becomes impossible to make fine distinctions. It is neces-
sary to permit either all or nothing. A case in point is the powerful
DCOM protocol where Windows has dumped all of its diverse
functionality. OPC uses DCOM, but permitting DCOM would be
a security risk because many other functions also use DCOM and
opening up a range of ports for DCOM may permit hackers to
perform a whole range of tricks. This is why DCOM is not suit-
able for communication between the automation system and the
rest of the enterprise.

New technologies such as OPC-XML-DA, the upcoming OPC-UA,


and the entire Microsoft .NET platform use underlying technolo-
gies other than DCOM to avoid these problems. The realm of
DCOM should be within the automation system, not outside.
From a cyber security perspective, it is better to use a scheme
where one port is used for only one protocol. This reduces the risk
by providing finer control of what is permitted and what is not,
this is particularly so in conjunction with application proxies that
can examine the protocol in even greater detail.

Other filters include IP fragment filtering, meaning fragmented


packets are not allowed to pass. Option filter means that packets
that have IP options enabled are discarded. A secondary benefit of
packet filtering is that unnecessary traffic is kept away from the
automation network, ensuring performance is not degraded.

Packet filtering based on MAC address can be used to prevent


vulnerabilities from connecting laptops used in service or casual
connections of wireless access points. Packet filtering can also be
used to mitigate a broadcast storm. For example, when one or
more computers are set up to broadcast lots of messages – more
than some devices can process – the result would be denial-of-
service. By limiting the number of connections to a device, it is
possible to prevent denial-of-service that would otherwise result
in resource starvation.

A Web client such as a HTML browser uses the HTTP protocol


“Get” command to retrieve pages and files from a Web server.
This makes it easy for the client-side firewall to open and close
ports as necessary and to only permit replies from the outside to
266 Software for Automation

enter if they correspond to a get request. However, the server-side


firewall is more complex and will require a more advanced solu-
tion to protect the server.

Network Address Translation (NAT)


NAT is a basic router feature. Its primary purpose is to permit
many devices and computers on a network to share one single
Internet address. The primary benefits are costly rental of IP
addresses is reduced and fewer of the limited addresses provided
by IP version 4 are consumed.

Another benefit is NAT increases security by protecting the inside


network by hiding IP addresses of devices and computers on the
network. NAT can thus be used to shield the automation network
from external users. The only address visible to the external
network is the address of the router. The less the hacker knows
about the internal network, the more difficult it is to attack.

Access List
Access list is a filtering mechanism found only in advanced
routers and in firewalls. An access list is configured as a list of
statements used to build filters that permit or deny inbound and
outbound traffic between the internal and external networks. It is
possible to have different access criteria for each Ethernet port on
the router or firewall. Filtering is primarily made based on IP
source addresses but can also include protocol and destination
address. Options include address wildcards, check if the message
is a response to a TCP request, priority and type, discarding PING
requests, and suppress error messages, as well as logging and
remarks. Thus, access list configuration is more programmatic in
nature and therefore more difficult than simple packet filtering.
However, it is also more flexible and powerful.

Address spoofing is a basic hacker trick that can be mitigated


using access lists. Basic measures against address spoofing include
access list statements to:
• Block all packets in the three RFC 1918 address blocks
(10.0.0.0 - 10.255.255.255, 172.16.0.0 - 172.31.255.255, and
192.168.0.0 - 192.168.255.255)
• Block all packets with IP address 0.0.0.0 or 255.255.255.255,
127.0.0.0
Chapter 10 – Safety, Availability & Security 267

• Block class D and E multicasts if not used


• Block all Internet packets with IP address claiming to be on
the automation network

Dynamic Access List


In many systems, the IP addresses are dynamically assigned. In
these cases, access list based on static IP addresses would not do
the job, since one user may have been assigned different IP
addresses from one session to the next. Sophisticated routers and
firewalls support a mechanism where users log in with username
and password, and the filter dynamically selects a pre-configured
access list accordingly each time the user logs in. A serious draw-
back is that this mechanism requires user accounts with names
and passwords to be managed in the router. This is an additional
administrative task.

Reflexive Access List


Using a reflexive access list, the inbound access list is dynamically
opened and closed in response to an outbound communication.
This permits all external IP addresses and ports to be blocked by
default. When an outbound request is transmitted from a client,
the filter is automatically opened to permit the response message
from the server to return. This increases the security on networks
with clients, as all ports and addresses are blocked most of the
time, only opened temporarily when required, then closed again
after a period of inactivity.

Other Access Lists


Access list can also include filtering based on time of day. This
permits certain functions to be carried out only at certain times of
the day. Other access list features include the ability to open
multiple ports in response to a request, as required by protocols
using more than one report.

Intrusion Detection System (IDS)


Firewall software running on computers often includes sophisti-
cated Intrusion Detection Systems (IDS) that examine network
packet patterns and sequences, recognizing patterns and thus
detecting possible attempts to break in or commit denial-of-
service attacks. For example, by looking at several packets over a
period of time, it is possible to detect if someone is probing the
network ports. Common attack methods are recognized by
268 Software for Automation

comparing against known “signatures”, logged, prevented, and


alerted to the network administrator. Intrusion techniques
commonly detected include (see Figure 10-14):
• Port scan
• Windows out-of-band
• Land
• Ping of death
• IP half-scan
• UDP bomb

Figure 10-14. Intrusion Detection (Screenshot: Microsoft ISA server 2004)

These common intrusion techniques are explained in regular IT


books. Once an intrusion is detected it can be reported by auto-
matically sending an Email. It can be logged, and the server can
execute a command to take immediate action. The firewall soft-
ware should run on a dedicated computer, not on one of the
servers it is assigned to protect.

Application Proxy
A proxy server is intermediary software somewhere between the
clients and the ultimate source servers. A proxy server includes
features such as an application filter firewall for security and
cache providing accelerated Web access. Proxy servers can work
in both directions. By blocking outgoing messages, the proxy can
prevent internal users from accessing certain Web sites or servers
– preventing, for example, Web browsing or file download. By
blocking incoming messages, the proxy servers prevent many
forms of attacks.
Chapter 10 – Safety, Availability & Security 269

In the incoming mode, the server is often referred to as a reverse


proxy. A reverse proxy is often used to protect a Web server that
should be available to the outside network, such as the business
Intranet or the Internet. A proxy server is software running on a
computer with a regular operating system for servers such as
Windows NT family server. Thus, proxy servers are vulnerable to
operating system and software application bugs. The application
proxy software should run on a dedicated computer, not on one of
the servers it is assigned to protect. Make sure to use well proven
Web server software and maintain the software current by
applying any security patches that come out as soon as possible.

Packet filters and access lists don’t understand the application


layer protocol and therefore cannot permit certain protocol serv-
ices while denying others. For example, they cannot permit FTP
read of a file while denying writing files. Since other security
methods don’t examine the message contents and don’t under-
stand its function, they offer only a primitive all-or-nothing
approach. However, an application proxy understands the appli-
cation protocol and can therefore perform more sophisticated
filtering, permitting certain functions of a protocol while blocking
others, such as permitting read while denying write.

The proxy servers typically include filters for common application


protocols such as TELNET, ICMP, HTTP, SMTP, and FTP and in
modern servers, also SOAP. However, they do not provide filters
for automation protocols, although it is typically possible to
program filters that plug-in to the proxy server. A proxy server
can detect hacker tricks such as transmitting protocols other than
HTTP through port 80. A proxy server is able to detect intention-
ally malformed packets – for example, an illegal data field such as
length, but correct checksum targeted to cause not-so-robust soft-
ware to crash and devices to fail. ICMP is a protocol used to
transmit error messages but also includes the basic trou-
bleshooting command PING. The PING command can be used by
hackers as a denial-of-service attack as well as a way to discover
IP addresses and to see which ports are operational. An ICMP
proxy function can detect continuous pinging as in a denial-of-
service attack, or scanning IP addresses to discover devices.

DMZ
Connecting the automation system to the Intranet and Internet
requires tight security and it should only be done if really neces-
sary. If connected, consider permitting only read access. If external
270 Software for Automation

writes to controllers and actuators, etc. should be permitted, even


tighter security has to be in place. One way to provide good secu-
rity is to provide this interface through Web technologies and
place the Web server in a De-Militarized Zone (DMZ).

A DMZ is not really a firewall feature per-se but more about


where in the network architecture the firewall is connected. A
DMZ is formed by connecting two firewalls back-to-back and
refers to the network between the firewalls also known as a
perimeter network (see Figure 10-15). However, many firewalls
and routers do have special features built-in to simplify creation
of a DMZ. One purpose of the DMZ is to establish a clear demar-
cation line for areas of responsibility between connected
networks, such as between IT and automation. Another purpose is
to protect the internal private network while at the same time
providing outside access to a Web server. A DMZ is also a good
location for wireless access points. For a simple network not
connecting to other departments and that don’t include Web
servers or wireless access, a DMZ may not be required.

Figure 10-15. Simplified Schematic of De-Militarized Zone (DMZ)

The typical IT application for a Web server is simply to serve up


static pages with product data and prices. For this purpose, Web
servers are placed outside the firewall and therefore no access is
given back into the corporate networks through the Web server
connection. It is easier to make the private network secure. The
worst that can happen is that the Web server is interrupted.
Automation and e-business is more difficult because the Web server
must connect not only to the Internet to make information avail-
able, but also to the corporate network to constantly refresh with
up-to-date information. For this scenario, a DMZ is one solution.
Chapter 10 – Safety, Availability & Security 271

The basic rule for establishing a DMZ is that ports opened in the
two firewalls are mutually exclusive so a single protocol cannot
go right through both. For example, if the firewall facing the
external network has port 80 open, the firewall facing the internal
network should not have port 80 open. The Web portal must
access data on the internal private network through another port,
such as 30080, or use HTTPS on port 443.

Make sure to use well proven Web server software, and keep the
software current by applying any security patches that come out
as soon as possible.

Keep in mind that, by permitting writes to the automation system


from the corporate Intranet or public Internet, devices and the
process they control may change. The consequences may be very
serious in a plant.

There are two basic ways to connect a Web server in a DMZ: to


the outer router or the inner firewall.

DMZ for Inner MES Portal


When providing automation data to the execution level through
an MES portal, a suitable architecture may be to use one port on
the outer router and two ports on the inner firewall. The Web
server connects through the second port on the inner firewall
(Figure 10-16).

Figure 10-16. DMZ between Automation and Execution for Inner MES Portal
272 Software for Automation

DMZ for Outer ERP Portal


E-business forces manufacturers to provide online order status
data to customers and permits suppliers to check inventory to
automate reordering. The ERP firewall is the responsibility of the
IT department, not the automation engineers, but is included here
as an example of another connection of a DMZ. The data is best
made available through a Web server, while at the same time the
internal private network must be secured. The firewall should
primarily protect the internal private network. For this scenario,
the best architecture may be to use two ports on the outer router
and one port on the inner firewall. The Web server connects
through the second port on the outer router (Figure 10-17).

Figure 10-17. DMZ between Business LAN and the Internet for Outer ERP
Portal

VPN
Encryption is not required on the automation network, since it is
securely contained within the plant perimeter, factory, or building.
However, data transmitted over a public network such as the
Internet should be encrypted to ensure the data remains confiden-
tial and to prevent any attempt to tamper with the data. Virtual
Private Network (VPN) is a means to provide secure access to a
private network for a limited number of well-known clients or
networks. It permits one private network or client to securely
connect to another private network across a public network such
as the Internet using encrypted communication. VPN also controls
access by requiring authentication such as user name, password,
and an optional additional authentication such as smart card.
Authentication and authorization is explained in the section of
software security.
Chapter 10 – Safety, Availability & Security 273

There are a few different protocols for VPN, but the most common
are Point-to-Point Tunnelling Protocol (PPTP) and IP Security
Protocol (IPSec). Single client computers typically use PPTP when
dialing into a Remote Access Server (RAS) whereas IPSec is used
when connecting LAN to LAN through routers. Although
providing good security, a drawback of VPN is client software
must be configured on the client computer using the PPTP mode.
Connecting LAN to LAN requires network administrator skills.
VPN cannot be used ad-hoc from any computer. This limits the
flexibility for any customer to access when required. It may also
be a hindrance if access is urgently required in a crisis.

VPN uses public key cryptography that requires a much longer


key than symmetric key cryptography. VPN encryption and
decryption is also much slower. The advantage of using public
keys is that distribution of the key is easier and more secure.

Split ‘Air Gap’ Application Firewall


A split reverse proxy, a. k. a. “air gap” application firewall, only
supports the HTTP protocol and therefore is only suitable when
protecting Web servers. However, the most secure setup for the
automation network is exactly that, access only using HTTP.

A regular application proxy is software that runs on a regular


operating system and Web server software on a computer
connected to both the external and internal network at the same
time. A hacker can conceivably discover an operating system or
Web server vulnerability that permits the hacker to breach the
security from the external network and use the proxy to stage an
attack against resources on the internal network.

An air gap application firewall is split into one external server and
one internal server with a switched buffer in between. The switch
is connected to either server at any one time, never to both at the
same time (Figure 10-18). Technically, there is never any direct
connection from the external through to the internal network. The
firewall connects to the external network through the external
server. External messages are terminated at the external server,
which removes the message header from the message. The switch
transfers only the application data, taking away the ability to
address specific resources on the internal network. Filtered appli-
cation data is sent to a designated Web server on the internal
network by the internal server. The internal server generates a
totally new IP session to pass only the application data to the
internal Web server.
274 Software for Automation

Figure 10-18. Air Gap Switch

Regular application proxies use negative-logic filtering based on


“signatures” of known attacks. They recognize the signatures and
stop these known forms of attacks. Therefore, in a regular applica-
tion proxy, anything that does not explicitly look invalid is consid-
ered valid and allowed to pass through. Thus, new previously
unseen attack methods may be successful even if extreme care
was taken when configuring the firewall access lists. Moreover,
intrusions may look like legitimate HTTP or HTTPS messages and
go undetected.

Positive-logic filtering uses signatures for valid requests and only


messages that explicitly look valid are allowed to pass through,
anything unknown is blocked. The filter signatures are based on
application layer protocol, URL, and parameters included in the
request. Filters reject too long or malformed URLs and requests
where values are out of range. The filters can be created manually
or automatically through a learning process, and then edited
manually. Application attacks such as buffer overflow and code or
script injection are thus countered. Without positive filters, care
must be taken when designing Web pages.

The communication on the external network is SSL encrypted, such


as HTTPS. HTTPS protocol is supported by most Web browsers. The
internal server handles the encryption and decryption, permitting all
the applications on the internal network to use regular HTTP
communication. Moreover, the SSL certificate and key can reside on
the secure internal network instead of in a non-secure DMZ.

Web pages delivered from Web servers to browsers usually


contain the name or address of data sources on the internal
network, permitting a hacker to figure out the internal network
Chapter 10 – Safety, Availability & Security 275

architecture. These names or addresses can instead automatically


be translated by the air gap to something entirely different in the
pages provided to clients, with client requests conversely trans-
lated back, thus hiding the internal architecture. Without transla-
tion, great care must be taken when designing Web pages.

When accessing data on the internal network from the external


network through the air gap, the user gets prompted to enter
login name and password on a Web page.

Other Features
Several other features in firewalls and some routers are available
and additional measures can be taken to improve network security.

Limiting and Time-outs


A router protects the internal network from denial-of-service (DoS)
attacks, but may itself be subject to a DoS attack. This could overload
it, preventing users on the internal network to reach the external
network such as the Internet. It is possible to configure the maximum
number of connections as well as maximum rate of new connections
being established to counter DoS attacks on the router itself.

Alert
Firewalls can alert administrators to a possible attack when DoS
or intrusion is discovered. This can be done through Email or
pager. This permits the administrator to act.

Logging
Firewalls can log activity to create an audit trail. This includes
logging external DoS and intrusion attacks as well as logging
legitimate access made from the internal network, such as sites
visited. The network administrator configures triggers in the fire-
wall that log events as they occur. Triggers are selected and
configured based on specific site needs. The log generates an
audit trail that can alert a network administrator to an ongoing
attack and may later be helpful to analyze a successful or unsuc-
cessful attack.

Anti-Virus Scanning
A careless user can inadvertently transfer virus and other mali-
cious software by Email, Web browsing, or file transfer to the
automation system. Therefore, Email, Web browsing, and file
276 Software for Automation

transfer from computers in the automation system should be


blocked, such as by packet filtering in the firewall. If Email is used
for alarm and event reporting to the system, such as in some
SCADA solutions, Email must of course be enabled. However,
packet filtering and proxies cannot tell a good attachment from
bad. Therefore, if Email is necessary for the automation system,
make sure to block the use of attachments.

Some firewall software that executes on computers can perform


virus checking on HTTP, SNMP, and FTP. Virus scanning software
detects viruses by comparing incoming data against the signature
of known viruses in its virus definition database. For virus scan-
ning software to be effective, the latest virus definition has to be
loaded. Virus definitions can be downloaded through the Internet,
but the development of virus definitions always lags the creation
of a new virus so there is always a period of time during which
virus is not detected.

Virus scanning can be done from a server at the perimeter of the


automation system on incoming traffic for the network, or by soft-
ware on each individual workstation. It may be a good idea to do
virus scanning at the perimeter of the automation system because
not every automation system vendor permits users to install anti-
virus software on computers.

Physical Access
Routers and firewalls provide the network security, and LAN
switches are the basis for the network infrastructure. If these
devices are switched off or damaged, the entire network is
brought down. Similarly, if routers or firewalls are tampered with
or bypassed, security is compromised. Therefore, it is also neces-
sary to consider the physical security of the network hardware.
Keep firewalls, routers, switches, etc. locked in a closet. Similarly
proxies and other servers should also be locked up.

Configuration and Management


Routers can be remotely managed, but make sure hackers can’t
access them. Routers do not support strong authentication for
access. Therefore, the most secure way to manage routers is to
only permit configuration through a terminal directly connected
through a console port.
Chapter 10 – Safety, Availability & Security 277

For firewalls, if remote configuration is at all enabled, use SSL to


encrypt the communication, plus authentication to prevent
hackers from changing the configuration.

SNMP Considerations
Managed LAN switches and many Ethernet devices can be moni-
tored and configured using the Simple Network Management
Protocol (SNMP). The first versions of SNMP can pose a security
risk because they do not have encrypted authentication, making it
possible to sniff out the “community string” used as a password.
After gaining access, a hacker can use SNMP to discover network
properties and even reconfigure the network. Some network
devices that use early versions of SNMP, therefore, do not imple-
ment the configuration commands. Version 3 of SNMP supports
authentication and encryption, making it more difficult to break.
Either way, it may be a good idea to disable SNMP in the devices
and through packet filters, or through access lists limit users that
can access using SNMP. Make sure to change the default commu-
nity name to a proper password.

SNMP traps can be set in switches to detect new links, and


network management software can discover, for example, if a
laptop or other new devices are connected to the network.

Modem and RAS


Modems can pose a security risk, since they connect externally
and thus can be a point of entry for a hacker. In particular, Remote
Access Server (RAS) functionality permitting external dial-in to a
network may be a security risk. Therefore, preferably only permit
dial out, and do not permit the modem to answer incoming calls.

If RAS is used, such as to permit remote maintenance and


management, there are some measures that can be taken. These
include, for example, disconnecting if login name or password is
wrong more than three times, caller ID where the number is
checked and logged, use callback with a limited list of permitted
numbers and on a different line. If callback is done through an
exchange, it should drop the incoming line before dialing out.

WEP
Wireless networks enable several attractive possibilities using
mobile handheld computers for data entry and interrogation
while moving about the plant, factory, or building. However, in
278 Software for Automation

spite of Wire Equivalent Privacy (WEP) encryption, wireless


networking should generally be considered insecure and should
only be used with caution when other options are not possible.
Wireless networks are vulnerable to eavesdropping and denial-of-
service attacks.

If wireless network is used, make sure to connect wireless access


points outside the private network firewall, in a DMZ. Do not put
wireless access points on the control network.

A serious problem with wireless access is the user does not really
know to which Web server or network the computer has
connected. It is possible that an unsuspecting user will unwit-
tingly enter a secret user name and password on a Web page from
a server right outside the premises set up by a hacker with the
purposes to trick users to reveal secret information. The hacker
then uses the identification to gain access to network resources.

Another concern is that some employee, for convenience, connects


a wireless access point inside the firewall. This permits an
outsider to connect direct to the private network, bypassing all
network security. SNMP traps can be set in switches to detect new
links and network management software can be used to monitor
the network and alert administrators.

Software Security
Software security primarily includes authentication and authori-
zation. Software security is built into computer operating systems,
Web server software, and other software applications. Software
security is required in addition to network firewall measures.

Authentication and Authorization


Security in operating systems such as Windows uses a concept of
Users and User Groups that are assigned different access rights to
files and software components. Rights are typically assigned based
on their roles, such as operator, engineers, and managers, etc.

Password Authentication
Authentication means a user is identified. The simplest form of
authentication requires logon with user name and password as
the credentials. If passwords are not managed properly, they are
surprisingly easy to break. A password policy is an essential part
of the overall security policy. Login and password is also used in
Chapter 10 – Safety, Availability & Security 279

some network security schemes such as VPN and dynamic access


lists. Passwords may also be used as a key in encryption.
Windows NT security can be based on domain or workgroup
network access model, depending on the system philosophy. In
the workgroup model, the user is authenticated in every worksta-
tion accessed. In the domain model the authentication is done in a
central domain controller.

Single Sign-On (SSO) enables a user to log in once to gain access


to multiple computers across a network. Usually, operator soft-
ware makes use of the operating system security for logging in
and authentication of users. The security configuration should be
based on the plant’s security policy. Generally, groups of users are
created for operators, supervisors, engineers, administrators, etc.,
each having different rights. Individual users and groups can be
assigned different privileges to access files and applications.

Make sure applications used in a networked environment encrypt


passwords before sending them across a network or storing them.
Operating systems such as Windows NT encrypt all passwords,
making them inaccessible even to the administrator.

Guessing a password is far simpler than trying to break an encryp-


tion algorithm or operating system. Simple applications and
embedded devices support only letters or numbers. Number pass-
words are quick to break using brute force to try every combina-
tion. The number of characters the password supports depends on
operating system and software, for Windows NT it may be 127
characters in length permitting an entire phrase. Programs for
automatic password cracking exist that use a number of very large
“password dictionaries” trying popular passwords one by one.
These are capable of breaking virtually any single word password
in most languages, and even phrases such as lyrics and scripts, in a
matter of days, hours, or minutes. Common words and phrases
should be avoided. Modified capitalization, spelling, and replacing
letters with numbers do not help much. Strong passwords can take
months to crack. Do not write it down and do not tell anyone. It is
better if the applications generate passwords the users have to
learn. Users tend to select weak and easy-to-remember passwords,
and they tend to reuse the same password in many places, poten-
tially giving access to many resources, if found out.
280 Software for Automation

A strong password should:


• Be at least seven characters long
• Be complex:
- Contain at least one lowercase letter
- Contain at least one uppercase letter
- Contain at least one symbol character:
`~!@#$%^&*()_+-={}|[]\:“;‘<>?,./
- Contain at least one number
• Not be a common word or name
• Not be a common number or date

Passwords in Automation
In an office environment, there are lots of computers, people with
different roles and responsibilities, and many outside visitors.
Therefore, computers in an office require strict security. However,
operator workstations in many installations remain permanently
logged on, and all share the same pool of computers. Often indi-
vidual passwords are not used, instead all operators share one
login name and password, and the password may never expire, or
there may not even be a password.

From this respect an automation system is different from the IT


world, and this must be reflected in the security policy for the
automation system. In many applications, the password policy
must not under any circumstances lock out the operator. Accessi-
bility is critical in an automation system and therefore the user
should not be allowed to change the password. Although policies
such as locking access for a period after a number of unsuccessful
attempts are common in the IT world, they may not be fit for the
automation system. In a crisis the operator may fail several times
to log in by hastily keying in wrongly, or not note that the caps
lock is on. Similarly, if the operator has difficulties recalling the
password, the operator will be locked out and unable to act.
Serious accidents may result which otherwise could be avoided.
Similarly, in the IT world, passwords expire every few months,
cannot be changed more often than once a day, and cannot be
reused over several iterations. This may not fit the automation
system policy. Because the computers effectively do not use pass-
words, there must instead be restrictions on persons entering the
control room, such as some form of lock.
Chapter 10 – Safety, Availability & Security 281

However, in a plant that must conform to FDA 21 CFR Part 11 for


electronic records and signatures, the stricter IT world policy must
be used also for the automation system. Similarly, if some server,
such as a Web portal, can be accessed from the Intranet or
Internet, it needs a stricter policy. For example:
• User names must be unique and individual, cannot be
repeated or shared among users
• Force regular password change
• Lock after five failed attempts
• Do not use the same password in many different places
• Enforce minimum password length
• Enforce mix of numbers and characters
• Prevent rotating between a few passwords

An operating system such as Windows contains features that


enforce many aspects of the password policy (Figure 10-19).

Figure 10-19. Password Policy Enforcement (Screenshot: Microsoft Windows)

Beware that some software applications transmit passwords unen-


crypted. Make sure to use software that encrypts passwords that
are transmitted or stored.

Strong Authentication
Because security built on static reusable passwords has proven
easy for hackers to beat, simple passwords are not sufficient for
remote access to the automation system. Strong authentication
uses two factors of identification: something that you know and
something that you have. One is typically a password, the other
282 Software for Automation

may be a biometric or a token generated from hardware or soft-


ware. One form of a token is key-fob or smart card that generates
a new unique authentication code every minute, and which there-
fore is not reusable. Tokens are ideal for remote access control.
Biometric authentication includes retina scan and finger printing,
the latter of which may be built into a computer mouse. Two-
factor authentication is recommended for remote access, particu-
larly to also perform writes.

Authorization
Authorization means a logged-in user is granted certain rights
and access. Windows NT security can be based on domain or
workgroup network access model, depending on the system
philosophy. The domain model has a single access control list and
is therefore easy to manage. Single Sign-On (SSO) enables a user
to log in once to gain access to the operating system and many
applications, provided the applications have security integrated
with the operating system.

Rights for the operating system and applications should be


assigned on a need-to-do and a need-to-know basis, such as based
on the role of each user or group. Access rights should be tailored
according to responsibilities. Make authorization as restrictive as
practically possible. Few users on the Intranet need access to the
automation system, and those that don’t should not have privi-
leges. Similarly, not all users on the automation system need
access to the Intranet or Internet.

When an application such as an operator visualization software


for the operator console is integrated with the operating system,
such as Windows, the operating system handles the authentica-
tion while the application takes care of the authorization for the
different features of the software.

Rights can be granted on an individual user or group basis. Using


groups, rights of roles such as operators, engineers, and techni-
cians can be changed independently of the actual persons to
match changing responsibilities of each role (see Figure 10-20).
The rights of persons are changed automatically if they are trans-
ferred from one role to another, or join another group. Estab-
lishing new user accounts for persons that start in a new role is
greatly simplified.
Chapter 10 – Safety, Availability & Security 283

Figure 10-20. Users and Group Accounts (Screenshot: Microsoft Windows)

The operating system rights assigned to users and groups include


basic capabilities, ranging from logging onto the computer to
permissions to manage security (see Figure 10-21). The process
operator group may only need rights to log on and to access the
computer remotely. A backup operator group will need rights to
backup and restore files and directories. A system engineer group
may also need the ability to set the time and shut the system
down. Lastly, the system administrator should also be able to
manage security and take ownership of files, etc. In other words,
users with administrator rights have full access to the system
whereas operators and engineers only have access to the resources
they need to do their jobs.

Figure 10-21. Operating System Rights (Screenshot: Microsoft Windows)


284 Software for Automation

To enhance security:
• Remove all unused accounts, especially those with adminis-
trator rights
• Disable the guest and anonymous accounts
• Review and disable or remove accounts for unauthorized
users, such as staff that leave
• Minimize the number of accounts with administrator
rights; use non-privileged accounts as much as possible
• Use non-privileged accounts for services and continuously
running applications
• Minimize the number of accounts for services and applica-
tions, not associated with a person
• Remove default accounts and accounts created by system
integrators, etc.

The operator visualization security adds several important


features on top of the operating system security. For example, it is
common that several operators work from the same console.
Therefore, the operator visualization security allows multiple
users to be logged in simultaneously, resulting in effective rights
of all the users combined. As with the operating system, security
can be configured for individuals or for groups of users. A user
can belong to multiple groups and have the effective rights of all
the groups combined. Rights can typically be granted and denied
to workstations, application actions such as adding pens to a
trend, files such as displays, and writing parameters. By default,
systems often have no security at all granting full access to all
functions. As a first step when configuring security, therefore,
remove these rights. Typically the automation software uses the
same login name and password as the operating system to elimi-
nate the need for two logins.

Access can also be authorized based on the physical console used.


For example, selected workstations on the network can be denied
access to some functions or certain areas of the plant, factory lines,
or building floors. They may limit access from a console on the
plant floor or satellite control room.

Application actions such as configuration of trend pens, logs and


reports, and graphics are also authorized by security (Figure 10-22).
Chapter 10 – Safety, Availability & Security 285

Figure 10-22. Application Accesses Authorization (Screenshot: SMAR


SYSTEM302)

Files are protected by denying access for unauthorized users or


groups. Configure which displays, etc. individual users and
groups can access by granting or denying access to the associated
file. Thus, certain users can be authorized to access certain plant,
factory, or building areas or system status, while others are not.

Parameter access can be authorized based on the associated OPC


server and item name such as tag and parameter name. Thus, it is
possible to deny writing to an operational parameter such as
mode, set point, output, alarm settings, tuning, as well as alarm
acknowledgement and disabling, and overrides for selected users
and groups, while permitting it for others (Figure 10-23). The
exact configuration of privileges for the groups depends on the
plant policy and system capability. The lowest level of authoriza-
tion may be view-only.

Figure 10-23. Parameter Authorization (Screenshot: SMAR SYSTEM302)


286 Software for Automation

Parameter access security and audit trail usually only works


within operator visualization software that typically has more
sophisticated security. Many other OPC client applications do not
support authorization based on users and groups. Therefore, to
prevent unauthorized access, it may be necessary to use other
security features in the operator visualization software to prevent
operators from switching from the visualization software to appli-
cations that do not have access restrictions.

Last, access can also be authorized based on time, meaning certain


functions may only be available during selected times of day, for
example, not on Sunday and Saturday (Figure 10-24).

Figure 10-24. Scheduled Access (Screenshot: SMAR SYSTEM302)

Operators need to take urgent action and therefore share a single


password on workstations that are always logged on. Since the
computers are always logged on, and operators are not distin-
guished, the operators are assigned limited rights. There is no need
for operators to make configuration changes to the automation
system. However, engineers and administrators who have more
rights should always have individual passwords. Engineering
stations should automatically log out after a period of inactivity.

Because only automation staff understands the automation soft-


ware and how it affects the plant, factory, or building, the normal
IT staff should not have administrator rights on the computers in
the automation system. The automation system should be on a
separate subnet from the rest of the enterprise, separated by
routers and firewalls which clearly demarcate the separate areas
of responsibility.
Chapter 10 – Safety, Availability & Security 287

Challenge Response Authentication


Another form of authentication is a response to a challenge from
the software. This may, for example, be used to unlock files that
have been locked down due to too many failed attempts or when
the user cannot recall the password. The user then calls the soft-
ware manufacturer, informs the challenge, and after ascertaining
that the caller is authorized, the manufacturer provides the proper
response for the challenge (Figure 10-25).

Figure 10-25. Challenge/Response (Screenshot: SMAR SYSTEM302)

Audit Trail
By enabling the audit trail functionality, both successful and failed
attempts to make changes can be logged, which could be legiti-
mate users as well as a hacker. Logs are therefore useful to see
what damage may have been done and assist in repairing it, as
well as to trace the culprit.

User activities can be tracked by auditing security events and then


placing entries in a security log. Operating systems such as
Windows permit selecting the types of security events that will be
logged (Figure 10-26). Both the success and failure of activities can
be logged in the audit trail. Events that can be monitored include
logon and logoff, file and object access, use of rights, user and
group management, security policy changes, restart and shutdown.
288 Software for Automation

Figure 10-26. Operating System Audit Trail Policy (Screenshot: Microsoft


Windows)

An event viewer can be used to view entries logged in the secu-


rity log.

Additionally, operator visualization software typically includes an


event logger that records all changes made through access via
operator consoles.

Focus
To prevent operators from launching, switching and shutting
down applications, as well as installing or modifying applications
that could interfere with the automation system, many of the
operating system functions have to be disabled for certain users or
groups of users. In an operating system such as Windows, it is
possible to lock out functionality to prevent a user from doing
things that may disable the computer or system, or distract atten-
tion. Security may be used to ensure operators cannot switch to
other applications having no security, preventing security from
being circumvented.

For example, the Task bar “Start” button may have to be disabled
as may special function keys such as the Windows key and
ALT+TAB, CTRL+ESC, ALT+ESC, and limit CTRL+ALT+DEL
(Figure 10-27). This ability to selectively disable the keys is usually
built into the operator software but generally no disabling is done
by default. Therefore, based on the plant’s security policy, disabling
must be configured. The administrator can define for which users
or groups of users certain key combinations should be disabled.
Chapter 10 – Safety, Availability & Security 289

Figure 10-27. Disabled Keys (Screenshot: SMAR SYSTEM302)

It may be possible to define several groups with combinations of


blocked keys corresponding to functions disallowed for operators,
supervisors, and engineers.

Many functions in Windows can only be disabled by editing the


registry. Edit Windows registry to disable:

• Shutdown button • Folder Options


• Desktop icons • Favorites Menu
• Find menu • Recent Documents
• Logoff button • Active Desktop
• Run menu • Save on exit
• Control Panel • Network neighborhood
• Taskbar • Drives
• Tray • Lock
• Windows context menu • Password change
• Drive Auto-Run

Others
There are several other measures that can be taken to keep the
system as secure as possible. This includes keeping software and
anti-virus definitions up to date.

Anti-Virus Scanning
Not every automation system vendor would permit users to
install anti-virus software on servers and workstations, since
incompatibilities may occur when the virus definition and the
different components of the anti-virus software are updated.
Moreover, the updating process requires connecting to the
Internet or using removable disks, thus exposing the system
(Figure 10-28). Therefore, if Email, Web browsing, or file transfer
290 Software for Automation

has to be enabled, it may be a good idea to do virus scanning at


the perimeter of the automation system. This eliminates the need
for scanning on each station.

Figure 10-28. Live Update of Virus Definitions (Screenshot: Symantec


Norton AntiVirus)

The oldest forms of virus propagate through removable disks and


drives such as the floppy. Consider employing locking mecha-
nisms for removable media.

Upgrades, Patches, and Hot-fixes


A system is best protected if the latest security hot-fixes for the
operating system are installed on the computers. However, some
automation systems may not work after the operating system has
been updated. It may therefore be a good idea to find out from the
software vendor which patches have been tested on the system,
and if the operating system patches subsequently require some
patches or changes to the automation software. The same may
apply to new virus definitions for anti-virus software. Because
keeping operating system and anti-virus software up to date is
not as straightforward, automation system administrators may
not be able to afford to do weekly, or even monthly, updates of
security patches on the computers.

Component Security
DCOM provides security for distributed applications, even
though these applications are not specifically designed to be
secure. The security for any application component, such as the
OPC servers, can be customized. DCOM security configuration is
explained in Chapter 3.

Loading ActiveX components from unknown sources can be very


risky. See Chapter 5 for a discussion on this topic.
Chapter 10 – Safety, Availability & Security 291

Cyber Security Policy


Most of all, cyber security is a habit instilled through corporate
culture. Security is not an event, it is a process that is constantly
reviewed and updated. A cyber security policy should be estab-
lished containing the security mandates for the automation
system. The policy has to be periodically reviewed and updated.

As per ANSI/ISA-TR99.00.02-2004 - Integrating Electronic Security


into the Manufacturing and Control Systems Environment, a security
policy should include the following areas:
• Legal and regulatory compliance
• Training and certification
• Hiring, evaluating and terminating personnel
• Assignment of appropriate security clearance levels
• Authentication
• Authorization
• Logical rights
• Change control
• Logging
• Passwords
• Accounts
• Modem access
• Other remote access
• Unused resources

Cyber Security Manual


In addition to one-time installation of hardware and software
measures, it is necessary to put procedures in place in order to
ensure the system remains secure. This includes reviewing the
security impact of any changes before they are implemented. A
manual of standard operating procedures may be helpful. Proce-
dures may include, for example, steps for installing new software
on the automation system, updating with patches, use of remov-
able media, and connection of mobile computers from visitors.
292 Software for Automation

Other examples of procedures to be included in a cyber security


manual:
• Security audit procedures
• Incident response procedures
• Contingency procedures
• Disaster recovery procedures

21CFR11 Electronic Records and Electronic


Signatures
Throughout all automation areas, automatically generated elec-
tronic records with electronic signatures are taking the place of
manual paper records with handwritten signatures. The benefits
of electronic records are listed in Chapter 2. Industries with
processes regulated by the Food and Drug Administration (FDA)
in the USA have high accountability standards for requiring
records for their process to be kept. Special regulations apply if
the records and signatures are electronic. Affected industries
include pharmaceuticals, beverages, blood handling, medical
devices, food processing, cosmetics, etc. Similar requirements may
exist in other countries.

In the U.S., this regulation is “Code of Federal Regulations, Title 21,


Volume 1, Food and Drugs, Chapter I—Food And Drug Administra-
tion, Department Of Health and Human Services, Part 11—Electronic
Records; Electronic Signatures,” or for short simply “21CFR11.” This
regulation does not go into details of applicability and implemen-
tation. It is open to interpretation and some judgement is therefore
required when interpreting the regulation for a specific applica-
tion. This section is intended as a generic overview; each process
and software will require some adaptation.

If the site complies with 21CFR11, the electronic records and


signatures are considered equivalent to paper records and hand-
written signatures, and are legally binding. The regulation relates
to tracking changes and alarm acknowledgements made by opera-
tors, and security functions to restrict access and to authenticate
the operator.

Software Considerations
Many operator visualization software applications have been
designed with 21CFR11 in mind, making it easier for system inte-
grators and users to build a system that meets the requirements.
Chapter 10 – Safety, Availability & Security 293

Many of the features explained for software security are used to


meet 21CFR11 requirements. Note it is not the software that must
comply with 21CFR11, it is the user system implementation. Once
integrated, the system can be validated by a third party. However,
software designed for 21CFR11 makes it easier to meet the regula-
tion by reducing integration and validation costs.

If there is any software application involved in the automation


project that does not meet the requirements of 21CFR11, it will be
necessary to perform the required functions manually. For example,
if the engineering tool used does not contain an automatic audit
trail, changes must be tracked manually. The more built-in auto-
mated functions the software has, the less remains to be configured
in scripts and done manually according to procedures.

Authentication and Authorization


The 21CFR11 regulation requires that system access is limited to
authorized individuals. Users should be identified using biomet-
rics or a user name and password. Authentication and authoriza-
tion using user name and password is supported in operating
systems such as Windows. Third-party biometric devices such as
thumb print scanner come with drivers to integrate with the oper-
ating system security.

Operators must have individual unique user names for login.


Unlike many automation systems where all operators share the
same user name, this is not permitted for FDA regulated manufac-
turing. Managing multiple user accounts is not difficult, providing
user groups are used. Moreover, user names should not be reas-
signed or reused. For example, if one “John Doe” leaves the
company and another “John Doe” joins, the second person must
have a slightly modified user name. Ensuring unique accounts is
best managed by never deleting any user accounts. Simply disable
unused accounts. This prevents future users from being confused
with past users.

Operating system and operator visualization software can be


configured to automatically log out the user after a certain period
of time. This ensures the operator at the console must authenticate
on a regular basis.

Passwords must be changed on a regular basis. Operating system


and operator visualization software can be configured to automat-
ically force the operator to periodically select a new password as
the old one expires.
294 Software for Automation

Unauthorized use of accounts must be prevented and reported.


Operating system and operator visualization software can be
configured to automatically lock out accounts when a person fails
several attempts to log in. Operating system and operator visuali-
zation software generate events that get logged into their respec-
tive event logs for review by the administrator. Moreover, it is
possible to tie in with Email and mobile messaging, etc., to alert
persons in charge.

More details on authentication and authorization, including


policy for password length and expiration, etc., can be found in
the section on software security.

Audit Trail
Operator actions during production can have a significant impact
on the product. When operators make entries, this must be logged
in a time-stamped audit trail. The application may either log the
currently logged-in user as performing the change or, for critical
points, it may pop up a dialog box requiring the user to authenti-
cate one more time to confirm the change before the action is
completed. Optionally, a second person may be required to elec-
tronically sign (see Figure 10-29).

Figure 10-29. Electronic Signature (Screenshot: GE Fanuc iFix)

Operator visualization software typically includes an OPC-A&E


server that automatically generates an event whenever an oper-
ator keys in a value, or manipulates other controls such as sliders
and pushbuttons. The event is captured by an OPC-A&E logger
client that records the tag name, original and new value, operator
Chapter 10 – Safety, Availability & Security 295

name, time-stamp, and computer name, etc., using a secure data-


base engine (see Figure 10-30).

Figure 10-30. Audit Trail in A&E Log ActiveX Viewer (Screenshot: SMAR
SYSTEM302)

Operator entries such as changed set points, modes, and limits, as


well as alarm acknowledgements, are logged along with logins
and logouts. The audit trail makes it possible to trace backwards
through all changes back to the original data before changes were
made.

A concern about electronic record keeping is electronic fraud, as


data on networked computers is accessible to more people. It
must not be possible to falsify or destroy records in the database
without a trace, permanently losing the true original data. Several
security features are built into database engines to resist
tampering. The database engine itself must also include an audit
trail to trace any changes made directly to the data logged in the
database such as modification or deletion of data. Sophisticated
SQL database engines are used to ensure that original data is not
truly deleted but can be recovered. Moreover, database files are
compressed in a binary format, making file tampering difficult.
The database engine security ensures the validity of the data and
ultimately the credibility of the records. Last, the access to the
database can be restricted to only authorized users or groups.

Alarm Log
When an operator acknowledges an alarm it may be sufficient to
merely record that the operator action was an alarm acknowledge-
ment. However, it may be helpful to record some additional
comments regarding the reason and circumstances of the alarm.
Therefore it may be a good idea to use operator visualization that
permits operators to enter such comments explaining the alarm
when acknowledging alarms (see Figure 10-31).
296 Software for Automation

Figure 10-31. Alarm Explanation (Screenshot: ICONICS Genesis)

The operator comments will then be logged along with other


alarm information in the alarm database.

Sequencing Enforcement
Some FDA products require a specific sequence of steps and
phases for manufacturing. Software used in manufacturing, such
as operator visualization software and batch management, must
include mechanisms to enforce the correct execution order during
special circumstances when this is done manually. This ensures the
product is manufactured as per specification and with a minimum
of variance. A simple method is to disable or hide controls associ-
ated with the next step before the previous step in the phase has
been completed. More advanced schemes may include the use of
VBA scripting to assure operator commands are valid.

System Revision Control


The system integrator must control changes to the system docu-
mentation. Many of the operator screens, such as recipe entry,
may be considered part of this documentation. Graphics screens
and the configuration are files that can tie into document control
software used for revision control of other documents related to
manufacturing FDA regulated products. Operator visualization
software designed for 21CFR11 includes tools to simplify
managing the configurations (see Figure 10-32).
Chapter 10 – Safety, Availability & Security 297

Figure 10-32. Embedded Revision Control (Screenshot: ICONICS


ProjectWorX)

Closed System
21CFR11 distinguishes between “open system” and “closed
system.” A closed system means system access is controlled by
persons who are responsible for the content of electronic records
on the system. When the automation system administrator
controls access, it is a closed system. This is the most straightfor-
ward approach. A “closed system” in the 21CFR11 sense of the
word does not mean proprietary technologies.

An open system means system access is not controlled by persons


responsible for the content of electronic records on the system. If
the system is connected to the corporate IT networks, the Internet,
or has modem access so data is transferred “in the open,” it is
classified as an open system. An open system would require addi-
tional measures such as document encryption and use of appro-
priate digital signature standards to ensure, as necessary under the
circumstances, record authenticity, integrity, and confidentiality.

Procedures
Several requirements in 21CFR11 cannot be met using software.
Therefore the user is responsible for making sure these require-
ments are met, such as by putting procedures in place. These addi-
tional requirements include getting applications validated, for
298 Software for Automation

example by third parties specializing in FDA approval. Require-


ments include:
• Ensuring records are available, meaning not lost, and that
applications that display data remain compatible and not
obsolete as software is upgraded;
• Physically limiting system access to authorized persons,
such as by restricting access to the control room;
• Ensuring system integrators are properly trained;
• Holding individuals accountable for actions performed
under their signature;
• Limiting the proliferation of system documentation only to
those that really need to know
• Requiring manual procedures to maintain an audit trail of
changes to documentation;
• Verifying the identity of persons given system access;
• Certifying that electronic signatures are legally binding;
• Ensuring that only the unique owner uses their user names
and passwords, meaning they are not shared or stolen;
• Ensuring passwords are changed regularly, such as by
enabling password aging and lockout in the operating system;
• Deactivating stolen passwords, tokens, etc.
• Users to periodically test tokens, etc., if used.

Exercises
1. Is it permitted to set OPC to write to a SIS logic solver?

2. Is Ethernet ring topology a form of redundancy?

3. In one year, how many minutes of downtime is equivalent


to an availability of five nines?

4. Why is a proprietary protocol not a cyber security protection?

5. What is the basic way of preventing the internal network


from being accessed by a specific application layer protocol
such as FTP?
Chapter 10 – Safety, Availability & Security 299

6. What is the potential weakness of using the automation


system server to also connect to the information network
through a separate network card?

7. Which port or ports of the inner firewall in a DMZ configu-


ration may be opened?

8. Is it acceptable to leave operator workstations permanently


logged on?

9. What is the difference between authentication and authorization?

10. How is it possible to prevent games to be installed and


played on the workstations?

11. What is an electronic signature?

12. If some 21CFR11 requirement does not have a correspon-


ding feature in a software, is it possible to use this software
in a FDA regulated application?

References and Bibliography


1. Berge, Jonas. Fieldbuses for Process Control: Engineering, Oper-
ation and Maintenance. ISA – The Instrumentation, Systems,
and Automation Society, 2002.

2. ANSI/ISA-TR99.00.02-2004 - Integrating Electronic Security


into the Manufacturing and Control Systems Environment. ISA –
The Instrumentation, Systems, and Automation Society, 2004.

3. Code of Federal Regulations. Title 21-Food and Drugs,


Chapter I-Food and Drug Administration/Department of
Health and Human Services, Part 11-Electronic Records;
Electronic Signatures. Government Printing Office, 2003.

4. IEC 61511-1 (1st edition 2003-01) Functional Safety –Safety


Instrumented Systems for the Process Industry Sector -
Part 1: Framework, Definitions, System, Hardware, and Software
Requirements. IEC, 2003.

5. TÜV Product Service, Ltd. Maintenance Override Procedure,


Version 3.0 (draft).

Anda mungkin juga menyukai