Real-Time Systems
Faculty of Engineering
Bachelor of Engineering
Study Book
Written by
http://www.usq.edu.au
Copyrighted materials reproduced herein are used under the provisions of the Copyright Act 1968 as amended,
or as a result of application to the copyright owner.
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any
means electronic, mechanical, photocopying, recording or otherwise without prior permission.
Camera-ready copy produced using LATEX 2" by the author. Style file supplied by Ted Siebuhr, DEC. MATLAB
was used for the majority of PostScript graphs. The following LATEX 2" packages were utilized: dvips (PostScript
output); pstricks (most diagrams); hyperref (hyperlinking for Acrobat PDF format). AMS font system used for
some mathematics.
TABLE OF
CONTENTS
PAGE
1.2 Introduction 1
1.6 Terminology 5
1.6.1 Systems 5
1.6.2 Events 7
1.8.1 Platforms 9
2.2 Introduction 1
3.2 Introduction 1
3.3.1 Starting 2
3.3.6 Pointers 18
3.4.1 Portability 24
3.4.2 Makefiles 25
3.5.6 Input/Output 39
3.5.12 Functions 49
3.5.13 References 50
4.2 Introduction 1
4.5 C to Assembler 6
5.2 Introduction 1
Module 6 Multitasking 0
6.2 Multitasking 1
6.2.1 Processes 2
6.2.2 Threads 3
6.4 Threads 11
7.4 Pipes 3
8.4 Semaphores 7
This module gives an overview of the field of real-time operating systems. As such, it
forms a basis for what is presented in the following modules by introducing many of the
important concepts involved in studying operating systems, and in particular real-time
operating systems. The examples given in later modules are aimed at giving concrete,
working examples of the principles discussed in this module. The first part of this
module discusses some more general themes, which are expanded upon in the later
modules:
The latter sections of this module outline the materials which are necessary to complete
the suggested exercises:
Note that obtaining and installing a C/C++ compiler is discussed here, with program-
ming aspects deferred until the following module.
1.2 Introduction
Mention the word computer and many people will automatically think of the desk-
top variety, common in offices and the home. However, applications of computer-
and microprocessor-based systems abound elsewhere from television remote con-
trollers and washing machines, to cell phones and engine management systems in
vehicles, to large corporate databases. At the hardware level, a common characteristic
of all these is the fetch-decode-execute processing cycles. The hardware is of course
controlled by the software sitting above it. The software in fact consists of a multi-
tude of functions monitoring a keypad, accepting network connections, controlling
hardware such as disk drives and video displays. It is this illusion of a single proces-
sor performing many, many tasks which makes software design and implementation
particularly real-time software a challenging art. Design issues come to the fore
not only the ability of the software to complete the requisite tasks, but the ability to
complete them in a timely fashion, usually with minimal resources.
1.2 70935 Real-Time Systems
The term real-time system is generally taken to mean a computer system which must
respond within a given amount of time. Failure to do so means that the particular job
at hand cannot be completed, and the overall system fails. The definition of failure is
a crucial one. For example, a computer controlling the lift motor in a lift in a high-rise
building must ensure that the lift cage is stopped at precisely the right level on each
floor. If the lift did not stop so that it was exactly level with the floor, it would be difficult
(and possibly dangerous) for the passengers to enter or exit. The computer system
must monitor the position of the lift at precise intervals and take action according to
the present position and the target position. An overshoot of even a few centimeters
would be considered unacceptable. A great many embedded control systems fall into
the category of requiring precise, guaranteed response times. Medical devices used to
administer patient care in hospitals, on-board controllers for aircraft, even fuel injection
controllers for passenger cars, could be given as examples.
Contrast the lift control example with, say, using a word processor or spreadheet on
your desktop Personal Computer (PC). If the response to a request such as selecting
a table column, or a formatting menu item, or even scrolling the text, was delayed by
a fraction of a second, the overall system is not considered to have failed. If the
response time to a mouse selection took several seconds, the end result is that the
system is not failed, although it might be an inconvenience to the user.
Many authors (such as [1]) use the terms soft real-time and hard real-time to distinguish
these types of situations. It could be argued, for example, that if the response time of
a word processor was of the order of minutes then the system would effectively be un-
usable. However, the degradation in response time is not acompanied by catastrophic
failure. In the case of the lift controller, failure could indeed be catastrophic.
3. A memory manager, to give each task the memory it requires for operation.
Module 1 Real-Time Systems 1.3
The core of the operating system usually called the kernel or executive is respon-
sible for supervising the overall system. Context switching is managed by the kernel
such that each task is given an appropriate amount of time in which to execute. What
constitutes an appropriate amount of time depends on the task and what duties it has
to perform.
The system interface is the method by which the systems programmer gains access
to the underlying subsystem. This may mean hardware or software resources, or a
combination of both. Sending some data to a remote computer using a network con-
nection, for example, entails copying the data to the appropriate buffers and sequenc-
ing the transmission through the network hardware. For several reasons, the systems
programmer does not normally have, or wish to have, direct access to hardware de-
vices. To continue the example, writing code that directly manipulates the registers of
a network controller chip is tedious and error-prone. Worse, the location and functions
of the registers will differ depending on the manufacturer of the chipset! This is where
the device driver is used. This is the appropriate code to control the hardware devices
and provide services to the system. These services are accessed from the user-level
programs (tasks) via a well-known API, or Applications Programming Interface.
So it can be seen that in order for several tasks to co-operate, a common manager
must be called upon. User-level programs may be several levels removed from the
underlying hardware. For example, it is usually convenient to read disk files as a se-
quence of characters or lines. The underlying storage is quite different from this, as
disks use fixed-size blocks called sectors. The operating system pieces together the
sectors in the right order for user programs to access. The access may be via the API
calls open() and read(). These usually (but not always) appear to the programmer
as simple function calls in the C language. Underneath, the kernel is joining sectors
together to form a contiguous byte-stream file for the application code. The device
driver is called to read the correct sector at a certain offset from the start of the disk. It
must manipulate the control registers and set up buffers in order to access the physical
disk. A different device driver is required for, say, a hard disk compared to a CD-ROM.
Above this, the data bytes constituting the file may be viewed as records in a database.
This process user view of database records, application view of a byte-stream, ker-
nel view of disk blocks, and device driver view of read/write control registers is called
layering or abstraction. This is shown diagrammatically in Figure 1.1.
1.4 70935 Real-Time Systems
User Applications
User
system calls
Kernel
device drivers
Hardware
Figure 1.1
System hierarchy. User programs cannot access hardware and re-
sources such as memory directly only through the appropriate system
calls (called the API or Applications Programming Interface).
It is important to understand that real-time does not necessarily mean fast. Although
some emphasis is placed in subsequent modules on fast and efficient algorithms and
coding techniques, speed is not the sole element. The design aspect of considering
what is important and what is relatively less important in terms of time is crucial to the
overall success in meeting real-time performance constraints.
1.6 Terminology
As with any technical field there are a number of terms significant to real-time sys-
tems. In this section we will look at a number of definitions and briefly expand on their
application.
1.6.1 Systems
When we consider a real-time system as a single entity the following terms are useful
to define system level concepts:
System Terms 1
A system may be considered as any single entity that has a number of inputs and
outputs.
A response time is the time interval between the presentation of a set of inputs
to a system and the appearance of the resulting set of outputs.
A set of inputs or outputs is considered bounded if they are always remain within
a set of specified limits.
A system failure is the result of failing to satisfy one or more of the specified
system requirements.
The use of these terms is fairly generic. You will come across them in many fields
associated with real-time applications. In this course of study the above terms will
be more specifically applied to software systems, but this not exclude the computer
hardware on which software is operating. There is often a very close tie between the
software system and hardware system.
1.6 70935 Real-Time Systems
The depth to which the software and hardware systems are integrated broadly falls into
three categories:
System Terms 2
Examples of real-time systems falling into each of the three classifications might in-
clude:
The timeliness of the response of a real-time system is a major issue in its ability to
meet an acceptable performance standard. What is considered as acceptable per-
formance however varies widely from one application to another. You might be pre-
pared to wait for five seconds for an automatic teller machine to dispense your cash,
but the same response time would render most personal computer software useless.
Deadlines in real-time systems are often referred to as soft, hard or firm deadlines.
Consequently systems may also be classified as:
System Terms 3
Hard real-time systems - where failure to meet response time constraints would
cause a system failure.
Firm real-time systems - where failure to meet response time constraints can be
tolerated occasionally.
Module 1 Real-Time Systems 1.7
Time Loading
1.6.2 Events
When we look closer at the operation of a real-time system we come to realise that
it is changes in input conditions or changes in the internal state of the system that
trigger its dynamics. By dynamics we mean the way the system responds to change.
A change in either of these conditions is referred to as an event. In real-time systems
the following terms are used to describe event related concepts:
A synchronous event is any event which occurs at a predictable time in the flow
of control in the system.
A system state is any unique condition that a system can attain, as defined by a
set of system variables.
The concept of asynchronous and synchronous events may not at first be obvious to
all readers. Consider a software controlled real-time system which, like any software,
contains decision points. It is at these decision points that a predictable change in
1.8 70935 Real-Time Systems
system state or flow of control occurs. Other program events can also be classified as
synchronous such as errors in arithmetic calculations and software interrupt instruc-
tions.
There is another concept related to events that is very important to the operation of real-
time systems, that of determinism. Determinism is the ability to predict how a system
will behave under all possible conditions, which includes all system states and event
combinations. Assuming that a real-time system operates with bounded inputs and a
finite number of system states, it should be possible to predict all system responses
and any resulting change in system state. Consider the following terms related to
determinism.
Determinism
A deterministic system is one for which a unique set of responses and the next
state can be determined for each possible state and set of inputs.
Event determinism is where a unique set of responses and the next state can be
determined for that event.
Temporal determinism is where the response times of the system can be deter-
mined for each possible state and set of inputs.
1.8.1 Platforms
C is available in several flavours:
Unix The original C compiler called cc; it comes as standard in most Unix installa-
tions.
DOS Various commercial versions (for example, Microsoft QuickC and Borland Turbo
C), and the free 32-bit Gnu C compiler (called gcc for C and g++1 for C++).
Windows Various commercial versions such as Microsoft Visual C/C++ and Borland
C++Builder. The GnuC compiler is also able to compile Windows programs but
uses a command-line interface.
The Gnu C compiler is available for Unix, with a port to DOS called DJGPP and a
Windows port called Cygwin.
http://www.delorie.com
ftp://mirror.aarnet.edu.au/pc/simtelnet/gnu/djgpp/
Instructions for obtaining and installing GnuC for DOS may be obtained at
http://www.usq.edu.au/users/leis/gnuc/gnuc.html
http://sourceware.cygnus.com/cygwin/
http://mirror.aarnet.edu.au/cygwin/
Instructions for obtaining and installing GnuC for Windows may be obtained at
http://www.usq.edu.au/users/leis/cygnus/cygnus.html
The Cygwin compiler includes tool libraries, software libraries for Unix and Win32 sys-
tem calls.
Whats the difference between the Cygnus and DJGPP ports of the GnuC compiler?
DJGPP does not (as of this writing) support long filenames on Windows NT.
The DJGPP C++ compiler is invoked by gxx whereas the Cygwin C++ compiler
is invoked by g++. This is due to historical limitations on DOS filenames. For
compiling C programs, both use gcc.
Module 1 Real-Time Systems 1.11
The GnuC compiler ports mentioned above are both free, and provide a 32-bit envi-
ronment for C and C++ programming. This means that the memory limitations of other
DOS language tools (such as DOS C compilers and QBasic) do not exist.
For an easier-to-use integrated code development system, the author uses Borland
C++Builder (version 4 as of this writing).
activity 1.1
Served to introduce some of the important concepts and terms pertaining to com-
puter operating systems, with the extension to real-time systems.
Indicated the requirements for a C compiler for completing the unit, and given
instructions on obtaining the free Gnu compiler.
The linked articles on real-time system failures should be read and considered before
continuing.
Further Reading
a
http://www.usq.edu.au/users/leis/units/70935/935link.html
Module 2
REAL-TIME
SOFTWARE DESIGN
Module 2 Real-Time Software Design 2.1
To describe several software design techniques and design tools for real-time
system design
2.2 Introduction
The best way to ensure that the implementation of a real-time system meets all the
requirements of its system specification is to adopt a solid engineering methodology. A
traditional engineering design and development process involves the following phases:
conception
specification
design
implementation
testing
maintenance
Real-time software is such an integral part of most modern real-time systems that it is
common to hear software referred to as an engineering material. Hence the phases
listed above should equally apply to the creation of software as they do to building
a sky-scraper or manufacturing a jet engine, given that we want to produce a quality
real-time system.
The characteristics we associate with the term quality such as conformity, precision,
cost effectiveness, efficiency, maintainability and reliability are often linked with engi-
neered objects we can see, but what about the bits we cant see? These characteristics
2.2 70935 Real-Time Systems
In this module we will learn about good software design methodology and how to use
several design tools and techniques to produce quality real-time software. We will also
be looking at the ways that system specifications can be defined/described in order to
unambiguously state the requirements of the real-time system.
There are six distinct phases in the software life cycle which parallel the six phases
used in engineering. In software engineering these are referred to as:
Software engineering for real-time systems can be subdivided into the six phases
above with fairly distinct outcomes for each phase. Table 2.3 identifies the processes
and outcomes involved in each phase.
While all phases in the software life cycle are important it is often the design and test
phases that new-comers to the field tend to under-value. In fact it is in the design
phase that qualities such as conformity, precision and reliability are built-in to real-
time software. It is also in the design phase that a test plan is devised from which the
final system performance will be determined. The test phase is then used to check the
correspondence of the system with respect to the system specification.
Module 2 Real-Time Software Design 2.3
All things begin with an idea. In this phase of a real-time system design ideas are
proposed and discussed for new products, enhanced products or solutions to a problem
posed. This phase is often initiated by market forces, changes in technology or a
request from a client, where an opportunity is identified to fulfill a need.
In the conceptual phase of the project the main objectives are to:
While these tasks may not always fall into the domain of the real-time systems engineer,
he/she is often involved in evaluating and preparing the technical aspects of proposals
and estimating a products potential performance and practicalities.
In this phase the operational and contractual details of the project are identified and
documented. The operational part of the specification lists and describes the func-
tions, processes and performance requirements of the system including characteris-
tics such as speed, accuracy, stability and response times. The contractual part of
the specification details the scheduling, budgetary and legal (if any) elements of the
project. These two sets of documentation are sometimes referred to as the functional
and non-functional requirements. The non-functional requirements may also include
specification of programming language, time-loading, system hardware etc.
The specification phase is also the stage at which the test plan is produced. The
express purpose of the test plan is to define how the system will be tested to verify
conformity and performance with respect to the system specification.
In the specification phase of the project the main objectives are to:
The system specification is often written by or in close contact with the client or target
user of the product. The client may be a traditional customer, your boss, another sec-
tion of a large organisation or a particular industry sector. Real-time system engineers
are most likely to have direct input to this phase when the objective is to produce an
enhanced version of an existing product.
In the design phase the system specification is transformed from a list of functions
and requirements to a detailed statement of implementation referred to as the system
design document. The system design document specifies how the system require-
ments are to be met by partitioning the functions and processes prescribed in the sys-
tem specification into functional modules, supported by a collection of data structures.
Techniques and tools for creating the system design are explained in section 2.4.
During the design phase it is possible to identify flaws in the system specification which
cannot be worked around, or requirements that cannot be met with current technology.
In either case a request must be made to the originators of the system specification
for a system change. Design changes should only be implemented as the result of an
authorised system change.
In the design phase of the project the main objectives are to:
The system design document is often prepared by a team of design engineers and
analysts. This should be done according to a recognised international software design
standard such as DOD-STD-2167A or IEEE standard 1016.
Module 2 Real-Time Software Design 2.5
In the programming phase of the project the main objectives are to:
For the real-time design engineer the most important aspect of this phase is to properly
manage the software development process to ensure a level of quality control. This can
be accomplished through careful management techniques, sometimes with the aid of
software management software.
The importance of the test phase cannot be overstated. In some cases it is only when
the system is tested as a complete unit that some errors can manifest. Some perfor-
mance measures such as response time and time loading can only be fully evaluated
when the system is fully operational. Any failure to meet the requirements of the system
specification will require corrective action being taken by returning to the programming
phase or possibly even the design phase.
To thoroughly test a complex system the test phase should include a comprehensive
set of loading models. Loading models are sets of operational states and conditions
under which the real-time system is expected to operate. The purpose of these tests
is to thoroughly exercise or stress the system to prove its stability and performance.
2.6 70935 Real-Time Systems
The likelihood of various loading models can be estimated from a probabilistic analysis
based of the systems operation and structure.
In the test phase of the project the main objectives are to:
perform system stability and performance tests under various loading models
While some consider this revision process as part of the test phase it has a distinct
maintenance aspect which is bound to customer support issues. It is often essen-
tial that product maintenance include the capacity to revise system software, but after
some predefined period the product is usually no longer supported.
In the maintenance phase of the project the main objectives are to:
The software engineer may be involved during this phase in ongoing product enhance-
ment or regression testing. In small companies the engineer may also find him/her-self
with some direct involvement in marketing and customer support.
Module 2 Real-Time Software Design 2.7
The two foundations of good real-time software design are a sound system specifica-
tion and a definitive system design document. The purpose of the concept phase is
to produce a collection of features and objectives for the desired product, but this is a
far cry from a definitive statement of requirements. The purpose of the Specification
Phase is to transform the conceptual into the practical.
This task can be proceduralised to some degree, by using specific techniques to define
the system requirements. Most system requirements are initially described in words
which are later replaced by more specific mathematical, diagramatic or pseudo-code
definitions. These techniques attempt to unambiguously define the system require-
ments, processes and anticipated structures for the data and the program. We use the
term attempt as most system specifications use two or more techniques in combination
to ensure clarity. For all but the simplest systems any one technique alone is usually
inadequate.
While the specification and design phases are presented as separate steps in the
software life cycle, these two phases are often closely tied through the development of
the system design documentation.
Natural Language
Often the first step in developing a system design is to write down a description of what
is desired. This may take the form of a formal request from a client or your boss, or
summarised in point form scribbled down during or after a discussion. Either way the
form is descriptive using phrases or sentences outlining the desired result.
As one might expect this approach can create a degree of ambiguity when concepts
are not clearly presented and often suffers from problems related to cultural perception
and non-native languages. This technique is not recommended as a means of detailing
a specification, but may be used to complement other techniques where an explanation
is beneficial.
2.8 70935 Real-Time Systems
A more precise descriptive technique is available which greatly reduces the potential
for ambiguity. By converting a description into a series of statements and writing them
down in a structured way it is possible to create a descriptive form called psuedo-code,
or structured english. This technique adopts a procedural form. The following example
shows how this can be done.
example 2.1
do
display "Add Coins"
wait for coin to be inserted
identify coin
count coin value
if coin value exceeds product cost
begin
display "Select Product"
wait for product selection
eject product
calculate change
eject change
end
until product dispenser empty
display "No Stock"
Most real-time applications are based on the monitoring and/or control of some physical
object or process. This presents the design engineer with an ideal opportunity to apply
some of that mathematics learnt at university in an attempt to define the physics of
the problem as a series of mathematical equations. We use the term attempt because
practical systems can be difficult to accurately define as a set of equations, as they are
often too complex or exhibit non-linearities.
The concept of using mathematics to define the behaviour of a physical system is one
of engineerings fundamentals. It provides a concise, commonly understood technique
for defining a set of static and dynamic conditions; offers little chance for ambiguity and
is in a form which translates easily into software. In addition to these features math-
ematical equations can be manipulated to achieve simplifications and optimisations
Module 2 Real-Time Software Design 2.9
which can greatly improve the performance of real-time software. In many circum-
stances formal proving of the stability of a system is possible through mathematical
analysis.
The following example details a mathematical specification for the age old physics prob-
lem of projectile motion. The quantities involved include initial velocity (V ), angle of
projectile (A), the force of gravity (G), time (t) and the resulting distances traveled -
horizontally (x) and vertically (y ).
example 2.2
() =
x t ( )
tV cos A
y (t) = tV sin(A) Gt2
Where: t is time.
V,A are the initial velocity and angle respectively.
G is the gravitational constant.
x, y are resulting horizontal and vertical distances.
In many cases the design engineer is aware of the processes and procedures that
are required to be executed in a real-time system, these include: start-up and shut-
down sequences, specific algorithm execution, user data entry, a series of functional
decisions, sequences of events and others.
Several design tools are available that help the software engineer to detail procedures
which will form an integral part of the real-time system. Some of these techniques
have the distinction of being capable of defining increasingly finer levels of detail as the
design develops. The following sections outline the most common of these tools.
2.10 70935 Real-Time Systems
Flowcharts
Perhaps one of the earliest developed and most widely recognised graphical tech-
niques is the flowchart. Figure 2.1 shows an example of the most commonly used
subset of flowcharting symbols. There are actually many other symbols available for
flowcharting, but most of them relate to file and record handling for administrative pro-
gram design.
Sub-
Start 1 Connector Process
True Stop
Decision Input / 1
Output
Connector
False
The flowchart is designed to show unidirectional program flow, decision points, in-
put/output, processes and sub-processes. The few basic rules to drawing flowcharts
include:
Each symbol has a maximum of one entry and exit point, except the decision
diamond which has two exit points.
Program flow can only be joined arrow to arrow
Arrows cannot divide program flow
Flow should generally be top to bottom
Flow charts are not recommended for use in the specification phase of real-time system
design, but can be useful in defining/documenting specific procedures of small parts
of a larger system. In multi-tasking systems flowcharts do not easily represent the
interaction between the tasks, or between tasks and the operating system. There is
also no way of indicating temporal relationships in flowcharting.
Module 2 Real-Time Software Design 2.11
Dataflow Diagrams
Label
Label Label Label
Figure 2.3 shows an example of how these simple symbols can be used to convey
information about a process. Data sources/sinks are typically hardware elements such
as peripherals and Input/Output devices. The steps used to create a dataflow diagram
include:
Dataflow diagrams are highly recommended and widely used in the design of real-
time systems. They offer a structured approach for identifying the main data flows in a
system and for partitioning software into modules(processes). Interrupts can be shown
as an input from a hardware source that triggers an interrupt service routine. Dataflow
diagrams can be used to form a hierarchy of system structure with varying levels of
detail. Processes in upper layers can be represented by their own dataflow diagram
showing any underlying processes.
A particular feature of dataflow diagrams is that they provide the designer with the capa-
bility to identify concurrent processes, that is processes which can be run at the same
time on either multiple processors or as multiple tasks. This can be achieved by locat-
ing sections of the dataflow diagram which only connect to other sections only through
common data storage. In Figure 2.3 there are three such sections: the sample and
control section, the FFT section and the display section. In such cases the designer
has the option of considering any section as a separate task, which may significantly
influence the structural design of the overall system.
2.12 70935 Real-Time Systems
Analog to Graphical
Digital Raw Sample Display Graphic Frequency
Converter Display
Sample Data
Sample Time Frequency
rate data data
512 value 256 value
Control time buffer frequency
buffer
Time
Selection data File
F.F.T.
Frequency System
User data
Interface
Structure Charts
Structure Charts are widely used for describing the hierarchial structure of a system.
They can be used to describe, not only software, but any system where a layered struc-
ture exists. In software the layers are related to the depth of subroutine and function
calls. In other applications the structure chart may depict the hierarchial structure of a
chain of command, a written document or the physical components of a device.
The elements which form a structure chart are quite simple: a box represents a pro-
cess, vertical position indicates hierarchy, lines show links between processes and
arrows show major data and control flow. The advantages of structure charts include:
Some variants of the structure chart can be used to illustrate decisions and interrupt
processing. Figure 2.4 shows an example of a structure chart including a decision and
interrupts.
Interrupt A
Main Source
Process
Interrupt B
Source
Initialising / Process with
Debugging Decision Process
Interrupt
Control Data Either Service B
Interrupt
Service A
Common
Sub-
Process
While structure charts are useful to describe the general structure of a system they
lack the ability to depict concurrent processes, significant data storage and temporal
relationships. Thus structure charts are only recommended for use in the initial stages
of a design to outline the expected modularity and hierarchy of a system. They may also
be used as a good documentation tool to summarise a completed systems structure
for later reference.
Warnier-Orr Notation
In Figure 2.5 each set is identified by a label. In this example they indicate the type of
element, but in a real application they should indicate the function or purpose of the
element. The elements, in order of appearance include:
a condition - requiring a test condition and corresponding true and false actions
which are statements or sets
a while loop - requiring a test condition, W for while and statement to be executed
while the condition is true
Module 2 Real-Time Software Design 2.15
a loop until - requiring a test condition, U for until and statement to be executed
until the condition becomes true
Many real-time systems have clearly defined conditions or states in which the system
operates. The identification of these states allows the design engineer to logically
divide a system into distinct operational components. Not only does this method of
analysis help partition the software but it also helps identify the system events that
trigger changes in system operation. Several design tools are available that use state-
based techniques to develop well structured and modular code. The following sections
outline a few of these tools.
Finite State Machines and Finite State Automata are terms used for the technique of
defining a system as a fixed number of unique states between which the system moves
in response to events. A state is identified as a distinct condition a system may occupy,
based on system parameters called state variables. Transitions between states are
triggered by system inputs (events) or increments of time.
There are actually two types of Finite State Automata, the Moore and Mealy implemen-
tations. The difference between the two implementations lay in the way that outputs
are defined. The Moore machine can only define system outputs in terms of the state
variables, where-as the Mealy machine can use input conditions and state variables.
The significance of this distinction will be made clear later when we consider its effect
on implementation.
State Diagrams
Not "C"
"C" "C"
Third Second
Out=1 Out=0
"T"
states are typically labeled with capital letters or short descriptions to represent
system conditions
outputs/actions are either placed inside states or associated with inputs on tran-
sistions
Figure 2.6 shows a Moore machine for a simple task to recognise the word CAT from
a string of letters. The input is the next letter of the string, the output is a single bit
1
which is when CAT is recognised. Note that inputs are shown on the transitions
and the outputs are shown inside the states. Recall that a Moore machines outputs
are dependent only on the system state variables and hence are only defined within
states. In the Moore machine the outputs are static while the system stays in any
individual state.
Figure 2.7 shows the Mealy machine implementation for the same task of Figure 2.6.
Note the outputs are shown on the transitions along with the input causing the tran-
sition, separated by a /. In the Mealy machine the outputs are static if based only on
state variables or may be transitional if based on input conditions as well.
Module 2 Real-Time Software Design 2.17
Not "C" / 0
"C" / 0 "C" / 0
"C" / 0
Not ("T" or "C") "A" / 0
or Space / 0
Space / 0
Third Second
"T" / 1
State Diagrams are recommended as a good design technique for real-time systems,
particularly for state driven tasks which operate equipment in a range of sequences
like traffic light control, medical equipment, aircraft flight control, teller machines etc.
Statecharts
Statecharts are a combination of Finite State Machines and Data Flow Diagrams which
feature the ability to depict not only states of operation, but also states within states
and orthogonality. The structure of the statechart allows these and other features to be
incorporated in the following way.
Small letters in parentheses represent conditions that must be true for the transi-
tion to occur.
A function d/f
B D F
function function e function
a b
f c f(g) c
function function
C E
Figure 2.8 shows a sample statechart containing each of the above features. This
statechart is comprised of six states A to F, with two orthogonal (concurrent) processes
- one containing states B and C, the other states D, E and F. Each state shows a default
label function which would be substituted with a descriptive statement of the function
of that state. The dynamics of the system are as follows:
The presence of the dashed line indicates that two processes may be run concurrently,
in this case triggered by separate events a and b. Each process can run indepen-
dently but some transitions can be synchronised by common events called broadcast
communications, such as c and f .
Statecharts are highly recommended for real-time system design as they offer repre-
sentations for many of the features required for modern software design, in particu-
lar concurrency, modularity and intertask communication. When combined with the
state-based decomposition offered by Finite State Machines this design technique is
probably the one of the best available.
Module 2 Real-Time Software Design 2.19
For some real-time applications, particularly for embedded systems, commercial op-
erating systems are too big and complex to be efficiently applied. In these cases the
system designer is much more likely to write a simple kernel to meet the requirements
of the application. In this section we will describe how to implement the basic structures
of several real-time kernels and outline the more advanced features of commercial op-
erating systems.
The kernel, sometimes referred to as the executive or nucleus, is the smallest portion
of an operating system that provides the primary functions listed above. This is not to
say that all real-time applications are implemented using multiple tasks, but in one form
or another they implement each of these primary functions.
Later in this unit we will take a closer look at two of the most widely used commer-
cial operating systems in use today, Unix and WindowsNT. Several examples will be
provided to show some of the basic functions of each of these operating systems (ker-
nels). In this module we will be focusing on the fundamentals of implementing purpose
written kernels.
devices, they offer little other functionality. Each device service needs to be as short
as possible to achieve short response times for all events.
The C program shown in Figure 2.9 illustrates the concept of a polled system with a
polled single key input triggering either of two events.
int main() {
Polled systems are simple to write, easy to debug, response time is easy to determine
and they are good for high speed data channel interfacing. However polled loops are
inefficient with CPU time, they do not handle bursts of events unless specifically de-
signed with a buffer, and polled systems can not satisfy the requirements of all but the
simplest systems.
Machine. The division of the total program function into smaller distinct sections also
provides the ability for the program to be suspended at the end of each sections exe-
cution, as would be required in a multitasking application.
This technique particularly lends itself to the implementation of real-time systems de-
signed using the Finite State Machine approach. Figure 2.10 shows a State Diagram
for a simple parity generator for a bit stream. The program shown in Figures 2.11 and
2.12 show the implementation of the parity function for State-Driven code using the
switch/case approach.
0/
EVEN 1/
ODD 0 / ODD
A B
1/ EVEN
Note how each of the states are defined by letter to which each is assigned an integer.
The state variable state is initialised to the starting state and the program drops into
an endless loop. The input is received and control is passed to the case in the switch
statement that corresponds to the current state. The input value is tested to determine
0
if a transition is to be made in the state machine. For example: in case (state A) there
are two possibilities - if input = 1 then the output is changed to EVEN and the new state
becomes B, or input = 0 and no change is required. Note it is good practice to re-affirm
the output condition and state variable in this case.
Current State
Input A B
0 A / EVEN B / ODD
1 B / ODD A / EVEN
int output;
double rnd(double m) {
double r;
r = m * (double)rand() / RAND_MAX;
return r;
}
void main() {
int state;
int input;
output = EVEN;
state = A;
while(1)
{
//delay
input = (int)(rnd(1.0) + 0.5);
printf("%d\t",input);
switch (state)
{
case 0 : if (input == 1)
{
output = ODD;
state = B;
}
else
Figure 2.11
The switch/case implementation for the parity generator - part 1
of 2
Module 2 Real-Time Software Design 2.23
{
output = EVEN;
state = A;
}
break;
case 1 : if (input == 1)
{
output = EVEN;
state = A;
}
else
{
output = ODD;
state = B;
}
break;
}
printf("%s\n",parity[output]);
}
}
Figure 2.12
The switch/case implementation for the parity generator - part 2
of 2
In the implementation of the tabular form the present state and input are used as in-
dices into an array (table) holding pointers to individual functions representing each
transition. In each of these functions the output values are set and the desired next
state is specified by the return value of the transition function. This is achieved in a
single line in the main program using:
state = (next[input][state])();
The disadvantages of this approach include:
many entries in the table may be unused as some states may not use all input
conditions
In the last case a default error handling function should be specified in each unused
entry to trap illegal combinations of current state and input as appropriate.
Figures 2.13 and 2.14 show the implementation of the same parity function for State-
Driven code using tabulated function pointers. Note that in the main program the table
of addresses for each of the transition functions is defined individually using:
2.24 70935 Real-Time Systems
int output;
int AtoB() {
output = EVEN;
return B;
}
int BtoB() {
output = EVEN;
return B;
}
int AtoA() {
output = ODD;
return A;
}
int BtoA() {
output = ODD;
return A;
}
double rnd(double m) {
double r;
r = m * (double)rand() / RAND_MAX;
return r;
}
Figure 2.13 The tabular implementation for the parity generator - part 1 of 2
void main() {
int state = B;
int input;
output = EVEN;
next[0][A] = &AtoA;
next[0][B] = &BtoB;
next[1][A] = &AtoB;
next[1][B] = &BtoA;
while(1)
{
//delay
input = (int)(rnd(1.0) + 0.5);
printf("%d\t",input);
state = (*next[input][state])();
printf("%s\n",parity[output]);
}
}
Figure 2.14 The tabular implementation for the parity generator - part 2 of 2
Each of the transition functions sets the output response and exits with the desired next
state as the return value. One particular advantage with this approach is that response
times can be easily calculated as the output update can be made to occur in one place
only, in the main program after the return from the transition function. Also the tabular
approach is very easy to modify and maintain.
In all modern computer systems the hardware supports single or multiple interrupt
inputs. These interrupts can be associated with external event triggers, internal or
external clock sources, software instructions or both hardware and software error traps.
This rich source of asynchronous and synchronous event information is the underlying
feature upon which interrupt driven systems are based.
Instead of polling for events as previously described, a real-time system can be pro-
grammed to respond to interrupt events. The basic concept is that each interrupt
utilises its own interrupt service routine (ISR) to service that event. Servicing may
2.26 70935 Real-Time Systems
include transferring one or many pieces of data, counting events, starting or stopping
processes and many other functions. Systems which receive only aperiodic interrupts
are called sporadic systems, where-as systems which utilise only periodic interrupts
are called fixed-rate systems. Systems that use both types of interrupts are called
hybrid systems.
One of the difficulties associated with interrupt operation is the restricted means of
interaction between these separate event handlers and the main program. Because
interrupts are designed to carefully save and restore the CPU status before and after
the ISR is executed almost all communications between ISRs and the main program
must be through shared memory. While this is achievable it greatly increases the com-
plexity of the system.
One of the main uses for interrupts in real-time systems is to provide a means of switch-
ing between processes/tasks in multi-task applications. In this case the saving and
restoring of CPU status along with other system parameters can be used to stop one
process and start another in a procedure called context switching.
Context switching is the process by which the kernel of a real-time system suspends
the operation of one task/process via an interrupt by saving CPU registers, co-processor
registers, memory page registers, stack pointers and other significant system informa-
tion, before restoring an alternate context for the next process to run. The data is
typically saved on the run-time stack of the suspended process, and the new context
restored from the run-time stack of the next process. This is usually achieved through
the use of multiple process stacks.
In full featured operating systems the multiple stack model is often replaced by the Task
Control Block model. In more advanced operating systems it is advantageous to allo-
cate an area called the task control block to each task in the system. This area not only
holds the processs run-time stack but areas for task specific information, input/output
buffers and inter-task communications.
Figures 2.15, 2.16 and 2.17 show a program to set up the Intel8253 timer on the IBM
PC to generate a regular interrupt at approximately 55ms intervals. The main pro-
gram is comprised of the initialisation of the interrupt, a simulated three task/state
implementation based on state variable intswitch, a sample use of the timer for time
measurement and the closing down of the interrupt.
The task of the interrupt service routine is to count twenty timer interrupts and change
the state variable intswitch to switch between three simulated tasks approximately
once every second. While there is no context switching implemented here for the tasks
themselves the example serves to illustrate the concept of an interrupt driven system
and task selection.
Module 2 Real-Time Software Design 2.27
void main(void) {
int n;
unsigned long start, end;
intswitch = 0;
OpenMicroTimer();
n = 20;
while (n > 10)
{
switch (intswitch)
{
case 0:
printf("IN CASE 0\n");
break;
case 1:
printf("IN CASE 1\n");
break;
case 2:
printf("IN CASE 2\n");
break;
}
}
CloseMicroTimer();
oldvect = _dos_getvect(0x1c);
_dos_setvect( 0x1c, inthndlr);
}
TIMERDONE:
; return value:
; high word in DX
; low word in AX
; NOTE: COMPILER DEPENDENT !
}
}
The simplest is the round-robin system which simply divides the available CPU time
into short intervals, of the order of 10ms, and allocates consecutive time slices to each
task in turn. Each task runs until it is complete or until its allocated time slice is expired.
At that time the task is suspended and its context saved for later retrieval. The context
for the next task is loaded and it runs for up to one full time slice. All tasks in this system
are assumed to be of the same importance and no one has priority over any other.
While using equal priorities works well for systems with low time loading or few tasks,
most systems require the ability to assign priorities to tasks to ensure response times
can be guaranteed. In such systems we need the ability for higher priority tasks to
interrupt or preempt lower priority tasks. Such a system is called a preemptive priority
system. Priorities may be assigned at the design or programming phases based on
the importance of the task, or may be assigned dynamically by a section of the kernel
called the scheduler.
Preemptive priority systems have the disadvantage that higher priority tasks can tend
to hog system resources such as CPU time, single user input/output devices etc. This
effect can be minimised by careful assignment of priorities or dynamic assignment of
priorities.
In systems which have a number of fixed rate interrupts it has been shown that the
best performance is achieved when higher priorities are assigned to the interrupts with
higher execution rates. Such systems are called rate-monotonic systems.
incrementing task counters which get reset in the task to show the task is running
self testing
printing
Module 2 Real-Time Software Design 2.31
improved reliability through the use of interrupt triggered events and scheduling
of tasks.
they are not very suitable for a system requiring a variable number of tasks.
Several techniques for creating system specifications and design documentation were
presented for use as design tools for creation of real-time systems. The concept of
a real-time kernel was introduced along with examples on how to implement several
types of basic kernel.
2. Draft a Psuedo-code description for the procedure to change a flat tyre on a car.
4. Draft a Flowchart for the procedure described above for changing a flat tyre.
5. Draft a Dataflow Diagram for a system to measure and control the temperature of
a furnace according to a temperature set point which is entered by the user via a
keypad.
6. Draft a Structure Chart for a system which might be used to operate a library
catalog system.
2.32 70935 Real-Time Systems
7. Draft a State Diagram for a system to detect the sequence 10101 in a data stream
1 0
(single bit input) and output a logic corresponding to the last bit and otherwise.
9. Briefly outline the key features of the following real-time kernels: polled systems,
state driven systems, and interrupt driven systems.
Module 3
THE C AND C++
PROGRAMMING
LANGUAGES
Module 3 The C and C++ Programming Languages 3.1
Later modules will cover more advanced features such as linking C and assembly code
and dynamic memory allocation.
3.2 Introduction
C is sometimes referred to as a low-level language. The term low-level refers to the
components commonly found in operating systems, device drivers and embedded sys-
tems. Most operating systems are written wholly or substantially in C. This module will
give a brief overview of the C language by way of a set of examples. For completeness,
an overview of the most important features of the C++ language is also given. It is em-
phasized, however, that the treatment is definitely not introductory and the student is
expected to have a grasp of the principles of computer programming.
The Cygnus Gnu C/C++ compiler was used for all of the examples presented here.
However, the examples given in this module are sufficiently general to be able to be
compiled with virtually any C++ compiler without change.
a
3.2 70935 Real-Time Systems
C is one of the oldest programming languages, and arguably the most widely used.
The original specification is called K&R C after its originators, Kernighan and Ritchie.
The standard now is termed ANSI C (American National Standards Institute). Note
that the C++ language is a derivative of C, in that C++ compilers are able to compile
C programs (but the converse is not true you cannot compile a C++ program with
a plain C compiler). This document concentrates on the C language, although the
second part contains an introduction to and overview of the main concepts of C++ .
Note that C++ programs normally have a file extension of .cpp, whereas standard C
programs have a .c extension.
3.3.1 Starting
It must be understood that C has no intrinsic input or output (I/O) functions of its own.
The language includes constructs for variable declaration, numerical calculation, loop-
ing and the like, together with the ability to extend the basic functionality via library
function calls. It might seem strange to have no inherent I/O, but remember that the
notion of I/O is quite different in a DOS program, a Windows program or (for example)
an embedded control computer for a cars engine. Of course, some form of I/O is re-
quired, and thus a set of standard library functions like printf() for screen printing
and scanf() for keyboard input are provided with each compiler. Further examples of
code libraries include windowing code and network access functions.
myapp.c other.c
int main()
f
f // function code
someFunc(); g
// other code
is generally termed build or make. Note that simply compile will not produce an
executable program, as compilation is only the first step.
The source files normally have a .c extension. The compiler takes each .c file and
produces an object file, which normally has an extension.o (under Unix and the Gnu
DOS/Windows compilers) or .obj in other DOS compilers. The object files are linked
together, using the linker program, which is normally invoked automatically after com-
pilation to produce an executable file. In DOS, this normally has a .exe extension,
while in Unix the extension is not significant. The fact that a Unix file is executable is
seen by typing ls -l filename and examining the x flag. Other aspects, such as
dynamic linking and the use of makefiles, are discussed at the end of this module.
The Gnu C compiler is invoked by the command gcc. The simplest usage is as follows:
This runs the compiler and linker combined (the gcc program) on the source file
myprog.c to produce the output file (as denoted by the -o option) myprog.exe1. Note
that the output of the compiler is only an onject file, whereas the output of the linker is a
full executable file. This simple invocation will work if the various components required
by the compiler and linker are in the default directories.
Breaking up the code into more than one source file is the normal practice for anything
but the simplest of programs. If there is more than one source file, some references to
code in other files (modules) will be required, as depicted in Figure 3.1.
In this case, the compiler processes each code module separately into an object file,
and the linker resolves the references to external code or data. The compiler must
be told to expect certain variables or data to remain unresolved. This is done via the
1
The .exe extension should not be used on Unix systems.
3.4 70935 Real-Time Systems
more.c more.o
extern declaration. For the example shown in Figure 3.1, myapp.c would require the
following declaration at the start of the file:
which essentially states that the code for function someFunc() is in another module; it
expects a long arguments and returns a void data type.
There is no limit on the number of separate code modules as many as are required
for clarity. Normally code modules contain groups of related functions. The process is
depicted graphically in Figure 3.2.
It is common to have constants and/or data structures shared amongst code modules.
Such constants may be, for example, the expected maximum number of students in a
class. Data structures, which will be detailed further in a later section, are compos-
ite data types which group together related data (for example, a student name and
grades). These constants and data structures are likely to be required by several code
modules. It would be a maintenance nightmare if each separate file maintained sep-
arate definitions, as one change in a constant would require a change in each one of
the source files. Of course, this could introduce hard-to-trace errors. So, include files
are used for this purpose, which are shared between code modules. These files have
a .h extension. They are included into the source code module by a statement of the
form
#include <constant.h>
Module 3 The C and C++ Programming Languages 3.5
more.c more.o
libwin.a
Library files
Figure 3.3 Compiling and linking a C program with include and library files.
or
#include "constant.h"
The latter specifies the file as being in the same directory as the source file. The former
specifies a standard location as will be seen shortly.
Common code modules are also provided for various purposes. For example, a win-
dowing system will have a common set of primitives to draw a window, resize a window
and so forth. These are called libraries or archives and are effectively pre-compiled.
That is to say, the user does not have access to the source code only the object code.
This is illustrated in Figure 3.3.
In order to specify the library, the -l switch is used. Standard libraries reside in a
directory called lib. On Unix systems, the standard math library (providing functions
such as sin and sqrt) reside in the file libm.a. The library switch is such that the
linkage specification -lx searches for a standard library of the form libx.a. The path
to the library may be specified with the -L switch.
Similarly, standard include files reside in a directory called include. The path to the
include files may be specified with the -I switch. A default path is also defined
normally include under the compiler root directory.
Putting this all together, the following DOS batch file compiles the source file specified
on the command line into the corresponding executable file:
Here,
%1.c specifies the first command-line argument with .c appended
-I specifies the path to the include (,h) files
-o specifies that an output file follows (here the program name with .exe appended)
-L specifies the path to the library files (unnecessary here)
-lm specifies loading of the math library functions such as sin() from libm.a
3.6 70935 Real-Time Systems
gc myprog
A similar script file for Unix may also be used (using correct path separators and
command-line switches). Unix shells use $1 for command-line argument 1. Unix does
not use the extension to signify the executable nature of the script simply create the
file gc using a text editor and then change the mode using
chmod +x gc
Note that the libraries are only searched on demand. The code for library functions
is only added to the executable image if needed. This reduces the size of the final
executable.
Module 3 The C and C++ Programming Languages 3.7
activity 3.1
#include "constant.h"
int main(void)
{
short x;
x = SOME_CONSTANT;
}
#define SOME_CONSTANT 45
gcc ctest.c
(Omit the .exe on Unix systems) Then execute the program by typing
ctest at the command prompt. Now remove the #include line in the
source C file. Compile and note the error messages. Restore the line.
Now change the #include line to read
#include <constant.h>
Try to compile and note the error messages. Now compile using
Where -I sets the search path for include files specified with < > . The
. after L specifies the current directory. It should compile correctly.
3.8 70935 Real-Time Systems
activity 3.2
#include <math.h>
int main()
{
double x;
x = sin(3.14);
}
Now remove the #include line in the source C file. Compile and note the
error messages about the sin() function. This is because the function
prototype in the include file math.h has been omitted. Look at this file
normally in a directory called include below the compiler installation
path on Windows or /usr/include on Unix. Note the function prototype
for the sin() function shows the data types of the arguments expected
and returned. Restore the #include line.
Now compile using
Note the error messages. Because the standard library is not included
due to the -nostdlib switch the library code for the sin() function will
not be defined.
Module 3 The C and C++ Programming Languages 3.9
activity 3.3
Compile-Only
del *.o
dir ctest.*
activity 3.4
Libraries
Compile the ctest.c program with the -lm option to link in the math
library libm.a. Now look at the size of the resulting executable file and
the size of the math library. Clearly the entire contents of the math library
have not been included in the executable!
activity 3.5
Assembly Output
int main(void)
{
short y, x;
x = 3;
y = x + 1;
}
gcc ctest.c -S
A rudimentary C program is shown in Figure 3.6. The main function is the first point
of entry after the program begins. It is mandatory to have a function called main() In
a Windows program rather than a command-shell one the entry point must be called
WinMain(). The arguments to main() are void, meaning nothing; similarly, the return
value of main() (on the left-hand side) is int. This does not have to be the case, but
for a simple example it will suffice.
Comments begin with /* and end with */. If the compiler is C++ aware (the majority
of C compilers), then a single-line comment may also be entered by starting the line
with //
function name
return value (out)
arguments (in)
z }| {
short someFunc( short arg1, char *arg2, double arg3 )
last location
first location
= null
like to the compiler, in terms of arguments passed and return value. This is because
the compiler will encounter the call to the function PrintMessage() before the actual
function itself. Thus, it is a form of consistency checking.
The lines beginning with #include are directives to the C preprocessor; in this case, it
means literally include the files stdio.h and stdlib.h . These are called header files,
and contain (amongst other things) function prototypes for the functions which will be
used. Here, the printf() function is prototyped in stdio.h, and atoi() is prototyped
in stdlib.h. The compilers help screens or manual will inform you as to which include
files are required for each library function. The include files are located in a special
dedicated directory, normally called include under the compilers installation directory.
In Figure 3.6, gets() gets the character string from the user. This is to be interpreted
as a number, hence the function atoi (ASCII to integer) is used to convert the string (in
C, an array of characters or chars) into an integer (in this program, a short integer
of 16 bits size). In C, character strings are stored in null-terminated form as depicted
in Figure 3.5
3.12 70935 Real-Time Systems
#include <stdio.h>
#include <stdlib.h>
// function prototype
void PrintMessage( short NumTimes );
PrintMessage( NumTimesToPrint );
}
activity 3.6
Uninitialized Variables
activity 3.7
Compiler Warnings
int main()
{
double x;
The data type int may be used, however it can cause problems because it is defined
to be the native size on the machine on which it was compiled. This may be either 16
or 32 bits, and is thus ambiguous. The data type float also exists for single-precision
floating point values, however the extra precision of double is preferred because of the
prevalence of floating-point coprocessors in modern CPUs (double generally takes no
longer to calculate with). A string is simply an array of characters: char MyName[20].
The array must be terminated by a character value of 0 (the null character value).
C will allow certain invalid assignments to be made, and issue a warning. For example,
if pi was declared as a short, and we had the statement pi = 3.14, then a warn-
ing such as loss of precision may be issued. The program would still run, however
(pi would be set to 3). In many systems-level programming tasks, this rounding-off
behaviour may be desirable, but generally is the cause of many bugs.
This issue is termed data typing, or usually just typing. Type rules catch the passing
of incorrect arguments to a function, which may compile correctly but not execute cor-
rectly. An example is the passing of a double where an int was expected. Languages
are said to be strongly typed if the data types are strongly enforced. Weakly typed
languages do not enforce correct argument typing. Assembler is the weakest in this
regard, as the discipline of enforcing correct data storage sizes is entirely up to the pro-
grammer. Strongly-typed languages, though desirable, often make required operations
impossible. C is somewhere in between; for example, the following is incorrect but on
most compilers will not issue a warning by default:
int x;
double d;
d = 3.14159;
x = d;
int x;
double d;
d = 3.14159;
x = (int)d;
Presumably the effort of placing the typecast (int) means that the programmer was
sufficiently aware of the implications (in this case, truncating down 3.14159 to 3 when
stored as an integer).
Module 3 The C and C++ Programming Languages 3.15
activity 3.8
#include <stdio.h>
int main()
{
short shortVar;
long longVar;
shortVar = (short)longVar;
printf("longVar=%ld shortVar=%hd \n",
longVar, shortVar );
}
C has automatic variables, which are declared after the opening brace {, remaining
until the matching closing brace }. These are local to the function in which they are de-
clared. Passed-in arguments appear between the brackets () of functions and cannot
be changed by a function. Global variables (accessible by all functions) are declared
outside any function scope. Figure 3.7 illustrates these concepts, with the function
TestFunc() shown in Figure 3.8.
The output of scope.c is shown in Figure 3.9. Note that local values are not altered
within the scope of the calling function, and that RetVar is initially unassigned and has
a random value.
/* scope.c
* Simple illustration of variable scoping in the C language
*
* John Leis
*/
#include <stdio.h>
/* a global variable
* If we wish to use this variable from other modules (C files)
* we put the same declaration with the keyword "extern" in front.
*/
short GlobalVar;
/* function declaration */
short TestFunc( short InVar1, short InVar2, short *PtrVar);
Var1 = 4;
Var2 = 5;
GlobalVar = 6;
printf("Before function call, RetVar = %hd, Var1 = %hd, Var2 = %hd\n",
RetVar, Var1, Var2);
RetVar = TestFunc( Var1, Var2, &Var1 );
printf("After function call, RetVar = %hd, Var1 = %hd, Var2 = %hd\n",
RetVar, Var1, Var2);
printf("GlobalVar = %hd\n", GlobalVar);
/* a test function
* The function return value is the sum of the passed-in arguments
* (InVar1 + InVar2)
* The contents of the pointer variable PtrVar are replaced
* with the product (InVar1 * InVar2)
* The global variable GlobalVar is changed to (InVar1 - InVar2)
* Note that the function *attempts* to change InVar1 and InVar2,
* but that they are not changed (pass-by-value).
*/
short TestFunc( short InVar1, short InVar2, short *PtrVar)
{
short SumResult, ProductResult;
*PtrVar = ProductResult;
return SumResult;
----------------
3.3.6 Pointers
In the preceding example, the pointer data type was used. Variables in any program
are stored in memory, and the pointer is just another variable that happens to hold
the memory address of another variable. Although not strictly required for elementary
programming tasks, the pointer is quite powerful and gives extreme flexibility in many
situations.
An code example of pointer variables and their use was given in the previous example
(scope.c, Figure 3.7).
Two symbols, * and &, are used in connection with pointers. Read them as
* contents of
& address of
short Var1;
short *PointerToVar1;
Var1 = 4;
PointerToVar1 = &Var1;
*PointerToVar1 = 5;
may be read as
This is shown in Figure 3.10. Note that the value contained in PointerToVar1 is not
assigned directly by the programmer, but depends on where the compiler, linker and
run-time loader allocate the memory addresses. This taking-the-address-of is some-
times called dereferencing or more commonly indirection.
The question may be asked as to why this added complexity is necessary. The answer
is that such memory addressing facilitates rapid moving through arrays of characters
(strings) or other data types and is often necessary for low-level operations.
Sometimes it is very useful to have a more advanced construct, that of double indi-
rection. This is where a pointer-to-a-pointer is required, as shown in Figure 3.11.
Note how the pointer-to-pointer declaration syntax is consistent: simply declare two
contents-of:
Module 3 The C and C++ Programming Languages 3.19
16 bits
0 0 0 4 short Var1
short *PointerToVar
memory
5 8 A B 7 6 9 8
short **ppVar;
Again, this added complexity gives much greater flexibility in accessing memory. Memory-
efficient dynamic data structures such as doubly-linked lists use pointer dereferencing.
In DOS or Unix command-line (shell) programs, arguments may be sent to the program
via the command-line itself. Windows also allows the specification of parameters when
an application is started. For example, a command such as
requires the arguments somefile.txt and other.bak to be used in the copy program.
The program cmdargs.c, shown in Figure 3.12, simply prints out all of its command-
line arguments. Note in the sample output that the program name itself is the zeroth
argument.
It is often necessary to load in some data files, or save calculated data for later refer-
ence or plotting.
3.20 70935 Real-Time Systems
short *pVar
short Var
short **ppVar
/* cmdargs.c
* Command-line arguments
*
* example:
* John Leis
*/
#include <stdio.h>
0.001 0.564
0.193 0.809
0.585 0.480
0.350 0.896
0.823 0.747
0.174 0.859
0.711 0.514
0.304 0.015
0.091 0.364
0.147 0.166
An important distinction to be made is between binary and text (or ASCII) formatted
files. Text files contain plain, readable text which can be viewed with any text editor.
For example, temp.txt might contain the data as shown in Figure 3.13.
Text files may be created from a C program using the code fragment shown in Fig-
ure 3.14.
Note the use of fscanf() to read formatted data. The format specifier %lf specifies a
double-precision variable is being used (in this case, %5.3lf limits the output to a field
of width 5 with 3 decimal places).
We can then continue on to read back the text file using the code fragment shown in
Figure 3.15.
In the preceding example, the read and write functions are contained in the main pro-
gram for ease of illustration. Of course, it is good practice to separate them into func-
tions.
In addition to text files as discussed above, you may encounter binary data files. These
are not viewable using text editors they consist of raw 8 or 16-bit quantities (usu-
ally), which are machine-readable and not human-readable. The exact representation
depends on the CPU being used for example, a Pentium CPU has a different repre-
sentation for 16-bit integers to SUN Sparc CPUs. The integer representations may be
converted, but the situation for floating-point numbers is much more problematic. For
this reason, the Institution of Electrical and Electronic Engineers (IEEE) format is often
used.
Binary files may be written by using the "wb" (write, binary) mode when calling fopen(),
and using fread() to read the raw byte stream. Instead of formatted data handled by
fprintf(), we use fwrite() to write a certain number of bytes (using the sizeof()
operator).
Figures 3.16 and 3.17 show the code necessary to write and read binary files. Of
course, the binary data files themselves will not be directly printable try this by
loading temp.bin into a text editor such as DOS edit.
Note the format specifier "rb" (read-only, in binary mode) as opposed to the text files
discussed previously, which were opened using "r" mode (text is the default). On Unix
systems, there is no need to explicitly specify b as part of the mode string.
3.22 70935 Real-Time Systems
/* textfile.c
* Illustrates reading and writing a text (ASCII) file
* containing numerical data samples.
*/
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main()
{
FILE *fp;
double x, y;
short SampNum, NumSamples, NumSamplesRead;
char LineBuf[100];
char *FileName = "temp.txt";
SampNum = 0;
do
{
fscanf( fp, "%lf %lf", &x, &y);
if( ! feof(fp))
{
printf("x=%5.3lf y=%5.3lf\n", x, y);
SampNum += 1;
}
} while( ! feof(fp) );
NumSamplesRead = SampNum;
printf("Read back %hd samples.\n", NumSamplesRead);
fclose(fp);
exit(0);
}
The files temp.txt (ASCII) and temp.bin (binary) generated by the above code should
be examined using the type command in DOS or cat in Unix.
Note that there is considerable scope for generating code which is machine-dependent
that is, code which will work on one platform (such as DOS) and not on another (such
as Unix), and vice-versa.
Most binary files will contain, in addition to the data, some information at the start
of the file pertaining to the characteristics of the data, in addition to the data itself.
This header information is somewhat specific to the type of data file. For example,
a picture (image) file will require information about the width of the picture, the height
of the picture, the number of colours present, and so forth. Some standards exist:
for example, the Windows Paint program uses bitmap files with extension .bmp.
Sound (audio, or wave) files often use the extension .wav or .au. You should be aware,
however, that there is a plethora of different file formats, even for (ostensibly) the same
type of data.
Note that there are several possible methods of declaring such a structure, for example
struct Student
{
char Name[20];
double gpa;
};
3.4.1 Portability
Generally, C code is highly portable between different platforms and operating sys-
tems, provided care is taken with library functions. The exceptions are file handling (as
discussed above) and data sizes. The latter is why the int data type is not recom-
mended. File handling in binary mode, if coded carefully, can be platform-independent
and hence portable.
More subtle structure alignment problems can occur when porting. If a structure size
must have an exact member alignment, the #pragma pack(1) directive may be used.
Module 3 The C and C++ Programming Languages 3.25
3.4.2 Makefiles
Typing the compilation command(s) on the command line can become tedious and
tiring. A batch file with the appropriate commands can go a long way to simplifying this.
However, for large projects, a better solution is required. As mentioned previously, it
is good practice to break a project into code modules in separate files. A large project
may run to dozens of files. Creating a single change, even just adding a comment
(which of course does not change the executable code), requires compiling and linking
all over again.
For more than relatively simple projects, a makefile is recommended. In a large multi-
source project, if one file is edited then it is the only one which needs to be re-compiled
not all of the source files need be recompiled followed by linking of the object
files. The make utility solves this problem, and automates the building of large software
projects. make does this by checking and comparing the dates on the source files
and the corresponding object files. If the object file is newer, then it must have been
compiled after the last modification to the source. If the source is newer than the
corresponding object file, then that source file must be re-compiled. If any sources are
re-compiled, the link stage must be performed again.
Figure 3.19 shows an elementary makefile. The most important concepts are:
Target is normally the executable program to be generated. More than one target may
exist, but only one is built at a time. The target may be a dummy one in order to
invoke some other operating-system command.
Dependencies specify what files need to be re-built to create a specific target.
Suffix Rules are the rules for building the targets, for example an object file from a
C source, an object file from an assembly-code source, and an executable from
object files.
The first section contains some comments regarding the project. Following this the
dependencies (object files) are listed here theyre called DEPS and contain only one
object (basic.o), but normally there would be many. The executable target is named
TARGET. Following are the suffix rules here, the only rule specified is to create an
object file from a C source (compile only). Of course, other language compilers could
be invoked or a rule added for creating an object from an assembly language source
(assembly only). The executable target basic.exe specifies a dependency on all the
object files (although here only one), followed by the appropriate command to run. This
is specified on the following line and must be indented one tab stop. In this case, it is
effectively just a link phase.
Note well: the rule to be executed after the dependency must begin with a tab charac-
ter, not spaces. Otherwise the error missing separator will result. So the second line
of
# suffix rules
.c.o:
$(CC) -c $(CCOPTS) $(INCDIR) $*.c -o $*.o
3.26 70935 Real-Time Systems
# make targets
basic.exe: $(DEPS)
$(CC) -o $(TARGET) $(DEPS) $(LDOPTS)
Other targets may be specified to enhance project maintenance. Here, the targets
noexe and clean specify an action but no dependencies.
activity 3.9
Make Exercises
make
make
Note that the target is now up to date. Now delete the executable:
make noexe
make
Note that only the link phase is invoked. If there were more than once
source C file, only the file(s) which have been changed are re-compiled.
Now delete the object files:
make clean
make
For more advanced work, the Frequency Asked Questions (FAQ) may be consulted on
newsgroup comp.lang.c. Users of Gnu C should consult the FAQ which is available
with the distribution.
3.28 70935 Real-Time Systems
// binfile.c
// Illustrates reading and writing a binary file
// containing 16-bit integer data samples.
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main()
{
FILE *fp;
short Sample;
short SampNum, NumSamples, NumSamplesRead;
double NumCycles=2, pi = 4.0*atan(1.0);
char *FileName = "temp.bin";
SampNum = 0;
do
{
fread( &Sample, sizeof(short), 1, fp);
if( ! feof(fp))
{
// found a valid sample
SampNum += 1;
NumSamplesRead = SampNum;
printf("Read back %d samples.\n", NumSamplesRead);
fclose(fp);
exit(0);
}
#include <stdio.h>
#include <string.h>
void main(void) {
strcpy( StudentRecords[4].Name, "fred");
StudentRecords[4].gpa = 1.0;
}
# compiler flags
CC = gcc
CCOPTS =
BASEDIR =
INCDIR = -I $(BASEDIR)/include
# linker flags
MATHLIB = m
LIBS = -l$(MATHLIB) # note no space after -l
LDOPTS = $(LIBS)
# dependencies
DEPS = basic.o
# executable target
TARGET = basic.exe
# suffix rules
.c.o:
$(CC) -c $(CCOPTS) $(INCDIR) $*.c -o $*.o
# executable
basic.exe: $(DEPS)
$(CC) -o $(TARGET) $(DEPS) $(LDOPTS)
clean:
del *.o
noexe:
del *.exe
Reliability The need for reliability and the problem of error-checking in large projects.
Reuse The ability to re-use code in one or several projects.
Co-ordination The ability for several members of a design team to work on different
sections of the same project and have the resulting software work seamlessly.
The object-oriented paradigm refers to the joining of data and functions to operate on
that data. The two entities data and code taken together are termed an object.
In a traditional procedural language such as C, the software engineering task normally
begins by defining the data structures and then the code which operates on that data.
The linkage is not tight, and relies on the programmer(s) to ensure that the correct
portion of code operations on the correct data. Although the data-code connection
should be clear at the design stage, it is not immediately obvious at the implementation
stage and much less clear during the maintenance phase.
C++ is a superset of C. That is, a C++ compiler can quite happily compile a standard
C program, and conventional C code can be freely intermixed with C++ code they
look the same except for some additional syntax. This probably accounts for the pop-
ularity of C++ for programming projects.
Having spelled out the advantages of C++ in a broad sense, some disadvantages must
be mentioned. First is the requirement to upgrade software tools and of course pro-
gramming skills. If the GnuC compiler tools were installed as described previously, then
the GnuC++ compiler is automatically available. Second, some performance may be
lost in terms of raw speed. Whether this loss outweighs the code portability, readabil-
ity and maintainability advantages outlined earlier depends on the particular situation.
In general, large application programming tasks will invariably benefit from the use of
C++. Low-level code such as device drivers and system functions may require a pro-
gramming style appropriate to attaining the utmost performance. C is probably more
appropriate in this situation. However these sorts of tasks are unlikely to benefit from
an object-oriented approach as the tasks tend to be procedural and hardware-oriented.
Note the use of the cpp extension for C++ sources. The batch file gcpp.bat is provided
in the example sources, and is invoked as follows:
gcpp myprog
Object files, include files, linking and so forth remain as per the previous discussion for
the C compiler.
Two seemingly superficial differences between C and C++ must first be mentioned.
The first is the comment delimiter. C uses the /* and */ constructs to mark the
beginning and end of comment blocks, respectively. The comments may span several
lines. C++ has a single-line comment: everything from // to the end of line defines a
comment. Most C compilers now also support the C++ comment syntax.
activity 3.10
C++ Compilation
// enter as cpptest.cpp
// This is a one-line comment
/* This is a
* multi-line
* comment.
*/
#include <iostream.h>
int main(void)
{
double someVar = 3.14159;
// this is a comment
cout << "the value of someVar" ;
cout << " is " << someVar << "\n";
cout << "we could also use ";
cout << "this syntax. " << someVar << endl;
}
Attempt to compile using the gcc compiler as per the previous examples.
Note that many errors are produced, because the C++ libraries are not
linked. Now compile using
In standard C, a data structure (also called an Abstract Data Type or ADT) is repre-
sented by a structure and denoted by the struct keyword. Functions are essentially
independent in their declaration: it is up to the programmer to ensure that functions op-
erate on the correct data structure(s). The advantage of declaring data as a structure,
rather than simply as a collection of individual types, is twofold: the logical association
becomes a focus at design time, and the code becomes more readable and hence
easier to maintain.
We could declare a structure prototype Student for storing student records as follows:
It is of course possible to declare the Name and gpa fields separately, but that would not
indicate to the reader that the two are related. Additionally, it forces the code designer
to view the two as one entity. A particular Name is associated with one, and one only,
gpa.
The type definition above is a prototype: it does not reserve any memory space. If we
wish to declare a particular instance of a data structure using the above prototyping
method2, the syntax is identical to any other intrinsic data type:
#define MAX_STUDENTS 20
pStudentRecord = &StudentRecords[4];
Finally, declaring functions that use data structure is consistent with other intrinsic data
types:
In the preceding example, the connection between the Student data and the
updateStudentRecord() function is not especially tight. It is not enforced by the com-
piler (except perhaps for the typing of the pointer argument). The main connection
is through the naming convention if the function were called updateRecord(), for
example, the connection would be lost.
Recall that a struct is a prototype and that instances of a structure must be declared.
The class is the prototype. An object represents a particular piece of data and the
code to operate on that data it is an instance of a class. As the class contains not
only data but also code, special names are used to define the components of the class.
The functions within a class are termed methods, and the data is termed the attributes
of that class. For a window object, the attributes may be
1. width in pixels;
3
Remember that C indices start at zero, hence the 5th record has index 4.
3.36 70935 Real-Time Systems
2. height in pixels;
4. vertical position.
If, on the other hand, access functions were used to return and set the year, then
only those access functions would need to be changed. All code that accesses the
attributes of an object should thus do so using access methods and not alter the at-
tributes themselves directly. Of course, this requires some additional overhead in terms
of the larger amount of code required.
Some concrete examples of these will be given in the following section, using C++
syntax.
Module 3 The C and C++ Programming Languages 3.37
class Vehicle
{
public:
void printColor( void );
private:
int numWheels;
char *color;
};
class Vehicle
{
int numWheels;
char *color;
};
This really isnt much different from a C structure. The principle of data hiding may be
implemented by defining the public and private components of the class, together with
the access functions as shown in Figure 3.20.
Now the attributes are private and any attempt to access them will be deemed illegal
by the compiler. In this case it could be useful: the number of wheels would be set when
the object is first declared and never changed. There should be no need to change the
component numWheels for example. However a reasonable requirement would be to
return the number of wheels contained in a particular object of the vehicle class.
Thus the existence of the public function getNumWheels().
When a data object is created, the data should be set to some sensible default value.
A common error in C is to declare a data structure and forget to initialize some or all
of its members. This may be a difficult problem to trace. So C++ has the constructor,
which is a function called when an object belonging to a particular class is declared.
3.38 70935 Real-Time Systems
class Vehicle
{
public:
// no return type allowed for constructor
Vehicle( char *color, int numWheels );
private:
int numWheels;
char *color;
};
A constructor function may be added to the vehicle class as shown in Figure 3.21.
The body of the function may be implemented within the class definition, as has been
done so far. This is fine for short functions but may make the class definition difficult or
impossible to read if long functions are involved. So naturally the function body may be
declared elsewhere in the source files. However, some notation is required to associate
a function with a particular class. Figure 3.22 illustrates this for the constructor function
of the vehicle object.
The :: notation explicitly associates a function with an object. On the left is the class
name; on the right is the member function name. Since constructors must have the
same name as the class, we end up with the syntax for the constructor declaration
being Vehicle::Vehicle().
Member functions are implicitly passed a pointer to the object which is always called
this. It is implicit because it does not appear in the function definitions. However,
there must be some underlying mechanism for determining which particular object is
being referenced the this object. The use of this is optional, as in the above
example4 .
4
There are some exceptions to this rule.
Module 3 The C and C++ Programming Languages 3.39
// constructor
// note no return type allowed
Vehicle::Vehicle( char *color, int nw )
{
// explicit use of 'this'
// need to do this if the arguments to the function have the
// same name as the class members
this->color = color;
Function overloading allows C++ to have more than one function with the same name
but different arguments. This may be useful in situations where, for example, sensible
defaults are to be assumed for a object. Which particular function is called is deter-
mined by the compiler, using the number and data type of the arguments.
Just as a constructor allocates memory and sets the initial attributes for an object, a
destructor function releases the memory and performs any additional tasks which may
be required when an object is no longer needed. The destructor function must also
have the same name as the object but is prefixed by a tilde sign ( ~ ). Figure 3.23
shows both an overloaded constructor function and a destructor function.
The declaration of an object as an instance of a class may now be done. Figure 3.24
shows several alternatives, including the declaration of a pointer to a new object. Note
the use of the new keyword in the latter case.
activity 3.11
C++ Objects
Compile and run the program objects.cpp using the g++ Gnu C++ com-
piler. Note the output from the program and trace through the listing,
comparing the output to the source code.
3.5.6 Input/Output
C++ builds on the inherent operator overloading available in the language to provide
more robust formatting. The basic iostream constructs allow for formatted output in
a somewhat different manner using the streams cout and cin in conjunction with the
3.40 70935 Real-Time Systems
class Vehicle
{
public:
// no return type allowed for constructor & destructor
Vehicle( char *color, int numWheels );
Vehicle( void );
~Vehicle();
private:
int numWheels;
char *color;
};
// constructor
// note no return type allowed
Vehicle::Vehicle( char *color, int nw )
{
// explicit use of 'this'
// need to do this if the arguments to the function have the
// same name as the class members
this->color = color;
// destructor
// note no return type allowed
Vehicle::~Vehicle()
{
cout << "vehicle destructor for " << color << " vehicle\n";
}
// member function
void Vehicle::printColor(void)
{
cout << "color is " << color << "\n";
}
myCar.printColor();
cout << "myCar.numWheels returns " << myCar.getNumWheels() << "\n";
// illegal
// myCar.numWheels = 0;
// OK
myCar.value = 10;
operators << and >>. Even more precise control over field widths, degree of precision
and so forth may be obtained using the manipulators defined in iomanip.h.
activity 3.12
C++ Input/Output
Work through the supplied iocons.cpp for simple examples of the new
formatting methods.
requires some attention to detail on the part of the programmer, in that the variable
theAnswer must be of the type double in order that the %lf format specifier work cor-
rectly. If, for example, theAnswer were an integer then the results would be unpre-
dictable.
3.42 70935 Real-Time Systems
activity 3.13
C Formatting
#include <stdio.h>
int main()
{
short theShortAnswer;
long theLongAnswer;
double theDoubleAnswer;
theShortAnswer = 1342;
theLongAnswer = 1342L;
theDoubleAnswer = 1342.0;
printf("theShortAnswer: ",
printf("hd format:%hd ld format:%ld lf format:%lf\n",
theShortAnswer, theShortAnswer, theShortAnswer);
printf("theLongAnswer: ",
printf("hd format:%hd ld format:%ld lf format:%lf\n",
theLongAnswer, theLongAnswer, theLongAnswer);
printf("theDoubleAnswer: ",
printf("hd format:%hd ld format:%ld lf format:%lf\n",
theDoubleAnswer, theDoubleAnswer, theDoubleAnswer);
}
activity 3.14
C++ Formatting
Work through the supplied format.cpp for simple examples of the new
formatting methods.
Module 3 The C and C++ Programming Languages 3.43
The sample program objects.cpp has already shown the basic method for declaring
classes and objects (specific instances of a class). Note especially:
3. The way in which member functions are declared and the function body defined,
either within the class definition or elsewhere using the scope operator ::.
but that the member function in the base class may be explicitly called using the scope
resolution operator as follows:
The keywords public and private enforce encapsulation of a classs data and meth-
ods. Sometimes, however, it is necessary (or convenient) to have a common function
which can access the private data of several objects, thus temporarily over-riding the
private specification. This is done using the friend keyword. Essentially, this de-
clares other friendly classes who are allowed to access the private information in the
current class. In the example friend.cpp, the class CustomerRecord is defined as
having a friend function
Thus the Capitalize() member function of the object myName an instance of class
MyNameClass is allowed to access the variable customerName, which is private to the
CustomerRecord class. Under normal circumstances this would be illegal.
activity 3.15
Friend Functions
Compile and run friend.cpp. Now remove the friend keyword and
note the compilation errors pertaining to private member variables.
activity 3.16
Object Pointers
activity 3.17
Derived Classes
There are two methods by which objects may incorporate other objects: derivation and
Module 3 The C and C++ Programming Languages 3.45
activity 3.18
Just as one class may inherit attributes and methods from another class, any single
class may inherit attributes from several other classes. This is termed multiple inheri-
tance. The example multint.cpp shows a generic Window class for drawing a window
on screen, which inherits attributes and methods from both the Button class and the
TitleBar class. The Button class may have attributes such as foreground color, back-
ground color, and methods to simulate pushing of the button on screen. The syntax
simply extends the previously-mentioned single-inheritance case:
activity 3.19
Multiple Inheritance
Virtual functions are member functions of a class, which are normally expected to be
over-ridden when the class is derived. The example virtual.cpp gives an example of
such a case. The Employee class is used to derived a sub-class, ContractEmployee.
This would be an appropriate use of class derivation: generic attributes such as em-
ployee name, employee number and so forth would be used in the more specific con-
tract employee class. However, the method of calculating the pay for contract employ-
ees is quite different to the method of calculating the pay for salaried employees. Thus
the member function calcPay() is declared in the base class. It is expected that this
function be over-ridden in each derived class. To enforce this, a pure virtual function
is declared in the base class. The compiler will expect a derived version of this function
to be found in all derived classes.
3.46 70935 Real-Time Systems
activity 3.20
Virtual Functions
Normally all attributes whether private or public are set for each object even though
the objects may belong to the same class. This is the expected behaviour. Sometimes
it is necessary to define an attribute that has a value across all instances of a class (all
objects of that class). These attributes are given the qualifier static.
The example statics.cpp shows the use of statics in a simplistic way. The Bank object
contains the balance for a persons bank account. The objects myBank and yourBank
are thus quite distinct. However, the interest rate paid across all accounts is the same.
Therefore, a static variable is required to maintain the interest rate irrespective of the
instances of the Bank class.
activity 3.21
Static Scope
Compile and run statics.cpp and verify that changing the currentRate
variable changes the value in both Bank objects myBank and yourBank.
numValues = rhs.numValues;
vecValues = new int[rhs.numValues];
}
Vector v(4);
Vector newV(4);
newV = v;
implicitly calls Vector::operator=. If the new operator= function were not provided,
the assignment would force the compiler to generate code to perform an element-by-
element copy of the attributes of a Vector object. In some cases, particularly those
where objects contain pointers to other data, this may not be the desired behaviour.
It is then up to the programmer to define the new storage and set/copy elements as
appropriate.
activity 3.22
Operator Overloading
Naturally arrays of objects as well as single objects may be created. An array may be
created using
however it does not allow the explicit specification of a constructor to be called (the
void constructor is called).
An alternative is to declare a pointer to the object and then allocate the necessary
storage using new, as follows:
Vehicle *pCarLot;
Using this approach, the constructor cannot be called directly. For example
It must be kept in mind that because the array is dynamically allocated using new, the
memory must be freed using delete (compare to malloc() and free() in standard C).
The array declared as above is freed using
delete [] pCarLot;
Although it may seem that the empty array specifier is either unnecessary or should
contain the number of objects, the compiler is able to determine the amount of memory
to be deleted (just as free() does not require the number of bytes to be deleted, only
a pointer to the start of the block). The number of bytes is kept internally in a data
structure associated with the memory pointer.
Lastly another method is static initialization, which allows the specific constructor to be
called:
activity 3.23
Arrays of Objects
3.5.12 Functions
The example code module func.cpp illustrates several important principles of C++
relating to functions. Firstly, the concept of default arguments to a function is a C++
feature not found in C. A function declared as
will be called with the two values for x and y if both are supplied. However if only
one argument is supplied then the second takes on the value supplied in the function
prototype (50 here).
A related concept is that of overloading a function. This is done via the data type of the
arguments supplied. For example the function prototypes
call the appropriate function depending upon whether the argument supplied is an int
or a double.
Functions declared as inline are compiled as inline code rather than a separate func-
tion. A copy of the functions code is included where the call to the function appears.
Thus several copies of the functions code may actually appear in the output file. This
increases the code size but may yield improved performance (but this is certainly not
guaranteed).
Another important matter in function calls is the ability to call native C functions. Al-
though standard C functions should not normally be written for a C++ project, the need
may arise to call C code from a library or a C-interfaced assembly language routine or
similar. Because of the way in which the linker operates for C++ code as compared to
C code, C functions cannot be directly called. They must be called using the syntax
where the extern "C" qualifier invokes special name-changing rules for the C function
name which follows.
The sample program func.cpp demonstrates these principles, together with the use
of the const qualifier and local/global scoping of variables. const signifies a constant
value which must not be changed by the programmer (such as an array limit). The
scope-resolution operator :: may also be used to resolve the conflict that arises
if a function name is the same at a local (function) and global (module) level (not a
desirable practice however).
3.50 70935 Real-Time Systems
activity 3.24
Compile func.cpp and cfunc.c using the batch file cfunc.bat. Step
through the code to verify the features discussed above.
3.5.13 References
A reference in C++ behaves in a manner somewhat akin to a pointer. A reference is
declared as follows:
int x;
int &refX = x; // declare and initialize
where the ampersand (&) indicates a reference or indirection to another variable, much
as a pointer is an indirection to another variable. Using either a pointer or a reference is
termed pass-by-reference. Contrast this to pass-by-value where a copy of the entire
data variable is passed and the original cannot be modified by the called function. The
main difference between a pointer and a reference is that a reference must be initialized
when it is declared.
activity 3.25
References
activity 3.26
Reference Initialization
class ListItem
{
public:
ListItem( char *name );
~ListItem();
private:
char *itemName;
ListItem *pNext;
};
Pointers to objects are often used in C++ in the same way as pointers to structs in
C. The example program objptr.cpp implements a simple linked list using a pointer to
the List object as in Figure 3.26.
The list is thus dynamically created, rather than using a static array. Adding an item to
the list requires traversal of the list as in Figure 3.27.
activity 3.27
Object Pointers
Just as the = operator may be overridden, so the << operator may be over-ridden. For
standard output or output to files, the << operator is used. An example of overriding this
operator is given in ionew.cpp, where a data type PhoneNumber is declared. A special
output formatting function for this data type, which separates the area code and the
phone number, is shown in Figure 3.28.
Now the programmer is freed from specific output formatting considerations when using
the PhoneNumber data type:
activity 3.28
File I/O is a special case of screen/keyboard I/O. So the new output stream concept
Module 3 The C and C++ Programming Languages 3.53
ofstream *outStream;
if( ! outStream )
{
cout << "could not open output file" << endl;
exit(1);
}
outStream->close();
delete outStream;
may equally be applied there. Instead of using fopen() the output stream ofstream is
used as in Figure 3.29.
The include file fstream now required. The cout-type operators may now be used as
shown in Figure 3.30. Reading in a text file is similar (Figure 3.31). The complete code
may be found in iofile.cpp.
activity 3.29
Compile, run and step through the file I/O example program iofile.cpp.
3.54 70935 Real-Time Systems
ifstream *inStream;
const int MAXLINE = 50;
char lineBuffer[MAXLINE];
if( ! inStream )
{
cout << "could not open input file" << endl;
exit(1);
}
while( ! inStream->eof() )
{
inStream->getline( lineBuffer, MAXLINE );
if( inStream->eof() )
{
cout << "end of file\n";
}
else
{
cout << "read back: " << lineBuffer << endl;
}
}
inStream->close();
delete inStream;
Further Reading
a
http://www.usq.edu.au/users/leis/units/70835/835link.html
Module 4
CODING TECHNIQUES
Module 4 Coding Techniques 4.1
Error handling.
Assembly language.
Software optimization.
Linking C and assembly code.
Dynamic memory allocation.
Mixed-language programs.
4.2 Introduction
There are three main thrusts to this module: error handling, performance and memory
management. These are aspects of system coding which the designer and implemen-
tor ought to be aware of from the beginning.
The module begins with a brief examination of some methods of error-handling in com-
plex systems. As many computer systems embedded systems, servers, and the like
often have to run unattended, some method of recording system errors, when they
occur, is useful. Although relatively straightforward, it is often a vital aspect of any
system design especially when things go wrong!
The other aspect which is examined in this module is that of performance optimization.
Attaining the best performance is not simply a worthy goal often times, real-time
constraints mean that every last bit of performance must be teased out of a system.
More commonly, certain sections of code present bottlenecks to the overall system
performance. These sections must be targetted for enhancement.
Finally, the issue of memory management is examined. This is probably the area which
causes most grief for software engineers. Bugs in this area can be very, very subtle and
difficult to track down. Some of the more common problems in dealing with memory
are presented in order to make the student aware of them immediately, rather than
through painful experience.
4.2 70935 Real-Time Systems
Systems such as network servers and embedded systems must often run without any
supervision. Thus the traditional technique of debugging statements may not be use-
ful. To begin with, there may be no system operator present to examine the error
messages. Error messages may often mean little to a supervisor of a system, but be
invaluable to the software engineer who designed the product. For example, a cryptic
Error # 23 is of little use unless 23 actually has some meaning to whoever sees the
message. Furthermore, many devices such as handheld devices do not have the ca-
pability to output diagnostic codes. Even if they did, non-technical operators will expect
that a reset of the device will cure the problem.
Figure 4.1 shows one method of reporting internal errors. Firstly, note the perror()
Unix system call. This is used after system calls (such as fopen() in the example)
to print a string appropriate to the last error. The internal variable errno is used to
store this error. Secondly, the compiler may be used to help track down the specific
section of code responsible for the error using the C preprocessor macros __FILE__
and __LINE__, which correspond to the file (module) name and the line number within
the module. Obviously such information is of little use to the end-user, but may save
valuable debugging time.
/* errhand.c
* To illustrate some methods of system error trapping.
* The macros __FILE__ and __LINE__ are defined by the preprocessor.
* Output:
fopen: No such file or directory
Error occured in module errhand.c on line 17
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
if( ! fopen("AnInvalidFilename", "r") )
{
perror("fopen");
fprintf( stderr, "Error occured in module %s on line %d\n",
__FILE__, __LINE__ );
exit(1);
}
exit(1);
}
In the Windows NT environment, code such as presented in Figures 4.2 and 4.3 may be
used. Note the use of the system function GetLastError(), which performs a function
similar to perror(). The error-handling routine has been condensed into one function,
SysErr(), as it is likely to be called many times.
activity 4.1
Error Handling
Compile both versions of error-handling code and check that they work
as expected. Explain why the __FILE__ and __LINE__ symbols are de-
fined in a macro SysErr() rather than in the function body of doSysErr()
itself.
In systems which operate unattended, a log file may be used to record significant sys-
tem events. In Unix systems, log files are typically stored under /var/log/. Under Win-
dows NT, the system log may be accessed via Administrative Tools-Event Viewer.
Simply writing the error message to a file may be satisfactory but consider the fact
that the system may be running 24 hours a day, 7 days a week. The log files generated
may become quite large. Thus, a circular log file may be implemented, such that the
messages wrap around, with only most recent messages available (for example, the
most recent 100 messages). New messages over-write the oldest messages in the
system.
activity 4.2
Log Files
How would you implement a circular log file? (Hint: use fixed-length lines
when writing to the file.)
Figure 4.4 shows a simplified view of the Pentium family registers. The general purpose
registers ax, bx, cx and dx are 16 bits wide. These are split into 8-bit (1 byte) low and
high halves: al is the lower half of ax, ah is the upper (high) half of ax; ch is the high half
of cx and so forth. These registers may be loaded from each other, and loaded/stored
4.4 70935 Real-Time Systems
/* errhand.c
* Windows NT system call error handling
*
* Example output:
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
HANDLE hFile;
DWORD errorCode;
LPVOID lpMsgBuf;
from/to memory, using several addressing modes. The extended versions of these
registers are 32 bits (4 bytes) wide and begin with the prefix e. Thus eax is a 32-bit
register, whose lower 16 bits is ax, whose upper and lower 8 bits in turn are designated
ah/al. The ax register is sometimes called the accumulator, and in certain special
block-move instructions the cx register is referred to as the count register.
Register esp is the extended stack pointer. It is used in the manner of a traditional stack
pointer: when an item is pushed onto the stack, the stack pointer is decremented by
the correct number of bytes. When a value is popped from the stack, the stack pointer
is incremented as appropriate.
The source index (esi) and destination index (edi) registers are used to index memory.
Like the stack pointer, these cannot be split in the way the general-purpose registers
may be split. Certain instructions reference these registers implicitly.
Finally, the base pointer ebp is similar in some ways to a stack pointer. It is used in the
design of high-level languages, when arguments are passed to functions. This will be
4.6 70935 Real-Time Systems
examined further in later sections in the context of the assembly-code output from a C
compiler.
The following assumes, for simplicity, a flat memory model. In reality, the protected
mode of the Pentium processor means that hardware-enforced memory bounds check-
ing stops inadvertent or malicious access to memory areas that a program is not meant
to access. This is done via Descriptor Tables (DTs). A discussion of this is an ad-
vanced topic, beyond the scope of this module.
General Purpose
eax (32)
ax (16)
ah (8) al (8)
return address
4.5 C to Assembler
Most C compilers are able to output assembly code. Some are even able to interleave
the C and assembler codes, so that the block of C code and corresponding assembler
may be easily seen. The Gnu C compilers use the switch -S to do this. All of the
examples in this chapter have been generated in this way.
The simple function shown in Figure 4.5 will be used to illustrate the conversion of C to
assembler. Referring to this figure, three types of variable have been used:
long aLongGlobalVar;
local1 = 1234;
local2 = 56789;
local1 = val1;
local2 = val2;
aLongGlobalVar = 1245678;
return local1;
}
The annotated assembly code output is shown in Figure 4.6. This was generated using
where the suffix .s indicates assembler (some compilers use .asm or some other
extension).
To begin with, note the global declaration for the function using .globl. The function
name, as with global variables, has an underscore prepended. The entry or prolog
code to the function:
2. Adjusts the stack pointer to reserve sufficient space for the local variables.
In this way, the ebp base pointer points to the base in memory of the local variables.
This is illustrated in Figure 4.7, with the steps numbered 1 through 5.
Following on, the value 1234 is moved into variable local1. This is at offset -4 from
the base pointer, hence the indexed addressing instruction
.globl _asmfunc
_asmfunc: # prolog
pushl %ebp # save base ptr
movl %esp,%ebp # base ptr to stack base
subl $8,%esp # reserve space for 2 local
# variables (4 bytes each)
leave # epilog
ret
Figure 4.6 C to assembly language assembly code output.
The local variable local2 is stored next on the stack (remember that the stack grows
downwards towards lower memory).
Accessing the function parameters such as inval1 means accessing a higher address
in memory, hence a positive offset.
The global variable is accessed directly, in the data segment of the program. Finally,
the return value is in the eax register. Normally a register is used for return values,
although it may be different on different compilers.
activity 4.3
C to Assembler
Using the example C code and assembler output, draw a diagram sim-
ilar to Figure 4.7 as each instruction is processed. Show the memory
locations of each variable.
Module 4 Coding Techniques 4.9
High Memory
vars in
ret addr 2
local vars 5
Low Memory
The easiest way to examine a C-callable assembly code function is to first write the
function in C and examine the output from the compiler, and then optimize the assembly
code.
Figure 4.8 shows the C function which calls an assembly-coded function, asmfunc().
Not only integer data items will be used, but also an array, which is passed by reference.
Passing by reference means that the address of the array (arrptr in the figure) is
passed as a parameter. The output of the program is shown in Figure 4.9.
The initial code shown in Figure 4.10 is similar to that examined in the previous exam-
ple. The stack and base pointer registers are adjusted to obtain the stack frame.
In Figure 4.11, a loop (beginning with label fill) is used to access the array. This fills
the array with values 7, 12, 17 and 22. This type of coding rapid access to an array
of values is commonly the type of code which must be optimized.
In the second half of Figure 4.11, some more advanced processor operations are
shown. The instruction sequence rep movsb is used to move (mov) a string (s) con-
4.10 70935 Real-Time Systems
#include <stdio.h>
#include <stdlib.h>
long aLongGlobalVar;
unsigned char srcdata[] = { 200, 201, 202, 203 };
unsigned char destdata[] = { 9, 8, 7, 6};
int main()
{
short val1, val2, retval, i;
long arrptr;
short array[4];
aLongGlobalVar = 87;
val1 = 12345;
val2 = 6789;
arrptr = (long)&array[0];
sisting of byte values (b). The rep instruction prefix signals that the instruction is to
be repeated the number of times in the cx register.
4.12 70935 Real-Time Systems
.file "asmfunc.s"
.data
# 0x for hexadecimal
#somedata: .long 0x56781234
somedata: .long 9876
.text
# int asmfunc(int inval1, int inval2, long arrptr);
# note _ underscore required
.globl _asmfunc
_asmfunc:
# prolog
pushl %ebp # save base ptr
movl %esp,%ebp # base ptr to stack base
# epilog
leave
ret
Figure 4.11 Called assembly code (part 2 of 2).
4.14 70935 Real-Time Systems
Of course, the most efficient algorithm must be used for the task (algorithm optimiza-
tion). Although this depends almost entirely on the task at hand, some simplifications
are often overlooked. For example, suppose some section of code had to be imple-
mented if the conditions today is tuesday and today is the first day of the month were
both true, as follows:
Either ordering of the tests produces the same output, but since the second is less
likely, it should be performed first, so as to rule out the subsequent test most of the
time:
Re-writing portions of the code in assembly is tedious and error-prone, but a speedup
of the order of two, ten or even one hundred times is not uncommon. This of course
assumes the performance bottleneck has been identified and that the programmer has
sufficient skill to work in the given low-level assembly language.
In the following discussion, the execution time reductions may only be of the order of
one or two instructions in many cases. A one-off reduction of this magnitude is hardly
worth the effort expended. But where code portions appear in loops or nested loops,
the small speedup is leveraged many times over. For example, consider the code to
move a window on the screen from one location to another, as would be encountered
when moving or resizing a window on a PC desktop. Suppose the window is 400
pixels wide by 300 pixels high. The number of bytes moved is 120 000. Suppose
(optimistically) that the movement of each pixel could be reduced in time by 10 mi-
croseconds. The user will observe a reduction of more than one tenth of a second
certainly noticeable. It can easily be seen that real-time video games, for example, are
ideal candidates for optimization.
Constant Propagation Using constants directly when they need not be stored.
Dead Code Elimination of code which is present but whose result is not used.
Use of Registers Using registers rather than memory to store often-used variables.
Loop Invariance Removing code which is executed within a loop but need not be.
Function Inlining Duplicating the code for functions to save the call/return overhead.
Data alignment Aligning data for the most efficient memory bus access.
The code shown in Figure 4.12 is used for the following examples.
r = 4*5;
4.16 70935 Real-Time Systems
/* optim.c
* For GnuC
* Use
gcc -S -O3 -funroll-loops optim.c -o optim.s
gcc -S -O -finline-functions optim.c -o optim.s
*
* John Leis
*/
r = 4*5;
r = r * 9;
r = r * r;
//r = r + i*someArg;
}
return r;
}
is reduced to
4 5
The constant has been pre-calculated, with the value 20 inserted into the compiled
code. The multiplication is not done at run-time. Note also that the stack in memory is
used to hold the variable r at offset -8 from the base pointer register ebp.
activity 4.4
Compiler Optimization
Using the GnuC compiler, compile the assembly code optim.c and
examine the assembly listing optim.s. Use the command line
result = someFunction(theArg);
The value 34 is stored in local variable theArg, transferred to the ax register and then
pushed on the stack to be transferred to the function someFunction. With optimization
this is reduced to:
pushl $34
call _someFunction
r = r * 9;
has not been compiled as a multiply operation. Instead it appears in the output listing
as
r = r * r;
Dead or unused code is quite often produced during the debugging phase. For example
the assignment
k = 6;
in the example code is obviously redundant, as the value of k is not subsequently used.
Efficient compiler optimization may be able to detect this.
Storing frequently-used variables in registers rather than in memory (the default) saves
time. For example, the loop section
compiles to
movw $0,-4(%ebp)
L3:
cmpw $3,-4(%ebp)
jle L6
jmp L4
L6:
movw $6,-6(%ebp)
L5:
incw -4(%ebp)
jmp L3
The variable i is stored in memory location -4(%ebp). The register qualifier may be
used in front of variables to request the compiler to use registers rather than memory.
Compare the above to using the declaration
4.20 70935 Real-Time Systems
register short i, k, r;
The variable i has been stored in register dx. Note that the register declaration does
not require the compiler to use a register at all: it is only a suggestion. As the number
of registers is limited, there may be none available. Furthermore, optimizing compilers
may implicitly generate code which uses registers in this way by using the optimization
option. This is done using
Where -O indicates use optimization, with the number following indicating the level of
optimization to use (0,1,2, or 3)
compiles to
Module 4 Coding Techniques 4.21
At each iteration of the loop there is an increment (incw, increment word), test (cmpw
compare word) and branch (jle, jump if less than or equal to). These instructions
effectively constitute the loop overhead. Small loops may be replaced by direct copies
of the code executed within the loop.
r = r + i*someArg;
_someFunction:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %eax
movl %eax, %edx
addl $32400, %edx
movswl %dx,%eax
leave
ret
4.22 70935 Real-Time Systems
activity 4.5
Compilation Checking
The above code has assigned register eax for variable someArg and reg-
ister edx for variable r. Verify that the computation is still correct.
myClass[studentNumber].age = 0;
on each iteration. The pointer alternative simply increments by a fixed amount (the size
of the STUDENT structure) at each iteration, which is more efficient.
Figure 4.14 shows some sample code for comparing indexed and pointer-based mem-
ory access for a simple memory block copy. Typically the pointer-based loop is around
5% faster.
activity 4.6
Compile the code module ptrspeed.c shown in Figure 4.14 and com-
pare the resulting execution times. On faster machines it may be nec-
essary to increase the variable NITERATIONS to force the execution to
take longer and to obtain a more accurate comparison. (Sample figures
on a Pentium 200MHz are indexing = 9.07 seconds and pointers = 8.46
seconds; A Pentium 450MHz gave indexing 3.19 seconds, pointers 2.91
seconds).
Module 4 Coding Techniques 4.23
/* ptrloop.c
*
* John Leis
*/
typedef struct
{
char initial;
short age;
} STUDENT;
#define NUM_STUDENTS 20
STUDENT myClass[NUM_STUDENTS];
// using pointers
pCurrentStudent->age = 0;
pCurrentStudent++ ;
}
}
long Block[BLOCK_SIZE];
int main()
{
clock_t tstart, tend;
double telapsed;
short n;
long iter, *ptr1, *ptr2;
// using pointers
tstart = clock();
for( iter = 0; iter < NITERATIONS; iter++)
{
ptr1 = &Block[0];
ptr2 = &Block[1];
for( n = 0; n < BLOCK_SIZE-1; n++)
{
*ptr1++ = *ptr2++ ;
}
}
tend = clock();
telapsed = (double)(tend - tstart)/(double)CLOCKS_PER_SEC;
printf("Pointers: Elapsed time = %lf seconds\n", telapsed);
exit(0);
}
Loop invariance refers to calculations or memory transfers which are coded inside a
loop, but which may be performed outside a loop. Consider the following example:
The value kj could be pre-calculated and assigned to a single variable prior to the
loop, thus saving time.
Each function call has entry code (also called the prolog) and exit code (the epilog).
The entry code looks like this:
_someFunction:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
Here the stack frame is set up by storing the local stack pointer (the base pointer) and
reserving space for local variables (here 8 bytes). The exit code restores the stack
pointer and pops the return address:
leave
ret
The leave instruction was introduced in later processors because it is a common re-
quirement in procedure-oriented code. It effectively performs
Function inlining replaces small functions with verbatim copies of the functions code,
thus removing the overhead of the entry/exit code. The disadvantage is that the exe-
cutable code becomes longer.
activity 4.7
Inline Functions
For processors with 16 or 32 bit memory buses, accessing an 8-bit quantity still takes
one memory access. Using a data structure containing a mixture of one or two byte
quantities means that for the most efficient memory usage, the quantities should be
packed together. However this may increase the number of memory accesses re-
quired, as the two-byte quantity may straddle a 16-bit boundary. That is, the lower byte
may be stored in one 16-bit location with the higher byte stored in the next highest
location as shown in Figure 4.15.
Figure 4.16 defines structures ASTUDENT and BSTUDENT with identical elements. A
char is a one-byte quantity, while a short is a two-byte quantity. The structure ASTU-
DENT is packed, so that the total storage space is 3 bytes. The structure BSTUDENT
is packed to as to align the data items on word boundaries, thus requiring 4 bytes. One
byte is effectively wasted, but the access time is substantially reduced.
Figure 4.17 shows the corresponding code output for the memory accesses for the
aligned and misaligned structures, produced using the GnuC compiler. Note that the
stack offsets from the ebp register are 3 and 4 for the packed structure, and 6 and 8 for
the word-aligned structure. The instruction movw $88,-3(%ebp) will incur two access
to memory.
Module 4 Coding Techniques 4.27
16 bits 16 bits
address n +1
address n
misaligned aligned
activity 4.8
Using the GnuC compiler, the assembly code align.c and examine the
assembly listing align.s. Use the command line
gcc -S align.c -o align.s
2. The array size would need to be compiled into the program. The array size could
be hard-coded to some maximum value within the program when compiled. This
4.28 70935 Real-Time Systems
/* align.c
* For GnuC
* Compile using
gcc -S align.c -o align.s
*
* John Leis
*/
#include <stdio.h>
} ASTUDENT;
int main()
{
ASTUDENT me;
BSTUDENT you;
me.initial = 23;
me.age = 88;
you.initial = 24;
you.age = 99;
}
movb $23,-4(%ebp)
movw $88,-3(%ebp)
movb $24,-8(%ebp)
movw $99,-6(%ebp)
Dynamic memory allocation is done using malloc() to allocate the required number of
bytes of storage. The memory must be released using the free() call. The memory
allocated is guaranteed to be contiguous that is, one single block and not several
fragmented blocks. This allows the memory to be used as a conventional array.
Figure 4.18 gives a sample program for testing the malloc() function. Note the corre-
sponding free() call, which releases the memory block back into the pool. A typical
output of this program is shown in Figure 4.19. Note that it will vary from system to sys-
tem, depending on the amount of RAM installed, the number of applications running,
and other factors.
1. Not declaring the malloc() and free() functions using the include file stdlib.h.
This may cause a compiler warning about int/long/pointer mismatches which, if
ignored, could incorrectly interpret the size of the arguments to (and return values
from) the library functions.
2. Not allocating the memory that is, omitting to call malloc() when needed.
3. Not checking the return value from malloc(). If the function returns a NULL
pointer, then it means that there is not a contiguous block of the requested size
available in the system at all.
4. Writing to memory beyond the end of the allocated block. If malloc() is given a
size argument of N bytes then the allowable byte offsets are from to N 0 1
. Lo-
cations after this belong to the operating system (or perhaps another application)
and must not be used. This normally causes a memory protection fault.
5. Not calling free() after the application has finished using the memory. This is
termed a memory leak.
6. Using the memory after free() has released the memory. The block previously
allocated no longer belongs to the application after a call to free().
4.30 70935 Real-Time Systems
7. Changing the pointer value returned from malloc() and then passing the changed
value to free(). The original returned value should be kept as free() uses it to
determine the start of the allocated memory block (and indirectly, its length). If
pointer addressing is required within the block, it is necessary to assign the allo-
cated pointer to another pointer which may be altered. See the example following
this.
8. Incorrectly declaring a pointer type for the allocated block or not typecasting the
returned pointer. The former can cause fatal errors; the latter compiler warnings.
See the example following this.
9. Freeing a block which contains a pointer to another memory block. This can
occur if a block of pointers is allocated, with each pointer pointing to a another
dynamically allocated block. The order of allocating and freeing the blocks is
vitally important.
Arrays of data structures are easily allocated. Using the previous student-record data
structure, Figure 4.20 gives a complete example.
Module 4 Coding Techniques 4.31
int main()
{
long nBytes;
char *memPtr;
nBytes = 1024*16;
do
{
memPtr = (char *)malloc(nBytes);
if( ! memPtr )
{
printf("Malloc of %ld failed\n", nBytes );
}
else
{
printf("Malloc of %ld is OK\n", nBytes );
free(memPtr);
}
nBytes += 1024*16;
} while(memPtr);
exit(0);
}
typedef struct
{
char Name[20+1]; // name - 20 bytes (+zero terminator)
double gpa; // grade-point average
} Student;
int main()
{
short nStudents, currStudent;
Student *pRecords, *pCurrent;
activity 4.9
Memory Allocation
activity 4.10
In malloc.c(), comment out the free() call. Then re-compile the pro-
gram and examine the output. Can you explain what is happening? This
is called a memory leak.
activity 4.11
char src[IO_BUFLEN];
char dest[IO_BUFLEN];
To copy from src (source) to dest (destination), the following code could be used:
int n;
char *pSrc, *pDest;
#incluide <string.h>
Note that memcpy() takes void-type pointers, thus enabling it to copy any arbitrary
memory block irrespective of the actual data type. Many other library functions ex-
ist for memory access, such as memset() for setting a block of memory to a particular
value and memchr() for searching for a particular character. The string functions such
as strcpy() (copy a string) and strlen() (length of a string) operate on null (zero)-
terminate strings, whereas the mem functions assume nothing about the blocks of
memory supplied.
Module 4 Coding Techniques 4.35
Error handling.
Assembly language.
Software optimization.
Further Reading
a
http://www.usq.edu.au/users/leis/units/70935/935link.html
Module 5
ALGORITHMS
Module 5 Algorithms 5.1
5.2 Introduction
Many different algorithms are used in the implementation of complex computer soft-
ware systems. Choosing the correct approach for a given problem has many benefits
in terms of ease of implementation and speed of execution. This module presents only
a brief overview of some important concepts. Many standard libraries in C and C++
are available which implement algorithms for various purposes, and often it may be
preferable to use these off-the-shelf solutions rather than wasting project time coding
and debugging from scratch. In order to make the most use of standard libraries it
is necessary to understand the underlying algorithm and how it is likely to perform in
terms of speed of execution, memory usage and so forth. In many circumstances
possibly because of the uniqueness of a particular problem it may be necessary to
code an algorithm from scratch.
Operating systems utilize a great many algorithms and data structures internally. Mul-
titasking, for example (considered in a subsequent module) relies on priority queues
to schedule tasks for execution. Simply typing on a keyboard invokes a buffering al-
gorithm, so that the keystrokes are delivered in order when the receiving application is
ready to process them. Network data may need to be queued upon arrival, and pos-
sibly processed out of order. These aspects should be considered when reading the
following.
several forms of system buffering. As seen in the figure, differences of more than an
order of magnitude are possible. The reason why buffering is important in this instance
is that the disk is a physical device and can only spin so fast. An intelligent pre-reading
method, which stores more data blocks from the disk than are actually requested by
the application, can have enormous benefits.
40
seconds
30
20
10
0
default no buffer 16 32K no standard lib
1.5
seconds
0.5
0
default no buffer 16 32K no standard lib
Figure 5.2 shows the initial portion of the program. Note that the file buffers must be
declared static if they were automatic (within a function) then the buffer space may be
re-allocated, resulting in two sections of code trying to use the same memory space.
Note in Figure 5.2 the use of the clock() system function to time the operation.
Figure 5.3 shows a subsequent section, using first the default library buffering, follow-
ing by a call using no buffer. The C library buffer is disabled by calling
Figure 5.4 illustrates the assignment of the static file buffers, which are simply arrays
of bytes (unsigned char data type in C):
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h> // open() close()
#include <io.h> // O_ definitions
#include <time.h>
#define SMALLBUF 16
#define LARGEBUF 32768
if( argc != 2)
{
printf("Usage: filebuf filename\n");
exit(1);
}
fclose(fp);
// second - no buffer
if( !(fp = fopen(argv[1], "rb")) )
{
printf("Could not open %s\n", argv[1]);
exit(2);
}
startTime = clock();
checksum = calcChecksum(fp);
endTime = clock();
elapsedTime = endTime - startTime;
printf("checksum %lx. Elapsed %.2lf\n",
checksum, (double)elapsedTime/(double)CLOCKS_PER_SEC);
fclose(fp);
startTime = clock();
checksum = calcChecksum(fp);
endTime = clock();
elapsedTime = endTime - startTime;
printf("checksum %lx. Elapsed %.2lf\n",
checksum, (double)elapsedTime/(double)CLOCKS_PER_SEC);
fclose(fp);
startTime = clock();
checksum = calcChecksum(fp);
endTime = clock();
elapsedTime = endTime - startTime;
printf("checksum %lx. Elapsed %.2lf\n",
checksum, (double)elapsedTime/(double)CLOCKS_PER_SEC);
fclose(fp);
startTime = clock();
checksum = calcChecksumNolib(fd);
endTime = clock();
elapsedTime = endTime - startTime;
printf("checksum %lx. Elapsed %.2lf\n",
checksum, (double)elapsedTime/(double)CLOCKS_PER_SEC);
close(fd);
exit(0);
}
Figure 5.5 shows the portion of the test that does not use the standard library buffering.
The buffered routines use fopen(), fread() and fwrite() to access the files, and
require a FILE structure. The unbuffered portion of the test uses the direct system
calls open(), read() and close().
Figure 5.6 shows the two routines which are accessing the files contents. For the ex-
ample, each calculates a modulo-checksum (the sum of all bytes in the file). Such an
operation is requred, for example, in transmitting data frames over a network. The
only difference between the two functions is the use of the file access routines
the fread(), which is a buffered read, and read(), which is a standard system call.
fread() requires a pointer to a FILE structure, which encompasses the buffer. The
read() function calls the system file read function directly, and requires an integer file
descriptor.
Manipulating buffers causes many problems. If not thought through carefully, the rou-
tines used for buffer manipulation may slow the overall performance. Figure 5.7 shows
one method of accessing a pool of buffers, using an array of pointers to store the mem-
ory address of the start of each buffer. If the buffers need to be sorted for example,
a set of student records to be sorted in alphabetical order the relative position of
the buffers must be changed. One solution is to copy all the bytes comprising each
buffer. This is potentially very slow. A better solution is simply to swap the pointers to
the buffers, as illustrated in Figure 5.7. Variations on this theme are discussed in the
following sections.
Module 5 Algorithms 5.7
return cksm;
}
return cksm;
}
swap pointers
..
.b
b
A circular buffer is often useful when data must be processed in order. For example,
keystrokes from a keyboard may arrive at unequal intervals, but must be handed over to
the application by the system kernel when requested. Since memory is a linear concept
(from start to finish), any array must have some structured handling superimposed
upon in so that it may be viewed as being circular.
Figure 5.8 illustrates this concept. In practice, the circular buffer is laid out as shown
in Figure 5.9. Pointers in and out point to the memory location of the next item to be
stored in the buffer, and the next item to be read from the buffer, respectively. If these
are equal, there is nothing stored in the buffer, as shown in A in the figure.
After receiving the characters o, n, and e, the situation is as depicted in part B. The in
pointer is advanced to the next free location. Following that, the characters t, w, and o
are received. These are also stored in the buffer.
If the receiving application is then able to read characters, it reads them in order as
shown in Figure 5.10. The characters o, n and e happen to be read out, and the out
pointer is advanced. Finally, consider the case when the letters t, h, r, e, and e must be
queued in the buffer. Once the in pointer reaches index 9 in the buffer, it wraps around
to index 0. This is no problem, since the byte that was formerly at index 0 has been
Module 5 Algorithms 5.9
b in ptr
b
out
read out (of course it will still remain there, but the application has read out the data
item so it may as well be erased).
In effect, the linear array has become circular. Of course, if the in pointer wraps
and eventually catches up to the out pointer, valid data will be overwritten. The buffer
must be large enough for this not to happen. The receiving application must have an
average speed equal to the average input speed. The buffer is merely absorbing the
bursty nature of the input hence the term elastic buffer is sometimes used.
Figure 5.12 shows the initialization necessary the input and output pointers are set
to the start of the buffer.
The get and put routines (Figures 5.13 and 5.14) perform the necessary testing for
wrapping of the FIFO.
Figures 5.15, 5.16 and 5.17 show the testing of the FIFO, with the output shown in
Figure 5.18.
5.10 70935 Real-Time Systems
buflen
in
b
A 0
out
in
B o n e 3
out
in
C o n e t w o 6
out
0 1 2 3 4 5 6 7 8 9
bufidx
buflen
in
D - - - t w o 3
out
in
E e - - t w o t h r e 8
out
0 1 2 3 4 5 6 7 8 9
bufidx
/* fifo.c
* Simple first-in, first-out buffering.
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#define QUEUE_SIZE 4
typedef struct
{
char queue[QUEUE_SIZE];
char *qin, *qout;
int charsInQueue, inCtr, outCtr;
} FIFO;
void initFifo();
int fifoPut(int nextChar);
int fifoGet();
Figure 5.11 First-In, First-Out (FIFO) code main test code (part 1 of 4).
if( fifo.charsInQueue == 0)
{
return nextChar; // -1 for nothing in queue
}
nextChar = (int)*fifo.qout++;
if( ++fifo.outCtr == QUEUE_SIZE )
{
fifo.qout = &fifo.queue[0];
fifo.outCtr = 0;
}
--fifo.charsInQueue;
return nextChar;
}
return 1;
}
int main()
{
int ch, i;
Figure 5.16 FIFO test 2, normal operation when the FIFO buffer wraps.
Module 5 Algorithms 5.15
exit(0);
}
activity 5.1
Compile the FIFO test program. Verify normal operation of the FIFO,
the situation when the pointers wrap around, and the result when the
fifo.qin pointer attempts to overtake the fifo.qout pointer and data
is lost.
As a final note consider the dynamic handling of several FIFO buffers. In this code the
FIFO data structure is static, however several FIFOs could be managed by modifying
each routine to take a pointer to a FIFO and having the initFifo() allocate the buffer
via malloc() (this would also require a destroyFifo() routine).
Thus, each element is threaded into the list as depicted in Figure 5.19 using the next
pointer. In the example shown, each entry is simply a character string in order to have a
workable example but limiting the complexity. The data structure may be, for example,
a record from a database, a list of processes to be scheduled in an operating system,
or a list of sequential timeouts to be checked in a real-time system.
In operation, the list is traversed by beginning at the root pointer as shown in Fig-
ure 5.19. To find the next item, the next pointer is followed from the current item. The
last item in the list has a next pointer set to a special indicator, normally the NULL
pointer in C (a value of zero).
Consider the case where items A and B exist on the list. Insertion of item C in the list
requires unlinking or redirecting the root pointer to point to C, followed by setting the
1
Languages such as Java, which do not have the pointer type, use other mechanisms.
Module 5 Algorithms 5.17
node *first
b
b b
B
C A
next pointer of the new item to point to the item which was formerly first on the list (here
B). Note that this implies addition at the front of the list, but this need not necessarily
be the case. In addition, dynamic memory allocation must be used to create the new
list entry, rather than static declaration of the array. Code examples will demonstrate
this shortly.
The linked list is able to implement many queue disciplines (modes of operations). For
example, a First-In, First-Out (FIFO) discipline as discussed in the previous section may
be implemented using a linked list. If each item is a simple character, a structure using
a pointer as outlined above may not be the most memory-efficient design, especially if
the list is to become large. A last-in, first-out discipline may be implemented, which is
similar to the operation of a processor stack. In-order addition to the list may be done,
so that the list is always maintained in a sorted state.
A linked list is thus suitable for applications where the number of items on the list is
highly variable over time. The maximum number of items which may be queued is
limited only by the memory resources of the system.
Figure 5.20 shows the declaration of the list data structure as well as the root pointer.
This code implements three types of addition to a linked list:
1. Addition at the head of the list, making the new item the first on the list.
2. Addition at the tail of the list, making the new item the last on the list.
3. Addition subject to another criteria, in this case that the new item is in the correct
alphabetical sequence with respect to other items already on the list.
Note that the in-order addition could be, for example, in terms of a numeric quantity, a
time stamp, or some other condition.
Figures 5.21, 5.22 and 5.23 show addition in alphabetical order, at the head, and at the
tail, respectively.
5.18 70935 Real-Time Systems
#define MAX_NAMELEN 40
#define ENTRY struct entry
#define NULL_ENTRY (ENTRY *)NULL
ENTRY
{
char name[MAX_NAMELEN+1];
ENTRY *next;
};
// function prototypes
int main();
void InitList();
void AddEntryToHead(char *NewName);
void AddEntryToTail(char *NewName);
void AddEntryInOrder(char *NewName);
ENTRY *CreateNewEntry(char *NewName);
void PrintList(void);
void FreeList(void);
Figure 5.21 Linked list test 1: add items in alphabetical order (part 2 of 4).
Figure 5.22 Linked list test 2: add items at the head of the list (part 3 of 4).
exit(0);
}
Figure 5.23 Linked list test 3: add items at the tail of the list (part 4 of 4).
5.20 70935 Real-Time Systems
void InitList()
{
ListAnchor = NULL_ENTRY;
}
NewEntry = CreateNewEntry(NewName);
if( ListAnchor == NULL_ENTRY )
{
// list is currently empty
ListAnchor = NewEntry;
return;
}
Figure 5.25 Linked list function to add to the head of the list.
Initialization of the list is quite simple, as shown in Figure 5.24. There is only one root
pointer, which is set to NULL to indicate that the list is empty.
Adding to the head of the list is shown in Figure 5.25. This was previously shown
diagrammatically. Adding to the head of a list is evidently straightforward. Adding to
the tail of the list as shown in Figure 5.26 is marginally more complicated, as it requires
traversal of the list. In practice, if addition at the end of the list is often performed, it
may be more efficient to store a static pointer to the last item.
Adding in order (Figure 5.27) is more complicated. Firstly, the search (ascending nu-
merical, descending numerical, alphabetic, etc) must be performed. For numerical
quantities, a simple comparison may suffice. For more complex criteria, a function such
as the library function strcmp() must be used. Note in the figure that double-indirection
is used a pointer to a pointer (variable ENTRY **ppCurrent). The is because not only
must the subsequent list item be pointed to by the new entry, the previous item must
point to the new item. As the list is traversed, the current item has no knowledge of
the previous item, and therefore the pointer one stage back must be known.
Figure 5.28 shows the output of the linked-list test examples. The memory address of
the list items are shown in order to help distinguish each item as it is allocated.
Module 5 Algorithms 5.21
NewEntry = CreateNewEntry(NewName);
if( ListAnchor == NULL_ENTRY )
{
// list is currently empty
ListAnchor = NewEntry;
return;
}
Figure 5.26 Linked list function to add to the tail of the list.
activity 5.2
Work through each of the list insertion C functions, using the example
output. Draw a diagram of each scenario, showing the items on the list
and the pointers.
Figures 5.29, 5.30 and 5.31 show helper functions to create a new item, print the
list, and free the list (since it is dynamically created). To re-use the list, it must be
re-initialized. Specifically, the root pointer must be set to NULL.
5.22 70935 Real-Time Systems
NewEntry = CreateNewEntry(NewName);
if( ListAnchor == NULL_ENTRY )
{
// list is currently empty
ListAnchor = NewEntry;
return;
}
return NewEntry;
}
Figure 5.30 Linked list function to traverse the list and print items.
Module 5 Algorithms 5.25
activity 5.3
Sometimes it is necessary to delete not the entire list, but a single item
in the list. Write a function to to this, given a string as the item to be
deleted.
activity 5.4
Finally, some applications of linked lists are mentioned. Multitasking operating systems
used linked lists (or a variation thereof) to maintain the ordered lists of tasks to be run.
Priority queues, with each list corresponding to tasks at a certain scheduling priority,
may be used to quickly locate the next task to be scheduled. The older DOS FAT (File
Allocation Table) filesystem (still used on floppy disks) uses a linked-list variation to
locate the sectors on a disk corresponding to each file.
5.26 70935 Real-Time Systems
Another important data structure is the binary tree (or btree). Figure 5.32 shows the
fundamental idea: instead of a linear list, each item forms a leaf in a tree, containing
not only the data but also left and right pointers (consider the diagram rotated so the
tree grows downwards). This type of data structure is able to locate items in O (log )
N
( )
time, as opposed to O N or linear time for a linked list .2
The fundamental idea is that items are added in order, where order is defined by the
task at hand. Consider the tree of Figure 5.32. Beginning at the root node and consid-
ering each item in turn, if the item to be added is less than the current item, the lower
(or left) pointer (branch) is accessed, which becomes the current node. Conversely, if
the new item is greater in the defined ordering, the right branch is accessed. Travers-
ing the tree to retrieve all the items requires recursively descending the left branch until
a null branch is encountered. This means that a leaf node has been encountered.
The right branch then becomes the new starting point. All left branches from this point
are followed. The code examples shortly will help clarify this.
b
b
b
b
b b
b b
b
b
b
b
Figure 5.34 shows a test program for the binary tree implementation, which will be
examined below.
2
O () means order.
Module 5 Algorithms 5.27
Figure 5.35 shows addition to the tree. The decision as to whether to access the left
or right branch at each node depends on the value of the new item with respect to the
current item in the tree. When a decision is made to go left or right, that branch is
checked to see if it exists.
Creating a new node is done in the helper function shown in Figure 5.36. This simply
allocates space for the new node and initializes its value to that requested. The left/right
pointers are set to null.
The complexity of adding to the tree is offset to some degree by the relative simplicity
of accessing the tree in order. This is done using the recursive function shown in
Figure 5.37, which follows the left pointer as far as possible. When the leaf node is
encountered, the right node is taken as a new starting point and the tree descended.
activity 5.5
Binary Trees
By drawing a diagram similar to that shown for a binary tree, show each
stage of the addition of strings to the tree. Once the tree is populated
with several strings, show how the tree is recursively traversed in-order
and that this results in the correct alphabetical sequence being printed.
5.28 70935 Real-Time Systems
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_NAMELEN 20
#define NODE struct node
#define NULL_NODE_PTR (NODE *)NULL
NODE
{
char name[MAX_NAMELEN+1];
NODE *Parent;
NODE *LeftChild, *RightChild;
};
int main();
void AddNode( char *NewName );
NODE *CreateChildNode( NODE *Parent, char *name);
void DescendTree( NODE *StartNode );
int main()
{
char NameBuffer[MAX_NAMELEN+1];
while(1)
{
printf("Enter a name:");
gets(NameBuffer);
if( NameBuffer[0] != '\0')
AddNode(NameBuffer);
else
break;
}
DescendTree(RootPtr);
exit(0);
}
while(1)
{
if( strcmp(NewName, Current->name) < 0)
{
// go left
if( Current->LeftChild == NULL_NODE_PTR )
{
Current->LeftChild =
CreateChildNode(Current, NewName);
return;
}
else
{
Current = Current->LeftChild;
}
}
else
{
// go right
if( Current->RightChild == NULL_NODE_PTR )
{
Current->RightChild =
CreateChildNode(Current, NewName);
return;
}
else
{
Current = Current->RightChild;
}
}
}
}
return NewNodePtr;
}
A number of other issues, such as searching algorithms and text parsing, have not
been covered. The references give some starting points for investigating these as the
need arises.
Further Reading
a
http://www.usq.edu.au/users/leis/units/70935/935link.html
Module 6
MULTITASKING
Module 6 Multitasking 6.1
Processes, and
Threads.
These concepts are introduced by way of short example code segments for both Unix
and Windows NT.
6.2 Multitasking
The process (task) concept is the idea of subdividing the CPUs time into smaller times-
lices in order to do a small portion of each job when possible. The complete job will
likely not be finished in one timeslice. Threads, or strictly threads of execution are
a newer variation, and may be termed (for now) processes-within-processes. After
all, there is only one CPU (normally), and there may be dozens of processes running
at any given instant. If a particular process has no work to do, perhaps because it is
waiting for some data to arrive across a network or is waiting for a user-keystroke, then
the task may be temporarily suspended.
blocked
task 2
task 1
task 3
sleep
Figure 6.1 Conceptual view of tasks and transitions.
Note, however, that the process does not actually execute in a loop, polling the re-
source. This would be very inefficient. Instead, the task is blocked examples of this
will be seen throughout this module.
6.2.1 Processes
A task or process is a separate program which is run concurrently with other tasks.
The terms task and process are in common use, and are used interchangeably here.
Each process is scheduled, normally in a time-sliced fashion and subject to the avail-
ability of the CPU, by the operating system. The scheduling is modified depending
upon the priority of the task and any resources the task has requested. Each task has
its own separate data segment, code segment, open files and other resources. Nor-
mally each task is constrained to access memory strictly within its own allocated areas.
However, as the following two modules will show, tasks may co-operate to perform the
overall operations required of the system and may also require synchronization with
one another.
1. Read the executable code from disk and determine the initial memory resource
requirements.
2. Allocate space for the process, typically in conjunction with the memory manage-
ment subsystem of the operating system kernel.
Module 6 Multitasking 6.3
As can be seen, the act of starting a process involves considerable overhead. When
this is infrequent, such as when a user application starts, the overhead is necessary
and unavoidable. However when a given system requires processes to be started
and stopped at regular intervals, the overhead becomes significant, and is a drain on
the overall system. An example of this is an web page server, where each request
is handled by a separate process. Examples of starting, suspending, and stopping a
process are given in the first sections of the module.
6.2.2 Threads
Conceptually similar to a process is a thread. A thread is a lightweight process. A
given process may own one or more threads of execution. Implementation of threads
in operating systems is comparatively new many Unix flavors contain the POSIX
threads library, and Windows NT supports threads. Unlike separate processes, which
are loaded on demand, threads are loaded with the process itself and not on demand.
In addition, threads are able to access the memory space of their parent process di-
rectly. This makes for much faster startup of a thread, but means special care must
be taken in protecting areas of memory which are accessed by more than one thread.
Examples of this are given later in the module.
Both processes and threads are discussed in this module, for both Unix (POSIX)
and Windows NT. Code in this section is written by the author, derived principally
from [17], [18] and [19].
6.4 70935 Real-Time Systems
activity 6.1
In Figure 6.5, the command-line arguments are printed out. The following exercise
shows illustrates these. Note the use of the fflush() system call. This is because
system output is, by default, buffered. This means that the output of a program may
not immediately appear on the screen. It depends on the system scheduling and the
buffer sizes. In later exercises, several processes will be forked concurrently, and the
system output may be confusing unless each processes output is flushed immediately.
Note the use of the system calls sleep(), which temporarily suspends the process,
and exit(), which returns an exit status code to the operating system.
Module 6 Multitasking 6.7
#include <stdio.h>
#include <stdlib.h>
printf("Child sleeping...");
sleep(2);
printf("Child exiting.\n");
fflush(stdout);
activity 6.2
3. Kill the process using kill -SIGKILL <pid>, where <pid> is the
PID from ps.
Under NT:
3. Kill the process by selecting it from the task list and using the
End Process button.
Note that under Windows the alt-tab sequence may be used to step
through the task list.
Figure 6.6 shows the fork() system call, which is used in Unix to create a copy of the
current process. fork is interesting, in that it appears as a function that returns in two
places! If the call returns 0, the execution is in fact resuming in the child process. If the
call returns some other positive value, the execution has resumed in the context of the
parent process. The value returned is the PID of the child process.
Creating a child process as a duplicate of the current process is often not what is
required. The exec() family of calls takes the current process and overlays it with
a new process. Figure 6.7 shows this. As shown, the parent waits for the child to
terminate using the wait() system call. This is not mandatory the child process may
continue executing after the parent has finished. Figure 6.8 shows a sample output
from these test programs.
/* fork.c
* Demonstration of process duplication via 'fork()'
* See also forke.c
*
* Platform: SunOS or cygwin/NT
* Compiler: gcc
*
* Example output:
phanes (leis) [49] fork
child process - fork() returned 0, my pid via getpid() is 18200
Parent process ID 18199, child process is 18200
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
int main()
{
pid_t ForkPID;
case -1:
// error in fork() - too many processes ?
perror("fork() failed:");
exit(1);
default:
// parent process. fork() returns the child PID
printf("Parent process ID %d, child process is %d\n",
getpid(), ForkPID);
fflush(stdout);
}
exit(0);
}
int main()
{
pid_t ForkPID, ChildPID;
unsigned int ChildStatus;
// Wait for the child to exit & get its exit status.
ChildPID = wait(&ChildStatus);
Figure 6.7 Using fork() for duplication then overlaying using exec().
Module 6 Multitasking 6.11
tightly integrated into NT, and that each process has a main thread. The output of this
example is shown in Figure 6.10 note that the same child code is used.
activity 6.3
Note that graphical or GUI (graphical user interface) applications under Windows have
the function WinMain() as their main entry point rather than main(), with different ar-
guments.
6.4 Threads
A process, under either Unix or NT, is a separate stand-alone program with its own
separate local and global variables, open files and so forth. Although child processes
are given command-line arguments from the parent and can inherit a copy of the open
file handles of the parent, the child has its own execution context. That is, its own
variables, stack, and so forth. The child process has a main() or WinMain() at which
execution begins. When main() reaches the end, or the child calls exit(), the process
terminates.
Threads are a different proposition. Threads are essentially a special function within
the main parent process. However, when that function is invoked as a thread, it is
6.12 70935 Real-Time Systems
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
STARTUPINFO si;
PROCESS_INFORMATION pi;
BOOL fCreated;
char *processName = "child.exe";
int nProcesses = 4, processNum;
DWORD errorCode;
LPVOID lpMsgBuf;
memset(&si, 0, sizeof(STARTUPINFO) );
si.cb = sizeof(STARTUPINFO);
----
Process created:
process handle: 116
process main thread handle: 112
process ID 253:
main thread ID: 245
----
----
Process created:
process handle: 140
process main thread handle: 144
process ID 186:
main thread ID: 189
----
----
Process created:
process handle: 164
process main thread handle: 168
process ID 270:
main thread ID: 269
----
----
Process created:
process handle: 124
process main thread handle: 128
process ID 225:
main thread ID: 247
----
child process...
child process...
child process...
child process...
Goodbye.
Goodbye.
Goodbye.
Goodbye.
scheduled separately. The same function may be invoked as a thread as many times
as desired.
One coding issue brought about by multi-tasking is that of re-entrancy. Essentially, this
means that it is now possible for one function to be called by more than one task or
thread. If the function being called maintains static data representing the current state
of an object, the very act of calling the function in more than one context may mean that
the state information is changed several times. Some examples of this and where it
causes failure will be seen shortly.
This section shows how to create threads under Unix first, then under Windows NT.
Since 50 threads are created and each thread has a loop that increments nCounter
50 times, it would be expected that nCounter would be incremented to 5000. How-
ever, Figure 6.13 shows that this is not the case. This is due to the interaction of the
scheduling, and the fact that incrementing a variable is not necessarily an atomic op-
eration. An atomic operation is one that is completed entirely, and never interrupted
when partially finished. In the case of incrementing a counter, the C code
nCounter++ ;
may in fact translate into several assembly-language instructions (recall the earlier
module on compiler operation). Each assembly language instruction may be inter-
rupted by the system timer, and thus the thread currently executing may happen to be
suspended while another thread/process is allowed to run. Now consider the possible
sequence of operations in incrementing a variable in memory:
At each stage, a system timer interrupt and re-schedule is possible. In the example
code, the actual increment has been split into several C lines to make failure more
likely. So in Figure 6.13, the value does not reach 5000 but only 4617. A simple C
operation such as
Module 6 Multitasking 6.15
void *threadFunc()
{
long c;
long tmp;
pthread_rwlock_unlock(&lock);
}
nCounter++ ;
may only very occasionally fail, but it will still fail eventually. Such a bug is difficult
to track down. This situation may occur in other scenarios for example consider a
database for checking seat reservations on an airline. The operation of checking for an
available seat and actually booking the seat must be atomic, otherwise the last seat on
the plane may occasionally be sold twice!
The solution to the above dilemma is that the programmer must use lock functions
and atomic calls which are provided as part of the thread library. In Figure 6.13, the
functions
pthread_rwlock_wrlock()
and
6.16 70935 Real-Time Systems
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *threadFunc();
long nCounter;
pthread_rwlock_t lock = PTHREAD_RWLOCK_INITIALIZER;
int main()
{
pthread_t tid;
pthread_attr_t tattr;
pthread_rwlockattr_t lattr;
int threadNum;
int nThreads = 50;
nCounter = 0L;
pthread_attr_init(&tattr);
pthread_rwlockattr_init(&lattr);
pthread_rwlock_init(&lock, &lattr);
thread id 4 here...
counter=0
counter=1
counter=2
counter=3
counter=4
counter=5
...
thread id 5 here...
...
counter=4612
counter=4613
counter=4614
counter=4615
counter=4616
counter=4617
main() exiting
Figure 6.13 Creating a thread using POSIX threads output without locking.
pthread_rwlock_wrlock()
are used, to place a semaphore (lock) on the critical regions of code those that must
be executed atomically. Why not simply lock the entire thread function? The answer is
because this would mean that the entire thread function must execute to completion,
and other threads would be blocked. The threads would execute sequentially, thus
defeating the purpose of having concurrent threads.
activity 6.4
Compile and run the pthread.c code. Test with and without the atomic
operation in place. How often does the counter reach an incorrect value
when simply using an increment? When using the temporary variable?
Check that failure never occurs when using the thread lock functions.
long nCounter = 0;
// guaranteed thread-safe
//InterlockedIncrement(&nCounter);
return NO_ERROR;
}
The function shown in Figure 6.14 attempts to increment the variable nCounter 1000
times. The calling program, Figure 6.15, creates 50 threads. Thus Counter should
be incremented 50,000 times upon completion. Like the previous Unix example, if the
operation of incrementing the counter is not atomic it may be interrupted. The function
InterlockedIncrement(&nCounter)
int main()
{
DWORD threadID;
int threadNum, nThreads = 50;
HANDLE hThread;
DWORD errorCode;
LPVOID lpMsgBuf;
printf("starting...\n");
nCounter = 0L;
for( threadNum = 1; threadNum <= nThreads; threadNum++)
{
hThread = CreateThread(NULL, 0,
(LPTHREAD_START_ROUTINE)doIncrement, // thread routine
(void *)threadNum, // passed to thread function
0, &threadID); // flags, returned thread ID
if( hThread )
{
printf("thread id is %ld\n", threadID);
}
else
{
printf("Thread creation error \n");
exit(1);
}
}
return 0;
}
starting...
thread id is 262
thread id is 263
thread id is 264
thread id is 265
thread id is 266
thread id is 267
thread id is 268
thread id is 269
thread id is 270
thread id is 248
thread id is 206
thread id is 216
thread id is 255
Thread: lParam=1
counter is 0
counter is 1
counter is 2
...
counter is 40343
counter is 40344
counter is 40345
counter is 40346
counter is 40347
end of process, counter is 40348
Figure 6.16 Code output when creating a thread under Windows NT.
activity 6.5
Compile and run the thread.c code. Test with and without the atomic
operation in place. How often does the counter reach an incorrect value
when simply using an increment? When using the temporary variable?
Verify that failure never occurs when using the atomic increment function.
Module 6 Multitasking 6.21
Further Reading
a
http://www.usq.edu.au/users/leis/units/70935/935link.html
Module 7
INTERPROCESS
COMMUNICATION
Module 7 Interprocess Communication 7.1
As shown in the previous module, one processor (CPU) may be required to execute
several processes in a round-robin fashion. These processes may be largely indepen-
dent, or they may be the result of a conscious decision by the designer to break up
the functions into several separate, but co-operating processes. As described in the
previous module, the processes (or tasks) are largely independent and cannot commu-
nicate with each other. But such a communications mechanism is very important, so
as to enable processes to share data, request that another perform some action, and
so forth. The exact mechanism may be synchronous via messages, or asynchronous
via software interrupts. In the first instance, messages are posted on a message queue
by the sender, addressed to the receiver. The message may be only one or two bytes,
or may be a more elaborate data structure. It is said to be synchronous because
the recipient process attends to messages as time permits. The second method in-
volves stopping the normal flow of execution of the recipient task, typically by calling a
nominated function to handle the situation. The topics covered in this module are
The creation of processes and threads were discussed in the previous module, for both
Unix (POSIX) and Windows NT. Co-operating processes (and/or threads) are com-
monly used in large, complicated systems where many concurrent transactions must
be handled or where the overall system may be logically broken up into separate tasks.
In a sense, this is much like structured programming, which teaches the principle that a
single, monolithic main program should be broken down into constituent components.
The key point is that the processes co-operate which implies that they must share
information at various stages of the processes lifecycle. This is the aspect which is
discussed in this module. Synchronization, covered in the next module, may be viewed
as an extension of process communication. In many ways, one cannot be done without
the other.
/* cmdargs.c
* Command-line arguments
*
* Example output:
C:\usr\c\examples>gcc cmdargs.c -o cmdargs
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
exit(0);
}
Such a method is easy and convenient for passing configuration variables, filenames
and so forth. However, it is once-off when the child process starts. Processes which
need to continually exchange information need some other mechanism.
Module 7 Interprocess Communication 7.3
7.4 Pipes
The pipe provides for a one-way or bidirectional flow of bytes in a sequenced order. It
is effectively a first-in, first-out queue of bytes. The application processes must impose
some structure on the sequence for example, as character strings or a struct data
structure. Both Unix and Windows NT support pipes. Conceptually they are similar to
files accessed in order. Furthermore, the application programming interface (API) is
similar to accessing a file using file descriptors (handles).
Figure 7.2 shows the basic idea of a pipe communicating between two processes.
process 1 process 2
pipe
Figure 7.3 shows the establishment of a Unix pipe. The system call pipe() creates two
file descriptors. In the example, they are in the array int fifo[2]. Element fifo[0] is
created for reading, and element fifo[1] is for writing. In preparation for sending the
pipe handles to the child process, the file descriptor is converted into an ASCII string.
After the pipe (or FIFO) is created, a child process is created. As shown in Figure 7.4,
this is similar to the examples discussed in the previous module. The child process
expects a command-line argument which is the string representation of the pipes file
descriptor, hence the string buffer pipestr is passed in the execl() call.
Turning to the child process (Figure 7.5), the command-line arguments are checked
and parsed. Recall that argv[0] is the name of the process itself. Parsing here is
simple, and consists of converting the command-line argument argv[1] string into an
integer using atoi().
Note that this only works because under Unix, child processes inherit the open file
descriptors of their parents. The command-line passing of the file descriptor is merely a
7.4 70935 Real-Time Systems
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int fifo[2];
char *msg = "this is a test";
int pid;
char pipestr[10];
case -1:
// error in fork() - too many processes ?
perror("fork() failed:");
exit(2);
default:
// parent process. fork() returns the child PID
printf("Parent process ID %d, child process is %d\n",
getpid(), pid);
fflush(stdout);
}
printf("parent sleeping...\n");
fflush(stdout);
sleep(4);
printf("parent %d about to write on pipe\n", getpid() );
fflush(stdout);
write(fifo[1], msg, strlen(msg));
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
if(argc != 2)
{
printf("expect pipercv fd\n");
exit(1);
}
fd = atoi(argv[1]);
printf("pipercv: pid %d, pipe fd %d\n", getpid(), fd);
fflush(stdout);
mechanism for the child to be informed of the value of the file descriptor corresponding
to the pipe.
Reading the pipe is similar to reading from a file, using the read() call. Thus, a byte-
stream sequence is established between parent and child. If bidirectional communica-
tion is required (child to parent as well), the parent would have to pass the file descriptor
for the write end of the pipe to the child.
Figure 7.6 shows the output from the combined parent/child pipe communications. It is
now evident why the fflush() call is used after printf() both the parent and child
output strings are appearing interleaved on the same output screen device. If the buffer
was not flushed, the output would be somewhat confusing as the timing order would be
lost. Note that the inclusion of the fflush() call does not necessarily guarantee that
the parent/child output is not mixed up, as a process context switch may occur at any
time.
The following example is of a named pipe under Windows NT. The server process
(Figures 7.7 and 7.8) creates the pipe using CreateNamedPipe(). The pipe is a special
form of filename, of the form \pipe\mypipe, where the first pipe is mandatory and
the second argument is the name of the pipe itself. Note also the additional security
attributes which may be set up under NT.
In this example, the server (parent) is expecting a message through the pipe from
the client (child) process. Although the passing of the pipe name could be done via
command-line arguments, since the name is predefined no special mechanism is used
and the child process accesses the pipe directly. The PIPENAME string would normally
be defined in a header (.h) file which is included by both parent and child.
The child process of Figure 7.10 assumes that the pipe file has already been created,
and opens a file descriptor (Unix terminology) or handle (Windows terminology) to the
7.8 70935 Real-Time Systems
pBuf = buf;
fRead = ReadFile(hPipe, pBuf, NBUF, &nRead, NULL);
exit(0);
}
pipe using CreateFile() with the appropriate flags for read-only opening of an existing
file (actually a pipe). A message is then written using WriteFile().
An example output is shown in Figure 7.9. When executing this example, the output
is more easily understood if the client pclient and server pserver are invoked from
separate console windows. In addition, the server must be executed first in order to
create the pipe. In a practical situation, this means that the server would have to create
the client in a parent/child situation, rather than having them invoked separately as is
done here.
activity 7.1
Windows NT Pipes.
Compile the code examples pserver and pclient. Create two separate
command (DOS) windows. Invoke the client in one window it should
fail, as the code assumes the pipe has already been created and simply
tries to open the existing pipe file. Now run the server in one window
first, and then the client in the other window.
Module 7 Interprocess Communication 7.11
#include <windows.h>
#include <stdlib.h>
#include <string.h>
int main()
{
HANDLE hPipe;
char *msg = "hello";
DWORD written, nBytes;
LPSTR pBuf;
BOOL fWrite;
pBuf = msg;
nBytes = strlen(msg)+1;
fWrite = WriteFile( hPipe, pBuf, nBytes, &written, NULL);
if( fWrite == FALSE)
{
printf("client:write error\n");
exit(1);
}
printf("%ld bytes written\n", written);
exit(0);
}
Normally each process has its own memory space, and cannot (deliberately or acci-
dentally) read from or write to the address space of any other process. However, direct
access to memory is the fastest possible type of interprocess communication, and it
allows random access (rather than sequential as in pipes).
As shown in Figure 7.11, the shared memory segment does not necessarily appear in
the same address space of each process.
process 1 process 2
shared
segment
Examining the code, the file is first created using open() with the create and read-write
flags. The mapping is performed using mmap(), which takes the file descriptor and
returns a pointer to the shared memory block. This pointer may then be used as a
conventional memory pointer. In the example, the first byte of the shared memory is
incremented in a loop every one second.
Module 7 Interprocess Communication 7.13
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
int main()
{
int fd, i, len = 10;
off_t off = 0;
char *pMem, c;
Starting two of these processes (preferably in two separate command windows for a
saner output) results in two processes attempting to access the same area of shared
memory. Figure 7.13 shows the situation when one is started and several seconds
later the second started. If only one process were running, the character c which was
written would correspond to the character returned from the shared memory, *pMem.
The first read of the second process results in the character h being returned, which
was placed there by the first process. The second process then writes an a, which is
read by the first process.
Evidently, the benefits of faster and random access to shared memory have introduced
another problem which was not present with pipes: that of synchronization. Some
method of guaranteeing exclusive access to a shared memory area is necessary so
that operations may be performed atomically. This topic is discussed in the next mod-
ule.
Module 7 Interprocess Communication 7.15
activity 7.2
Explain the output of Figure 7.13 in terms of the source code. When did
the scheduler interrupt one process and start another?
In a manner similar to the Unix example, Figure 7.16 shows the result when two pro-
cesses attempt to access the same shared memory block.
activity 7.3
Explain the output of Figure 7.16 in terms of the source code. When did
the scheduler interrupt one process and start another?
7.16 70935 Real-Time Systems
#include <windows.h>
#include <stdlib.h>
#include <stdio.h>
#define BYTES_TO_MAP 1
int main()
{
HANDLE hFile, hMap;
char *pMapMem, c;
SECURITY_ATTRIBUTES sa;
SECURITY_DESCRIPTOR *psd;
int n;
c++ ;
}
CloseHandle(hMap);
CloseHandle(hFile);
exit(0);
}
C:\usr\c\NT\console>mmap
ptr 0x14030000
1000: c=a *pMapMem = a
1000: c=b *pMapMem = b
1000: c=c *pMapMem = c
1000: c=d *pMapMem = a
1000: c=e *pMapMem = b
1000: c=f *pMapMem = c
1000: c=g *pMapMem = d
1000: c=h *pMapMem = e
1000: c=i *pMapMem = f
1000: c=j *pMapMem = g
Process 2:
C:\usr\c\NT\console>mmap
ptr 0x14030000
1001: c=a *pMapMem = e
1001: c=b *pMapMem = f
1001: c=c *pMapMem = g
1001: c=d *pMapMem = h
1001: c=e *pMapMem = i
1001: c=f *pMapMem = j
1001: c=g *pMapMem = g
1001: c=h *pMapMem = h
1001: c=i *pMapMem = i
1001: c=j *pMapMem = j
Figure 7.18 shows the basic structure of the initial portion of a Windows program. All
Windows programs include the header file windows.h. The main entry point is not
main() but WinMain(). The main routine is relatively simple, and does not perform any
interactive processing. Instead, messages are send on an event queue to be handled
by the window-processing callback function, which is called by the window manager.
1. Registers the window class and creates the window using CreateWindowEx().
This does not make the window visible. Note the registration of the callback
function, here WndProc().
2. Displays the main window using ShowWindow() and UpdateWindow(). The latter
function sends a message to the window message queue.
Figure 7.19 shows the first portion of the callback function, WndProc(). This function is
called via the window manager when an event is to be processed for the application.
The parameters to this function are:
wndclass.cbSize = sizeof(WNDCLASSEX);
wndclass.style = CS_HREDRAW | CS_VREDRAW;
wndclass.lpfnWndProc = WndProc;
wndclass.cbClsExtra = 0;
wndclass.cbWndExtra = 0;
wndclass.hInstance = hInstance;
wndclass.hIcon = LoadIcon(NULL, IDI_APPLICATION);
wndclass.hIconSm = LoadIcon(NULL, IDI_WINLOGO);
wndclass.hCursor = LoadCursor(NULL, IDC_ARROW);
wndclass.hbrBackground = (HBRUSH)GetStockObject(LTGRAY_BRUSH);
wndclass.lpszMenuName = "";
wndclass.lpszClassName = szAppName;
if( !RegisterClassEx(&wndclass) )
return 0;
hwnd = CreateWindowEx(WS_EX_CLIENTEDGE, szAppName,
"C Coded Windows", WS_OVERLAPPEDWINDOW, // normal
CW_USEDEFAULT, CW_USEDEFAULT, // x, y
400, 150, // width, height may be CW_USEDEFAULT
HWND_DESKTOP, NULL, hInstance, NULL);
hInstSave = hInstance;
ShowWindow(hwnd, nWinMode);
UpdateWindow(hwnd);
The body of this procedure consists of a switch statement, invoking the appropriate
code sections according to the message. The predefined WM_COMMAND message is used
to indicate a command message from a child control; here, these are simply the button
controls within the window.
Figure 7.20 shows the portion of the switch statement handling the WM_CREATE mes-
sage. This message is sent to the callback procedure when the window is created. As
can be seen from the figure, the various graphical objects on the window are created
here. In the example, these are 3 buttons and an edit box.
Figure 7.21 shows the portion of the switch statement handling the WM_SIZE, WM_PAINT
and WM_DESTROY messages.
The WM_SIZE message is sent to the callback procedure when the window is created.
This typically involves re-drawing any graphical objects which may change according to
the size of the overall window. The WM_PAINT message is sent to the callback procedure
when the window needs repainting (re-drawing), because its status has changed (pos-
sibly from an icon to a normal window) or it is no longer obscured by other windows.
The WM_DESTROY message is sent when the window is closing down. Finally, any mes-
sages not handled are passed to the default windows procedure, DefWindowProc().
7.22 70935 Real-Time Systems
#define MAX_EDIT 20
switch( message )
{
case WM_COMMAND: // menu & commands
switch(LOWORD(wParam))
{
case IDB_BUTTON_1: // set the text in the edit box
nbPress++;
wsprintf(editBuf, "count=%d", nbPress);
SendMessage(hWndEdit, WM_SETTEXT,
(WPARAM)0, (LPARAM)editBuf);
break;
case IDB_BUTTON_2: // get the text from the edit box
len = (int)SendMessage(hWndEdit,
EM_GETLINE, (WPARAM)0,
(LPARAM)editBuf);
editBuf[len] = '\0'; // null-terminate
MessageBox(hwnd, editBuf, "Retrieve Text",MB_OK);
break;
case IDB_EXIT:
dialogResp = MessageBox(hwnd,
"Exit the program?", "Exit",
MB_YESNO);
if( dialogResp == IDYES )
PostMessage(hwnd, WM_DESTROY,
(WPARAM)0, (LPARAM)0);
break;
default:
break;
}
return 0; // WM_COMMAND handled
case WM_CREATE:
hdc = GetDC(hwnd);
SelectObject(hdc, GetStockObject(SYSTEM_FIXED_FONT));
GetTextMetrics(hdc, &tm);
cxChar = tm.tmAveCharWidth;
cyChar = tm.tmHeight + tm.tmExternalLeading;
CreateWindowEx(BS_PUSHBUTTON,
"button", // window class name
"button 1",
WS_CHILD | WS_VISIBLE | BS_PUSHBUTTON,
50, 20, 100, 20,
hwnd, (HMENU)IDB_BUTTON_1,
hInstSave, (LPVOID)NULL);
CreateWindowEx(BS_PUSHBUTTON,
"button", // window class name
"button 2",
WS_CHILD | WS_VISIBLE | BS_PUSHBUTTON,
50, 80, 100, 20,
hwnd, (HMENU)IDB_BUTTON_2,
hInstSave, (LPVOID)NULL);
CreateWindowEx(BS_PUSHBUTTON,
"button", // window class name
"exit",
WS_CHILD | WS_VISIBLE | BS_PUSHBUTTON,
220, 20, 100, 20,
hwnd, (HMENU)IDB_EXIT,
hInstSave, (LPVOID)NULL);
case WM_SIZE:
cxClient = LOWORD(lParam);
cyClient = HIWORD(lParam);
case WM_PAINT:
hdc = BeginPaint(hwnd, &ps) ;
case WM_DESTROY:
//MessageBox(hwnd, "destroy", "TestApp", MB_OK);
PostQuitMessage(0);
return 0;
default:
return DefWindowProc(hwnd, message, wParam, lParam);
}
return 0;
}
activity 7.4
Windows Applications
1. Just before the return statement in the WM_SIZE message, the call
MessageBox(hwnd, "size", "TestApp", MB_OK);
2. Just before the return statement in the WM_PAINT message, the
call
MessageBox(hwnd, "paint", "TestApp", MB_OK);
3. Just before the PostQuitMessage() statement in the WM_DESTROY
message, the call
MessageBox(hwnd, "destroy", "TestApp", MB_OK);
Re-compile and test what happens when the window is created, de-
stroyed, iconified, maximized, and obscured by another window then
made visible.
7.26 70935 Real-Time Systems
The use of queued message buffers or shared memory areas depends on the particular
problem at hand. Understanding the differences between message queues and shared
memory is quite important.
Further Reading
a
http://www.usq.edu.au/users/leis/units/70935/935link.html
Module 8
PROCESS
SYNCHRONIZATION
AND TIMING
Module 8 Process Synchronization and Timing 8.1
In any given operating system, some or all of these may be supported to varying de-
grees.
Figure 8.1 shows the setting up of signal handling for a timer and for an interrupt
(control-c). The timer is set using the setitimer(). The signal handlers are registered
following this. The while(1) loop is simply to keep the process from exiting while the
signals are trapped. Such a loop is termed a busy wait or a spin loop, and should be
avoided in practice because it is simply wasting processor time. In reality, the sleep()
or wait() functions are preferable.
The signal handlers are shown in Figure 8.2. As noted previously, they both take no
arguments and return nothing. A short run of the program is shown in Figure 8.3.
requests Windows to send to the WndProc() the message at the appropriate time inter-
vals. The timer is destroyed using KillTimer(hwnd, IDTIMER_MESSAGE). The switch
statement in the main callback procedure may be used to handle the timer event by
checking which timeout the event belongs to:
case WM_TIMER:
if( wParam == (WPARAM)IDTIMER_MESSAGE )
{
messageCount++ ;
// other operations
}
return 1;
Another mechanism is to define a callback function exclusively for the timeout. This is
done using
VOID CALLBACK TimerFunc(HWND hwnd, UINT msg, UINT timerID, DWORD SysTime)
{
procCount++ ;
// other processing
}
Module 8 Process Synchronization and Timing 8.3
#include <signal.h>
#include <sys/time.h>
#include <stdio.h>
#include <stdlib.h>
// signal-handling functions
void SigIntHandler();
void SigHupHandler();
void SigTermHandler();
void SigAlrmHandler();
int main()
{
struct itimerval Timeout ;
Timeout.it_interval.tv_sec = 0;
Timeout.it_interval.tv_usec = 500000;
Timeout.it_value.tv_sec = 3;
Timeout.it_value.tv_usec = 0;
// SIGALRM = 14
signal(SIGALRM, SigAlrmHandler);
// SIGINT = 2
signal(SIGINT, SigIntHandler);
// SIGHUP = 1
signal(SIGHUP, SigHupHandler);
// SIGTERM = 15
// send via kill -15 pid or kill -TERM pid
signal(SIGTERM, SigTermHandler);
while(1)
{
getchar();
printf("waiting...\n");
}
printf("Exiting.\n");
exit(0);
}
// SIGHUP = 1
void SigHupHandler()
{
printf("HANGUP SIGNAL\n");
// SIGINT = 2
void SigIntHandler()
{
printf("INTERRUPT SIGNAL\n");
// SIGTERM = 15
void SigTermHandler()
{
printf("TERMINATE SIGNAL\n");
printf("Exiting now.\n");
exit(1);
}
// SIGALRM = 14
void SigAlrmHandler()
{
printf("ALARM TIMER SIGNAL\n");
* Example output:
ALARM TIMER SIGNAL
waiting...
ALARM TIMER SIGNAL
waiting...
INTERRUPT SIGNAL
waiting...
ALARM TIMER SIGNAL
waiting...
ALARM TIMER SIGNAL
waiting...
TERMINATE SIGNAL
Exiting now.
Figure 8.3
Output of the Unix signals test, for pid 213. kill -INT 213 and
kill -TERM 213 was entered at another console.
Suppose one withdrawal request is part-way completed. The check for sufficient funds
has been completed and the withdrawal has been allowed. The balance is about to
be decremented. Now suppose the other process runs, but for whatever reason (faster
CPU, less loading on that CPU, etc) it runs the entire algorithm whilst the first is paused.
The check for sufficient funds succeeds, as the balance has not yet been decremented
by the first process. Now the second process allows the withdrawal and decrements the
balance. If, say, the original balance was $100, the first requested $70, and the second
8.6 70935 Real-Time Systems
requested $60, then they could both succeed, with the net result that the balance is
less than the available funds (-$30). If the bank does not allow overdrawn accounts
(negative balances), the above represents a processing failure.
This situation may arise in many real-world situations for example, an airline database
checking for available seats and then booking the seats, or a computer disk being
backed up whilst users were writing files to the disk.
From the above discussion, it is clear that the entire operation testing availability and
the actual transaction (bank balance deduction) must be atomic. In more general
terms, in order to acquire exclusive access to a shared resource, a process must:
The first two must be performed as an atomic operation, a concept which has been met
before. If, between checking if the resource is free and actually locking the resource,
the process gets interrupted, another process may run and capture the resource lock.
Thus there is the potential that two processes have captured the resource lock, thus
both accessing the shared resource simultaneously the very situation that was to
be avoided. Another scenario, that of infinite deadlock, is possible. In this case, both
processes are waiting for a resource which the other has but cannot release. The
process will then wait indefinitely.
One other issue which ought to be noted here is the wait for the shared resource. If
this is a busy wait, then processor time is needlessly spent polling a resource lock.
Figure 8.4 shows a process which locks a resource before attempting to use it. One
possible method of handling lock files is shown in the support functions of Figure 8.5.
The getLock() function enters what appears to be an infinite loop. Inside the loop, it
attempts to create the lock file. If it can be created, the code returns as the existence
of the lock file signifies to other processes that the resource is in use. If creation of
the lock file fails, it will be because the file already exists (the open() function is called
with create, exclusive access flags). Note that this only works because the unique
Module 8 Process Synchronization and Timing 8.7
The attempt to acquire the lock is designed to fail gracefully. If the number of times
around the loop exceeds WAIT_LOCKTIME then the function gives up. In addition, the
loop is not a busy wait due to the sleep(1) call. This allows intervals of 1 second
between polling of the lock file. Such mechanisms are required in robust systems. For
example, if a process acquired the lock and the process died or was killed for some
reason, the lock file would remain but belong to nobody. Processes will wait indefinitely
for the lock file to be released. Ideally this program should incorporate a signal, so that
interrupt signals (SIGINT, SIGHUP, SIGTERM) are caught and the lockfile removed.
This will prevent problems if the application is killed with a lock held (the lock file still
exists). On Unix systems, lock files are usually created in a common directory such as
/var/lock/. This can then be checked and orhpaned lock files removed on startup.
Figure 8.6 shows the example running as two separate processes. The first starts and
acquires the lock. The second waits and checks the lock file at 1 second intervals, until
finally the first process releases the lock by removing the lockfile.
8.4 Semaphores
Another method which is supported in many systems is the semaphore. These are
similar in concept to the thread locks and atomic increments met in the module on
multitasking, in the context of threads. The so-called Dijkstra PV operations on
semaphores are defined as:
P decrease (not available, capture)
V increase (available, release)
Thus a semaphore is a free flag for a resource, indicating that the resource is free
and able to be used. A P operation decrements the count and flags the semaphore
as locked. A V operation increments the count and thus releases the flag. These
operations must be done using a kernel function.
Figures 8.7 and 8.8 show a Unix client program which uses semaphores. The semaphore
is first created (Figure 8.9). After all processes have terminated, the semaphore may
be deleted from the system. Obviously this must be done by the parent process.
Figure 8.10 shows the encapsulation of the semop() system call to acquire and release
the semaphore. After the semaphore is created, its status may be examined using the
ipcs -s command as shown in Figure 8.11.
8.8 70935 Real-Time Systems
#include <stdio.h>
#include <stdlib.h>
#define WAIT_LOCKTIME 10
int getLock(char *lockfile);
void releaseLock(char *lockfile, int lockfd);
int main()
{
int lockfd;
char *lockfile = "file.lk";
lockfd = getLock(lockfile);
if( lockfd != -1)
{
printf("use resources...\n");
fflush(stdout);
sleep(10);
releaseLock(lockfile, lockfd);
}
else
{
printf("failed to obtain lock\n");
fflush(stdout);
}
exit(0);
}
do
{
// check if the number of iterations has been exceeded
if(++tryCount > WAIT_LOCKTIME)
{
printf("getLock(): timeout on obtaining lock\n");
return -1;
}
printf("getLock(): ok\n");
return lockfd;
}
* Example output:
Process 1
phanes (leis) [4] lockfile
getLock(): lockfile created
use resources...
phanes (leis) [5]
Process 2 (run slightly later)
phanes (leis) [57] lockfile
getLock(): lockfile already exists
getLock(): lockfile already exists
getLock(): lockfile already exists
getLock(): lockfile already exists
getLock(): lockfile already exists
getLock(): lockfile already exists
getLock(): lockfile created
use resources...
phanes (leis) [58]
Semaphores are created under Windows NT using the system call CreateSemaphore().
Figure 8.12 shows the creation of the semaphore, which is then captured using
WaitForSingleObject() as shown in Figure 8.13. Note the possible return codes from
this call the semaphore was acquired within the specified timeout, the timer expired
before the semaphore was available, or the wait was abandoned altogether.
Any number of client processes may then attach to the system semaphore and wait on
it. Figure 8.14 shows that a handle to the semaphore is requested using OpenSemaphore()
in much the same way as a file is opened. Next, WaitForSingleObject() is used to
atomically protect any accesses to shared resources. In the example of Figure 8.15
this is simply a Sleep() call, but of course in practice some more realistic operation
would be performed. The actual operation must be done as quickly a possible, as the
longer a process holds a semaphore the longer other processes may be delayed in
waiting for the semaphore to become available.
Figure 8.16 shows the actual output of the semaphore client/server combination. How-
ever, the timing of the output is important but not conveyed in the printed listing.
Module 8 Process Synchronization and Timing 8.11
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <stdio.h>
int main()
{
int SemID, TestNum , MyPID;
SemID = CreateSem();
MyPID = getpid();
exit(0);
}
return SemID ;
}
SemOpBuf.sem_num = 0;
SemOpBuf.sem_op = 1;
SemOpBuf.sem_flg = SEM_UNDO ;
/* semserver.c
* Windows NT semaphores
* John Leis
*/
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
HANDLE hSemaphore;
LONG semIncr = 1L, semPrev;
DWORD waitStatus;
// no security attributes,
// initial count 1 (released), max count 1
hSemaphore = CreateSemaphore( NULL, 1L, 1L, SEM_NAME);
if( hSemaphore == NULL )
{
printf("CreateSemaphore() failed\n");
exit(1);
}
activity 8.1
Windows NT Semaphores.
Compile the code examples semserver and semclient. Create two sep-
arate command (DOS) windows. Invoke the client in one window it
should fail, as the code assumes the semaphore has already been cre-
ated and simply tries to open the existing semaphore. Now run the
server in one window first, and then the client in the other window.
Module 8 Process Synchronization and Timing 8.17
// decrements count
// timeout is in milliseconds, or INFINITE
waitStatus = WaitForSingleObject(hSemaphore, INFINITE);
printf("semaphore wait done, status = ");
fflush(stdout);
switch( waitStatus )
{
case WAIT_ABANDONED:
printf("abandoned\n");
break;
case WAIT_OBJECT_0:
printf("available within timeout\n");
break;
case WAIT_TIMEOUT:
printf("timeout\n");
break;
case WAIT_FAILED:
printf("failed\n");
break;
}
Sleep(10000L);
printf("awake\n");
fflush(stdout);
printf("releasing semaphore\n");
fflush(stdout);
Sleep(10000L);
printf("awake\n");
fflush(stdout);
CloseHandle(hSemaphore);
return 0;
}
/* semclient.c
* Windows NT semaphores
* John Leis
*/
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
HANDLE hSemaphore;
LONG semIncr = 1L, semPrev;
DWORD waitStatus;
// decrements count
// timeout is in milliseconds, or INFINITE
waitStatus = WaitForSingleObject(hSemaphore, INFINITE);
printf("semaphore wait done, status = ");
fflush(stdout);
switch( waitStatus )
{
case WAIT_ABANDONED:
printf("abandoned\n");
break;
case WAIT_OBJECT_0:
printf("available within timeout\n");
break;
case WAIT_TIMEOUT:
printf("timeout\n");
break;
case WAIT_FAILED:
printf("failed\n");
break;
}
Sleep(2000L);
printf("awake\n");
fflush(stdout);
printf("releasing semaphore\n");
fflush(stdout);
return 0;
}
C:\usr\c\NT>semclient
waiting for semaphore....
semaphore wait done, status = available within timeout
sleeping for 2 seconds...awake
releasing semaphore
C:\usr\c\NT>semserver
waiting for semaphore....
semaphore wait done, status = available within timeout
sleeping for 10 seconds...awake
releasing semaphore
sleep...awake
The conceptual difference between each of these, and where they would be used,
should be fully understood.
Further Reading
a
http://www.usq.edu.au/users/leis/units/70935/935link.html
Module 9
VIDEO GRAPHICS
AND ANIMATION
Module 9 Video Graphics and Animation 9.1
Although video displays are considered here, similar principles are applicable to real-
time audio output. These ideas extend to the implementation of networking protcols
and real-time control systems.
The video screen image is effectively a matrix of successive picture elements, called
pixels. The pixels are rendered in a particular color according to the values stored in
video memory (RAM). Each pixel corresponds to one or more bits in the video memory.
Because memory is linear meaning that successive memory locations form a one-
dimensional list of bytes stored the video memory must be mapped onto the two-
dimensional screen. (As an aside, the projection of three-dimensional images onto
a two-dimensional viewing plane is done using mathematical viewing transformations
from 3 dimensions to two, normally in software, sometimes using special-purpose video
processors).
(0 0)
figure. The convention is that the origin, or point ; , is in the upper-left corner of the
screen, but this need not always be the case. Assuming one byte is required per pixel,
a screen resolution of 1024 (width) by 768 (height) requires almost 1 MByte of video
RAM.
video screen
visible
horizontal retrace
vertical retrace
One or more bits in each video memory location define, directly or indirectly, the color
of the displayed pixel. At the most basic level, a single bit could be used to represent
black or white. Representing color becomes more complicated. The three primary
colors red, green, and blue (RGB) may be combined in the appropriate amounts
to produce any desired color. Thus one common scheme is to reserve three bytes
to use for each pixel, with each byte representing the RGB triplet. Each byte is used
as an intensity for the red, green and blue electron guns of the display. A DAC or
digital to analog converter performs the conversion from a binary value into a relative
voltage strength. Thus the relative value of each byte, 0 to 255, represents the relative
strength of each color. For example, red 0, green 100% and blue 100% may be mixed
to represent cyan; red 100%, green 100% with no blue gives a strong yellow. Equal
red, green and blue gives a shade of gray ranging from black to white.
Using a direct representation gives the best possible color rendition, at the expense
of the amount of memory required. Using three bytes per pixel, a screen resolution of
1024 (width) by 768 (height) requires over 2MBytes of video RAM.
x offset = y w +x
b b b b
y
b b b
b
less memory is required for the video display. The obvious disadvantage is that fewer
colors are available. A more subtle problem is that of color re-use. In a windowing
system, if one window is very graphics-intensive and uses up most of the color palette
entries, other applications suffer.
High-speed graphics cards often use an interleaving scheme as shown in Figure 9.4.
This is because the data access time of RAM, defined as the time from when the ad-
dress is presented to when the data is available, must be quite small for high-resolution,
fast-refresh video displays. Video RAM having a fast access time is generally more ex-
pensive. To use lower-speed memory, interleaving fetches the byte for pixel n from
bank 1, the byte for pixel n +1
from bank 2, and so forth. Using, say, four memory
banks in an interleaved fashion means that RAM with approximately one-quarter of the
access speed is required, which reduces system cost.
9.4 70935 Real-Time Systems
Screen
Palette
b b b
blue
green
red
successive pixels
b b b b
memory banks
byte
As Figure 9.5 indicates, this may be controlled for full-screen animation through a spe-
cial control register in hardware. If the animation is to be done within one window on a
Windows system, the double-buffering technique is still applicable, although of course it
is not done using special hardware naturally the hardware cannot reflect the position
of all screen windows (only the entire screen). However, special-purpose high-speed
memory copy operations, used with appropriate synchronization, allows the effect of
animation in application windows.
visible
2. The new position of the objects is calculated. This is normally a small offset
depending on the current direction of motion. However, if the object appears to
hit the windows edge, the direction is reversed.
This cartoon-sequence of animation provides the desired result, provided the pro-
cessor is fast enough to handle the timer-based redraws and allow sufficient time for
other processing. However, the performance is not entirely satisfactory. The display is
plagued by a horizontal flickering which is quite noticeable, even on a 400MHz CPU.
Before considering how to solve this problem, the internal workings of the application
will be dissected.
Module 9 Video Graphics and Animation 9.7
activity 9.1
The window event-handling is not unlike that discussed in the previous module. The
windows callback function processes events, which are passed as an integer identifier
from the operating-system event queue. It is defined as
where message is the event identifier. Certain event messages have additional param-
eters associated with them, such as the identity of the timer which expired in the case
of timeout messages. These are passed in the wParam and lParam variables.
Because the code is event-driven (meaning only activated when an event occurs, as
necessary), the state of the application window must be saved. In this case, the current
position of the objects is saved using
These are declared as static variables, meaning that they retain their value from one
invocation to the next. This is in contrast to automatic variables, the space for which is
allocated each time the function is called. Internally, static variables are stored in the
static data area, whereas automatic variables are stored on the stack. If the above were
not declared as static, their value would be different each time WndProc() is invoked.
For this reason, Windows applications tend to use many global and/or static variables.
9.8 70935 Real-Time Systems
case WM_TIMER:
InvalidateRect(hwnd, NULL, TRUE);
return 0;
The timer event is used to handle the window update. Figure 9.7 shows the WM TIMER
event, which simply calls InvalidateRect(). This has the effect of queueing a paint
message for the entire window.
Upon receipt of a WM PAINT message, the window is re-drawn by filling the rectangle
with a background color, followed by re-drawing of the objects in their new positions.
Figure 9.8 shows the WM PAINT event, which implements the erase-redraw strategy.
The objects positions are then calculated in readiness for the next iteration.
In order to allocate the buffer, the size must be known. This changes when the window
is resized, and it captured when a WM SIZE event is sent to the callback procedure. The
necessary code is shown in Figure 9.9. GetClientRect() captures the new size of the
window and creates a memory bitmap using CreateCompatibleBitmap().
The redraw is implemented as follows. Figure 9.9 shows the WM TIMER! event, which
now draws to a memory bitmap rather than to the screen directly. The device con-
text of the visible window is obtained, using GetDC(hwnd). A bitmap, which rep-
resents the contents of the window, is selected using CreateCompatibleDC() and
SelectObject(). The drawing commands such as Rectangle() use the handle to
the memory buffer, hcdMem. Finally, the bitmap buffer is copied in its entirety to the
visible window using BitBlt(). This function is optimized for high-speed copying of
an entire bitmap, and thus to eliminate the flicker which was present in the previous
example.
To summarize, the difference between the two methods is as follows. In the first, the
timer is used to trigger a repaint of the window. The drawing is done directly to the
visible window. In the second approach, the timer is used to draw to a hidden bitmap,
which is copied to the visible window. The size event is used to capture the required
size of the hidden bitmap buffer. This double-buffering approach may be used in real-
time systems to guarantee satisfactory performance where direct hardware I/O (typi-
cally video and audio) is required.
Module 9 Video Graphics and Animation 9.9
case WM_PAINT:
hdc = BeginPaint(hwnd, &ps) ;
SetMapMode(hdc, MM_TEXT); // pixel mode
GetClientRect(hwnd, &ClientRect);
// ball bouncing
Ellipse(hdc, xPosBall, 150, xPosBall+BallWidth, 180);
if( xPosBall+BallWidth > ClientRect.right )
xStepBall = -xStepBall; // hit right-hand side
if( xPosBall < ClientRect.left )
xStepBall = -xStepBall; // hit left-hand side
xPosBall += xStepBall; // update position
// box bouncing
Rectangle(hdc, xPosRect, 50, xPosRect+RectWidth, 100);
if( xPosRect+RectWidth > ClientRect.right )
xStepRect = -xStepRect; // hit right-hand side
if( xPosRect < ClientRect.left )
xStepRect = -xStepRect; // hit left-hand side
xPosRect += xStepRect; // update position
DeleteObject(hpen);
DeleteObject(hbrush);
DeleteObject(hbrushb);
EndPaint(hwnd, &ps) ;
return 0;
case WM_SIZE:
hdc = GetDC(hwnd);
cxClient = LOWORD(lParam);
cyClient = HIWORD(lParam);
GetClientRect(hwnd, &ClientRect);
hdcMem = CreateCompatibleDC(hdc);
activity 9.2
case WM_TIMER:
if( ! hBitMap )
break;
hdc = GetDC(hwnd);
SetMapMode(hdc, MM_TEXT); // pixel mode
GetClientRect(hwnd, &ClientRect);
hdcMem = CreateCompatibleDC(hdc);
SelectObject(hdcMem, hBitMap);
// filled rectangle
hbrushb = CreateSolidBrush(RGB(200, 200, 200));
SelectObject(hdcMem, hbrushb);
Rectangle(hdcMem, ClientRect.left, ClientRect.top,
ClientRect.right, ClientRect.bottom);
hbrush = CreateSolidBrush(RGB(0, 200, 0));
SelectObject(hdcMem, hbrush);
// line drawing
hpen = CreatePen(PS_SOLID, 1, RGB(200, 0, 0));
SelectObject(hdcMem, hpen);
MoveToEx(hdcMem, 0, 0, &oldPoint);
LineTo(hdcMem, ClientRect.right, ClientRect.bottom);
DeleteObject(hpen);
// ball bouncing
Ellipse(hdcMem, xPosBall, 150, xPosBall+BallWidth, 180);
if( xPosBall+BallWidth > ClientRect.right )
xStepBall = -xStepBall; // hit right-hand side
if( xPosBall < ClientRect.left )
xStepBall = -xStepBall; // hit left-hand side
xPosBall += xStepBall; // update position
// box bouncing
Rectangle(hdcMem, xPosRect, 50, xPosRect+RectWidth, 100);
if( xPosRect+RectWidth > ClientRect.right )
xStepRect = -xStepRect; // hit right-hand side
if( xPosRect < ClientRect.left )
xStepRect = -xStepRect; // hit left-hand side
xPosRect += xStepRect; // update position
Further Reading
a
http://www.usq.edu.au/users/leis/units/70935/935link.html
References
[4] Stuart Ritchie, Systems Programming in Java, IEEE Micro, vol. 17, no. 3, pp.
3035, May/June 1997.
[5] Thomand J. Penello, Compiler Challenges with RISCs, IEEE Micro, vol. 10, no.
1, pp. 3743, Feb. 1990.
[6] Ramesh Subramaniam and Kiran Kundargi, Programming the Pentium Proces-
sor, Dr Dobbs Journal, pp. 3442, June 1993.
[8] Donald Lewine, POSIX Programmers Guide: Writing Portable UNIX Programs,
OReilly & Associates, Inc., 1991.
[9] Bill O. Gallmeister, POSIX. 4: Programming for the Real World, OReilly & Asso-
ciates, Inc., 1995.
[10] Stephen G. Kochan and Patrick H. Wood, The unix system interface, in Topics
in C Programming, chapter 5. Hayden Books, 1987.
[11] Donald E. Knuth, The Art of Computer Programming, Addison-Wesley, 1973.
[16] Jeffrey H. Kingston, Algorithms and data structures : design, correctness, analy-
sis, Addison-Wesley, 1990.
[17] Mickey Williams, Threads, in Programming Windows NT4, chapter 22. SAMS
Publishing, 1996.
[18] Herbert Schildt, Thread-based multitasking, in Windows NT4 Programming,
chapter 15. Osbourne McGraw-Hill, 1997.
[20] Mickey Williams, Pipes, in Programming Windows NT4, chapter 23. SAMS
Publishing, 1996.