Sguide

70935
Real-Time Systems
Faculty of Engineering
Bachelor of Engineering
Study Book
Written by
John Leis (Modules 1, 3-9) Mark Phythian (Modules 1, 2)

BEng, MEngSc, PhD BEng, MEng
Senior Lecturer Lecturer
The University of Southern Queensland The University of Southern Queensland
Published by
Distance Education Centre

The University of Southern Queensland
Toowoomba Qld 4350
Australia
http://www.usq.edu.au
Copyrighted materials reproduced herein are used under the provisions of the Copyright Act 1968 as amended,
or as a result of application to the copyright owner.
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any
means electronic, mechanical, photocopying, recording or otherwise without prior permission.
Camera-ready copy produced using LATEX 2" by the author. Style file supplied by Ted Siebuhr, DEC. MATLAB
was used for the majority of PostScript graphs. The following LATEX 2" packages were utilized: dvips (PostScript
output); pstricks (most diagrams); hyperref (hyperlinking for Acrobat PDF format). AMS font system used for
some mathematics.
TABLE OF
CONTENTS
PAGE
Module 1 Real-Time Systems 0
1.1 Module Overview 1
1.2 Introduction 1
1.3 Real-Time Concepts 2
1.4 What is an Operating System? 2
1.5 What is a Real-Time System? 4
1.6 Terminology 5
1.6.1 Systems 5
1.6.2 Events 7
1.7 Real-Time Applications 8
1.8 Compilers for Code Examples 9
1.8.1 Platforms 9
1.8.2 Gnu Compilers 10
1.9 Links to Real-Time Systems Information 11
1.10 Module Summary 12

Module 2 Real-Time Software Design 0
2.2 Introduction 1
2.3 The Software Life Cycle 2
2.3.1 The Concept Phase 3
2.3.2 The Specification Phase 3
2.3.3 The Design Phase 4
2.3.4 The Programming Phase 5
2.3.5 The Test Phase 5
2.3.6 The Maintenance Phase 6
2.4 The Real-time System Specification and Design Techniques 7
2.4.1 Descriptive Techniques 7
2.4.2 Mathematical Techniques 8
2.4.3 Procedural Techniques 9
2.4.4 Structural Techniques 12
2.4.5 State-based Techniques 15
2.5 Implementation of Real-Time Kernels 19
2.5.1 The Function of a Real-Time Kernel 19
2.5.2 Polled Systems 19
2.5.3 Phase-Driven and State-Driven Systems 20
2.5.4 Interrupt Driven Systems 25
2.5.5 Types of Multi-tasking Systems 30
2.5.6 Foreground / Background Systems 30
2.7 Self Assessment Questions 31

Module 3 The C and C++ Programming Languages 0
3.2 Introduction 1
3.3 The C Language 2
3.3.1 Starting 2
3.3.2 Compiling C Programs 2
3.3.3 Basic Structure of a C Program 10
3.3.4 Data Types 13
3.3.5 Variable Scope 15
3.3.6 Pointers 18
3.3.7 Command-Line Arguments 19
3.3.8 Loading & Saving Data 19
3.3.9 Data Structures 23
3.3.10 Bitwise Operators 24
3.4 Advanced Topics 24
3.4.1 Portability 24
3.4.2 Makefiles 25
3.4.3 Public-Domain C Resources 27
3.5 The C++ Language 31
3.5.1 Compiling C++ Programs 31
3.5.2 Superficial Differences 32
3.5.3 Abstract Data Types 33
3.5.4 Object-Oriented Principles & Concepts 35
3.5.5 Object Definitions in C++ 37
3.5.6 Input/Output 39
3.5.7 Declaring Classes and Creating Objects 43
3.5.8 Inheritance and Derived Classes 44

3.5.9 Setting Attributes 46
3.5.10 Operator Overloading 46
3.5.11 Object Arrays 47
3.5.12 Functions 49
3.5.13 References 50
3.5.14 Object Pointers 51
3.5.15 Stream Input/Output 52
Module 4 Coding Techniques 0
4.2 Introduction 1
4.3 Error Handling 2
4.4 Assembly Language 3
4.5 C to Assembler 6
4.6 Calling Assembly Code 9
4.7 Software Optimizations 14
4.7.1 Constant Folding 15
4.7.2 Constant Propagation 17
4.7.3 Strength Reduction 18
4.7.4 Dead Code 19
4.7.5 Use of Registers 19
4.7.6 Loop Unrolling 20
4.7.7 Use of Pointers 22
4.7.8 Loop Invariance 25
4.7.9 Function Inlining 25
4.7.10 Data Alignment 26
4.8 Memory Allocation 27
4.9 Memory Copying and Searching 33

Module 5 Algorithms 0
5.2 Introduction 1
5.3 Arrays and Buffers 1
5.4 Circular Buffers 8
5.5 Linked Lists 16
5.6 Binary Trees 26
Module 6 Multitasking 0
6.2 Multitasking 1
6.2.1 Processes 2
6.2.2 Threads 3
6.2.3 Examining System Processes 4
6.3 Multitasking Examples 6
6.3.1 Processes under Unix 6
6.3.2 Processes under Windows NT 8
6.4 Threads 11
6.4.1 Threads under Unix 14
6.4.2 Threads under Windows NT 17

Module 7 Interprocess Communication 0
7.2 Interprocess Communication 1
7.3 Command Arguments 2
7.4 Pipes 3
7.4.1 Pipes under Unix 3
7.4.2 Pipes under Windows NT 7
7.5 Shared Memory 12
7.5.1 Shared Memory under Unix 12
7.5.2 Shared Memory under Windows NT 15
7.6 Windows Interprocess Communications 19
Module 8 Process Synchronization and Timing 0
8.2 Process Notification 1
8.2.1 Unix Signals 1
8.2.2 Windows NT Events 2
8.3 Atomic Resource Locking 5
8.3.1 Lock Files 6
8.4 Semaphores 7
8.4.1 Unix Semaphores 7
8.4.2 Windows NT Semaphores 10

Module 9 Video Graphics and Animation 0
9.2 Video Subsystem 1
9.3 Animation Effects 5
9.3.1 Direct Repaint Approach 5
9.3.2 Memory-Copy Approach 8

Abbreviations
ASCII American Standard Code for Information Interchange

GUI Graphical User Interface
CPU Central Processing Unit
CISC Complex Instruction Set Computer
RISC Reduced Instruction Set Computer
DSP Digital Signal Processor (or Processing)
OS Operating System
DOS Disk Operating System
NT Windows NT (New Technology) Operating System
DLL Dynamic Link Library
POSIX Portable Operating System Interface
IEEE Institute of Electrical and Electronics Engineers
I/O Input/Output
IP Internet Protocol
TCP Transmission Control Protocol
TCP/IP Generic term for TCP and IP protocols
Module 1
REAL-TIME SYSTEMS
Module 1 Real-Time Systems 1.1
1.1 Module Overview
This module gives an overview of the field of real-time operating systems. As such, it
forms a basis for what is presented in the following modules by introducing many of the
important concepts involved in studying operating systems, and in particular real-time
operating systems. The examples given in later modules are aimed at giving concrete,
working examples of the principles discussed in this module. The first part of this
module discusses some more general themes, which are expanded upon in the later
modules:
The main concepts which underpin operating systems.
The extension of those concepts to real-time operating systems.
The latter sections of this module outline the materials which are necessary to complete
the suggested exercises:
Obtaining and installing a C++ compiler.
The differences between hardware platforms, operating systems and compiler

versions used for the examples.
Note that obtaining and installing a C/C++ compiler is discussed here, with program-
ming aspects deferred until the following module.
1.2 Introduction
Mention the word computer and many people will automatically think of the desk-
top variety, common in offices and the home. However, applications of computer-
and microprocessor-based systems abound elsewhere from television remote con-
trollers and washing machines, to cell phones and engine management systems in
vehicles, to large corporate databases. At the hardware level, a common characteristic
of all these is the fetch-decode-execute processing cycles. The hardware is of course
controlled by the software sitting above it. The software in fact consists of a multi-
tude of functions monitoring a keypad, accepting network connections, controlling
hardware such as disk drives and video displays. It is this illusion of a single proces-
sor performing many, many tasks which makes software design and implementation
particularly real-time software a challenging art. Design issues come to the fore
not only the ability of the software to complete the requisite tasks, but the ability to
complete them in a timely fashion, usually with minimal resources.
1.2 70935 Real-Time Systems
1.3 Real-Time Concepts
The term real-time system is generally taken to mean a computer system which must
respond within a given amount of time. Failure to do so means that the particular job
at hand cannot be completed, and the overall system fails. The definition of failure is
a crucial one. For example, a computer controlling the lift motor in a lift in a high-rise
building must ensure that the lift cage is stopped at precisely the right level on each
floor. If the lift did not stop so that it was exactly level with the floor, it would be difficult
(and possibly dangerous) for the passengers to enter or exit. The computer system
must monitor the position of the lift at precise intervals and take action according to
the present position and the target position. An overshoot of even a few centimeters
would be considered unacceptable. A great many embedded control systems fall into
the category of requiring precise, guaranteed response times. Medical devices used to
administer patient care in hospitals, on-board controllers for aircraft, even fuel injection
controllers for passenger cars, could be given as examples.
Contrast the lift control example with, say, using a word processor or spreadheet on
your desktop Personal Computer (PC). If the response to a request such as selecting
a table column, or a formatting menu item, or even scrolling the text, was delayed by
a fraction of a second, the overall system is not considered to have failed. If the
response time to a mouse selection took several seconds, the end result is that the
system is not failed, although it might be an inconvenience to the user.
Many authors (such as [1]) use the terms soft real-time and hard real-time to distinguish
these types of situations. It could be argued, for example, that if the response time of
a word processor was of the order of minutes then the system would effectively be un-
usable. However, the degradation in response time is not acompanied by catastrophic
failure. In the case of the lift controller, failure could indeed be catastrophic.
1.4 What is an Operating System?
An Operating System is often, erroneously, thought to mean the user interface to a

Personal Computer (PC). This is not correct. The user interface, whether text-based
or a Graphical User Interface (GUI), is a separate issue. It may, however, be tighly
integrated into the operating system itself.
A minimal operating system comprises:
1. A resource manager, to control access to attached devices.
2. A task manager, to ensure that several jobs or tasks may be accomplished

together.
3. A memory manager, to give each task the memory it requires for operation.
4. A system interface: a set of software entry points to request operations to be

done on behalf of the application.
Resources might include, for example: a network connection, a keyboard, an out-

put port connected to a motor speed control, or input ports connected to controlling
switches. The model is to have several tasks or processes running concurrently.
This means that each is allocated a certain proportion of time in which to execute.
After this time has elapsed, called the timeslice, another process is run. The act of
switching between tasks is called a task switch or context switch.
The core of the operating system usually called the kernel or executive is respon-
sible for supervising the overall system. Context switching is managed by the kernel
such that each task is given an appropriate amount of time in which to execute. What
constitutes an appropriate amount of time depends on the task and what duties it has
to perform.
The system interface is the method by which the systems programmer gains access
to the underlying subsystem. This may mean hardware or software resources, or a
combination of both. Sending some data to a remote computer using a network con-
nection, for example, entails copying the data to the appropriate buffers and sequenc-
ing the transmission through the network hardware. For several reasons, the systems
programmer does not normally have, or wish to have, direct access to hardware de-
vices. To continue the example, writing code that directly manipulates the registers of
a network controller chip is tedious and error-prone. Worse, the location and functions
of the registers will differ depending on the manufacturer of the chipset! This is where
the device driver is used. This is the appropriate code to control the hardware devices
and provide services to the system. These services are accessed from the user-level
programs (tasks) via a well-known API, or Applications Programming Interface.
So it can be seen that in order for several tasks to co-operate, a common manager
must be called upon. User-level programs may be several levels removed from the
underlying hardware. For example, it is usually convenient to read disk files as a se-
quence of characters or lines. The underlying storage is quite different from this, as
disks use fixed-size blocks called sectors. The operating system pieces together the
sectors in the right order for user programs to access. The access may be via the API
calls open() and read(). These usually (but not always) appear to the programmer
as simple function calls in the C language. Underneath, the kernel is joining sectors
together to form a contiguous byte-stream file for the application code. The device
driver is called to read the correct sector at a certain offset from the start of the disk. It
must manipulate the control registers and set up buffers in order to access the physical
disk. A different device driver is required for, say, a hard disk compared to a CD-ROM.
Above this, the data bytes constituting the file may be viewed as records in a database.
This process user view of database records, application view of a byte-stream, ker-
nel view of disk blocks, and device driver view of read/write control registers is called
layering or abstraction. This is shown diagrammatically in Figure 1.1.
User Applications
User
system calls
filesystem process control Memory
Kernel
device drivers
Hardware
Figure 1.1
System hierarchy. User programs cannot access hardware and re-
sources such as memory directly only through the appropriate system
calls (called the API or Applications Programming Interface).
1.5 What is a Real-Time System?
An operating system differs in design compared to a real-time operating system. Real-

time operating systems usually include methods for precise timing of events, priority-
based scheduling of tasks according to their importance, and methods to ensure re-
liability. This is not to say that any or all of these features may not be present in a
conventional operating system. Priority-based scheduling, for example, is standard
practice in Unix systems, however Unix is not considered to be a real-time operating
system. As an example of priority, the task which controls the lift position in the previ-
ous lift control example is more important than the task which monitors the user-request
buttons for each floor. The motor control is crucial, and hence must not be held up by
polling the press-buttons.
Sometimes hardware assistance is available to help ensure the reliability of mission-

critical systems. So-called watchdog or dead-man timers are often used. The tasks
and/or kernel must poll certain ports or memory locations periodically. If this is not
done, it is assumed the system has failed (possibly due to software malfunction) and
the hardware is reset. Although not designed to happen in the normal course of events,
such a design means that the system recovers automatically. Such a design may be
augmented using polling or supervisory tasks, which check on the contents of certain
memory areas (for example) at regular intervals. If the contents is not as expected, the
task under supervision is assumed to have failed and action must be taken to restore
the state of the system.
It is important to understand that real-time does not necessarily mean fast. Although
some emphasis is placed in subsequent modules on fast and efficient algorithms and
coding techniques, speed is not the sole element. The design aspect of considering
what is important and what is relatively less important in terms of time is crucial to the
overall success in meeting real-time performance constraints.
1.6 Terminology
As with any technical field there are a number of terms significant to real-time sys-
tems. In this section we will look at a number of definitions and briefly expand on their
application.
1.6.1 Systems
When we consider a real-time system as a single entity the following terms are useful
to define system level concepts:
System Terms 1
A system may be considered as any single entity that has a number of inputs and
outputs.
A response time is the time interval between the presentation of a set of inputs
to a system and the appearance of the resulting set of outputs.
A set of inputs or outputs is considered bounded if they are always remain within
a set of specified limits.
A system specification is a document which defines the requirements of a system.
A real-time system is one which much meet specified bounded requirements

including response time constraints or cause a system failure.
A system failure is the result of failing to satisfy one or more of the specified
system requirements.
The use of these terms is fairly generic. You will come across them in many fields
associated with real-time applications. In this course of study the above terms will
be more specifically applied to software systems, but this not exclude the computer
hardware on which software is operating. There is often a very close tie between the
software system and hardware system.
The depth to which the software and hardware systems are integrated broadly falls into
three categories:
System Terms 2
Embedded systems - in which the software is completely encompassed by the

hardware system on which it runs.
Organic systems - in which the software is independent of the hardware system

on which it runs.
Semi-detached systems - in which the software may run on an alternate hardware

system with some hardware related modifications.
Examples of real-time systems falling into each of the three classifications might in-
clude:
embedded systems - a cars fuel injection system, a lift control system.
organic systems - an airline reservation system, a library catalogue system.
semi-detached systems - a production line quality monitoring system, some com-

puter operating systems.
The timeliness of the response of a real-time system is a major issue in its ability to
meet an acceptable performance standard. What is considered as acceptable per-
formance however varies widely from one application to another. You might be pre-
pared to wait for five seconds for an automatic teller machine to dispense your cash,
but the same response time would render most personal computer software useless.
Deadlines in real-time systems are often referred to as soft, hard or firm deadlines.
Consequently systems may also be classified as:
System Terms 3
Soft real-time systems - where performance is degraded by missed deadlines,

but does not cause a system failure.
Hard real-time systems - where failure to meet response time constraints would
cause a system failure.
Firm real-time systems - where failure to meet response time constraints can be
tolerated occasionally.
When it comes to stating a measure of performance for a real-time system we could

simply state that a system either meets its system specification or it does not, but that
would really only work for hard real-time systems. A more general measure of real-time
performance often used is time-loading, defined as:
Time Loading
Time-loading is a measure of the useful processing the computer is doing expressed

as a percentage of its full capacity.
Time-loading may be measured using modern logic analysers or estimated by calcu-

lating the execution time of all sections of the application software. A satisfactory level
of time-loading for a well designed real-time system should be around 70%. Higher
values may be acceptable in some systems if no expansion is expected, but very low
values <( 10%) would indicate under utilised hardware.
1.6.2 Events
When we look closer at the operation of a real-time system we come to realise that
it is changes in input conditions or changes in the internal state of the system that
trigger its dynamics. By dynamics we mean the way the system responds to change.
A change in either of these conditions is referred to as an event. In real-time systems
the following terms are used to describe event related concepts:
Event Related Terms 1
An event is any occurrence which causes a change in system state or flow of

control of a real-time system.
A synchronous event is any event which occurs at a predictable time in the flow
of control in the system.
An asynchronous event is any event whose occurrence cannot be predicted in

the flow of control in the system.
A system state is any unique condition that a system can attain, as defined by a
set of system variables.
The concept of asynchronous and synchronous events may not at first be obvious to
all readers. Consider a software controlled real-time system which, like any software,
contains decision points. It is at these decision points that a predictable change in
system state or flow of control occurs. Other program events can also be classified as
synchronous such as errors in arithmetic calculations and software interrupt instruc-
tions.
Asynchronous events are characterised by their unpredictability with respect to the

program execution. Examples include hardware interrupts and changes in input quan-
tities. Even a regular clock source such as a 1mS interval timer input is considered
asynchronous as the event occurs at random with respect to the sequence of instruc-
tions in the program.
There is another concept related to events that is very important to the operation of real-
time systems, that of determinism. Determinism is the ability to predict how a system
will behave under all possible conditions, which includes all system states and event
combinations. Assuming that a real-time system operates with bounded inputs and a
finite number of system states, it should be possible to predict all system responses
and any resulting change in system state. Consider the following terms related to
determinism.
Determinism
A deterministic system is one for which a unique set of responses and the next
state can be determined for each possible state and set of inputs.
Event determinism is where a unique set of responses and the next state can be
determined for that event.
Temporal determinism is where the response times of the system can be deter-
mined for each possible state and set of inputs.
1.7 Real-Time Applications

Practical real-time systems are present in much of the high-tech equipment that sur-
rounds us in this modern world. Some applications include:
aircraft - flight control, navigation, environment control
automotive - engine management, cruise control
military - weapon guidance, enhanced vehicle control
space exploration - rocket control, navigation, environment control
medical equipment - pace makers, critical care equipment
electricity generation - system stability and security, nuclear reactor control

lift control - direct operation and safety, multi-lift management

teller machine - transaction control, money dispensing
library catalog - inquiry, inventory support
reservation system - inquiry, reservation
stock control - stock movement, inventory support
video games - user interface, video and audio generation
1.8 Compilers for Code Examples

Throughout this unit of study, many code examples are presented both to illustrate cer-
tain language features or API function calls, and to give concrete examples of various
principles and algorithms. The student is encouraged to study these by compiling, ex-
ecuting and (where appropriate) changing the source. This section explains what is
required in order to be able to do this.
1.8.1 Platforms
C is available in several flavours:
Unix The original C compiler called cc; it comes as standard in most Unix installa-
tions.
DOS Various commercial versions (for example, Microsoft QuickC and Borland Turbo
C), and the free 32-bit Gnu C compiler (called gcc for C and g++1 for C++).
Windows Various commercial versions such as Microsoft Visual C/C++ and Borland
C++Builder. The GnuC compiler is also able to compile Windows programs but
uses a command-line interface.
One advantage of C is that it is available on several different computer platforms, no-

tably Windows and Unix. Windows-based compiler offerings from Borland and Mi-
crosoft are available in academic versions for around $150-$200. An alternative is
the free Gnu C compiler software, available for Windows (using a shell) and Unix.
Commercial compilers normally come in the form of an Integrated Development Envi-

ronment (IDE), whereas Unix compilers come in the traditional command-line form.
The IDE provides an integrated editor with a mouse-based menu system for compiling,
debugging and running programs. Command-line systems, such as Unix and the Gnu
C compiler for DOS/Windows (to be discussed) requires the user to run a separate ed-
itor and compile programs using command-line sequences (these will be summarized
shortly).
1
Previously gxx because of filenaming problems.
1.8.2 Gnu Compilers
The Gnu C compiler is available for Unix, with a port to DOS called DJGPP and a
Windows port called Cygwin.
Firstly, DJGPP is available from
http://www.delorie.com
or a (faster) mirror site
ftp://mirror.aarnet.edu.au/pc/simtelnet/gnu/djgpp/
Instructions for obtaining and installing GnuC for DOS may be obtained at
http://www.usq.edu.au/users/leis/gnuc/gnuc.html
The GnuC port for Windows by Cygnus is freely downloadable, from
http://sourceware.cygnus.com/cygwin/
or a (faster) mirror site
http://mirror.aarnet.edu.au/cygwin/
Instructions for obtaining and installing GnuC for Windows may be obtained at
http://www.usq.edu.au/users/leis/cygnus/cygnus.html
The Cygwin compiler includes tool libraries, software libraries for Unix and Win32 sys-
tem calls.
Whats the difference between the Cygnus and DJGPP ports of the GnuC compiler?
DJGPP does not (as of this writing) support long filenames on Windows NT.
DJGPP supports numerous hardware-access and graphics facilities for exam-

ple access to input/output ports using inportb() and outportb(). Cygwin does
not.
Cygwin supports Unix and Win32 Application Programming Interfaces (APIs)

for example the Unix fork() to create child processes, together with the network
sockets API.
Cygwin is easier to install only one file need be downloaded.
The DJGPP C++ compiler is invoked by gxx whereas the Cygwin C++ compiler
is invoked by g++. This is due to historical limitations on DOS filenames. For
compiling C programs, both use gcc.
If unsure, it is recommended that you use the Cygwin GnuC package.
The GnuC compiler ports mentioned above are both free, and provide a 32-bit envi-
ronment for C and C++ programming. This means that the memory limitations of other
DOS language tools (such as DOS C compilers and QBasic) do not exist.
For an easier-to-use integrated code development system, the author uses Borland
C++Builder (version 4 as of this writing).
1.9 Links to Real-Time Systems Information

Recently, real-time system failures have been documented some with very serious
consequences. Amongst the more interesting of recent times are those of the Therac
medical treatment system, and the Ariane rocket. The exact cause of each design
failure, hardware failure, software failure, operator failure is debatable. However, the
following activity should be completed before continuing.
activity 1.1
Failures in Real-Time Systems.
1. Read one of the articles on the Therac medical accelerator, linked

from 70935 Real-Time Systems Further Referencesa . It is only
necessary to read one (several links are provided in case one is
unavailable).
2. Read one of the articles on the Ariane rocket, linked from

70935 Real-Time Systems Further Referencesb . It is only neces-
sary to read one (again, more than one link is provided in case one
is unavailable).
a
http://www.usq.edu.au/users/leis/units/70935/935link.html
b
1.10 Module Summary

This module has:
Served to introduce some of the important concepts and terms pertaining to com-
puter operating systems, with the extension to real-time systems.
Indicated the requirements for a C compiler for completing the unit, and given
instructions on obtaining the free Gnu compiler.
The linked articles on real-time system failures should be read and considered before
continuing.
Further Reading
70935 Real-Time Systems Further Referencesa
a
Module 2
REAL-TIME
SOFTWARE DESIGN
Module 2 Real-Time Software Design 2.1
2.1 Module Overview

The objectives of this module include:
To introduce the concept of the software life cycle
To explain the importance of using best-practice techniques of software engineer-

ing for developing reliable real-time systems
To describe the elements which form a Real-Time System Specification
To describe several software design techniques and design tools for real-time
system design
To introduce the concept of the kernel
To explain how to implement basic real-time kernels
To explain the concept of foreground and background functions in kernels
2.2 Introduction
The best way to ensure that the implementation of a real-time system meets all the
requirements of its system specification is to adopt a solid engineering methodology. A
traditional engineering design and development process involves the following phases:
conception
specification
design
implementation
testing
maintenance
Real-time software is such an integral part of most modern real-time systems that it is
common to hear software referred to as an engineering material. Hence the phases
listed above should equally apply to the creation of software as they do to building
a sky-scraper or manufacturing a jet engine, given that we want to produce a quality
real-time system.
The characteristics we associate with the term quality such as conformity, precision,
cost effectiveness, efficiency, maintainability and reliability are often linked with engi-
neered objects we can see, but what about the bits we cant see? These characteristics
should be at least as important to the production of a quality software as they are to

our other engineering achievements, perhaps even more so. Would you want to trust
your life to a aircrafts autopilot that you knew to be of suspect quality?
In this module we will learn about good software design methodology and how to use
several design tools and techniques to produce quality real-time software. We will also
be looking at the ways that system specifications can be defined/described in order to
unambiguously state the requirements of the real-time system.
2.3 The Software Life Cycle

Good software development is a controlled process, with a clear methodology and
predictable outcomes. It is much more than someone just sitting at a computer writing
code. As described earlier, applying a solid engineering approach to software design
and development can significantly improve the quality of the final product. And we must
begin to think of our work as a product to be consumed by a discerning customer, if we
intend to produce a system embodying quality characteristics.
There are six distinct phases in the software life cycle which parallel the six phases
used in engineering. In software engineering these are referred to as:
The Concept phase

The Specification phase
The Design phase
The Programming phase
The Test phase
The Maintenance phase
While an engineering methodology is often applied rigorously in a pure engineering

application, software engineering tends to exhibit a certain flexibility, as software is
inherently a more pliable engineering material than steel or concrete. While the mal-
leability of software can be a very beneficial characteristic its misuse can also result in
a substandard product.
Software engineering for real-time systems can be subdivided into the six phases
above with fairly distinct outcomes for each phase. Table 2.3 identifies the processes
and outcomes involved in each phase.
While all phases in the software life cycle are important it is often the design and test
phases that new-comers to the field tend to under-value. In fact it is in the design
phase that qualities such as conformity, precision and reliability are built-in to real-
time software. It is also in the design phase that a test plan is devised from which the
final system performance will be determined. The test phase is then used to check the
correspondence of the system with respect to the system specification.
2.3.1 The Concept Phase
All things begin with an idea. In this phase of a real-time system design ideas are
proposed and discussed for new products, enhanced products or solutions to a problem
posed. This phase is often initiated by market forces, changes in technology or a
request from a client, where an opportunity is identified to fulfill a need.
In the conceptual phase of the project the main objectives are to:
identify the opportunity/need for the product
identify the features/objectives of the product
produce market/feasibility studies
While these tasks may not always fall into the domain of the real-time systems engineer,
he/she is often involved in evaluating and preparing the technical aspects of proposals
and estimating a products potential performance and practicalities.
2.3.2 The Specification Phase
In this phase the operational and contractual details of the project are identified and
documented. The operational part of the specification lists and describes the func-
tions, processes and performance requirements of the system including characteris-
tics such as speed, accuracy, stability and response times. The contractual part of
the specification details the scheduling, budgetary and legal (if any) elements of the
project. These two sets of documentation are sometimes referred to as the functional
and non-functional requirements. The non-functional requirements may also include
specification of programming language, time-loading, system hardware etc.
The specification phase is also the stage at which the test plan is produced. The
express purpose of the test plan is to define how the system will be tested to verify
conformity and performance with respect to the system specification.
In the specification phase of the project the main objectives are to:
Phase Processes Primary Outcome

Concept Outline project objectives/feasibility Project outline
Specification Detail system requirements System specification
Design Detail the system structure System design document
Programming Implement design methodology Program code
Test Verify system meets specification Test reports
Maintenance Maintain code Maintenance reports
Table 2.1 Software Engineering phases

identify the functions/requirements of the product
produce the system specification documentation
produce schedule, budget and contractual documentation
produce a test plan for system verification
The system specification is often written by or in close contact with the client or target
user of the product. The client may be a traditional customer, your boss, another sec-
tion of a large organisation or a particular industry sector. Real-time system engineers
are most likely to have direct input to this phase when the objective is to produce an
enhanced version of an existing product.
2.3.3 The Design Phase
In the design phase the system specification is transformed from a list of functions
and requirements to a detailed statement of implementation referred to as the system
design document. The system design document specifies how the system require-
ments are to be met by partitioning the functions and processes prescribed in the sys-
tem specification into functional modules, supported by a collection of data structures.
Techniques and tools for creating the system design are explained in section 2.4.
During the design phase it is possible to identify flaws in the system specification which
cannot be worked around, or requirements that cannot be met with current technology.
In either case a request must be made to the originators of the system specification
for a system change. Design changes should only be implemented as the result of an
authorised system change.
In the design phase of the project the main objectives are to:
partition the functions and processes into modules
identify problems in the system specification for possible change
produce the system design documentation to a recognised design standard
produce a set of test cases based on the test plan
The system design document is often prepared by a team of design engineers and
analysts. This should be done according to a recognised international software design
standard such as DOD-STD-2167A or IEEE standard 1016.
2.3.4 The Programming Phase

Once a system design document is produced the next phase of the project involves pro-
gramming the functional modules and creating the data structures to operate as per the
system specification. This task may involve one or many software engineers depend-
ing on the complexity and diversity of skills required for the implementation. Different
approaches can be used to achieve this task but the most common is a bottom-up
implementation strategy, where low level modules are developed first following on to
more complex interactions of modules.
In the programming phase of the project the main objectives are to:
decide on a kernel structure and implement the kernel
create the data structures and write the software modules
debug the software
develop test cases and procedures
integrate modules to form a functional system
For the real-time design engineer the most important aspect of this phase is to properly
manage the software development process to ensure a level of quality control. This can
be accomplished through careful management techniques, sometimes with the aid of
software management software.
2.3.5 The Test Phase

The formal testing of real-time software involves the strictly controlled application of
the test plan that was produced in the design phase. While testing is performed during
the programming phase of the project to verify the operation of various modules, this
process tends to be relatively ad-hoc. To show that the complete real-time system
meets all requirements as set down in the system specification a rigid test program
must be adopted.
The importance of the test phase cannot be overstated. In some cases it is only when
the system is tested as a complete unit that some errors can manifest. Some perfor-
mance measures such as response time and time loading can only be fully evaluated
when the system is fully operational. Any failure to meet the requirements of the system
specification will require corrective action being taken by returning to the programming
phase or possibly even the design phase.
To thoroughly test a complex system the test phase should include a comprehensive
set of loading models. Loading models are sets of operational states and conditions
under which the real-time system is expected to operate. The purpose of these tests
is to thoroughly exercise or stress the system to prove its stability and performance.
The likelihood of various loading models can be estimated from a probabilistic analysis
based of the systems operation and structure.
In the test phase of the project the main objectives are to:
verify system function and performance according to the test plan
perform system stability and performance tests under various loading models
produce test reports detailing level of compliance
This phase of system development is performed either by the development team or a

third, independent party. Either way the test phase must be carefully supervised and
documented. In traditional engineering this phase is referred to as commissioning.
2.3.6 The Maintenance Phase

Beyond the test phase there are invariably ongoing corrections and revisions to soft-
ware systems in response to reports of errors and suggestions from users. A controlled
release program of a Beta version of a system is often used to help identify significant
deficiencies in complex systems before the true product launch. Revisions of complete
systems typically use a system of regression testing to verify system function after
selected sections of a system are modified.
While some consider this revision process as part of the test phase it has a distinct
maintenance aspect which is bound to customer support issues. It is often essen-
tial that product maintenance include the capacity to revise system software, but after
some predefined period the product is usually no longer supported.
In the maintenance phase of the project the main objectives are to:
deploy the system into practical application
provide a customer support system including error reporting
maintain the system as a product through a system of revisions
The software engineer may be involved during this phase in ongoing product enhance-
ment or regression testing. In small companies the engineer may also find him/her-self
with some direct involvement in marketing and customer support.
2.4 The Real-time System Specification and

Design Techniques
The two foundations of good real-time software design are a sound system specifica-
tion and a definitive system design document. The purpose of the concept phase is
to produce a collection of features and objectives for the desired product, but this is a
far cry from a definitive statement of requirements. The purpose of the Specification
Phase is to transform the conceptual into the practical.
This task can be proceduralised to some degree, by using specific techniques to define
the system requirements. Most system requirements are initially described in words
which are later replaced by more specific mathematical, diagramatic or pseudo-code
definitions. These techniques attempt to unambiguously define the system require-
ments, processes and anticipated structures for the data and the program. We use the
term attempt as most system specifications use two or more techniques in combination
to ensure clarity. For all but the simplest systems any one technique alone is usually
inadequate.
While the specification and design phases are presented as separate steps in the
software life cycle, these two phases are often closely tied through the development of
the system design documentation.
The system design document is typically a collection of descriptive, mathematical, pro-

cedural, structural or state-based models of the required system. In the following sec-
tions we will outline several techniques used to create system specifications and design
real-time software.
2.4.1 Descriptive Techniques
Natural Language
Often the first step in developing a system design is to write down a description of what
is desired. This may take the form of a formal request from a client or your boss, or
summarised in point form scribbled down during or after a discussion. Either way the
form is descriptive using phrases or sentences outlining the desired result.
As one might expect this approach can create a degree of ambiguity when concepts
are not clearly presented and often suffers from problems related to cultural perception
and non-native languages. This technique is not recommended as a means of detailing
a specification, but may be used to complement other techniques where an explanation
is beneficial.
Psuedo-code or Structured English
A more precise descriptive technique is available which greatly reduces the potential
for ambiguity. By converting a description into a series of statements and writing them
down in a structured way it is possible to create a descriptive form called psuedo-code,
or structured english. This technique adopts a procedural form. The following example
shows how this can be done.
example 2.1
Example of psuedo-code for a vending machine.
do
display "Add Coins"
wait for coin to be inserted
identify coin
count coin value
if coin value exceeds product cost
begin
display "Select Product"
wait for product selection
eject product
calculate change
eject change
end
until product dispenser empty
display "No Stock"
2.4.2 Mathematical Techniques
Most real-time applications are based on the monitoring and/or control of some physical
object or process. This presents the design engineer with an ideal opportunity to apply
some of that mathematics learnt at university in an attempt to define the physics of
the problem as a series of mathematical equations. We use the term attempt because
practical systems can be difficult to accurately define as a set of equations, as they are
often too complex or exhibit non-linearities.
The concept of using mathematics to define the behaviour of a physical system is one
of engineerings fundamentals. It provides a concise, commonly understood technique
for defining a set of static and dynamic conditions; offers little chance for ambiguity and
is in a form which translates easily into software. In addition to these features math-
ematical equations can be manipulated to achieve simplifications and optimisations
which can greatly improve the performance of real-time software. In many circum-
stances formal proving of the stability of a system is possible through mathematical
analysis.
The following example details a mathematical specification for the age old physics prob-
lem of projectile motion. The quantities involved include initial velocity (V ), angle of
projectile (A), the force of gravity (G), time (t) and the resulting distances traveled -
horizontally (x) and vertically (y ).
example 2.2
Example of a mathematical specification for projectile motion.
() =
x t ( )
tV cos A
y (t) = tV sin(A) Gt2
Where: t is time.
V,A are the initial velocity and angle respectively.
G is the gravitational constant.
x, y are resulting horizontal and vertical distances.
Mathematical specification is a widely accepted and well understood technique and is

recommended for use with real-time systems involved with monitoring and/or control of
a physical object or process. Just a word of warning however - as a physical system is
approximated through mathematical modeling one must not fool ones self into believ-
ing the model is the system. A strict set of operational limits and checks are usually
required to ensure that the system operates within a predefined range over which the
model is valid.
2.4.3 Procedural Techniques
In many cases the design engineer is aware of the processes and procedures that
are required to be executed in a real-time system, these include: start-up and shut-
down sequences, specific algorithm execution, user data entry, a series of functional
decisions, sequences of events and others.
Several design tools are available that help the software engineer to detail procedures
which will form an integral part of the real-time system. Some of these techniques
have the distinction of being capable of defining increasingly finer levels of detail as the
design develops. The following sections outline the most common of these tools.
Flowcharts
Perhaps one of the earliest developed and most widely recognised graphical tech-
niques is the flowchart. Figure 2.1 shows an example of the most commonly used
subset of flowcharting symbols. There are actually many other symbols available for
flowcharting, but most of them relate to file and record handling for administrative pro-
gram design.
Sub-
Start 1 Connector Process
Process Sub- Process

Process
True Stop
Decision Input / 1
Output
Connector
False
Stop Process flow indicated by

arrows.
Figure 2.1 Example of the basic set of flowcharting symbols.
The flowchart is designed to show unidirectional program flow, decision points, in-
put/output, processes and sub-processes. The few basic rules to drawing flowcharts
include:
Each symbol has a maximum of one entry and exit point, except the decision
diamond which has two exit points.
Program flow can only be joined arrow to arrow
Arrows cannot divide program flow
Flow should generally be top to bottom
The application of this type of procedural representation is not restricted to software

development and is often used to describe all sorts of procedures from changing a tyre
to operating a photocopier.
Flow charts are not recommended for use in the specification phase of real-time system
design, but can be useful in defining/documenting specific procedures of small parts
of a larger system. In multi-tasking systems flowcharts do not easily represent the
interaction between the tasks, or between tasks and the operating system. There is
also no way of indicating temporal relationships in flowcharting.
Dataflow Diagrams
A dataflow diagram is a simplified system representation showing major flow of data

through a series of processes. The four main elements of data flow diagrams are the
data source/sink, data storage, processes and data flow arrows. Figure 2.2 shows the
symbols used to represent these elements.
Label
Label Label Label
Data source/sink Data Store Process Data flow

(Hardware) (Memory/Disk) (Software)
Figure 2.2 Symbols for Dataflow diagram elements.
Figure 2.3 shows an example of how these simple symbols can be used to convey
information about a process. Data sources/sinks are typically hardware elements such
as peripherals and Input/Output devices. The steps used to create a dataflow diagram
include:
Identify the major data flows based on the system requirements

Starting on the outer edges draw the sources and sinks for data, usually the
system hardware elements
Draw and label arrows indicating data flow between hardware, memory and soft-
ware modules
Ensure symbols are clearly labeled according to their function
Do not show initialisation or flow-of-control and keep detail to a minimum
Dataflow diagrams are highly recommended and widely used in the design of real-
time systems. They offer a structured approach for identifying the main data flows in a
system and for partitioning software into modules(processes). Interrupts can be shown
as an input from a hardware source that triggers an interrupt service routine. Dataflow
diagrams can be used to form a hierarchy of system structure with varying levels of
detail. Processes in upper layers can be represented by their own dataflow diagram
showing any underlying processes.
A particular feature of dataflow diagrams is that they provide the designer with the capa-
bility to identify concurrent processes, that is processes which can be run at the same
time on either multiple processors or as multiple tasks. This can be achieved by locat-
ing sections of the dataflow diagram which only connect to other sections only through
common data storage. In Figure 2.3 there are three such sections: the sample and
control section, the FFT section and the display section. In such cases the designer
has the option of considering any section as a separate task, which may significantly
influence the structural design of the overall system.
Analog to Graphical
Digital Raw Sample Display Graphic Frequency
Converter Display
Sample Data
Sample Time Frequency
rate data data
512 value 256 value
Control time buffer frequency
buffer
Time
Selection data File
F.F.T.
Frequency System
User data
Interface
Figure 2.3 A dataflow diagram for a Fast-Fourier Transform application.
2.4.4 Structural Techniques
When adopting an engineering approach to real-time software design it is important

to establish a good structural framework around which to build an application. This is
why procedures such as top-down design are so successful, because they encourage
modularity and hierarchy in a design. Several design tools are available to help the
software engineer create well structured and modular code. The following sections
outline a few of these tools.
Structure Charts
Structure Charts are widely used for describing the hierarchial structure of a system.
They can be used to describe, not only software, but any system where a layered struc-
ture exists. In software the layers are related to the depth of subroutine and function
calls. In other applications the structure chart may depict the hierarchial structure of a
chain of command, a written document or the physical components of a device.
The elements which form a structure chart are quite simple: a box represents a pro-
cess, vertical position indicates hierarchy, lines show links between processes and
arrows show major data and control flow. The advantages of structure charts include:
execution sequence is shown as a left-to-right progression across each layer in

the diagram
they encourage top-down design
the help identify the modularity of a system

Some variants of the structure chart can be used to illustrate decisions and interrupt
processing. Figure 2.4 shows an example of a structure chart including a decision and
interrupts.
Interrupt A
Main Source
Process
Interrupt B
Source
Initialising / Process with
Debugging Decision Process
Interrupt
Control Data Either Service B
Sub- Sub- Sub-

Process Process Process
Interrupt
Service A
Common
Sub-
Process
Figure 2.4 Example of the basic set of structure chart symbols.
The position of processes in a structure chart is significant, as it shows the level of

depth of subroutine/function call required to reach that module of code, its relationship
with processes above and below it and its relative functional level. In the case of
interrupt service processes its position and the dashed line above it can be used to
indicate the scope of the interrupt. For example, Figure 2.4 shows interrupt service
B is only enabled during the execution of the first layer of processes under the main
process, while the interrupt service A is capable of interrupting all but the common
sub-process.
While structure charts are useful to describe the general structure of a system they
lack the ability to depict concurrent processes, significant data storage and temporal
relationships. Thus structure charts are only recommended for use in the initial stages
of a design to outline the expected modularity and hierarchy of a system. They may also
be used as a good documentation tool to summarise a completed systems structure
for later reference.
Warnier-Orr Notation
Warnier-Orr notation is a semi-descriptive, semi-structural system of notation which

can represent program structure, data and their conditional relationships. It is some-
what like a structure chart on its side with conditional elements indicating options for
execution of subprocesses. Warnier-Orr also uses set-theoretic notation to carefully
indicate the conditions under which various processes are executed.
Warnier-Orr notation is written top-to-bottom in order of execution and is formed from a

combination of sets of steps, a little like psuedo-code. Each set can be made of a com-
bination of other sets and steps. Each step can include a conditional decision which
indicates alternate sets for execution. As the notation is written left-to-right increasing
+
levels of detail are introduced. Logical operators of exclusive or () and or ( ) are used
to indicate mutually exclusive cases and combinational cases respectively. Figure 2.5
illustrates the elements which can be used in Warnier-Orr notation.
8 label (optional) fstatement

>
>
>
> 8
>
> >
< (
statement
>
>
>
>
label - set example
>
: set
statement
>
>
statement
>
> 8
>
> >
< test condition ftrue action - statement
>
> label - condition example
>
>
> : complementary condition ffalse action - statement
>
>
< 8
> option1 fstatement
label - program
> >
>
>
> <
> label - case example option2 fstatement
>
> >
> >
: option3 fstatement
>
>
>
>
>
> label - while loop (test condition,W) fstatement
>
>
>
> label - loop until (test condition,U) fstatement
>
>
>
>
> label - indexed loop (n) fstatement
:
Figure 2.5 Example of Warnier-Orr Notation syntax
In Figure 2.5 each set is identified by a label. In this example they indicate the type of
element, but in a real application they should indicate the function or purpose of the
element. The elements, in order of appearance include:
a statement - a statement for execution
a set - a set of statements or sets
a condition - requiring a test condition and corresponding true and false actions
which are statements or sets
a case - requiring a series of options and corresponding statements
a while loop - requiring a test condition, W for while and statement to be executed
while the condition is true
a loop until - requiring a test condition, U for until and statement to be executed
until the condition becomes true
a indexed loop - requiring a counter n and statement to be executed while n > , 0

=
which will typically include as statement like n n 1
Warnier-Orr notation is recommended as an alternative to structure charts as they ex-

hibit the features of modularity, clear sequencing, decision and case capability, looping
and counters. Multi-tasking can be incorporated through the inclusion of flags and
tests. However Warnier-Orr notation can become laiden with detail and become un-
clear if the designer is not careful to state elements concisely.
2.4.5 State-based Techniques
Many real-time systems have clearly defined conditions or states in which the system
operates. The identification of these states allows the design engineer to logically
divide a system into distinct operational components. Not only does this method of
analysis help partition the software but it also helps identify the system events that
trigger changes in system operation. Several design tools are available that use state-
based techniques to develop well structured and modular code. The following sections
outline a few of these tools.
Finite State Machines
Finite State Machines and Finite State Automata are terms used for the technique of
defining a system as a fixed number of unique states between which the system moves
in response to events. A state is identified as a distinct condition a system may occupy,
based on system parameters called state variables. Transitions between states are
triggered by system inputs (events) or increments of time.
There are actually two types of Finite State Automata, the Moore and Mealy implemen-
tations. The difference between the two implementations lay in the way that outputs
are defined. The Moore machine can only define system outputs in terms of the state
variables, where-as the Mealy machine can use input conditions and state variables.
The significance of this distinction will be made clear later when we consider its effect
on implementation.
Finite state machines can be represented by mathematical notation, graphically as

a State Diagram or State Chart, or in a tabular form. As the graphical and tabular
techniques are the most easily applied to software design we shall only consider these
as tools for real-time system design.
State Diagrams
The State Diagram is a graphical representation of a finite state machine in which:

Not "C"
"C" "C"
Start not "C" "A" First

Out=0 or space Out=0
Space Not ("T" or "C") "C" "A"

or Space
Third Second
Out=1 Out=0
"T"
Figure 2.6 Example of a State Diagram (Moore implementation).
circles are used to represent system states
states are typically labeled with capital letters or short descriptions to represent
system conditions
connecting arrows represent transitions between states
inputs/events are used as labels on transistions indicting trigger conditions
outputs/actions are either placed inside states or associated with inputs on tran-
sistions
starting or terminating states may be depicted by double circles
Figure 2.6 shows a Moore machine for a simple task to recognise the word CAT from
a string of letters. The input is the next letter of the string, the output is a single bit
1
which is when CAT is recognised. Note that inputs are shown on the transitions
and the outputs are shown inside the states. Recall that a Moore machines outputs
are dependent only on the system state variables and hence are only defined within
states. In the Moore machine the outputs are static while the system stays in any
individual state.
Figure 2.7 shows the Mealy machine implementation for the same task of Figure 2.6.
Note the outputs are shown on the transitions along with the input causing the tran-
sition, separated by a /. In the Mealy machine the outputs are static if based only on
state variables or may be transitional if based on input conditions as well.
Not "C" / 0
"C" / 0 "C" / 0
Start space, Not "C" First

or "A" / 0
"C" / 0
Not ("T" or "C") "A" / 0
or Space / 0
Space / 0
Third Second
"T" / 1
Figure 2.7 Example of a State Diagram (Mealy implementation).
State Diagrams are recommended as a good design technique for real-time systems,
particularly for state driven tasks which operate equipment in a range of sequences
like traffic light control, medical equipment, aircraft flight control, teller machines etc.
Statecharts
Statecharts are a combination of Finite State Machines and Data Flow Diagrams which
feature the ability to depict not only states of operation, but also states within states
and orthogonality. The structure of the statechart allows these and other features to be
incorporated in the following way.
States are represented by loops with labels.
States within states (depth) are represented as loops within loops.
Orthogonality is represented by a dashed line separating concurrent processes.
Small letters a; b; : : : z represent events that trigger transitions.
Small letters in parentheses represent conditions that must be true for the transi-
tion to occur.
Simultaneous transitions in orthogonal states, called broadcast communications,

are represented by transitions with the same event.
Outputs are represented as actions associated with states or transitions.
Cascade events can be attached to triggering events.

A function d/f
B D F
function function e function
a b
f c f(g) c
function function
C E
Figure 2.8 Example of a State Chart.
Figure 2.8 shows a sample statechart containing each of the above features. This
statechart is comprised of six states A to F, with two orthogonal (concurrent) processes
- one containing states B and C, the other states D, E and F. Each state shows a default
label function which would be substituted with a descriptive statement of the function
of that state. The dynamics of the system are as follows:
Transition to state B is triggered by event a.

Transition to state D is triggered by event b.
Transition to states C and E are triggered simultaneously (synchronously) by

event c.
Control returns to states B and D respectively, triggered by event f , with the

transition to D conditional on g .
Transition to state F is triggered by event d which causes a cascade trigger of
event f .
Control returns to state A from state F on event e.
States B to F are said to be nested states of state A, indicating an increasing depth

of detail and function. This concept of states-within-states encourages software de-
signers to utilise top-down design principles and create modular code. The concept of
depth is similar to sets containing sets in Warnier-Orr notation.
The presence of the dashed line indicates that two processes may be run concurrently,
in this case triggered by separate events a and b. Each process can run indepen-
dently but some transitions can be synchronised by common events called broadcast
communications, such as c and f .
Statecharts are highly recommended for real-time system design as they offer repre-
sentations for many of the features required for modern software design, in particu-
lar concurrency, modularity and intertask communication. When combined with the
state-based decomposition offered by Finite State Machines this design technique is
probably the one of the best available.
2.5 Implementation of Real-Time Kernels

The programming or implementation phase of real-time software is a critical stage in
the development of real-time systems. One of the primary tasks is to decide on a suit-
able kernel structure and efficiently implement that kernel. The kernel is the underlying
structure or core of a real-time system or operating system. Several basic types of
kernel are available for use which vary from simple polled loops through to full featured
commercial operating systems.
For some real-time applications, particularly for embedded systems, commercial op-
erating systems are too big and complex to be efficiently applied. In these cases the
system designer is much more likely to write a simple kernel to meet the requirements
of the application. In this section we will describe how to implement the basic structures
of several real-time kernels and outline the more advanced features of commercial op-
erating systems.
2.5.1 The Function of a Real-Time Kernel

The three primary functions performed by an operating system are
Task Scheduling - which identifies which task runs next
Task Dispatcher - which performs necessary housekeeping required to switch

from task to task
Intertask Communications - which allows for data transfer and synchronisation
The kernel, sometimes referred to as the executive or nucleus, is the smallest portion
of an operating system that provides the primary functions listed above. This is not to
say that all real-time applications are implemented using multiple tasks, but in one form
or another they implement each of these primary functions.
Later in this unit we will take a closer look at two of the most widely used commer-
cial operating systems in use today, Unix and WindowsNT. Several examples will be
provided to show some of the basic functions of each of these operating systems (ker-
nels). In this module we will be focusing on the fundamentals of implementing purpose
written kernels.
2.5.2 Polled Systems

The simplest of all real-time kernels is based on the polled loop structure, where one
or more devices are repetitively polled to check for changes (events) upon which to
take action. While polled systems can achieve fast response for a small numbers of
devices, they offer little other functionality. Each device service needs to be as short
as possible to achieve short response times for all events.
The C program shown in Figure 2.9 illustrates the concept of a polled system with a
polled single key input triggering either of two events.
/* poll.c - a sample polled system

*
* Mark Phythian
*
*/
#include <stdio.h> #include <conio.h> #include <stdlib.h>
void process_event1(); void process_event2();
int main() {
char c = 0; while (c != 'q') {
while (!_kbhit()); /* while no key press do nothing */

c= _getch();
if (c == '1')
process_event1();
if (c == '2')
process_event2();
}
exit(0); }
void process_event1() { printf("Event 1 \n"); }
void process_event2() { printf("Event 2 \n"); }
Figure 2.9 Sample Polled Application
Polled systems are simple to write, easy to debug, response time is easy to determine
and they are good for high speed data channel interfacing. However polled loops are
inefficient with CPU time, they do not handle bursts of events unless specifically de-
signed with a buffer, and polled systems can not satisfy the requirements of all but the
simplest systems.
2.5.3 Phase-Driven and State-Driven Systems
Phase-Driven or State-Driven implementations utilise case statements, nested if state-

ments or tables of function pointers to divide the code into manageable segments.
These segments may be phases of a larger process or states defined in a Finite State
Machine. The division of the total program function into smaller distinct sections also
provides the ability for the program to be suspended at the end of each sections exe-
cution, as would be required in a multitasking application.
This technique particularly lends itself to the implementation of real-time systems de-
signed using the Finite State Machine approach. Figure 2.10 shows a State Diagram
for a simple parity generator for a bit stream. The program shown in Figures 2.11 and
2.12 show the implementation of the parity function for State-Driven code using the
switch/case approach.
0/
EVEN 1/
ODD 0 / ODD
A B
1/ EVEN
Figure 2.10 State Diagram for a parity generator. (Mealy implementation).
Note how each of the states are defined by letter to which each is assigned an integer.
The state variable state is initialised to the starting state and the program drops into
an endless loop. The input is received and control is passed to the case in the switch
statement that corresponds to the current state. The input value is tested to determine
0
if a transition is to be made in the state machine. For example: in case (state A) there
are two possibilities - if input = 1 then the output is changed to EVEN and the new state
becomes B, or input = 0 and no change is required. Note it is good practice to re-affirm
the output condition and state variable in this case.
An alternative to the switch/case approach is to use a rather elegant solution derived

from the tabulated form for the Finite State Machine. In this approach each possi-
ble transition is tabulated for current state and input, where each entry indicates the
next state and resulting output. The table below shows the tabular form for the parity
generator of Figure 2.10.
Current State
Input A B
0 A / EVEN B / ODD
1 B / ODD A / EVEN
Table 2.2 Tabular Representation for the Parity Generator

/* statesw.c - a sample state driven application using switch/case

*
* Mark Phythian
*
*/
#include <stdio.h> #include <stdlib.h>
const int A = 0; const int B = 1; const int EVEN = 0; const int

ODD = 1;
int output;
// rnd(m) produces a random number between 0 and m
double rnd(double m) {
double r;
r = m * (double)rand() / RAND_MAX;
return r;
}
char parity[2][6] = {"EVEN\n" , "ODD\n"};
void main() {
int state;
int input;
// state transition functions
output = EVEN;
state = A;
while(1)
{
//delay
input = (int)(rnd(1.0) + 0.5);
printf("%d\t",input);
switch (state)
{
case 0 : if (input == 1)
{
output = ODD;
state = B;
}
else
Figure 2.11
The switch/case implementation for the parity generator - part 1
of 2
{
output = EVEN;
state = A;
}
break;
case 1 : if (input == 1)
{
output = EVEN;
state = A;
}
else
{
output = ODD;
state = B;
}
break;
}
printf("%s\n",parity[output]);
}
}
Figure 2.12
The switch/case implementation for the parity generator - part 2
of 2
In the implementation of the tabular form the present state and input are used as in-
dices into an array (table) holding pointers to individual functions representing each
transition. In each of these functions the output values are set and the desired next
state is specified by the return value of the transition function. This is achieved in a
single line in the main program using:
state = (next[input][state])();
The disadvantages of this approach include:
input values must be mapped to a set of consecutive integer indices
different inputs may be required for some states
many entries in the table may be unused as some states may not use all input
conditions
In the last case a default error handling function should be specified in each unused
entry to trap illegal combinations of current state and input as appropriate.
Figures 2.13 and 2.14 show the implementation of the same parity function for State-
Driven code using tabulated function pointers. Note that in the main program the table
of addresses for each of the transition functions is defined individually using:
/* statetab.c - a sample state driven application

* using tabulated function pointers
*
* Mark Phythian
*
*/
#include <stdio.h> #include <stdlib.h>
const int A = 0; const int B = 1; const int EVEN = 0; const int

ODD = 1;
int output;
int AtoB() {
output = EVEN;
return B;
}
int BtoB() {
output = EVEN;
return B;
}
int AtoA() {
output = ODD;
return A;
}
int BtoA() {
output = ODD;
return A;
}
// rnd(m) produces a random number between 0 and m
double rnd(double m) {
double r;
r = m * (double)rand() / RAND_MAX;
return r;
}
Figure 2.13 The tabular implementation for the parity generator - part 1 of 2
next [0][A] = &AtoA;

int (*next[2][2])(); char parity[2][6] = {"EVEN\n" , "ODD\n"};
void main() {
int state = B;
int input;
// state transition functions
output = EVEN;
next[0][A] = &AtoA;
next[0][B] = &BtoB;
next[1][A] = &AtoB;
next[1][B] = &BtoA;
while(1)
{
//delay
input = (int)(rnd(1.0) + 0.5);
printf("%d\t",input);
state = (*next[input][state])();
printf("%s\n",parity[output]);
}
}
Figure 2.14 The tabular implementation for the parity generator - part 2 of 2
Each of the transition functions sets the output response and exits with the desired next
state as the return value. One particular advantage with this approach is that response
times can be easily calculated as the output update can be made to occur in one place
only, in the main program after the return from the transition function. Also the tabular
approach is very easy to modify and maintain.
2.5.4 Interrupt Driven Systems
In all modern computer systems the hardware supports single or multiple interrupt
inputs. These interrupts can be associated with external event triggers, internal or
external clock sources, software instructions or both hardware and software error traps.
This rich source of asynchronous and synchronous event information is the underlying
feature upon which interrupt driven systems are based.
Instead of polling for events as previously described, a real-time system can be pro-
grammed to respond to interrupt events. The basic concept is that each interrupt
utilises its own interrupt service routine (ISR) to service that event. Servicing may
include transferring one or many pieces of data, counting events, starting or stopping
processes and many other functions. Systems which receive only aperiodic interrupts
are called sporadic systems, where-as systems which utilise only periodic interrupts
are called fixed-rate systems. Systems that use both types of interrupts are called
hybrid systems.
One of the difficulties associated with interrupt operation is the restricted means of
interaction between these separate event handlers and the main program. Because
interrupts are designed to carefully save and restore the CPU status before and after
the ISR is executed almost all communications between ISRs and the main program
must be through shared memory. While this is achievable it greatly increases the com-
plexity of the system.
One of the main uses for interrupts in real-time systems is to provide a means of switch-
ing between processes/tasks in multi-task applications. In this case the saving and
restoring of CPU status along with other system parameters can be used to stop one
process and start another in a procedure called context switching.
Context switching is the process by which the kernel of a real-time system suspends
the operation of one task/process via an interrupt by saving CPU registers, co-processor
registers, memory page registers, stack pointers and other significant system informa-
tion, before restoring an alternate context for the next process to run. The data is
typically saved on the run-time stack of the suspended process, and the new context
restored from the run-time stack of the next process. This is usually achieved through
the use of multiple process stacks.
In full featured operating systems the multiple stack model is often replaced by the Task
Control Block model. In more advanced operating systems it is advantageous to allo-
cate an area called the task control block to each task in the system. This area not only
holds the processs run-time stack but areas for task specific information, input/output
buffers and inter-task communications.
Figures 2.15, 2.16 and 2.17 show a program to set up the Intel8253 timer on the IBM
PC to generate a regular interrupt at approximately 55ms intervals. The main pro-
gram is comprised of the initialisation of the interrupt, a simulated three task/state
implementation based on state variable intswitch, a sample use of the timer for time
measurement and the closing down of the interrupt.
The task of the interrupt service routine is to count twenty timer interrupts and change
the state variable intswitch to switch between three simulated tasks approximately
once every second. While there is no context switching implemented here for the tasks
themselves the example serves to illustrate the concept of an interrupt driven system
and task selection.
/* timer functions for 8253 programmable timer chip

*
* J. Leis, modified M Phythian
*/
#include <stdio.h> #include <bios.h> #include <dos.h> #include

<conio.h>
#define CRYSTAL_RATE (unsigned long)(1193180) /* crystal

rate in Mhz */
void OpenMicroTimer(void); void CloseMicroTimer(void);

unsigned long ReadMicroTimer(void);
void (_interrupt _far *oldvect)(void); static unsigned

ticks=0;
void _interrupt _far inthndlr(void);
/* example program to demonstrate use */ int int_switch;
void main(void) {
int n;
unsigned long start, end;
intswitch = 0;
OpenMicroTimer();
n = 20;
while (n > 10)
{
switch (intswitch)
{
case 0:
printf("IN CASE 0\n");
break;
case 1:
break;
case 2:
break;
}
}
Figure 2.15 A sample timer interrupt application - part 1 of 3

/* just reading the timer alone

*/
start = ReadMicroTimer();
end = ReadMicroTimer();
printf("Latency time to read timer = %ld ticks.", end-start);
printf("Equivalent to %7.0f microseconds.\n",
(end-start) * 1000000.0 / (float)CRYSTAL_RATE);
/* time the printf() function

*/
start = ReadMicroTimer();
printf("The time to print this is ");
end = ReadMicroTimer();
printf("%ld ticks.", end-start);
printf("Equivalent to %7.0f microseconds.\n",
(end-start) * 1000000.0 / (float)CRYSTAL_RATE);
CloseMicroTimer();
/* OpenMicroTImer() - must be called to initialize the

* high-precision timer.
*/ void OpenMicroTimer(void) {
outp( 0x43, 0x34); /* channel 0, mode 2 */
outp( 0x40, 0); /* lsb = 0, */
outp( 0x40, 0); /* msb = 0 -> count = 65536 */
oldvect = _dos_getvect(0x1c);
_dos_setvect( 0x1c, inthndlr);
}
/* CloseMicroTimer() - must be used to de-install the timer */

void CloseMicroTimer(void) {
outp( 0x43, 0x36); /* mode 3 */
outp( 0x40, 0); /* lsb = 0 */
outp( 0x40, 0); /* msb = 0 */
_dos_setvect(0x1c, oldvect);
}

/* ReadMicroTimer() - read the current value

* of the micro-interval timer. Value returned
* is an 'unsigned long', representing a time
* value in crystal clock ticks, where one
* clock tick is 838.1 ns ( crystal rate = 1.19318 Mhz )
*/ unsigned long ReadMicroTimer(void) {
_asm
{
mov bx, ticks ; ticks on entry
mov al, 0x06

out 0x43, al ; latch tick count
in al, 0x40 ; get lsb
mov ah, al
in al, 0x40 ; get msb
xchg al, ah ; correct order
not ax ; convert from downcount
inc ax ; to upcount
mov dx, ticks ; has tick count incremented ?

cmp bx, dx
je TIMERDONE
; tick count has incremented

cmp ax, 0x8000 ; past half way ?
jb TIMERDONE ; yes-> ok ( nb downcount ! )
mov dx, bx ; no -> use latter value
TIMERDONE:
; return value:
; high word in DX
; low word in AX
; NOTE: COMPILER DEPENDENT !
}
}
/* interrupt handler. counts clock ticks ( 55 ms ticks ) */ void

_interrupt _far inthndlr(void) {
ticks++ ; /* increment cound of ticks */
if (ticks > 19)
{
ticks=0;
intswitch ++;
if (intswitch > 2)
intswitch = 0;
}
(*oldvect)(); /* chain to other timer handlers */
}

2.5.5 Types of Multi-tasking Systems

Having the ability to change tasks in a controlled manner raises the question: What is
the best way in which tasks can be swapped in a real-time system? That is - how does
the programmer or the kernel decide which task is to run next? No definitive answer
to that question has yet be to be found for an arbitrarily complex system, but several
types of scheduling systems have been developed for various applications.
The simplest is the round-robin system which simply divides the available CPU time
into short intervals, of the order of 10ms, and allocates consecutive time slices to each
task in turn. Each task runs until it is complete or until its allocated time slice is expired.
At that time the task is suspended and its context saved for later retrieval. The context
for the next task is loaded and it runs for up to one full time slice. All tasks in this system
are assumed to be of the same importance and no one has priority over any other.
While using equal priorities works well for systems with low time loading or few tasks,
most systems require the ability to assign priorities to tasks to ensure response times
can be guaranteed. In such systems we need the ability for higher priority tasks to
interrupt or preempt lower priority tasks. Such a system is called a preemptive priority
system. Priorities may be assigned at the design or programming phases based on
the importance of the task, or may be assigned dynamically by a section of the kernel
called the scheduler.
Preemptive priority systems have the disadvantage that higher priority tasks can tend
to hog system resources such as CPU time, single user input/output devices etc. This
effect can be minimised by careful assignment of priorities or dynamic assignment of
priorities.
In systems which have a number of fixed rate interrupts it has been shown that the
best performance is achieved when higher priorities are assigned to the interrupts with
higher execution rates. Such systems are called rate-monotonic systems.
2.5.6 Foreground / Background Systems

Foreground / Background Systems are a combination of polled and interrupt driven sys-
tems where a polled loop is used to run useful background processing, and interrupts
are used for foreground processing to service critical events or operate a multitasking
kernel. Foreground / Background Systems are often used for embedded systems.
Background processes are typically used for non-critical functions including:
incrementing a counter to measure the systems time loading
incrementing task counters which get reset in the task to show the task is running
self testing
printing
The advantages of Foreground / Background Systems include:
the ability to achieve good response times
improved reliability through the use of interrupt triggered events and scheduling
of tasks.
The disadvantages of Foreground / Background Systems include:
interrupt handlers must be written for each device
they are not very suitable for a system requiring a variable number of tasks.
2.6 Module Summary

This module has presented the concept of the software life cycle, outlining the six
phases through which a real-time software design progresses. The concept of software
engineering was discussed in terms of adopting an engineering methodology for the
design and implementation of real-time systems to ensure a quality end product.
Several techniques for creating system specifications and design documentation were
presented for use as design tools for creation of real-time systems. The concept of
a real-time kernel was introduced along with examples on how to implement several
types of basic kernel.
2.7 Self Assessment Questions

1. Briefly outline the six phases of the Software Life Cycle by listing the major ob-
jectives of each phase.
2. Draft a Psuedo-code description for the procedure to change a flat tyre on a car.
3. Draft a mathematical specification for the computational section of a system to

()
monitor the position (x; y ) of a robot moving on a flat surface at velocity v t and
()
at an angle of t . Where v and can change at intervals of 1 second.
4. Draft a Flowchart for the procedure described above for changing a flat tyre.
5. Draft a Dataflow Diagram for a system to measure and control the temperature of
a furnace according to a temperature set point which is entered by the user via a
keypad.
6. Draft a Structure Chart for a system which might be used to operate a library
catalog system.
7. Draft a State Diagram for a system to detect the sequence 10101 in a data stream
1 0
(single bit input) and output a logic corresponding to the last bit and otherwise.
8. Draft a Statechart that represents the task(s) of driving a car.
9. Briefly outline the key features of the following real-time kernels: polled systems,
state driven systems, and interrupt driven systems.
Module 3
THE C AND C++
PROGRAMMING
LANGUAGES
Module 3 The C and C++ Programming Languages 3.1
3.1 Module Overview

The C language is the basis of most low-level computer engineering tasks and is also
commonly used for writing higher-level applications. The topics covered in the section
on C are:
Compiling, linking and running a C program.
Variable types and variable scope.
Reading and writing data files.
Large build management using the makefile.
The section on C++ covers:
Compiling, linking and running a C++ program.
Objects, classes and object-oriented programming.
Functions in C++, including calling C-language functions.
C++ variable types and variable scope.
Reading and writing data files in C++.
Later modules will cover more advanced features such as linking C and assembly code
and dynamic memory allocation.
3.2 Introduction
C is sometimes referred to as a low-level language. The term low-level refers to the
components commonly found in operating systems, device drivers and embedded sys-
tems. Most operating systems are written wholly or substantially in C. This module will
give a brief overview of the C language by way of a set of examples. For completeness,
an overview of the most important features of the C++ language is also given. It is em-
phasized, however, that the treatment is definitely not introductory and the student is
expected to have a grasp of the principles of computer programming.
The Cygnus Gnu C/C++ compiler was used for all of the examples presented here.
However, the examples given in this module are sufficiently general to be able to be
compiled with virtually any C++ compiler without change.
a
3.3 The C Language

C is a programming language which is very widely used in Engineering applications.
These range from simple numerical and text processing tasks through to application
programs such as text editors and databases, all the way to operating systems them-
selves. If carefully written, C code is highly portable across different operating systems
and hardware platforms.
C is one of the oldest programming languages, and arguably the most widely used.
The original specification is called K&R C after its originators, Kernighan and Ritchie.
The standard now is termed ANSI C (American National Standards Institute). Note
that the C++ language is a derivative of C, in that C++ compilers are able to compile
C programs (but the converse is not true you cannot compile a C++ program with
a plain C compiler). This document concentrates on the C language, although the
second part contains an introduction to and overview of the main concepts of C++ .
Note that C++ programs normally have a file extension of .cpp, whereas standard C
programs have a .c extension.
3.3.1 Starting
C code produces a stand-alone, or executable program that may be run independently

of the compilation process. This is unlike MATLAB (for example), which requires the
MATLAB interpreter to be available on the users computer system in order to run. Note
however that some compiler environments require more than the plain executable file
to be distributed, and require certain library files (typically .lib, .dll or .so). In that
case, the linker or development environment will normally have one or more options in
order to produce a stand-alone executable file.
It must be understood that C has no intrinsic input or output (I/O) functions of its own.
The language includes constructs for variable declaration, numerical calculation, loop-
ing and the like, together with the ability to extend the basic functionality via library
function calls. It might seem strange to have no inherent I/O, but remember that the
notion of I/O is quite different in a DOS program, a Windows program or (for example)
an embedded control computer for a cars engine. Of course, some form of I/O is re-
quired, and thus a set of standard library functions like printf() for screen printing
and scanf() for keyboard input are provided with each compiler. Further examples of
code libraries include windowing code and network access functions.
3.3.2 Compiling C Programs
The so-called Integrated Development Environment (IDE) compilers are menu-based,

and as such have on-screen help facilities. To compile programs, the menu choice
myapp.c other.c
extern void someFunc(void);

void someFunc()
int main()
f
f // function code
someFunc(); g
// other code
Figure 3.1 External code modules.
is generally termed build or make. Note that simply compile will not produce an
executable program, as compilation is only the first step.
The source files normally have a .c extension. The compiler takes each .c file and
produces an object file, which normally has an extension.o (under Unix and the Gnu
DOS/Windows compilers) or .obj in other DOS compilers. The object files are linked
together, using the linker program, which is normally invoked automatically after com-
pilation to produce an executable file. In DOS, this normally has a .exe extension,
while in Unix the extension is not significant. The fact that a Unix file is executable is
seen by typing ls -l filename and examining the x flag. Other aspects, such as
dynamic linking and the use of makefiles, are discussed at the end of this module.
The Gnu C compiler is invoked by the command gcc. The simplest usage is as follows:
gcc myprog.c -o myprog.exe
This runs the compiler and linker combined (the gcc program) on the source file
myprog.c to produce the output file (as denoted by the -o option) myprog.exe1. Note
that the output of the compiler is only an onject file, whereas the output of the linker is a
full executable file. This simple invocation will work if the various components required
by the compiler and linker are in the default directories.
Breaking up the code into more than one source file is the normal practice for anything
but the simplest of programs. If there is more than one source file, some references to
code in other files (modules) will be required, as depicted in Figure 3.1.
In this case, the compiler processes each code module separately into an object file,
and the linker resolves the references to external code or data. The compiler must
be told to expect certain variables or data to remain unresolved. This is done via the
1
The .exe extension should not be used on Unix systems.
Source files Object Files

Compile Link
myapp.c myapp.o Executable
other.c other.o myapp.exe
more.c more.o
Figure 3.2 Compiling and linking a C program.
extern declaration. For the example shown in Figure 3.1, myapp.c would require the
following declaration at the start of the file:
extern void someFunc(long);
which essentially states that the code for function someFunc() is in another module; it
expects a long arguments and returns a void data type.
There is no limit on the number of separate code modules as many as are required
for clarity. Normally code modules contain groups of related functions. The process is
depicted graphically in Figure 3.2.
The compile-link command required now is
gcc myapp.c other.c more.c -o myapp.exe
Of course only one output file must be specified.
It is common to have constants and/or data structures shared amongst code modules.
Such constants may be, for example, the expected maximum number of students in a
class. Data structures, which will be detailed further in a later section, are compos-
ite data types which group together related data (for example, a student name and
grades). These constants and data structures are likely to be required by several code
modules. It would be a maintenance nightmare if each separate file maintained sep-
arate definitions, as one change in a constant would require a change in each one of
the source files. Of course, this could introduce hard-to-trace errors. So, include files
are used for this purpose, which are shared between code modules. These files have
a .h extension. They are included into the source code module by a statement of the
form
#include <constant.h>
Source files Object Files

Include Compile Link
constant.h myapp.c myapp.o Executable
other.c other.o myapp.exe
more.c more.o
libwin.a
Library files
Figure 3.3 Compiling and linking a C program with include and library files.
or
#include "constant.h"
The latter specifies the file as being in the same directory as the source file. The former
specifies a standard location as will be seen shortly.
Common code modules are also provided for various purposes. For example, a win-
dowing system will have a common set of primitives to draw a window, resize a window
and so forth. These are called libraries or archives and are effectively pre-compiled.
That is to say, the user does not have access to the source code only the object code.
This is illustrated in Figure 3.3.
In order to specify the library, the -l switch is used. Standard libraries reside in a
directory called lib. On Unix systems, the standard math library (providing functions
such as sin and sqrt) reside in the file libm.a. The library switch is such that the
linkage specification -lx searches for a standard library of the form libx.a. The path
to the library may be specified with the -L switch.
Similarly, standard include files reside in a directory called include. The path to the
include files may be specified with the -I switch. A default path is also defined
normally include under the compiler root directory.
Putting this all together, the following DOS batch file compiles the source file specified
on the command line into the corresponding executable file:
@rem compile using Gnu C, in a DOS box under Windows

gcc -I incdir %1.c -o %1.exe -L libdir -lm
Here,
%1.c specifies the first command-line argument with .c appended
-I specifies the path to the include (,h) files
-o specifies that an output file follows (here the program name with .exe appended)
-L specifies the path to the library files (unnecessary here)
-lm specifies loading of the math library functions such as sin() from libm.a
It would be used on the command-line as follows:
gc myprog
where gc.bat is the name of the batch file as outlined.
A similar script file for Unix may also be used (using correct path separators and
command-line switches). Unix shells use $1 for command-line argument 1. Unix does
not use the extension to signify the executable nature of the script simply create the
file gc using a text editor and then change the mode using
chmod +x gc
Note that the libraries are only searched on demand. The code for library functions
is only added to the executable image if needed. This reduces the size of the final
executable.
activity 3.1
Compilation using include Files
Enter the following text and name it ctest.c
#include "constant.h"
int main(void)
{
short x;
x = SOME_CONSTANT;
}
Enter the following text and name it constant.h
#define SOME_CONSTANT 45
Compile ctest.c using
gcc ctest.c
The default output is called a.exe on DOS systems, or a.out on Unix

systems. Normally you would explicitly name the executable, so use
gcc ctest.c -o ctest.exe
(Omit the .exe on Unix systems) Then execute the program by typing
ctest at the command prompt. Now remove the #include line in the
source C file. Compile and note the error messages. Restore the line.
Now change the #include line to read
#include <constant.h>
Try to compile and note the error messages. Now compile using
gcc ctest.c -I. -o ctest.exe
Where -I sets the search path for include files specified with < > . The
. after L specifies the current directory. It should compile correctly.
activity 3.2
Function Prototypes and Library Linkage
#include <math.h>
int main()
{
double x;
x = sin(3.14);
}
Now remove the #include line in the source C file. Compile and note the
error messages about the sin() function. This is because the function
prototype in the include file math.h has been omitted. Look at this file
normally in a directory called include below the compiler installation
path on Windows or /usr/include on Unix. Note the function prototype
for the sin() function shows the data types of the arguments expected
and returned. Restore the #include line.
Now compile using
gcc ctest.c -o ctest.exe -nostdlib
Note the error messages. Because the standard library is not included
due to the -nostdlib switch the library code for the sin() function will
not be defined.
activity 3.3
Compile-Only
Test the compile-only option. Remove any object files:
del *.o
(or rm *.o in Unix)
Now look at the object file:
dir ctest.*
(or ls -la ctest.* in Unix)

In order to link this object file with the standard library enter
gcc ctest.o -o ctest.exe
This will produce the executable.
activity 3.4
Libraries
Compile the ctest.c program with the -lm option to link in the math
library libm.a. Now look at the size of the resulting executable file and
the size of the math library. Clearly the entire contents of the math library
have not been included in the executable!
Lastly, it is worth noting that it is possible to view the assembly-language instructions

corresponding to the C source code. A complete discussion of this topic is somewhat
beyond the scope of this introductory tutorial, however the following exercise shows
what is possible.
activity 3.5
Assembly Output
int main(void)
{
short y, x;
x = 3;
y = x + 1;
}
gcc ctest.c -S
The -S option (note capitalization) forces the C to assembly stage only

it does not produce an executable. The output file will be the same as
the source file with a .s extension (assembly). View the file ctest.s with
a text editor. Note the processor assembly instructions such as movw and
incl.
3.3.3 Basic Structure of a C Program
A rudimentary C program is shown in Figure 3.6. The main function is the first point
of entry after the program begins. It is mandatory to have a function called main() In
a Windows program rather than a command-shell one the entry point must be called
WinMain(). The arguments to main() are void, meaning nothing; similarly, the return
value of main() (on the left-hand side) is int. This does not have to be the case, but
for a simple example it will suffice.
Comments begin with /* and end with */. If the compiler is C++ aware (the majority
of C compilers), then a single-line comment may also be entered by starting the line
with //
As with any programming language, C allows the decomposition of problems into

smaller sub-problems via functions. In other languages, functions are variously called
subroutines or procedures. Functions may have any number of arguments passed
in (enclosed by the brackets ()), but only return one value (on the left-hand side). This
is illustrated in Figure 3.4. This is a functiomn prototype, or what the function looks
function name
return value (out)
arguments (in)
z }| {
short someFunc( short arg1, char *arg2, double arg3 )
Figure 3.4 C function arguments and return value.
) increasing memory address )

! allocated but unused !
n a m e 0 x x x x
last location
first location
= null
Figure 3.5 In C, strings (arrays of characters) are always null-terminated.
like to the compiler, in terms of arguments passed and return value. This is because
the compiler will encounter the call to the function PrintMessage() before the actual
function itself. Thus, it is a form of consistency checking.
The lines beginning with #include are directives to the C preprocessor; in this case, it
means literally include the files stdio.h and stdlib.h . These are called header files,
and contain (amongst other things) function prototypes for the functions which will be
used. Here, the printf() function is prototyped in stdio.h, and atoi() is prototyped
in stdlib.h. The compilers help screens or manual will inform you as to which include
files are required for each library function. The include files are located in a special
dedicated directory, normally called include under the compilers installation directory.
In Figure 3.6, gets() gets the character string from the user. This is to be interpreted
as a number, hence the function atoi (ASCII to integer) is used to convert the string (in
C, an array of characters or chars) into an integer (in this program, a short integer
of 16 bits size). In C, character strings are stored in null-terminated form as depicted
in Figure 3.5
/* basic.c - this is a basic C program */
#include <stdio.h>
#include <stdlib.h>
// function prototype
void PrintMessage( short NumTimes );
// main entry point

void main()
{
char UserBuf[100];
short NumTimesToPrint;
printf("How many times do you want it printed? ");

gets( UserBuf );
NumTimesToPrint = atoi( UserBuf );
PrintMessage( NumTimesToPrint );
}
void PrintMessage( short NumTimes )

{
short NumMessages;
for( NumMessages = 0; NumMessages < NumTimes; NumMessages++ )

{
printf("This is message number %hd \n", NumMessages);
}
}
Figure 3.6 The C program basic.c

activity 3.6
Uninitialized Variables
In the program basic.c remove via commenting-out, the three

lines which prompt the user and subsequently sets the variable
NumTimesToPrint. Re-compile and run the program. Are the results
predictable?
activity 3.7
Compiler Warnings
Enter the following program, called ctest.c
int main()
{
double x;
// other code would follow here

}
Now compile using
gcc ctest.c -o ctest.exe -Wunused
This should issue a warning about unused variable x. This is an exam-

ple of compiler warnings which may be available.
3.3.4 Data Types

C has moderately strong typecasting you must declare all variables before use, and
take care to assign appropriate quantities to variables. The main data types used are:
short a short integer, 16 bits

long a long integer, 32 bits
char a character, 8 bits
double a double-precision floating point variable
The data type int may be used, however it can cause problems because it is defined
to be the native size on the machine on which it was compiled. This may be either 16
or 32 bits, and is thus ambiguous. The data type float also exists for single-precision
floating point values, however the extra precision of double is preferred because of the
prevalence of floating-point coprocessors in modern CPUs (double generally takes no
longer to calculate with). A string is simply an array of characters: char MyName[20].
The array must be terminated by a character value of 0 (the null character value).
C will allow certain invalid assignments to be made, and issue a warning. For example,
if pi was declared as a short, and we had the statement pi = 3.14, then a warn-
ing such as loss of precision may be issued. The program would still run, however
(pi would be set to 3). In many systems-level programming tasks, this rounding-off
behaviour may be desirable, but generally is the cause of many bugs.
This issue is termed data typing, or usually just typing. Type rules catch the passing
of incorrect arguments to a function, which may compile correctly but not execute cor-
rectly. An example is the passing of a double where an int was expected. Languages
are said to be strongly typed if the data types are strongly enforced. Weakly typed
languages do not enforce correct argument typing. Assembler is the weakest in this
regard, as the discipline of enforcing correct data storage sizes is entirely up to the pro-
grammer. Strongly-typed languages, though desirable, often make required operations
impossible. C is somewhere in between; for example, the following is incorrect but on
most compilers will not issue a warning by default:
int x;
double d;
d = 3.14159;
x = d;
whereas the following, using a type cast, is acceptable:
int x;
double d;
d = 3.14159;
x = (int)d;
Presumably the effort of placing the typecast (int) means that the programmer was
sufficiently aware of the implications (in this case, truncating down 3.14159 to 3 when
stored as an integer).
activity 3.8
Data Size Errors
Enter the following program:
#include <stdio.h>
int main()
{
short shortVar;
long longVar;
// something greater than the

// maximum `short' allowable
longVar = 75000;
shortVar = (short)longVar;
printf("longVar=%ld shortVar=%hd \n",
longVar, shortVar );
}
What is wrong here? Can you explain the output?
3.3.5 Variable Scope
C has automatic variables, which are declared after the opening brace {, remaining
until the matching closing brace }. These are local to the function in which they are de-
clared. Passed-in arguments appear between the brackets () of functions and cannot
be changed by a function. Global variables (accessible by all functions) are declared
outside any function scope. Figure 3.7 illustrates these concepts, with the function
TestFunc() shown in Figure 3.8.
The output of scope.c is shown in Figure 3.9. Note that local values are not altered
within the scope of the calling function, and that RetVar is initially unassigned and has
a random value.
The pointer data type is discussed in the following section.
The use of pointer variables will be discussed in the next section.

/* scope.c
* Simple illustration of variable scoping in the C language
*
* John Leis
*/
#include <stdio.h>
/* a global variable
* If we wish to use this variable from other modules (C files)
* we put the same declaration with the keyword "extern" in front.
*/
short GlobalVar;
/* function declaration */
short TestFunc( short InVar1, short InVar2, short *PtrVar);
/* the "main" entry point */

void main()
{
short Var1, Var2, RetVar;
short *PointerToVar;
Var1 = 4;
Var2 = 5;
GlobalVar = 6;
printf("Before function call, RetVar = %hd, Var1 = %hd, Var2 = %hd\n",
RetVar, Var1, Var2);
RetVar = TestFunc( Var1, Var2, &Var1 );
printf("After function call, RetVar = %hd, Var1 = %hd, Var2 = %hd\n",
printf("GlobalVar = %hd\n", GlobalVar);
printf("\n ---------------- \n\n");
printf("Before function call, RetVar = %hd, Var1 = %hd, Var2 = %hd\n",

PointerToVar = &Var1;
RetVar = TestFunc( Var1, Var2, PointerToVar );
printf("After function call, RetVar = %hd, Var1 = %hd, Var2 = %hd\n",
printf("Contents of pointer variable = %hd\n", *PointerToVar);
}
Figure 3.7 The main section of the program scope.c

/* a test function
* The function return value is the sum of the passed-in arguments
* (InVar1 + InVar2)
* The contents of the pointer variable PtrVar are replaced
* with the product (InVar1 * InVar2)
* The global variable GlobalVar is changed to (InVar1 - InVar2)
* Note that the function *attempts* to change InVar1 and InVar2,
* but that they are not changed (pass-by-value).
*/
short TestFunc( short InVar1, short InVar2, short *PtrVar)
{
short SumResult, ProductResult;
SumResult = InVar1 + InVar2;

ProductResult = InVar1 * InVar2;
*PtrVar = ProductResult;
GlobalVar = InVar1 - InVar2;
/* this will *not* change the values in the calling function

*/
InVar1 = 45;
InVar2 = 54;
return SumResult;
Figure 3.8 The function TestFunc() from scope.c
Before function call, RetVar = 888, Var1 = 4, Var2 = 5 After

function call, RetVar = 9, Var1 = 20, Var2 = 5 GlobalVar = -1
----------------
Before function call, RetVar = 9, Var1 = 20, Var2 = 5 After

function call, RetVar = 25, Var1 = 100, Var2 = 5 Contents of
pointer variable = 100
Figure 3.9 The output of the program scope.c

3.3.6 Pointers
In the preceding example, the pointer data type was used. Variables in any program
are stored in memory, and the pointer is just another variable that happens to hold
the memory address of another variable. Although not strictly required for elementary
programming tasks, the pointer is quite powerful and gives extreme flexibility in many
situations.
An code example of pointer variables and their use was given in the previous example
(scope.c, Figure 3.7).
Two symbols, * and &, are used in connection with pointers. Read them as
* contents of
& address of
So the code fragment as follows:
short Var1;
short *PointerToVar1;
Var1 = 4;
PointerToVar1 = &Var1;
*PointerToVar1 = 5;
may be read as
Declare Var1 to be a short integer.

The contents of PointerToVar1 is a short integer.
Set Var1 to 4 as a direct assignment.

PointerToVar1 is set to the address of Var1.
Set the contents of PointerToVar1 to be 5. That is, set it indirectly.
This is shown in Figure 3.10. Note that the value contained in PointerToVar1 is not
assigned directly by the programmer, but depends on where the compiler, linker and
run-time loader allocate the memory addresses. This taking-the-address-of is some-
times called dereferencing or more commonly indirection.
The question may be asked as to why this added complexity is necessary. The answer
is that such memory addressing facilitates rapid moving through arrays of characters
(strings) or other data types and is often necessary for low-level operations.
Sometimes it is very useful to have a more advanced construct, that of double indi-
rection. This is where a pointer-to-a-pointer is required, as shown in Figure 3.11.
Note how the pointer-to-pointer declaration syntax is consistent: simply declare two
contents-of:
16 bits
0 0 0 4 short Var1
short *PointerToVar
memory
5 8 A B 7 6 9 8
Figure 3.10 Pointer dereferencing in the example program.
short **ppVar;
Again, this added complexity gives much greater flexibility in accessing memory. Memory-
efficient dynamic data structures such as doubly-linked lists use pointer dereferencing.
3.3.7 Command-Line Arguments
In DOS or Unix command-line (shell) programs, arguments may be sent to the program
via the command-line itself. Windows also allows the specification of parameters when
an application is started. For example, a command such as
copy somefile.txt other.bak
requires the arguments somefile.txt and other.bak to be used in the copy program.
The program cmdargs.c, shown in Figure 3.12, simply prints out all of its command-
line arguments. Note in the sample output that the program name itself is the zeroth
argument.
3.3.8 Loading & Saving Data
It is often necessary to load in some data files, or save calculated data for later refer-
ence or plotting.
short *pVar
short Var
short **ppVar
Figure 3.11 Pointer-to-pointer dereferencing.
/* cmdargs.c
* Command-line arguments
*
* example:
d:\c\gnuc> cmdargs one two three
There are 4 command-line arguments

Argument number 0 is d:/c/gnuc/cmdargs.exe
Argument number 1 is one
Argument number 2 is two
Argument number 3 is three
* John Leis
*/
#include <stdio.h>
int main( int argc, char *argv[] )

{
int argNum;
printf("There are %d command-line arguments\n", argc);

for( argNum = 0; argNum < argc; argNum++)
{
printf("Argument number %d is %s\n", argNum, argv[argNum] );
}
}
Figure 3.12 Processing of command-line arguments.

0.001 0.564
0.193 0.809
0.585 0.480
0.350 0.896
0.823 0.747
0.174 0.859
0.711 0.514
0.304 0.015
0.091 0.364
0.147 0.166
Figure 3.13 A text (ASCII) data file
An important distinction to be made is between binary and text (or ASCII) formatted
files. Text files contain plain, readable text which can be viewed with any text editor.
For example, temp.txt might contain the data as shown in Figure 3.13.
Text files may be created from a C program using the code fragment shown in Fig-
ure 3.14.
Note the use of fscanf() to read formatted data. The format specifier %lf specifies a
double-precision variable is being used (in this case, %5.3lf limits the output to a field
of width 5 with 3 decimal places).
We can then continue on to read back the text file using the code fragment shown in
Figure 3.15.
In the preceding example, the read and write functions are contained in the main pro-
gram for ease of illustration. Of course, it is good practice to separate them into func-
tions.
In addition to text files as discussed above, you may encounter binary data files. These
are not viewable using text editors they consist of raw 8 or 16-bit quantities (usu-
ally), which are machine-readable and not human-readable. The exact representation
depends on the CPU being used for example, a Pentium CPU has a different repre-
sentation for 16-bit integers to SUN Sparc CPUs. The integer representations may be
converted, but the situation for floating-point numbers is much more problematic. For
this reason, the Institution of Electrical and Electronic Engineers (IEEE) format is often
used.
Binary files may be written by using the "wb" (write, binary) mode when calling fopen(),
and using fread() to read the raw byte stream. Instead of formatted data handled by
fprintf(), we use fwrite() to write a certain number of bytes (using the sizeof()
operator).
Figures 3.16 and 3.17 show the code necessary to write and read binary files. Of
course, the binary data files themselves will not be directly printable try this by
loading temp.bin into a text editor such as DOS edit.
Note the format specifier "rb" (read-only, in binary mode) as opposed to the text files
discussed previously, which were opened using "r" mode (text is the default). On Unix
systems, there is no need to explicitly specify b as part of the mode string.
/* textfile.c
* Illustrates reading and writing a text (ASCII) file
* containing numerical data samples.
*/
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main()
{
FILE *fp;
double x, y;
short SampNum, NumSamples, NumSamplesRead;
char LineBuf[100];
char *FileName = "temp.txt";
// Create the data text file.

// Format is two numbers on each line - space to separate
fp = fopen( FileName, "w" );
if( ! fp )
{
printf("cannot open output file\n");
exit(1);
}
NumSamples = 10; // number of samples to write
SampNum = 1;
do
{
x = rand()/(double)RAND_MAX; y = rand()/(double)RAND_MAX;
printf("%hd %5.3lf %5.3lf\n", SampNum, x, y);

fprintf(fp, "%5.3lf %5.3lf\n", x, y);
SampNum += 1;
} while ( SampNum <= NumSamples );
printf("Wrote %hd samples\n", NumSamples);

fclose( fp );
Figure 3.14 Text file access program textfile.c (writing).

// now read the data file back

fp = fopen( FileName, "r");
if( ! fp )
{
printf("Cannot open input file `%s'\n", FileName);
exit(1);
}
SampNum = 0;
do
{
fscanf( fp, "%lf %lf", &x, &y);
if( ! feof(fp))
{
printf("x=%5.3lf y=%5.3lf\n", x, y);
SampNum += 1;
}
} while( ! feof(fp) );
NumSamplesRead = SampNum;
printf("Read back %hd samples.\n", NumSamplesRead);
fclose(fp);
exit(0);
}
Figure 3.15 Text file access program textfile.c (reading).
The files temp.txt (ASCII) and temp.bin (binary) generated by the above code should
be examined using the type command in DOS or cat in Unix.
Note that there is considerable scope for generating code which is machine-dependent
that is, code which will work on one platform (such as DOS) and not on another (such
as Unix), and vice-versa.
Most binary files will contain, in addition to the data, some information at the start
of the file pertaining to the characteristics of the data, in addition to the data itself.
This header information is somewhat specific to the type of data file. For example,
a picture (image) file will require information about the width of the picture, the height
of the picture, the number of colours present, and so forth. Some standards exist:
for example, the Windows Paint program uses bitmap files with extension .bmp.
Sound (audio, or wave) files often use the extension .wav or .au. You should be aware,
however, that there is a plethora of different file formats, even for (ostensibly) the same
type of data.
3.3.9 Data Structures

C provides for grouping of logically related data into a data structure, declared using
the struct keyword. Figure 3.18 shows the method for declaration of a data structure.
Note that there are several possible methods of declaring such a structure, for example
struct Student
{
char Name[20];
double gpa;
};
// an array of these structures

struct Student StudentRecords[10];
3.3.10 Bitwise Operators

C can perform most, if not all, of the functions of assembly language. This extends
to bitwise operators. The following table illustrates these. The variables must be an
integer type (int, short, char, long).
operator meaning example

x << n shift x left n bits z = x << 2
x >> n shift x right n bits x >>= 1
~x complement x z = ~x
x | y logical or x and y z = z | y
x & y logical and x and y z &= 0x00ff
x ^ y exclusive-or x and y z ^= 1
3.4 Advanced Topics

This section gives an overview of several topics which should be considered in various
C development situations portability across different hardware, some hardware is-
sues themselves, dynamic memory allocation and compilation management for larger
projects. Some of these are examined in greater detail later in the unit.
3.4.1 Portability
Generally, C code is highly portable between different platforms and operating sys-
tems, provided care is taken with library functions. The exceptions are file handling (as
discussed above) and data sizes. The latter is why the int data type is not recom-
mended. File handling in binary mode, if coded carefully, can be platform-independent
and hence portable.
More subtle structure alignment problems can occur when porting. If a structure size
must have an exact member alignment, the #pragma pack(1) directive may be used.
3.4.2 Makefiles
Typing the compilation command(s) on the command line can become tedious and
tiring. A batch file with the appropriate commands can go a long way to simplifying this.
However, for large projects, a better solution is required. As mentioned previously, it
is good practice to break a project into code modules in separate files. A large project
may run to dozens of files. Creating a single change, even just adding a comment
(which of course does not change the executable code), requires compiling and linking
all over again.
For more than relatively simple projects, a makefile is recommended. In a large multi-
source project, if one file is edited then it is the only one which needs to be re-compiled
not all of the source files need be recompiled followed by linking of the object
files. The make utility solves this problem, and automates the building of large software
projects. make does this by checking and comparing the dates on the source files
and the corresponding object files. If the object file is newer, then it must have been
compiled after the last modification to the source. If the source is newer than the
corresponding object file, then that source file must be re-compiled. If any sources are
re-compiled, the link stage must be performed again.
Figure 3.19 shows an elementary makefile. The most important concepts are:
Target is normally the executable program to be generated. More than one target may
exist, but only one is built at a time. The target may be a dummy one in order to
invoke some other operating-system command.
Dependencies specify what files need to be re-built to create a specific target.
Suffix Rules are the rules for building the targets, for example an object file from a
C source, an object file from an assembly-code source, and an executable from
object files.
The first section contains some comments regarding the project. Following this the
dependencies (object files) are listed here theyre called DEPS and contain only one
object (basic.o), but normally there would be many. The executable target is named
TARGET. Following are the suffix rules here, the only rule specified is to create an
object file from a C source (compile only). Of course, other language compilers could
be invoked or a rule added for creating an object from an assembly language source
(assembly only). The executable target basic.exe specifies a dependency on all the
object files (although here only one), followed by the appropriate command to run. This
is specified on the following line and must be indented one tab stop. In this case, it is
effectively just a link phase.
Note well: the rule to be executed after the dependency must begin with a tab charac-
ter, not spaces. Otherwise the error missing separator will result. So the second line
of
# suffix rules
.c.o:
$(CC) -c $(CCOPTS) $(INCDIR) $*.c -o $*.o
must start with a tab, as must the second line of
# make targets
basic.exe: $(DEPS)
$(CC) -o $(TARGET) $(DEPS) $(LDOPTS)
Other targets may be specified to enhance project maintenance. Here, the targets
noexe and clean specify an action but no dependencies.
activity 3.9
Make Exercises
Run the make utility on the sample makefile:
make
Note the compile and link phases. Run it again:
make
Note that the target is now up to date. Now delete the executable:
make noexe
Then build the target again:
make
Note that only the link phase is invoked. If there were more than once
source C file, only the file(s) which have been changed are re-compiled.
Now delete the object files:
make clean
Then build again:
make
Note that now both compile and link are performed.

3.4.3 Public-Domain C Resources

A number of public-domain resources for C programming exist, in the form of C libraries
for special functions (matrix calculations, graphics & animation, signal processing, to
name a few).
For more advanced work, the Frequency Asked Questions (FAQ) may be consulted on
newsgroup comp.lang.c. Users of Gnu C should consult the FAQ which is available
with the distribution.
// binfile.c
// Illustrates reading and writing a binary file
// containing 16-bit integer data samples.
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main()
{
FILE *fp;
short Sample;
short SampNum, NumSamples, NumSamplesRead;
double NumCycles=2, pi = 4.0*atan(1.0);
char *FileName = "temp.bin";
// Create the binary file.

// 16 bits (2 bytes) per sample.
fp = fopen( FileName, "wb");
if( ! fp )
{
printf("Cannot open output file `%s'\n", FileName);
exit(1);
}
printf("Opened file `%s'. Data size = %hd bytes\n",

FileName, sizeof(short) );
// write 10 samples to the file

NumSamples = 10; // number of samples to write
SampNum = 1;
do
{
/* write samples of a sine wave to the file.
* Note scaling by 1000 (maximum is approximately 32000),
* because we are dealing with 16-bit signed integer quantities.
*/
Sample = (short)(1000.0 *
sin(NumCycles*2.0*pi*(double)SampNum/(double)NumSamples));
fwrite( &Sample, sizeof(short), 1, fp);
printf("%hd %hd \n", SampNum, Sample);
SampNum += 1;
} while( SampNum <= NumSamples );
//printf("Wrote %hd samples\n", NumSamples);

fclose( fp );
Figure 3.16 Binary file access program binfile.c (writing).

// Now read the data file back.

fp = fopen( FileName, "rb");
if( ! fp )
{
printf("cannot open input file\n");
exit(1);
}
SampNum = 0;
do
{
fread( &Sample, sizeof(short), 1, fp);
if( ! feof(fp))
{
// found a valid sample
SampNum += 1;
printf("Read back sample number=%hd, value=%hd\n",

SampNum, Sample);
}
} while( ! feof(fp) );
NumSamplesRead = SampNum;
printf("Read back %d samples.\n", NumSamplesRead);
fclose(fp);
exit(0);
}
Figure 3.17 Binary file access program binfile.c (reading).
#include <stdio.h>
#include <string.h>
// declare the data structure

typedef struct
{
char Name[20];
double gpa;
} Student;

Student StudentRecords[10];
void main(void) {
strcpy( StudentRecords[4].Name, "fred");
StudentRecords[4].gpa = 1.0;
}
Figure 3.18 Structures in the C language.

# sample makefile which builds basic.exe

# For djgpp (GnuC for DOS)
#
# type "make" or "make basic" or "make -f makefile"
# Makefile reference:
# "Topics in C Programming", S. Kochan & P. Wood,
# Hayden Books, Chapter 7
#
# Notes:
# The command after dependency lines must be on
# the next line, and *must* be a tab character.
# Use / not \ as path separator in makefiles.
#
# John Leis
# compiler flags
CC = gcc
CCOPTS =
BASEDIR =
INCDIR = -I $(BASEDIR)/include
# linker flags
MATHLIB = m
LIBS = -l$(MATHLIB) # note no space after -l
LDOPTS = $(LIBS)
# dependencies
DEPS = basic.o
# executable target
TARGET = basic.exe
# suffix rules
.c.o:
$(CC) -c $(CCOPTS) $(INCDIR) $*.c -o $*.o
# make targets - note tab on rule line
# executable
basic.exe: $(DEPS)
$(CC) -o $(TARGET) $(DEPS) $(LDOPTS)
clean:
del *.o
noexe:
del *.exe
Figure 3.19 A makefile for building a project.

3.5 The C++ Language

The C language is excellent for low-level programming tasks. It is also quite good
for high-level tasks, although several key concepts have led to the development of an
object-oriented version of C called C++. Some of these key concepts are:
Reliability The need for reliability and the problem of error-checking in large projects.
Reuse The ability to re-use code in one or several projects.
Co-ordination The ability for several members of a design team to work on different
sections of the same project and have the resulting software work seamlessly.
The object-oriented paradigm refers to the joining of data and functions to operate on
that data. The two entities data and code taken together are termed an object.
In a traditional procedural language such as C, the software engineering task normally
begins by defining the data structures and then the code which operates on that data.
The linkage is not tight, and relies on the programmer(s) to ensure that the correct
portion of code operations on the correct data. Although the data-code connection
should be clear at the design stage, it is not immediately obvious at the implementation
stage and much less clear during the maintenance phase.
Thus, the object-oriented approach is as much a conceptual framework as a particular

programming language. However, the programming language constructs provided by
the C++ language help to enforce the object-oriented principle.
C++ is a superset of C. That is, a C++ compiler can quite happily compile a standard
C program, and conventional C code can be freely intermixed with C++ code they
look the same except for some additional syntax. This probably accounts for the pop-
ularity of C++ for programming projects.
Having spelled out the advantages of C++ in a broad sense, some disadvantages must
be mentioned. First is the requirement to upgrade software tools and of course pro-
gramming skills. If the GnuC compiler tools were installed as described previously, then
the GnuC++ compiler is automatically available. Second, some performance may be
lost in terms of raw speed. Whether this loss outweighs the code portability, readabil-
ity and maintainability advantages outlined earlier depends on the particular situation.
In general, large application programming tasks will invariably benefit from the use of
C++. Low-level code such as device drivers and system functions may require a pro-
gramming style appropriate to attaining the utmost performance. C is probably more
appropriate in this situation. However these sorts of tasks are unlikely to benefit from
an object-oriented approach as the tasks tend to be procedural and hardware-oriented.
3.5.1 Compiling C++ Programs

The Gnu C++ compiler is invoked by the command g++. The simplest usage is as
follows:
g++ myprog.cpp -o myprog.exe
Note the use of the cpp extension for C++ sources. The batch file gcpp.bat is provided
in the example sources, and is invoked as follows:
gcpp myprog
Object files, include files, linking and so forth remain as per the previous discussion for
the C compiler.
3.5.2 Superficial Differences
Two seemingly superficial differences between C and C++ must first be mentioned.
The first is the comment delimiter. C uses the /* and */ constructs to mark the
beginning and end of comment blocks, respectively. The comments may span several
lines. C++ has a single-line comment: everything from // to the end of line defines a
comment. Most C compilers now also support the C++ comment syntax.
An alternate approach to console (screen) printing and keyboard input is available in

C++. Although offering a plethora of features, only the simplest will be introduced now.
The include file, rather than Cs stdio.h, is now iostream.h. If a variable someVar is
declared then the following prints the value of someVar:
cout << "the value of someVar" << "is ";

cout << someVar << "\n";
cout << "we could also use ";
cout << "this syntax. " << someVar << endl;
This approach is considerably less error-prone than Cs printf() mechanism note

that the programmer does not need to specify the data type of someVar in the example.
Internally, the C++ function overload mechanism is used to implement the iostream
class. This will be discussed further in a later section.
activity 3.10
C++ Compilation
Enter the following program called cpptest.cpp
// enter as cpptest.cpp
// This is a one-line comment
/* This is a
* multi-line
* comment.
*/
#include <iostream.h>
int main(void)
{
double someVar = 3.14159;
// this is a comment
cout << "the value of someVar" ;
cout << " is " << someVar << "\n";
cout << "we could also use ";
cout << "this syntax. " << someVar << endl;
}
Attempt to compile using the gcc compiler as per the previous examples.
Note that many errors are produced, because the C++ libraries are not
linked. Now compile using
g++ cpptest.cpp -o cpptest.exe
Check that the compilation is successful and run the program.
3.5.3 Abstract Data Types

Standard C contains intrinsic definitions for data such as short, char and double.
The real world frequently requires definitions for compound data, such as a screen
window or a student. A screen window may be made up of a collection of primitive
data types such as width and height (which may be represented as the primitive type
short). A student may require the definition and storage of a name as an array of
characters, together with a grade-point average (gpa) represented as a floating point

item (double).
In standard C, a data structure (also called an Abstract Data Type or ADT) is repre-
sented by a structure and denoted by the struct keyword. Functions are essentially
independent in their declaration: it is up to the programmer to ensure that functions op-
erate on the correct data structure(s). The advantage of declaring data as a structure,
rather than simply as a collection of individual types, is twofold: the logical association
becomes a focus at design time, and the code becomes more readable and hence
easier to maintain.
We could declare a structure prototype Student for storing student records as follows:
// declare a data structure

typedef struct
{
char Name[20];
double gpa;
} Student;
C++ compilers also allow the syntax
// declare a data structure -- another method

struct Student
{
char Name[20];
double gpa;
};
which is consistent with the syntax for declaring a class.
It is of course possible to declare the Name and gpa fields separately, but that would not
indicate to the reader that the two are related. Additionally, it forces the code designer
to view the two as one entity. A particular Name is associated with one, and one only,
gpa.
The type definition above is a prototype: it does not reserve any memory space. If we
wish to declare a particular instance of a data structure using the above prototyping
method2, the syntax is identical to any other intrinsic data type:
// an instance of this structure

Student someStudent;
An array of structures follows logically, as does a pointer to an instance of a data

structure:
2
There are other methods of declaring structs.
#define MAX_STUDENTS 20

Student StudentRecords[MAX_STUDENTS];
// a pointer to one of these structures

Student *pStudentRecord;
We could point to the fifth student record3 using
pStudentRecord = &StudentRecords[4];
Finally, declaring functions that use data structure is consistent with other intrinsic data
types:
// a function that operates on a structure

void updateStudentRecord( Student *pCurrentStudent );
In the preceding example, the connection between the Student data and the
updateStudentRecord() function is not especially tight. It is not enforced by the com-
piler (except perhaps for the typing of the pointer argument). The main connection
is through the naming convention if the function were called updateRecord(), for
example, the connection would be lost.
3.5.4 Object-Oriented Principles & Concepts

It should now be clear that binding code to data is a desirable objective. C language
extensions are able to help in this regard hence the development of C++.
In object-oriented programming, a class defines a data structure and its associated

code. The data portion defines the characteristics of the class. The code is bound to
that class in order that it may operate on the data which is also defined for that class.
So, in C++ a class is essentially a struct which also contains code.
Recall that a struct is a prototype and that instances of a structure must be declared.
The class is the prototype. An object represents a particular piece of data and the
code to operate on that data it is an instance of a class. As the class contains not
only data but also code, special names are used to define the components of the class.
The functions within a class are termed methods, and the data is termed the attributes
of that class. For a window object, the attributes may be
1. width in pixels;
3
Remember that C indices start at zero, hence the 5th record has index 4.
2. height in pixels;
3. horizontal position in pixels, and
4. vertical position.
The methods may be
1. To resize the window to different width and height, and
2. To move the window to a different horizontal and vertical position.
The object-oriented paradigm is extended further by inheritance. This is a very powerful

concept, and allows code reusability the re-use of portions of code in large software
projects.
Data hiding or encapsulation is another concept employed in object-oriented design so

as to increase reusability. If a particular object has attributes, then at some point in the
code those attributes may be modified in value. The data types of those attributes will
be fixed at the design stage. Now suppose the need arises (during implementation or
maintenance of the software) to change the data types. Each access in the code to
those attributes would need to be checked and modified. A case in point may be the
storage of a date object using two digits for the year. It would be a mammoth task to
locate every instance of access to the two-digit date and change the appropriate code
to use a four-digit representation.
If, on the other hand, access functions were used to return and set the year, then
only those access functions would need to be changed. All code that accesses the
attributes of an object should thus do so using access methods and not alter the at-
tributes themselves directly. Of course, this requires some additional overhead in terms
of the larger amount of code required.
So to summarize, the key concepts are:

class The generic definition of data and code for a particular entity.
object An instantiation of a class. Several objects may belong to the
same class but be physically different objects.
methods The code which belongs to a particular class of object.
attributes The unique characteristics which define an object.
encapsulation Having special access or interface functions for an objects data,
and not allowing direct access to an objects attributes.
inheritance Where one class derives its characteristics from another.
Some concrete examples of these will be given in the following section, using C++
syntax.
class Vehicle
{
public:
void printColor( void );
// access functions for private members

char *getColor( void )
{
return color;
}
int getNumWheels( void )

{
return numWheels;
}
private:
int numWheels;
char *color;
};
Figure 3.20 A simple class definition.
3.5.5 Object Definitions in C++

Suppose the task is to represent a vehicle object. The attributes may be the color
and the number of wheels. The simplest definition of this class may look like:
class Vehicle
{
int numWheels;
char *color;
};
This really isnt much different from a C structure. The principle of data hiding may be
implemented by defining the public and private components of the class, together with
the access functions as shown in Figure 3.20.
Now the attributes are private and any attempt to access them will be deemed illegal
by the compiler. In this case it could be useful: the number of wheels would be set when
the object is first declared and never changed. There should be no need to change the
component numWheels for example. However a reasonable requirement would be to
return the number of wheels contained in a particular object of the vehicle class.
Thus the existence of the public function getNumWheels().
When a data object is created, the data should be set to some sensible default value.
A common error in C is to declare a data structure and forget to initialize some or all
of its members. This may be a difficult problem to trace. So C++ has the constructor,
which is a function called when an object belonging to a particular class is declared.
class Vehicle
{
public:
// no return type allowed for constructor
Vehicle( char *color, int numWheels );

{
return color;
}

{
return numWheels;
}
private:
int numWheels;
char *color;
};
Figure 3.21 A class with a constructor method.
In C++, the constructor function must:
1. Have the same name as the class.
2. Not have any return value or type.
A constructor function may be added to the vehicle class as shown in Figure 3.21.
The body of the function may be implemented within the class definition, as has been
done so far. This is fine for short functions but may make the class definition difficult or
impossible to read if long functions are involved. So naturally the function body may be
declared elsewhere in the source files. However, some notation is required to associate
a function with a particular class. Figure 3.22 illustrates this for the constructor function
of the vehicle object.
The :: notation explicitly associates a function with an object. On the left is the class
name; on the right is the member function name. Since constructors must have the
same name as the class, we end up with the syntax for the constructor declaration
being Vehicle::Vehicle().
Member functions are implicitly passed a pointer to the object which is always called
this. It is implicit because it does not appear in the function definitions. However,
there must be some underlying mechanism for determining which particular object is
being referenced the this object. The use of this is optional, as in the above
example4 .
4
There are some exceptions to this rule.
// constructor
// note no return type allowed
Vehicle::Vehicle( char *color, int nw )
{
// explicit use of 'this'
// need to do this if the arguments to the function have the
// same name as the class members
this->color = color;
// implicit use of 'this'

numWheels = nw;
}
Figure 3.22 The body of a constructor function.
Function overloading allows C++ to have more than one function with the same name
but different arguments. This may be useful in situations where, for example, sensible
defaults are to be assumed for a object. Which particular function is called is deter-
mined by the compiler, using the number and data type of the arguments.
Just as a constructor allocates memory and sets the initial attributes for an object, a
destructor function releases the memory and performs any additional tasks which may
be required when an object is no longer needed. The destructor function must also
have the same name as the object but is prefixed by a tilde sign ( ~ ). Figure 3.23
shows both an overloaded constructor function and a destructor function.
The declaration of an object as an instance of a class may now be done. Figure 3.24
shows several alternatives, including the declaration of a pointer to a new object. Note
the use of the new keyword in the latter case.
activity 3.11
C++ Objects
Compile and run the program objects.cpp using the g++ Gnu C++ com-
piler. Note the output from the program and trace through the listing,
comparing the output to the source code.
3.5.6 Input/Output
C++ builds on the inherent operator overloading available in the language to provide
more robust formatting. The basic iostream constructs allow for formatted output in
a somewhat different manner using the streams cout and cin in conjunction with the
class Vehicle
{
public:
// no return type allowed for constructor & destructor
Vehicle( char *color, int numWheels );
Vehicle( void );
~Vehicle();
void printColor( void );

int value;

{
return color;
}

{
return numWheels;
}
private:
int numWheels;
char *color;
};
// constructor
Vehicle::Vehicle( char *color, int nw )
{
// explicit use of 'this'
// need to do this if the arguments to the function have the
// same name as the class members
this->color = color;
// implicit use of 'this'

numWheels = nw;
}
// destructor
Vehicle::~Vehicle()
{
cout << "vehicle destructor for " << color << " vehicle\n";
}
// member function
void Vehicle::printColor(void)
{
cout << "color is " << color << "\n";
}
Figure 3.23 Overloading the constructor function.

Vehicle myCar( "blue", 4 );

//class Vehicle myCar( "blue", 4 ); // OK but unnecessary
// note pointer to object

Vehicle *yourCar = new Vehicle( "red", 3 );
myCar.printColor();
cout << "myCar.numWheels returns " << myCar.getNumWheels() << "\n";
// illegal
// myCar.numWheels = 0;
// OK
myCar.value = 10;
// delete the object allocated by new

// delete expects a pointer to an object
// Cannot delete objects that were not allocated via new
// (for example, myCar)
delete yourCar;
Figure 3.24 Class declarations.
operators << and >>. Even more precise control over field widths, degree of precision
and so forth may be obtained using the manipulators defined in iomanip.h.
activity 3.12
C++ Input/Output
Work through the supplied iocons.cpp for simple examples of the new
formatting methods.
As mentioned previously, C++ has a different input/output mechanism to the printf()

method of C. In C, a statement such as
printf("The answer is %5.2lf\n", theAnswer);
requires some attention to detail on the part of the programmer, in that the variable
theAnswer must be of the type double in order that the %lf format specifier work cor-
rectly. If, for example, theAnswer were an integer then the results would be unpre-
dictable.
activity 3.13
C Formatting
To illustrate the potential problems in format conversion and output spec-

ifiers, enter, compile and run the following short program.
#include <stdio.h>
int main()
{
short theShortAnswer;
long theLongAnswer;
double theDoubleAnswer;
theShortAnswer = 1342;
theLongAnswer = 1342L;
theDoubleAnswer = 1342.0;
printf("theShortAnswer: ",
printf("hd format:%hd ld format:%ld lf format:%lf\n",
theShortAnswer, theShortAnswer, theShortAnswer);
printf("theLongAnswer: ",
theLongAnswer, theLongAnswer, theLongAnswer);
printf("theDoubleAnswer: ",
theDoubleAnswer, theDoubleAnswer, theDoubleAnswer);
}
activity 3.14
C++ Formatting
Work through the supplied format.cpp for simple examples of the new
formatting methods.
3.5.7 Declaring Classes and Creating Objects
The sample program objects.cpp has already shown the basic method for declaring
classes and objects (specific instances of a class). Note especially:
1. The way classes are defined.
2. The many ways in which objects themselves may be defined.
3. The way in which member functions are declared and the function body defined,
either within the class definition or elsewhere using the scope operator ::.
4. How member functions are called, for example myCar.printColor();
The example inherit.cpp shows the implementation of inheritance. In this example,

the generic class Employee is defined, with a derived class ContractEmployee. The
syntax is as follows:
class ContractEmployee : public Employee
Note in inherit.cpp that the (correct) derived function is called using
// function in the derived class

pay = theEmployee.calcPay();
but that the member function in the base class may be explicitly called using the scope
resolution operator as follows:
// function of the same name in the base class

// note scope resolution operator
pay = theEmployee.Employee::calcPay();
The keywords public and private enforce encapsulation of a classs data and meth-
ods. Sometimes, however, it is necessary (or convenient) to have a common function
which can access the private data of several objects, thus temporarily over-riding the
private specification. This is done using the friend keyword. Essentially, this de-
clares other friendly classes who are allowed to access the private information in the
current class. In the example friend.cpp, the class CustomerRecord is defined as
having a friend function
friend class MyNameClass

Thus the Capitalize() member function of the object myName an instance of class
MyNameClass is allowed to access the variable customerName, which is private to the
CustomerRecord class. Under normal circumstances this would be illegal.
activity 3.15
Friend Functions
Compile and run friend.cpp. Now remove the friend keyword and
note the compilation errors pertaining to private member variables.
activity 3.16
Object Pointers
As it stands, the member function Capitalize() must be passed an

instance of a CustomerRecord class. This is not efficient if this class is
large and contains many attributes. A more efficient approach is to pass-
by-reference: that is, to pass a pointer to the object rather than the object
itself. Change the variable someCustomer to a pointer to CustomerRecord
and modify the appropriate other sections of code. Re-compile and
check that the code works the same as previously.
3.5.8 Inheritance and Derived Classes

A derived class is one which derives or inherits attributes from a parent class. All
of the functionality and attributes associated with a base object are inherited by the
derived object, however the access privileges may vary according to the derivation
specification. derived.cpp shows the declaration of a base object and derived objects,
and introduces protected members which control access privileges to the base object
and in derived objects.
activity 3.17
Derived Classes
Compile and run derived.cpp and step through its operation.
There are two methods by which objects may incorporate other objects: derivation and
composition. Derivation uses a special syntax in C++. Composition is simply nesting

objects, much like nested structures in C. In using derivation to construct classes, all
of the functionality and attributes associated with a base object are inherited by the
derived object. However the level of access may be modified by the way in which the
parent class is inherited public or private. derived1.cpp illustrates public and private
inheritance.
activity 3.18
More on Derived Classes
Compile and run derived1.cpp and step through its operation.
Just as one class may inherit attributes and methods from another class, any single
class may inherit attributes from several other classes. This is termed multiple inheri-
tance. The example multint.cpp shows a generic Window class for drawing a window
on screen, which inherits attributes and methods from both the Button class and the
TitleBar class. The Button class may have attributes such as foreground color, back-
ground color, and methods to simulate pushing of the button on screen. The syntax
simply extends the previously-mentioned single-inheritance case:
class Window : Button, TitleBar
activity 3.19
Multiple Inheritance
Compile and run multint.cpp and step through its operation.
Virtual functions are member functions of a class, which are normally expected to be
over-ridden when the class is derived. The example virtual.cpp gives an example of
such a case. The Employee class is used to derived a sub-class, ContractEmployee.
This would be an appropriate use of class derivation: generic attributes such as em-
ployee name, employee number and so forth would be used in the more specific con-
tract employee class. However, the method of calculating the pay for contract employ-
ees is quite different to the method of calculating the pay for salaried employees. Thus
the member function calcPay() is declared in the base class. It is expected that this
function be over-ridden in each derived class. To enforce this, a pure virtual function
is declared in the base class. The compiler will expect a derived version of this function
to be found in all derived classes.
activity 3.20
Virtual Functions
Compile and run virtual.cpp and step through its operation.
3.5.9 Setting Attributes

Public and private attributes have already been discussed. The concept of a private
variable enforces information hiding and minimizes change-propagation when storage
types are changed in one code location.
Normally all attributes whether private or public are set for each object even though
the objects may belong to the same class. This is the expected behaviour. Sometimes
it is necessary to define an attribute that has a value across all instances of a class (all
objects of that class). These attributes are given the qualifier static.
The example statics.cpp shows the use of statics in a simplistic way. The Bank object
contains the balance for a persons bank account. The objects myBank and yourBank
are thus quite distinct. However, the interest rate paid across all accounts is the same.
Therefore, a static variable is required to maintain the interest rate irrespective of the
instances of the Bank class.
activity 3.21
Static Scope
Compile and run statics.cpp and verify that changing the currentRate
variable changes the value in both Bank objects myBank and yourBank.
3.5.10 Operator Overloading

Just as member functions of a class may be overloaded, so operators may be re-
defined. The code opereq.cpp defines a Vector object which is meant for holding
variable-sized vectors of numbers. In addition to the array of values themselves, the
length of the array (number of elements) must also be stored. In opereq.cpp, the
assignment operator is redefined using the member function as shown in Figure 3.25.
So an assignment sequence such as

void Vector::operator=( const Vector &rhs )

{
cout << "operator=\n";
// safe to delete even if uninitialized in C++

delete [] vecValues;
numValues = rhs.numValues;
vecValues = new int[rhs.numValues];
}
Figure 3.25 The operator= method.
Vector v(4);
Vector newV(4);
newV = v;
implicitly calls Vector::operator=. If the new operator= function were not provided,
the assignment would force the compiler to generate code to perform an element-by-
element copy of the attributes of a Vector object. In some cases, particularly those
where objects contain pointers to other data, this may not be the desired behaviour.
It is then up to the programmer to define the new storage and set/copy elements as
appropriate.
activity 3.22
Operator Overloading
Comment-out the references to operator= in opereq.cpp. Compile and

run the program, and verify that the assignment works as expected (the
constructor sets the initial values to -1 for convenience). Now restore the
portions of code referring to operator=, re-compile and run the program.
Note the different results.
3.5.11 Object Arrays
Naturally arrays of objects as well as single objects may be created. An array may be
created using
const int numCars = 4;

Vehicle carLot[numCars];
however it does not allow the explicit specification of a constructor to be called (the
void constructor is called).
An alternative is to declare a pointer to the object and then allocate the necessary
storage using new, as follows:
Vehicle *pCarLot;
pCarLot = new Vehicle[numCars];

pCarLot[2].value = 3;
Using this approach, the constructor cannot be called directly. For example
// can't call the constructor

pCarLot[2].Vehicle("color", 1);
It must be kept in mind that because the array is dynamically allocated using new, the
memory must be freed using delete (compare to malloc() and free() in standard C).
The array declared as above is freed using
delete [] pCarLot;
Although it may seem that the empty array specifier is either unnecessary or should
contain the number of objects, the compiler is able to determine the amount of memory
to be deleted (just as free() does not require the number of bytes to be deleted, only
a pointer to the start of the block). The number of bytes is kept internally in a data
structure associated with the memory pointer.
Lastly another method is static initialization, which allows the specific constructor to be
called:
// another method of initialization

Vehicle anotherCarLot[] = { Vehicle("yellow", 5),
Vehicle("green", 6)
};
These examples are all shown in arrays.cpp.
activity 3.23
Arrays of Objects
Compile and work through arrays.cpp.

3.5.12 Functions
The example code module func.cpp illustrates several important principles of C++
relating to functions. Firstly, the concept of default arguments to a function is a C++
feature not found in C. A function declared as
void defaultFunc( int x, int y = 50 );
will be called with the two values for x and y if both are supplied. However if only
one argument is supplied then the second takes on the value supplied in the function
prototype (50 here).
A related concept is that of overloading a function. This is done via the data type of the
arguments supplied. For example the function prototypes
void overloadFunc( int x );

void overloadFunc( double x );
call the appropriate function depending upon whether the argument supplied is an int
or a double.
Functions declared as inline are compiled as inline code rather than a separate func-
tion. A copy of the functions code is included where the call to the function appears.
Thus several copies of the functions code may actually appear in the output file. This
increases the code size but may yield improved performance (but this is certainly not
guaranteed).
Another important matter in function calls is the ability to call native C functions. Al-
though standard C functions should not normally be written for a C++ project, the need
may arise to call C code from a library or a C-interfaced assembly language routine or
similar. Because of the way in which the linker operates for C++ code as compared to
C code, C functions cannot be directly called. They must be called using the syntax
extern "C" void cFunc(void);
where the extern "C" qualifier invokes special name-changing rules for the C function
name which follows.
The sample program func.cpp demonstrates these principles, together with the use
of the const qualifier and local/global scoping of variables. const signifies a constant
value which must not be changed by the programmer (such as an array limit). The
scope-resolution operator :: may also be used to resolve the conflict that arises
if a function name is the same at a local (function) and global (module) level (not a
desirable practice however).
activity 3.24
Calling C from C++ Linkage
Compile func.cpp and cfunc.c using the batch file cfunc.bat. Step
through the code to verify the features discussed above.
3.5.13 References
A reference in C++ behaves in a manner somewhat akin to a pointer. A reference is
declared as follows:
int x;
int &refX = x; // declare and initialize
where the ampersand (&) indicates a reference or indirection to another variable, much
as a pointer is an indirection to another variable. Using either a pointer or a reference is
termed pass-by-reference. Contrast this to pass-by-value where a copy of the entire
data variable is passed and the original cannot be modified by the called function. The
main difference between a pointer and a reference is that a reference must be initialized
when it is declared.
activity 3.25
References
Compile and run ref.cpp to illustrate references. Note the performance

advantage in using references, but that the called function may modify
the callers variable.
activity 3.26
Reference Initialization
Remove the initialization in one of the references in ref.cpp. Then try

to compile the program.
class ListItem
{
public:
ListItem( char *name );
~ListItem();
void LinkItem( ListItem *pNextItem );

void PrintAll( void );
private:
char *itemName;
ListItem *pNext;
};
Figure 3.26 Declaration of an object pointers within an class.
void ListItem::LinkItem( ListItem *pNextItem )

{
ListItem *pCurrItem;
// find end of list (ie item whose next pointer is null)

pCurrItem = this;
while( pCurrItem->pNext )
{
pCurrItem = pCurrItem->pNext;
}
pCurrItem->pNext = pNextItem;
cout << "adding " << pNextItem->itemName ;

cout << " after " << pCurrItem->itemName << "\n";
}
Figure 3.27 Linked list traversal using classes.
3.5.14 Object Pointers
Pointers to objects are often used in C++ in the same way as pointers to structs in
C. The example program objptr.cpp implements a simple linked list using a pointer to
the List object as in Figure 3.26.
The list is thus dynamically created, rather than using a static array. Adding an item to
the list requires traversal of the list as in Figure 3.27.
activity 3.27
Object Pointers
Compile and run objptr.cpp

// overloaded output function

ostream& operator<<( ostream& output, PhoneNumber& phoneNumber )
{
cout << "cout overload: " ;
cout << "(" << phoneNumber.areaCode << ") ";
cout << phoneNumber.localNumber;
cout << endl;
}
Figure 3.28 Overloading the cout operator.
3.5.15 Stream Input/Output

As mentioned previously, the new object-oriented features of C++ allow for a more
systematic programing style and a reduction in the requirement for dangerous pro-
gramming practices that is, programming constructs which may have side-effects
as they are, or if seemingly unrelated modifications to the code are made. Foremost
amongst the potentially dangerous constructs is the output function printf(). C++
replaces this with the stream called cout. The include file iostream is now required.
Just as the = operator may be overridden, so the << operator may be over-ridden. For
standard output or output to files, the << operator is used. An example of overriding this
operator is given in ionew.cpp, where a data type PhoneNumber is declared. A special
output formatting function for this data type, which separates the area code and the
phone number, is shown in Figure 3.28.
Now the programmer is freed from specific output formatting considerations when using
the PhoneNumber data type:
PhoneNumber myPhone("00", "123456");
cout << myPhone;
activity 3.28
C++ Formatted File I/O
Compile, run and step through the formatting example program

ionew.cpp.
File I/O is a special case of screen/keyboard I/O. So the new output stream concept
ofstream *outStream;
outStream = new ofstream( "outfile.txt", ios::out );
if( ! outStream )
{
cout << "could not open output file" << endl;
exit(1);
}
Figure 3.29 Opening an output file stream.
*outStream << "this is line one" << endl;

*outStream << "this is line two" << endl;
outStream->close();
delete outStream;
Figure 3.30 Writing to an output stream.
may equally be applied there. Instead of using fopen() the output stream ofstream is
used as in Figure 3.29.
The include file fstream now required. The cout-type operators may now be used as
shown in Figure 3.30. Reading in a text file is similar (Figure 3.31). The complete code
may be found in iofile.cpp.
activity 3.29
C++ Binary File I/O
Compile, run and step through the file I/O example program iofile.cpp.
ifstream *inStream;
const int MAXLINE = 50;
char lineBuffer[MAXLINE];
inStream = new ifstream( "outfile.txt", ios::in );
if( ! inStream )
{
cout << "could not open input file" << endl;
exit(1);
}
while( ! inStream->eof() )
{
inStream->getline( lineBuffer, MAXLINE );
if( inStream->eof() )
{
cout << "end of file\n";
}
else
{
cout << "read back: " << lineBuffer << endl;
}
}
inStream->close();
delete inStream;
Figure 3.31 A file input stream.

3.6 Module Summary

This module has given an overview of both the C and C++ languages. C++ is a super-
set of C with enhancements for the object-oriented programming style.
The important concepts from this module are:
How to construct a skeleton C program.
How C programs are entered, compiled and linked.
A brief introduction to object-oriented principles.
How objects are implemented using C++.
Further Reading
70835 Computer Engineering III Further Referencesa
a
Module 4
CODING TECHNIQUES
Module 4 Coding Techniques 4.1
4.1 Module Overview

Some important implementation issues for real-time systems are examined in this mod-
ule. Whilst most coding is done in third-generation languages (typically C), low-level
coding constitutes an important part of performance optimization. The use of C and
assembly language is discussed for this reason, with some tutorial-type examples ex-
plained. Memory allocation, particularly dynamic memory allocation at run-time, often
causes problems at the design stage because the underlying mechanisms are not
properly understood by the programmer/designer. This aspect of system implemen-
tation is also examined here, with an emphasis on learning by examining common
memory usage problems.
Topics covered in this module are:
Error handling.
Assembly language.
Software optimization.
Linking C and assembly code.
Dynamic memory allocation.
Mixed-language programs.
4.2 Introduction
There are three main thrusts to this module: error handling, performance and memory
management. These are aspects of system coding which the designer and implemen-
tor ought to be aware of from the beginning.
The module begins with a brief examination of some methods of error-handling in com-
plex systems. As many computer systems embedded systems, servers, and the like
often have to run unattended, some method of recording system errors, when they
occur, is useful. Although relatively straightforward, it is often a vital aspect of any
system design especially when things go wrong!
The other aspect which is examined in this module is that of performance optimization.
Attaining the best performance is not simply a worthy goal often times, real-time
constraints mean that every last bit of performance must be teased out of a system.
More commonly, certain sections of code present bottlenecks to the overall system
performance. These sections must be targetted for enhancement.
Finally, the issue of memory management is examined. This is probably the area which
causes most grief for software engineers. Bugs in this area can be very, very subtle and
difficult to track down. Some of the more common problems in dealing with memory
are presented in order to make the student aware of them immediately, rather than
through painful experience.
4.3 Error Handling
Systems such as network servers and embedded systems must often run without any
supervision. Thus the traditional technique of debugging statements may not be use-
ful. To begin with, there may be no system operator present to examine the error
messages. Error messages may often mean little to a supervisor of a system, but be
invaluable to the software engineer who designed the product. For example, a cryptic
Error # 23 is of little use unless 23 actually has some meaning to whoever sees the
message. Furthermore, many devices such as handheld devices do not have the ca-
pability to output diagnostic codes. Even if they did, non-technical operators will expect
that a reset of the device will cure the problem.
Figure 4.1 shows one method of reporting internal errors. Firstly, note the perror()
Unix system call. This is used after system calls (such as fopen() in the example)
to print a string appropriate to the last error. The internal variable errno is used to
store this error. Secondly, the compiler may be used to help track down the specific
section of code responsible for the error using the C preprocessor macros __FILE__
and __LINE__, which correspond to the file (module) name and the line number within
the module. Obviously such information is of little use to the end-user, but may save
valuable debugging time.
/* errhand.c
* To illustrate some methods of system error trapping.
* The macros __FILE__ and __LINE__ are defined by the preprocessor.
* Output:
fopen: No such file or directory
Error occured in module errhand.c on line 17
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
if( ! fopen("AnInvalidFilename", "r") )
{
perror("fopen");
fprintf( stderr, "Error occured in module %s on line %d\n",
__FILE__, __LINE__ );
exit(1);
}
exit(1);
}
Figure 4.1 Error handling using compiler built-in macros.

In the Windows NT environment, code such as presented in Figures 4.2 and 4.3 may be
used. Note the use of the system function GetLastError(), which performs a function
similar to perror(). The error-handling routine has been condensed into one function,
SysErr(), as it is likely to be called many times.
activity 4.1
Error Handling
Compile both versions of error-handling code and check that they work
as expected. Explain why the __FILE__ and __LINE__ symbols are de-
fined in a macro SysErr() rather than in the function body of doSysErr()
itself.
In systems which operate unattended, a log file may be used to record significant sys-
tem events. In Unix systems, log files are typically stored under /var/log/. Under Win-
dows NT, the system log may be accessed via Administrative Tools-Event Viewer.
Simply writing the error message to a file may be satisfactory but consider the fact
that the system may be running 24 hours a day, 7 days a week. The log files generated
may become quite large. Thus, a circular log file may be implemented, such that the
messages wrap around, with only most recent messages available (for example, the
most recent 100 messages). New messages over-write the oldest messages in the
system.
activity 4.2
Log Files
How would you implement a circular log file? (Hint: use fixed-length lines
when writing to the file.)
4.4 Assembly Language

This section discusses a simplified model of the Pentium architecture. Only enough to
be able to follow the subsequent sections is discussed.
Figure 4.4 shows a simplified view of the Pentium family registers. The general purpose
registers ax, bx, cx and dx are 16 bits wide. These are split into 8-bit (1 byte) low and
high halves: al is the lower half of ax, ah is the upper (high) half of ax; ch is the high half
of cx and so forth. These registers may be loaded from each other, and loaded/stored
/* errhand.c
* Windows NT system call error handling
*
* Example output:
Application error: opening source file.

Source file errhand.c Line 40
System error #2, The system cannot find the file specified.
*
* Adapted from:
* Borland C++ Builder Help file/Microsoft NT API
*
* John Leis
* Feb 2000
*/
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define DUMB_FILENAME "afilewhichdoesnotexist"
#define SysErr(msg) doSysErr(msg, __FILE__, __LINE__)
void doSysErr(char *msg, char *filename, int linenum);
int main()
{
HANDLE hFile;
DWORD errorCode;
LPVOID lpMsgBuf;
hFile = CreateFile( TEXT(DUMB_FILENAME),

GENERIC_READ,
0, NULL, OPEN_EXISTING, 0, NULL);
if( hFile == INVALID_HANDLE_VALUE )

{
SysErr("opening source file");
exit(1);
}
exit(0);
}
Figure 4.2 Error handling for NT (part 1 of 2).

void doSysErr(char *msg, char *filename, int linenum)

{
DWORD errorCode;
LPVOID lpMsgBuf;
char buf[400], tmpbuf[400];
sprintf(tmpbuf,"Application error: %s. Source file %s Line %d\n",

msg, filename, linenum);
strcpy(buf, tmpbuf);
// get last system error

errorCode = GetLastError();
FormatMessage(
FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM,
NULL, errorCode,
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
(LPTSTR)&lpMsgBuf, 0, NULL );
sprintf(tmpbuf, "System error #%ld, %s\n", errorCode, lpMsgBuf);
strcat(buf, tmpbuf);
// Free the buffer.

LocalFree( lpMsgBuf );
// Display the string.

fprintf(stderr, buf);
fflush(stderr);
}
Figure 4.3 Error handling for NT (part 2 of 2).
from/to memory, using several addressing modes. The extended versions of these
registers are 32 bits (4 bytes) wide and begin with the prefix e. Thus eax is a 32-bit
register, whose lower 16 bits is ax, whose upper and lower 8 bits in turn are designated
ah/al. The ax register is sometimes called the accumulator, and in certain special
block-move instructions the cx register is referred to as the count register.
Register esp is the extended stack pointer. It is used in the manner of a traditional stack
pointer: when an item is pushed onto the stack, the stack pointer is decremented by
the correct number of bytes. When a value is popped from the stack, the stack pointer
is incremented as appropriate.
The source index (esi) and destination index (edi) registers are used to index memory.
Like the stack pointer, these cannot be split in the way the general-purpose registers
may be split. Certain instructions reference these registers implicitly.
Finally, the base pointer ebp is similar in some ways to a stack pointer. It is used in the
design of high-level languages, when arguments are passed to functions. This will be
examined further in later sections in the context of the assembly-code output from a C
compiler.
The following assumes, for simplicity, a flat memory model. In reality, the protected
mode of the Pentium processor means that hardware-enforced memory bounds check-
ing stops inadvertent or malicious access to memory areas that a program is not meant
to access. This is done via Descriptor Tables (DTs). A discussion of this is an ad-
vanced topic, beyond the scope of this module.
general purpose eax ebx ecx edx

index esi edi
stack esp ebp
General Purpose
eax (32)
ax (16)
ah (8) al (8)
Source/Destination Index (esi/edi)

operand address
Stack Pointer (esp)
return address
Base Pointer (ebp)

parameter address
Figure 4.4 x86 processor family registers.
4.5 C to Assembler
Most C compilers are able to output assembly code. Some are even able to interleave
the C and assembler codes, so that the block of C code and corresponding assembler
may be easily seen. The Gnu C compilers use the switch -S to do this. All of the
examples in this chapter have been generated in this way.
The simple function shown in Figure 4.5 will be used to illustrate the conversion of C to
assembler. Referring to this figure, three types of variable have been used:
1. global variables (aLongGlobalVar);
2. local variables (local1, local2), and

long aLongGlobalVar;
int asmfunc( int inval1, int inval2, long arrptr)

{
int local1, local2;
local1 = 1234;
local2 = 56789;
local1 = val1;
local2 = val2;
aLongGlobalVar = 1245678;
return local1;
}
Figure 4.5 C to assembly language C code.
3. function parameters (inval1, inval2, and arrptr).
The annotated assembly code output is shown in Figure 4.6. This was generated using
gcc -S asmeg.c -o asmeg.s
where the suffix .s indicates assembler (some compilers use .asm or some other
extension).
To begin with, note the global declaration for the function using .globl. The function
name, as with global variables, has an underscore prepended. The entry or prolog
code to the function:
1. Saves the base pointer by pushing it on the stack.
2. Adjusts the stack pointer to reserve sufficient space for the local variables.
In this way, the ebp base pointer points to the base in memory of the local variables.
This is illustrated in Figure 4.7, with the steps numbered 1 through 5.
Following on, the value 1234 is moved into variable local1. This is at offset -4 from
the base pointer, hence the indexed addressing instruction
movl $1234, -4(%ebp)

.globl _asmfunc
_asmfunc: # prolog
pushl %ebp # save base ptr
movl %esp,%ebp # base ptr to stack base
subl $8,%esp # reserve space for 2 local
# variables (4 bytes each)
movl $1234,-4(%ebp) # move into local1 (offset -4)

movl $56789,-8(%ebp) # move into local2 (offset -8)
movl 8(%ebp),%eax # get inval1 (offset +8)

movl %eax,-4(%ebp) # move into local1 (offset -4)
movl 12(%ebp),%eax # get inval2 (offset +12)

movl %eax,-8(%ebp) # move into local2 (offset -8)
movl -4(%ebp),%edx # get local1 (offset -4)

movl %edx,%eax # put return value in eax
movl $1245678,_aGlobalVar # move into global variable
leave # epilog
ret
Figure 4.6 C to assembly language assembly code output.
The l in movl indicates a long value, or a 32-bit integer.
The local variable local2 is stored next on the stack (remember that the stack grows
downwards towards lower memory).
Accessing the function parameters such as inval1 means accessing a higher address
in memory, hence a positive offset.
The global variable is accessed directly, in the data segment of the program. Finally,
the return value is in the eax register. Normally a register is used for return values,
although it may be different on different compilers.
activity 4.3
C to Assembler
Using the example C code and assembler output, draw a diagram sim-
ilar to Figure 4.7 as each instruction is processed. Show the memory
locations of each variable.
High Memory
vars in
ret addr 2
base ptr stack ptr

b old base ptr 3
b
local vars 5
Low Memory
Figure 4.7 Stack frame calling/return sequence.
4.6 Calling Assembly Code

Sometimes in time-critical systems it is necessary to speed up certain sections of code
using assembler code. It is more tedious to write assembly code than high-level lan-
guage code, but sometimes it is unavoidable. Furthermore, care must be taken to fit in
with the high-level compiler. The calling conventions (which differ from one language
to another) must be adhered to.
The easiest way to examine a C-callable assembly code function is to first write the
function in C and examine the output from the compiler, and then optimize the assembly
code.
Figure 4.8 shows the C function which calls an assembly-coded function, asmfunc().
Not only integer data items will be used, but also an array, which is passed by reference.
Passing by reference means that the address of the array (arrptr in the figure) is
passed as a parameter. The output of the program is shown in Figure 4.9.
The initial code shown in Figure 4.10 is similar to that examined in the previous exam-
ple. The stack and base pointer registers are adjusted to obtain the stack frame.
In Figure 4.11, a loop (beginning with label fill) is used to access the array. This fills
the array with values 7, 12, 17 and 22. This type of coding rapid access to an array
of values is commonly the type of code which must be optimized.
In the second half of Figure 4.11, some more advanced processor operations are
shown. The instruction sequence rep movsb is used to move (mov) a string (s) con-
/* asmeg.c linking assembly functions. see asmfunc.s

* To compile, either gcc asmeg.c asmfunc.s -o asmeg.exe
* or make -f asmeg.mak
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
extern long asmfunc(int inval1, int inval2, long arrptr);
long aLongGlobalVar;
unsigned char srcdata[] = { 200, 201, 202, 203 };
unsigned char destdata[] = { 9, 8, 7, 6};
int main()
{
short val1, val2, retval, i;
long arrptr;
short array[4];
aLongGlobalVar = 87;
val1 = 12345;
val2 = 6789;
arrptr = (long)&array[0];
// call the assembler function

retval = asmfunc( val1, val2, arrptr);
printf("retval=%hd decimal or %hx hex\n", retval, retval);
printf("global var now %ld\n", aLongGlobalVar );
for( i = 0; i < 4; i++)

{
printf("array[%d] = %hd\n", i, array[i]);
}
for(i = 0; i < 4; i++)

{
printf("src[%d]=%d\n", i, (int)srcdata[i]);
}
for(i = 0; i < 4; i++)

{
printf("dest[%d]=%d\n", i, (int)destdata[i]);
}
exit(0);
}
Figure 4.8 Assembly language calling example C calling code.

retval=9876 decimal or 2694 hex

global var now 91
array[0] = 7
array[1] = 12
array[2] = 17
array[3] = 22
src[0]=200
src[1]=201
src[2]=202
src[3]=203
dest[0]=200
dest[1]=201
dest[2]=202
dest[3]=6
Figure 4.9 Assembly language calling example output.
sisting of byte values (b). The rep instruction prefix signals that the instruction is to
be repeated the number of times in the cx register.
# asmfunc.s see also asmeg.c

# GnuC assembly function which is called from C
# John Leis
.file "asmfunc.s"
.data
# 0x for hexadecimal
#somedata: .long 0x56781234
somedata: .long 9876
.text
# int asmfunc(int inval1, int inval2, long arrptr);
# note _ underscore required
.globl _asmfunc
_asmfunc:
# prolog
pushl %ebp # save base ptr
movl %esp,%ebp # base ptr to stack base
# store registers on stack

pushl %esi # save source index
pushl %edi # save destination index
# integer argument inval1

# lowest on the stack (offset +8)
movl 8(%ebp), %eax
# integer argument inval2

# higher on the stack (offset +12)
movl 12(%ebp), %eax
# array address arrptr (offset +16)

movl 16(%ebp), %ebx
Figure 4.10 Called assembly code (part 1 of 2).

# put data in the array

movw $4, %cx # count of array elements.
movl $7, %edx # first value to be put in array
fill:
movl %edx, (%ebx) # move data into array
addl $2, %ebx # point to next item (short, 2 bytes)
addl $5, %edx # next value to be put in (add 5)
movl %edx, (%ebx) # move data into array
decw %cx # decrement count
jne fill # loop until array filled
# modify the global variable

# must be global (not automatic in caller)
# prefix with underscore
mov _aLongGlobalVar, %eax
add $4, %eax
mov %eax, _aLongGlobalVar
# block copy instruction

# here copies first 3 elements of srcdata to destdata,
# considered as byte arrays
lea _srcdata, %esi # address of source
lea _destdata, %edi # address of destination
cld # increasing address
movl $3, %ecx # repetition count
rep # repeat next instruction
movsb # source->destination, loop on count
# int return value is in ax

# $ for immediate
#movl $0x12345678, %ax
movl somedata, %eax
# pop registers off stack in reverse order

popl %edi # restore destination index
popl %esi # restore source index
# epilog
leave
ret
Figure 4.11 Called assembly code (part 2 of 2).
4.7 Software Optimizations

In the case of hardware performance optimization, the manufacturer has control over
the specific features implemented. When writing software, however, the programmer
has considerable control over the performance optimization on the given hardware
platform. How can this be? As mentioned previously, the vast majority of code is
written in a high-level language.
Of course, the most efficient algorithm must be used for the task (algorithm optimiza-
tion). Although this depends almost entirely on the task at hand, some simplifications
are often overlooked. For example, suppose some section of code had to be imple-
mented if the conditions today is tuesday and today is the first day of the month were
both true, as follows:
if( dayOfWeek == 2 ) // fairly likely

{
if( dayOfMonth == 1 ) // unlikely but often tested
{
// code executed if both are true
}
}
// continue on ...
Either ordering of the tests produces the same output, but since the second is less
likely, it should be performed first, so as to rule out the subsequent test most of the
time:
if( dayOfMonth == 1 ) // unlikely

{
if( dayOfWeek == 2 )
{
// code executed if both are true
}
}
// continue on ...
If performance profiling shows particular bottlenecks, the programmer has essentially

two options:
1. Re-write specific time-critical sections in assembly language, and/or
2. Use the optimization facilities provided by the compiler.

Re-writing portions of the code in assembly is tedious and error-prone, but a speedup
of the order of two, ten or even one hundred times is not uncommon. This of course
assumes the performance bottleneck has been identified and that the programmer has
sufficient skill to work in the given low-level assembly language.
In the following discussion, the execution time reductions may only be of the order of
one or two instructions in many cases. A one-off reduction of this magnitude is hardly
worth the effort expended. But where code portions appear in loops or nested loops,
the small speedup is leveraged many times over. For example, consider the code to
move a window on the screen from one location to another, as would be encountered
when moving or resizing a window on a PC desktop. Suppose the window is 400
pixels wide by 300 pixels high. The number of bytes moved is 120 000. Suppose
(optimistically) that the movement of each pixel could be reduced in time by 10 mi-
croseconds. The user will observe a reduction of more than one tenth of a second
certainly noticeable. It can easily be seen that real-time video games, for example, are
ideal candidates for optimization.
The major methods of code optimization which will be examined are:
Constant Folding Performing calculations at compile-time which reduce to constant

values.
Constant Propagation Using constants directly when they need not be stored.
Strength Reduction Converting arithmetic operations to faster logical operations.
Dead Code Elimination of code which is present but whose result is not used.
Use of Registers Using registers rather than memory to store often-used variables.
Loop Unrolling Creating sequential code instead of loops.
Use of Pointers Using address indexes to simplify address calculation.
Loop Invariance Removing code which is executed within a loop but need not be.
Function Inlining Duplicating the code for functions to save the call/return overhead.
Data alignment Aligning data for the most efficient memory bus access.
A good reference for further details on this topic is [7].
The code shown in Figure 4.12 is used for the following examples.
4.7.1 Constant Folding

The assembled code corresponding to
r = 4*5;
/* optim.c
* For GnuC
* Use
gcc -S -O3 -funroll-loops optim.c -o optim.s
gcc -S -O -finline-functions optim.c -o optim.s
*
* John Leis
*/
short someFunction( short someArg );
inline short inlineFunction( short x )

{
return x+28 ;
}
int main( void )

{
short result, theArg = 34;
//theArg = inlineFunction( theArg );
result = someFunction( theArg );

}
short someFunction( short someArg )

{
short i, k, r;
//register short i, k, r;
r = 4*5;
r = r * 9;
r = r * r;
for( i = 0; i < 4; i++)

{
k = 6;
//r = r + i*someArg;
}
return r;
}
Figure 4.12 C source code fragment for optimization tests.

is reduced to
movw $20, -8(%ebp)
4 5
The constant has been pre-calculated, with the value 20 inserted into the compiled
code. The multiplication is not done at run-time. Note also that the stack in memory is
used to hold the variable r at offset -8 from the base pointer register ebp.
activity 4.4
Compiler Optimization
Using the GnuC compiler, compile the assembly code optim.c and
examine the assembly listing optim.s. Use the command line
gcc -S optim.c -o align.s
Repeat with optimization level 1:
gcc -S -O1 optim.c -o align.s
4.7.2 Constant Propagation

Consider the call to the function:
short result, theArg = 34;
result = someFunction(theArg);
Without optimization this compiles to:
movw $34, -4(%ebp)

movswl -4(%ebp), %eax
pushl %eax
call _someFunction
The value 34 is stored in local variable theArg, transferred to the ax register and then
pushed on the stack to be transferred to the function someFunction. With optimization
this is reduced to:
pushl $34
call _someFunction
4.7.3 Strength Reduction
This refers to reducing time-consuming operations typically time-consuming arith-

metic operations with simpler logical operations. The source section
r = r * 9;
has not been compiled as a multiply operation. Instead it appears in the output listing
as
sall $3, %eax

addl %edx, %eax
This is a shift left by 3 bits (corresponding to a multiply by 8), followed by an addition.

9
That is, (8+1) . Otherwise, an integer multiplication would be required as in the
calculation of
r = r * r;
which compiles into
imulw -8(%ebp), %ax

4.7.4 Dead Code
Dead or unused code is quite often produced during the debugging phase. For example
the assignment
k = 6;
in the example code is obviously redundant, as the value of k is not subsequently used.
Efficient compiler optimization may be able to detect this.
4.7.5 Use of Registers
Storing frequently-used variables in registers rather than in memory (the default) saves
time. For example, the loop section
for( i = 0; i < 4; i++)

{
k = 6;
}
compiles to
movw $0,-4(%ebp)
L3:
cmpw $3,-4(%ebp)
jle L6
jmp L4
L6:
movw $6,-6(%ebp)
L5:
incw -4(%ebp)
jmp L3
The variable i is stored in memory location -4(%ebp). The register qualifier may be
used in front of variables to request the compiler to use registers rather than memory.
Compare the above to using the declaration
register short i, k, r;
which yields assembly code
xorl %edx, %edx

L3:
cmpw $3, %dx
jle L6
jmp L4
L6:
movl $6, %ecx
L5:
incl %edx
jmp L3
L4:
The variable i has been stored in register dx. Note that the register declaration does
not require the compiler to use a register at all: it is only a suggestion. As the number
of registers is limited, there may be none available. Furthermore, optimizing compilers
may implicitly generate code which uses registers in this way by using the optimization
option. This is done using
gcc -S -O1 optim.c -o align.s
Where -O indicates use optimization, with the number following indicating the level of
optimization to use (0,1,2, or 3)
4.7.6 Loop Unrolling
The code loop
for( i = 0; i < 4; i++)

{
k = 6;
}
compiles to
movw $0, -4(%ebp)

L3:
cmpw $3, -4(%ebp)
jle L6
jmp L4
L6:
movw $6, -6(%ebp)
L5:
incw -4(%ebp)
jmp L3
L4:
At each iteration of the loop there is an increment (incw, increment word), test (cmpw
compare word) and branch (jle, jump if less than or equal to). These instructions
effectively constitute the loop overhead. Small loops may be replaced by direct copies
of the code executed within the loop.
If we now add the line
r = r + i*someArg;
to the loop and compile using
gcc -S -O1 -funroll-loops optim.c -o optim.s
we end up with no loops at all in the function. It becomes as follows:
_someFunction:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %eax
movl %eax, %edx
addl $32400, %edx
movl %eax, %ecx

addl %eax, %ecx
addl %ecx, %edx
addl %eax, %ecx
addl %ecx, %edx
movswl %dx,%eax
leave
ret
activity 4.5
Compilation Checking
The above code has assigned register eax for variable someArg and reg-
ister edx for variable r. Verify that the computation is still correct.
4.7.7 Use of Pointers

Loops are often used to iterate over data stored in an array. In addressing the data
item in the array, an index is used. The index must be calculated at compile time.
Figure 4.13 illustrates both methods.
Using a construct such as
myClass[studentNumber].age = 0;
forces the compiler to calculate the offset
Memory address of myClass

+ studentNumber bytes occupied by STUDENT structure
+ offset of age element
on each iteration. The pointer alternative simply increments by a fixed amount (the size
of the STUDENT structure) at each iteration, which is more efficient.
Figure 4.14 shows some sample code for comparing indexed and pointer-based mem-
ory access for a simple memory block copy. Typically the pointer-based loop is around
5% faster.
activity 4.6
Pointers and Indexing
Compile the code module ptrspeed.c shown in Figure 4.14 and com-
pare the resulting execution times. On faster machines it may be nec-
essary to increase the variable NITERATIONS to force the execution to
take longer and to obtain a more accurate comparison. (Sample figures
on a Pentium 200MHz are indexing = 9.07 seconds and pointers = 8.46
seconds; A Pentium 450MHz gave indexing 3.19 seconds, pointers 2.91
seconds).
/* ptrloop.c
*
* John Leis
*/
typedef struct
{
char initial;
short age;
} STUDENT;
#define NUM_STUDENTS 20
STUDENT myClass[NUM_STUDENTS];
int main( void )

{
}
void someFunction( short someArg )

{
short studentNumber;
STUDENT *pCurrentStudent;
// point to address of first item

pCurrentStudent = &myClass[0];
for( studentNumber = 0;
studentNumber < NUM_STUDENTS; studentNumber++)
{
// using indexes
myClass[studentNumber].age = 0;
// using pointers
pCurrentStudent->age = 0;
pCurrentStudent++ ;
}
}
Figure 4.13 C source code illustrating pointers and indexing.

/* ptrspeed.c - Compare speed of pointer and array indexing

*/
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define BLOCK_SIZE 10000L

#define NITERATIONS 10000L
long Block[BLOCK_SIZE];
int main()
{
clock_t tstart, tend;
double telapsed;
short n;
long iter, *ptr1, *ptr2;
// using array indices

tstart = clock();
for( iter = 0; iter < NITERATIONS; iter++)
{
for( n = 0; n < BLOCK_SIZE-1; n++)
{
Block[n] = Block[n+1];
}
}
tend = clock();
telapsed = (double)(tend - tstart)/(double)CLOCKS_PER_SEC;
printf("Indexing: Elapsed time = %lf seconds\n", telapsed);
// using pointers
tstart = clock();
for( iter = 0; iter < NITERATIONS; iter++)
{
ptr1 = &Block[0];
ptr2 = &Block[1];
for( n = 0; n < BLOCK_SIZE-1; n++)
{
*ptr1++ = *ptr2++ ;
}
}
tend = clock();
telapsed = (double)(tend - tstart)/(double)CLOCKS_PER_SEC;
printf("Pointers: Elapsed time = %lf seconds\n", telapsed);
exit(0);
}
Figure 4.14 C source code for pointer/indexing speed test.

4.7.8 Loop Invariance
Loop invariance refers to calculations or memory transfers which are coded inside a
loop, but which may be performed outside a loop. Consider the following example:
for( i = 0; i < 4; i++)

{
s = i + k*j;
}
The value kj could be pre-calculated and assigned to a single variable prior to the
loop, thus saving time.
4.7.9 Function Inlining
Each function call has entry code (also called the prolog) and exit code (the epilog).
The entry code looks like this:
_someFunction:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
Here the stack frame is set up by storing the local stack pointer (the base pointer) and
reserving space for local variables (here 8 bytes). The exit code restores the stack
pointer and pops the return address:
leave
ret
The leave instruction was introduced in later processors because it is a common re-
quirement in procedure-oriented code. It effectively performs
movl %esp, %ebp

pop %ebp
Function inlining replaces small functions with verbatim copies of the functions code,
thus removing the overhead of the entry/exit code. The disadvantage is that the exe-
cutable code becomes longer.
activity 4.7
Inline Functions
Uncomment the call to inlineFunction in the example optim.c and

examine the assembled code to determine the effect of using/not using
optimization. The command line required for compilation with function
inlining is
gcc -S -O -finline-functions optim.c -o optim.s
4.7.10 Data Alignment
For processors with 16 or 32 bit memory buses, accessing an 8-bit quantity still takes
one memory access. Using a data structure containing a mixture of one or two byte
quantities means that for the most efficient memory usage, the quantities should be
packed together. However this may increase the number of memory accesses re-
quired, as the two-byte quantity may straddle a 16-bit boundary. That is, the lower byte
may be stored in one 16-bit location with the higher byte stored in the next highest
location as shown in Figure 4.15.
Figure 4.16 defines structures ASTUDENT and BSTUDENT with identical elements. A
char is a one-byte quantity, while a short is a two-byte quantity. The structure ASTU-
DENT is packed, so that the total storage space is 3 bytes. The structure BSTUDENT
is packed to as to align the data items on word boundaries, thus requiring 4 bytes. One
byte is effectively wasted, but the access time is substantially reduced.
Figure 4.17 shows the corresponding code output for the memory accesses for the
aligned and misaligned structures, produced using the GnuC compiler. Note that the
stack offsets from the ebp register are 3 and 4 for the packed structure, and 6 and 8 for
the word-aligned structure. The instruction movw $88,-3(%ebp) will incur two access
to memory.
16 bits 16 bits
address n +1
address n
misaligned aligned
Figure 4.15 Data alignment in memory.
activity 4.8
Memory Data Alignment
Using the GnuC compiler, the assembly code align.c and examine the
assembly listing align.s. Use the command line
gcc -S align.c -o align.s
4.8 Memory Allocation

Memory allocation may be done dynamically in C. This is typically done where a tem-
porary buffer must be used for calculation, processing of data records or the like, is
required. For example, a spell checker may need to load a large dictionary of words.
The dictionary could be declared as a single large array, but this has two disadvan-
tages:
1. The amount of free memory available to other applications is reduced by this

amount while the program is running. This reduces the number of concurrent
applications which may be run. If the memory were allocated in an on-demand
basis, the extra memory is only allocated to the program when needed.
2. The array size would need to be compiled into the program. The array size could
be hard-coded to some maximum value within the program when compiled. This
/* align.c
* For GnuC
* Compile using
gcc -S align.c -o align.s
*
* John Leis
*/
#include <stdio.h>
// with byte alignment

#pragma pack(1)
typedef struct
{
char initial;
short age;
} ASTUDENT;
// with word alignment

#pragma pack(2)
typedef struct
{
char initial;
short age;
} BSTUDENT;
int main()
{
ASTUDENT me;
BSTUDENT you;
me.initial = 23;
me.age = 88;
you.initial = 24;
you.age = 99;
}
Figure 4.16 C source code fragment for data alignment tests.

movb $23,-4(%ebp)
movw $88,-3(%ebp)
movb $24,-8(%ebp)
movw $99,-6(%ebp)
Figure 4.17 Assembly output for data alignment tests.
would have to be a worst-case maximum using a substantial amount of memory

and is thus inefficient. Similarly, the array size could be hard-coded to some more
acceptable value, but it would mean that no more memory could be used for that
task (for example, the spell checker dictionary could never be expanded with new
words).
Dynamic memory allocation is done using malloc() to allocate the required number of
bytes of storage. The memory must be released using the free() call. The memory
allocated is guaranteed to be contiguous that is, one single block and not several
fragmented blocks. This allows the memory to be used as a conventional array.
Figure 4.18 gives a sample program for testing the malloc() function. Note the corre-
sponding free() call, which releases the memory block back into the pool. A typical
output of this program is shown in Figure 4.19. Note that it will vary from system to sys-
tem, depending on the amount of RAM installed, the number of applications running,
and other factors.
Common programming errors using dynamic allocation are:
1. Not declaring the malloc() and free() functions using the include file stdlib.h.
This may cause a compiler warning about int/long/pointer mismatches which, if
ignored, could incorrectly interpret the size of the arguments to (and return values
from) the library functions.
2. Not allocating the memory that is, omitting to call malloc() when needed.
3. Not checking the return value from malloc(). If the function returns a NULL
pointer, then it means that there is not a contiguous block of the requested size
available in the system at all.
4. Writing to memory beyond the end of the allocated block. If malloc() is given a
size argument of N bytes then the allowable byte offsets are from to N 0 1
. Lo-
cations after this belong to the operating system (or perhaps another application)
and must not be used. This normally causes a memory protection fault.
5. Not calling free() after the application has finished using the memory. This is
termed a memory leak.
6. Using the memory after free() has released the memory. The block previously
allocated no longer belongs to the application after a call to free().
7. Changing the pointer value returned from malloc() and then passing the changed
value to free(). The original returned value should be kept as free() uses it to
determine the start of the allocated memory block (and indirectly, its length). If
pointer addressing is required within the block, it is necessary to assign the allo-
cated pointer to another pointer which may be altered. See the example following
this.
8. Incorrectly declaring a pointer type for the allocated block or not typecasting the
returned pointer. The former can cause fatal errors; the latter compiler warnings.
See the example following this.
9. Freeing a block which contains a pointer to another memory block. This can
occur if a block of pointers is allocated, with each pointer pointing to a another
dynamically allocated block. The order of allocating and freeing the blocks is
vitally important.
Arrays of data structures are easily allocated. Using the previous student-record data
structure, Figure 4.20 gives a complete example.
/* malloc.c - To illustrate malloc() function

* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
int main()
{
long nBytes;
char *memPtr;
nBytes = 1024*16;
do
{
memPtr = (char *)malloc(nBytes);
if( ! memPtr )
{
printf("Malloc of %ld failed\n", nBytes );
}
else
{
printf("Malloc of %ld is OK\n", nBytes );
free(memPtr);
}
nBytes += 1024*16;
} while(memPtr);
exit(0);
}
Figure 4.18 Dynamic memory allocation using malloc().
output: (machine with 64M RAM, Win95 running)

Malloc of 16384 is OK
...
Malloc of 16793600 failed
output: (machine with 128M RAM, Windows NT4)

...
[main] malloc.exe 1000 (0) commit_and_inc: VirtualAlloc failed
Malloc of 134234112 failed
Figure 4.19 Output of dynamic memory allocation test.
/* malloceg.c - Example on correct usage of malloc()

*/
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
typedef struct
{
char Name[20+1]; // name - 20 bytes (+zero terminator)
double gpa; // grade-point average
} Student;
int main()
{
short nStudents, currStudent;
Student *pRecords, *pCurrent;
printf("One student takes %hd bytes\n", sizeof(Student));

nStudents = 10;
// allocate the correct number of bytes

pRecords = (Student *)malloc(nStudents*sizeof(Student));
// check the return value

if( ! pRecords )
{
perror("malloc()");
exit(1);
}
// assign start-of-block to indexing pointer

pCurrent = pRecords;
for( currStudent = 0; currStudent < nStudents; currStudent++ )
{
// Note the -> construct for accessing a structure member
// when the left-hand side is a pointer to a structure.
pCurrent->gpa = 0.0;
// Point to next record. Note that the pointer is

// incremented by the correct number of bytes
// (sizeof(Student)) as determined by the compiler
pCurrent++ ;
printf("pCurrent is address %lx (decimal %ld)\n",

pCurrent, pCurrent);
}
free(pRecords); // free the memory
exit(0);
}
Figure 4.20 Typical use of dynamic memory allocation.

activity 4.9
Memory Allocation
Compile the malloc.c program as it stands and compare the output on

your system to that in the comments in the header. Explain in your own
words what is happening.
activity 4.10
Incorrect Memory Usage: Omitting free()
In malloc.c(), comment out the free() call. Then re-compile the pro-
gram and examine the output. Can you explain what is happening? This
is called a memory leak.
activity 4.11
Viewing System Memory Allocation
The following requires Windows NT.
Select Start-Programs-Administrative Tools-Performance

Monitor. In the Performance Monitor, select Edit-Add and select
from the various categories via object and counter. Select memory
and other resource categories of interest such as disk and processor
time. Run some applications and note the effect on the resource.
4.9 Memory Copying and Searching

Fast and efficient memory copying is a very important function in virtually every soft-
ware project. For this reason, library functions have been provided which copy bytes
from one location to another. Suppose two buffers are defined as follows:
#define IO_BUFLEN 1024

char src[IO_BUFLEN];
char dest[IO_BUFLEN];
To copy from src (source) to dest (destination), the following code could be used:
int n;
char *pSrc, *pDest;
for( pSrc = src, pDest = dest, n = 0; n < IO_BUFLEN; n++)

{
*pDest++ = *pDest;
}
However an equivalent using the library function memcpy() is as follows:
#incluide <string.h>
memcpy( (void *)dest, (void *)src, (size_t)IO_BUFLEN);
Note that memcpy() takes void-type pointers, thus enabling it to copy any arbitrary
memory block irrespective of the actual data type. Many other library functions ex-
ist for memory access, such as memset() for setting a block of memory to a particular
value and memchr() for searching for a particular character. The string functions such
as strcpy() (copy a string) and strlen() (length of a string) operate on null (zero)-
terminate strings, whereas the mem functions assume nothing about the blocks of
memory supplied.
4.10 Module Summary

This module has introduced a number of issues pertaining to the coding of systems.
The relationship between high- and low-level languages has been discussed.
Topics covered in this module were:
Error handling.
Assembly language.
Software optimization.
Linking C and assembly code.
Dynamic memory allocation.
Further Reading
a
Module 5
ALGORITHMS
Module 5 Algorithms 5.1
5.1 Module Overview

Knowledge of algorithms is fundamental to the understanding real- and non-real-time
operating systems. Appropriate choice of an algorithm to solve a design sub-problem
often makes the difference between a design which is good and one which is not. The
choice of algorithm also impacts system efficiency to a large degree. This module
is not an introduction to algorithms and their use this is such a large field that it is
expected that the reader will have covered algorithms and data structures in a separate
course. This module is intended to refresh the readers knowledge of a few of the
more important algorithms and data structures used in operating systems and real-
time device drivers.
The main topics examined in this module are:
The need for buffering and queue structures.

The differences, and similarities, between various queueing methods.
5.2 Introduction
Many different algorithms are used in the implementation of complex computer soft-
ware systems. Choosing the correct approach for a given problem has many benefits
in terms of ease of implementation and speed of execution. This module presents only
a brief overview of some important concepts. Many standard libraries in C and C++
are available which implement algorithms for various purposes, and often it may be
preferable to use these off-the-shelf solutions rather than wasting project time coding
and debugging from scratch. In order to make the most use of standard libraries it
is necessary to understand the underlying algorithm and how it is likely to perform in
terms of speed of execution, memory usage and so forth. In many circumstances
possibly because of the uniqueness of a particular problem it may be necessary to
code an algorithm from scratch.
Operating systems utilize a great many algorithms and data structures internally. Mul-
titasking, for example (considered in a subsequent module) relies on priority queues
to schedule tasks for execution. Simply typing on a keyboard invokes a buffering al-
gorithm, so that the keystrokes are delivered in order when the receiving application is
ready to process them. Network data may need to be queued upon arrival, and pos-
sibly processed out of order. These aspects should be considered when reading the
following.
5.3 Arrays and Buffers

Where is buffering important? To give an example, consider the results shown in Fig-
ure 5.1. This shows the time taken to access the data sequentially in a disk file, using
several forms of system buffering. As seen in the figure, differences of more than an
order of magnitude are possible. The reason why buffering is important in this instance
is that the disk is a physical device and can only spin so fast. An intelligent pre-reading
method, which stores more data blocks from the disk than are actually requested by
the application, can have enormous benefits.
stdio File Buffering

FAT filesystem, CPU 486DX2 66MHz
50
40
seconds
30
20
10
0
default no buffer 16 32K no standard lib
NTFS filesystem, CPU Pentium III 450MHz

2
1.5
seconds
0.5
0
default no buffer 16 32K no standard lib
Figure 5.1 Standard I/O file buffering performance.
Figure 5.2 shows the initial portion of the program. Note that the file buffers must be
declared static if they were automatic (within a function) then the buffer space may be
re-allocated, resulting in two sections of code trying to use the same memory space.
Note in Figure 5.2 the use of the clock() system function to time the operation.
Figure 5.3 shows a subsequent section, using first the default library buffering, follow-
ing by a call using no buffer. The C library buffer is disabled by calling
setvbuf(fp, NULL, _IONBF, 0);
Figure 5.4 illustrates the assignment of the static file buffers, which are simply arrays
of bytes (unsigned char data type in C):
setvbuf(fp, &smallbuf[0], _IOFBF, SMALLBUF);

/* filebuf.c - illustrating inportance of buffering I/O streams

* From an idea in the Microsoft QuickC help files.
* This tests the stanard open()/read()/close() system calls
* as well as the standard library stdio fopen()/fread()/fclose()
* with internal buffering.
* The stdio buffer size is set by setvbuf().
*
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h> // open() close()
#include <io.h> // O_ definitions
#include <time.h>
unsigned long calcChecksum(FILE *fp);

unsigned long calcChecksumNolib(int fd);
#define SMALLBUF 16
#define LARGEBUF 32768
// Note that the I/O library buffers *must* be static

// to stay in scope
static char smallbuf[SMALLBUF];
static char largebuf[LARGEBUF];
int main(int argc, char *argv[])

{
FILE *fp; // pointer to buffer structure
int fd; // "handle"
unsigned long checksum;
clock_t startTime, endTime, elapsedTime;
if( argc != 2)
{
printf("Usage: filebuf filename\n");
exit(1);
}
printf("Filename %s \n", argv[1]);
Figure 5.2 Disk buffering part 1 of 5.

// first - default buffering

if( !(fp = fopen(argv[1], "rb")) )
{
printf("Could not open %s\n", argv[1]);
exit(2);
}
printf("Default buffering: \n");

startTime = clock();
checksum = calcChecksum(fp);
endTime = clock();
elapsedTime = endTime - startTime;
printf("checksum %lx. Elapsed %.2lf\n",
checksum, (double)elapsedTime/(double)CLOCKS_PER_SEC);
fclose(fp);
// second - no buffer
{
exit(2);
}
printf("No buffering: \n");

setvbuf(fp, NULL, _IONBF, 0);
endTime = clock();
fclose(fp);

// third - small buffer

{
exit(2);
}
printf("Buffer size %d: \n", SMALLBUF);

setvbuf(fp, &smallbuf[0], _IOFBF, SMALLBUF);
endTime = clock();
fclose(fp);
// fourth - large buffer

{
exit(2);
}
printf("Buffer size %d: \n", LARGEBUF);

setvbuf(fp, &largebuf[0], _IOFBF, LARGEBUF);
endTime = clock();
fclose(fp);

// fifth - no standard library fopen/fget/fread

if( (fd = open(argv[1], O_RDONLY | O_BINARY)) < 0 )
{
exit(2);
}
printf("No standard library: \n");
checksum = calcChecksumNolib(fd);
endTime = clock();
close(fd);
exit(0);
}
Figure 5.5 shows the portion of the test that does not use the standard library buffering.
The buffered routines use fopen(), fread() and fwrite() to access the files, and
require a FILE structure. The unbuffered portion of the test uses the direct system
calls open(), read() and close().
Figure 5.6 shows the two routines which are accessing the files contents. For the ex-
ample, each calculates a modulo-checksum (the sum of all bytes in the file). Such an
operation is requred, for example, in transmitting data frames over a network. The
only difference between the two functions is the use of the file access routines
the fread(), which is a buffered read, and read(), which is a standard system call.
fread() requires a pointer to a FILE structure, which encompasses the buffer. The
read() function calls the system file read function directly, and requires an integer file
descriptor.
Manipulating buffers causes many problems. If not thought through carefully, the rou-
tines used for buffer manipulation may slow the overall performance. Figure 5.7 shows
one method of accessing a pool of buffers, using an array of pointers to store the mem-
ory address of the start of each buffer. If the buffers need to be sorted for example,
a set of student records to be sorted in alphabetical order the relative position of
the buffers must be changed. One solution is to copy all the bytes comprising each
buffer. This is potentially very slow. A better solution is simply to swap the pointers to
the buffers, as illustrated in Figure 5.7. Variations on this theme are discussed in the
following sections.
unsigned long calcChecksum(FILE *fp)

{
unsigned long cksm, nBytes;
int inch;
// note that the argument to fgetc() is in fact an

// int, to allow it to return EOF
cksm = 0L;
nBytes = 0L;
while(! feof(fp))
{
if( (inch = fgetc(fp)) != EOF )
{
cksm += (unsigned long)inch;
nBytes++ ;
}
}
printf("processed %ld bytes. ", nBytes);
return cksm;
}
unsigned long calcChecksumNolib(int fd)

{
unsigned long cksm, nBytes;
unsigned char inch;
// note that the argument to read() is

// an unsigned char, as expected
cksm = 0L;
nBytes = 0L;
while( read(fd, &inch, 1) > 0 )
{
cksm += (unsigned long)inch;
nBytes++ ;
}
printf("processed %ld bytes. ", nBytes);
return cksm;
}

pointer array data structures

b
b
swap pointers
..
.b
b
Figure 5.7 An array of pointers to data structures.
5.4 Circular Buffers
A circular buffer is often useful when data must be processed in order. For example,
keystrokes from a keyboard may arrive at unequal intervals, but must be handed over to
the application by the system kernel when requested. Since memory is a linear concept
(from start to finish), any array must have some structured handling superimposed
upon in so that it may be viewed as being circular.
Figure 5.8 illustrates this concept. In practice, the circular buffer is laid out as shown
in Figure 5.9. Pointers in and out point to the memory location of the next item to be
stored in the buffer, and the next item to be read from the buffer, respectively. If these
are equal, there is nothing stored in the buffer, as shown in A in the figure.
After receiving the characters o, n, and e, the situation is as depicted in part B. The in
pointer is advanced to the next free location. Following that, the characters t, w, and o
are received. These are also stored in the buffer.
If the receiving application is then able to read characters, it reads them in order as
shown in Figure 5.10. The characters o, n and e happen to be read out, and the out
pointer is advanced. Finally, consider the case when the letters t, h, r, e, and e must be
queued in the buffer. Once the in pointer reaches index 9 in the buffer, it wraps around
to index 0. This is no problem, since the byte that was formerly at index 0 has been
b in ptr
bin b out ptr
b
out
Figure 5.8 Conceptual diagram of a circular buffer.
read out (of course it will still remain there, but the application has read out the data
item so it may as well be erased).
In effect, the linear array has become circular. Of course, if the in pointer wraps
and eventually catches up to the out pointer, valid data will be overwritten. The buffer
must be large enough for this not to happen. The receiving application must have an
average speed equal to the average input speed. The buffer is merely absorbing the
bursty nature of the input hence the term elastic buffer is sometimes used.
This type of buffer a FIFO, or First-In, First-Out buffer is relatively straightforward

to implement. Figure 5.11 shows the initial part of a FIFO example program. Note the
type definition for the FIFO data structure.
Figure 5.12 shows the initialization necessary the input and output pointers are set
to the start of the buffer.
The get and put routines (Figures 5.13 and 5.14) perform the necessary testing for
wrapping of the FIFO.
Figures 5.15, 5.16 and 5.17 show the testing of the FIFO, with the output shown in
Figure 5.18.
buflen
in
b
A 0
out
in
B o n e 3
out
in
C o n e t w o 6
out
0 1 2 3 4 5 6 7 8 9
bufidx
Figure 5.9 Circular buffer.
buflen
in
D - - - t w o 3
out
in
E e - - t w o t h r e 8
out
0 1 2 3 4 5 6 7 8 9
bufidx
Figure 5.10 Circular buffer (continued).

/* fifo.c
* Simple first-in, first-out buffering.
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#define QUEUE_SIZE 4
typedef struct
{
char queue[QUEUE_SIZE];
char *qin, *qout;
int charsInQueue, inCtr, outCtr;
} FIFO;
static FIFO fifo;
void initFifo();
int fifoPut(int nextChar);
int fifoGet();
Figure 5.11 First-In, First-Out (FIFO) code main test code (part 1 of 4).
// Initialize the fifo data structure

void initFifo()
{
fifo.qin = &fifo.queue[0];
fifo.qout = &fifo.queue[0];
fifo.charsInQueue = 0;
fifo.inCtr = 0;
fifo.outCtr = 0;
}
Figure 5.12 FIFO initialization.

// Retrieve a character from the fifo.

// Returns the character in the lowest byte of the
// integer on success, -1 when no data is in the fifo.
int fifoGet()
{
int nextChar = -1;
if( fifo.charsInQueue == 0)
{
return nextChar; // -1 for nothing in queue
}
nextChar = (int)*fifo.qout++;
if( ++fifo.outCtr == QUEUE_SIZE )
{
fifo.qout = &fifo.queue[0];
fifo.outCtr = 0;
}
--fifo.charsInQueue;
return nextChar;
}
Figure 5.13 Retrieving an item from a FIFO buffer.

// Put a character onto the fifo buffer.

// returns 1 on success, 0 on fail (over-run)
int fifoPut(int nextChar)
{
// check for over-run -- should never happen
if( ++fifo.charsInQueue > QUEUE_SIZE)
{
printf("fifo over-run\n");
return 0;
}
// retrieve the next character

*fifo.qin++ = nextChar;
// test for circular wrap of the fifo

if( ++fifo.inCtr == QUEUE_SIZE )
{
fifo.qin = &fifo.queue[0];
fifo.inCtr = 0;
}
return 1;
}
Figure 5.14 Putting an item in a FIFO buffer.
int main()
{
int ch, i;
printf("test 1: fewer characters than fifo size:\n");

initFifo();
fifoPut('a');
fifoPut('b');
fifoPut('c');
for( i = 0; i < 3; i++)
{
ch = fifoGet();
if( ch < 0 )
printf("Nothing in fifo!\n");
else
printf("Got '%c'\n", (char)ch);
}
printf("----\n");
Figure 5.15 First-In, First-Out (FIFO) test 1, FIFO not full.

printf("test 2: normal fifo operation with wrapping:\n");

initFifo();
fifoPut('a');
fifoPut('b');
fifoPut('c');
for( i = 0; i < 3; i++)
{
ch = fifoGet();
if( ch < 0 )
else
}
fifoPut('d');
fifoPut('e');
fifoPut('f');
for( i = 0; i < 3; i++)
{
ch = fifoGet();
if( ch < 0 )
else
}
printf("----\n");
Figure 5.16 FIFO test 2, normal operation when the FIFO buffer wraps.
printf("test 3: characters greater than fifo size:\n");

initFifo();
fifoPut('a');
fifoPut('b');
fifoPut('c');
fifoPut('d');
fifoPut('e');
for( i = 0; i < 5; i++)
{
ch = fifoGet();
if( ch < 0 )
else
}
printf("----\n");
exit(0);
}
Figure 5.17 FIFO test 3, when FIFO wraps and overflows.
test 1: fewer characters than fifo size:

Got 'a'
Got 'b'
Got 'c'
----
test 2: normal fifo operation with wrapping:
Got 'a'
Got 'b'
Got 'c'
Got 'd'
Got 'e'
Got 'f'
----
test 3: characters greater than fifo size:
fifo over-run
Got 'a'
Got 'b'
Got 'c'
Got 'd'
Got 'a'
----
Figure 5.18 FIFO test output.
activity 5.1
FIFO Character Buffer
Compile the FIFO test program. Verify normal operation of the FIFO,
the situation when the pointers wrap around, and the result when the
fifo.qin pointer attempts to overtake the fifo.qout pointer and data
is lost.
As a final note consider the dynamic handling of several FIFO buffers. In this code the
FIFO data structure is static, however several FIFOs could be managed by modifying
each routine to take a pointer to a FIFO and having the initFifo() allocate the buffer
via malloc() (this would also require a destroyFifo() routine).
5.5 Linked Lists

The linked list is a fundamentally important data structure. The basic idea is depicted
in Figure 5.19. Each item, shown as a block in the figure, contains not only the data for
the particular entry but a link to the next subsequent entry. Typically, this is done via a
C pointer1 as follows:
#define ENTRY struct entry

ENTRY
{
char name[MAX_NAMELEN+1];
ENTRY *next;
};
Thus, each element is threaded into the list as depicted in Figure 5.19 using the next
pointer. In the example shown, each entry is simply a character string in order to have a
workable example but limiting the complexity. The data structure may be, for example,
a record from a database, a list of processes to be scheduled in an operating system,
or a list of sequential timeouts to be checked in a real-time system.
In operation, the list is traversed by beginning at the root pointer as shown in Fig-
ure 5.19. To find the next item, the next pointer is followed from the current item. The
last item in the list has a next pointer set to a special indicator, normally the NULL
pointer in C (a value of zero).
Consider the case where items A and B exist on the list. Insertion of item C in the list
requires unlinking or redirecting the root pointer to point to C, followed by setting the
1
Languages such as Java, which do not have the pointer type, use other mechanisms.
node *first
b
b b
B

C A
Figure 5.19 A linked list diagram. Item C is to be added.
next pointer of the new item to point to the item which was formerly first on the list (here
B). Note that this implies addition at the front of the list, but this need not necessarily
be the case. In addition, dynamic memory allocation must be used to create the new
list entry, rather than static declaration of the array. Code examples will demonstrate
this shortly.
The linked list is able to implement many queue disciplines (modes of operations). For
example, a First-In, First-Out (FIFO) discipline as discussed in the previous section may
be implemented using a linked list. If each item is a simple character, a structure using
a pointer as outlined above may not be the most memory-efficient design, especially if
the list is to become large. A last-in, first-out discipline may be implemented, which is
similar to the operation of a processor stack. In-order addition to the list may be done,
so that the list is always maintained in a sorted state.
A linked list is thus suitable for applications where the number of items on the list is
highly variable over time. The maximum number of items which may be queued is
limited only by the memory resources of the system.
Figure 5.20 shows the declaration of the list data structure as well as the root pointer.
This code implements three types of addition to a linked list:
1. Addition at the head of the list, making the new item the first on the list.
2. Addition at the tail of the list, making the new item the last on the list.
3. Addition subject to another criteria, in this case that the new item is in the correct
alphabetical sequence with respect to other items already on the list.
Note that the in-order addition could be, for example, in terms of a numeric quantity, a
time stamp, or some other condition.
Figures 5.21, 5.22 and 5.23 show addition in alphabetical order, at the head, and at the
tail, respectively.
/* llist.c - simple demonstration of a singly-linked list.

* Addition routines cater for three types: addition at
* the head of the list, addition at the tail of the list,
* and addition in alphabetical order.
* These routines manage one list only, but could be modified
* to manage several lists by passing a pointer to the list head
* and using dynamic allocation.
*
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_NAMELEN 40
#define ENTRY struct entry
#define NULL_ENTRY (ENTRY *)NULL
ENTRY
{
ENTRY *next;
};
// function prototypes
int main();
void InitList();
void AddEntryToHead(char *NewName);
void AddEntryToTail(char *NewName);
void AddEntryInOrder(char *NewName);
ENTRY *CreateNewEntry(char *NewName);
void PrintList(void);
void FreeList(void);
// pointer to start of list

static ENTRY *ListAnchor;
Figure 5.20 Linked list test code (part 1 of 4).

// main test program

int main()
{
printf("test 1: entries added in alphabetical order:\n");
InitList();
AddEntryInOrder("gamma");
AddEntryInOrder("alpha");
AddEntryInOrder("beta");
PrintList();
FreeList();
printf("----\n");
Figure 5.21 Linked list test 1: add items in alphabetical order (part 2 of 4).
printf("test 2: entries added to the start of the list:\n");

InitList();
AddEntryToHead("gamma");
AddEntryToHead("alpha");
AddEntryToHead("beta");
PrintList();
FreeList();
printf("----\n");
Figure 5.22 Linked list test 2: add items at the head of the list (part 3 of 4).
printf("test 3: entries added at the end of the list:\n");

InitList();
AddEntryToTail("gamma");
AddEntryToTail("alpha");
AddEntryToTail("beta");
PrintList();
FreeList();
printf("----\n");
exit(0);
}
Figure 5.23 Linked list test 3: add items at the tail of the list (part 4 of 4).
void InitList()
{
ListAnchor = NULL_ENTRY;
}
Figure 5.24 Linked list functions to initialize the list.
// AddEntryToHead() - add an entry to the linked-list structure

// at the head of the list.
void AddEntryToHead(char *NewName)
{
ENTRY *NewEntry;
NewEntry = CreateNewEntry(NewName);
if( ListAnchor == NULL_ENTRY )
{
// list is currently empty
ListAnchor = NewEntry;
return;
}
// list has one or more entries

NewEntry->next = ListAnchor;
}
Figure 5.25 Linked list function to add to the head of the list.
Initialization of the list is quite simple, as shown in Figure 5.24. There is only one root
pointer, which is set to NULL to indicate that the list is empty.
Adding to the head of the list is shown in Figure 5.25. This was previously shown
diagrammatically. Adding to the head of a list is evidently straightforward. Adding to
the tail of the list as shown in Figure 5.26 is marginally more complicated, as it requires
traversal of the list. In practice, if addition at the end of the list is often performed, it
may be more efficient to store a static pointer to the last item.
Adding in order (Figure 5.27) is more complicated. Firstly, the search (ascending nu-
merical, descending numerical, alphabetic, etc) must be performed. For numerical
quantities, a simple comparison may suffice. For more complex criteria, a function such
as the library function strcmp() must be used. Note in the figure that double-indirection
is used a pointer to a pointer (variable ENTRY **ppCurrent). The is because not only
must the subsequent list item be pointed to by the new entry, the previous item must
point to the new item. As the list is traversed, the current item has no knowledge of
the previous item, and therefore the pointer one stage back must be known.
Figure 5.28 shows the output of the linked-list test examples. The memory address of
the list items are shown in order to help distinguish each item as it is allocated.
// AddEntryToTail() - add an entry to the litail of the linked-list

void AddEntryToTail(char *NewName)
{
ENTRY *NewEntry;
ENTRY *pCurrent;
{
return;
}
// list has one or more entries, so find the last

for( pCurrent = ListAnchor;
pCurrent->next != NULL_ENTRY; pCurrent = pCurrent->next)
{
}
pCurrent->next = NewEntry;
}
Figure 5.26 Linked list function to add to the tail of the list.
activity 5.2
Linked List Addition
Work through each of the list insertion C functions, using the example
output. Draw a diagram of each scenario, showing the items on the list
and the pointers.
Figures 5.29, 5.30 and 5.31 show helper functions to create a new item, print the
list, and free the list (since it is dynamically created). To re-use the list, it must be
re-initialized. Specifically, the root pointer must be set to NULL.
// AddEntryInOrder() - add an entry to the linked-list structure

// in ascending alphabetical order.
// Note that if addition to the list was to be unordered,
// the routine would be considerably simpler.
void AddEntryInOrder(char *NewName)
{
ENTRY **ppCurrent;
ENTRY *NewEntry;
{
return;
}
for( ppCurrent = &ListAnchor;

(*ppCurrent != NULL_ENTRY);
ppCurrent = &((*ppCurrent)->next) )
{
if( strcmp(NewName, (*ppCurrent)->name) < 0)
{
// found place - insert item in list
NewEntry->next = *ppCurrent;
*ppCurrent = NewEntry;
return;
}
}
// reached end of list and no place found
// add entry at the end
*ppCurrent = NewEntry;
}
Figure 5.27 Linked list function to add to the list in order.

* Example output, adding the strings "gamma", "alpha", "beta":

test 1: entries added in alphabetical order:
Allocated a new item at address 0xa030c30
Allocated a new item at address 0xa031098
Allocated a new item at address 0xa0310d0
Here is the list:
item 0: name='alpha'
item 1: name='beta'
item 2: name='gamma'
Freeing the item at address 0xa031098
Freeing the item at address 0xa0310d0
Freeing the item at address 0xa030c30
----
test 2: entries added to the start of the list:
Here is the list:
item 0: name='beta'
----
test 3: entries added at the end of the list:
Here is the list:
item 2: name='beta'
----
Figure 5.28 Linked list test output.
// CreateEntry() - create a new entry. Allocates

// space for the item and sets 'next' pointer to NULL
ENTRY *CreateNewEntry(char *NewName)
{
ENTRY *NewEntry;
if( (NewEntry = (ENTRY *)malloc(sizeof(ENTRY))) == NULL_ENTRY)

{
printf("Cannot allocate memory for list entry!\n");
exit(1);
}
printf("Allocated a new item at address %p\n", NewEntry);
strcpy( NewEntry->name, NewName);

NewEntry->next = NULL_ENTRY;
return NewEntry;
}
Figure 5.29 Linked list function to create a new item.
// PrintList() - print the list out

void PrintList()
{
ENTRY *Current;
int ItemID;
printf("Here is the list:\n");
for( Current = ListAnchor, ItemID = 0;

Current != NULL_ENTRY;
Current = Current->next, ItemID++)
{
printf("item %d: name='%s'\n", ItemID, Current->name);
}
}
Figure 5.30 Linked list function to traverse the list and print items.
// FreeList() - free all the allocated storage

void FreeList()
{
ENTRY *Current, *Save;
for( Current = ListAnchor; Current != NULL_ENTRY; )

{
Save = Current->next;
printf("Freeing the item at address %p\n", Current);
free( Current );
Current = Save;
}
}
Figure 5.31 Linked list function for freeing the list.
activity 5.3
Linked List Deletion
Sometimes it is necessary to delete not the entire list, but a single item
in the list. Write a function to to this, given a string as the item to be
deleted.
In some applications, a doubly-linked list is preferable. As well as having a next pointer,

a prev pointer is used to point to the previous item. This requires a little more storage
space (for the pointer) and slightly more complicated coding.
activity 5.4
Doubly Linked List
Draw a diagram of a doubly-linked list. How would this structure help

insertion into, and deletion from, the list?
Finally, some applications of linked lists are mentioned. Multitasking operating systems
used linked lists (or a variation thereof) to maintain the ordered lists of tasks to be run.
Priority queues, with each list corresponding to tasks at a certain scheduling priority,
may be used to quickly locate the next task to be scheduled. The older DOS FAT (File
Allocation Table) filesystem (still used on floppy disks) uses a linked-list variation to
locate the sectors on a disk corresponding to each file.
5.6 Binary Trees
Another important data structure is the binary tree (or btree). Figure 5.32 shows the
fundamental idea: instead of a linear list, each item forms a leaf in a tree, containing
not only the data but also left and right pointers (consider the diagram rotated so the
tree grows downwards). This type of data structure is able to locate items in O (log )
N
( )
time, as opposed to O N or linear time for a linked list .2
The fundamental idea is that items are added in order, where order is defined by the
task at hand. Consider the tree of Figure 5.32. Beginning at the root node and consid-
ering each item in turn, if the item to be added is less than the current item, the lower
(or left) pointer (branch) is accessed, which becomes the current node. Conversely, if
the new item is greater in the defined ordering, the right branch is accessed. Travers-
ing the tree to retrieve all the items requires recursively descending the left branch until
a null branch is encountered. This means that a leaf node has been encountered.
The right branch then becomes the new starting point. All left branches from this point
are followed. The code examples shortly will help clarify this.
b
b
b
b
b b
b b

b
b
b
b
Figure 5.32 Binary tree.
Figure 5.33 shows the output of the test program.
Figure 5.34 shows a test program for the binary tree implementation, which will be
examined below.
2
O () means order.
* Output with names "gamma", "beta", "alpha" entered:

Enter a name: gamma
Enter a name: beta
Enter a name: alpha
Enter a name:
name:'alpha'
name:'beta'
name:'gamma'
Figure 5.33 Binary tree test output.
Figure 5.35 shows addition to the tree. The decision as to whether to access the left
or right branch at each node depends on the value of the new item with respect to the
current item in the tree. When a decision is made to go left or right, that branch is
checked to see if it exists.
Creating a new node is done in the helper function shown in Figure 5.36. This simply
allocates space for the new node and initializes its value to that requested. The left/right
pointers are set to null.
The complexity of adding to the tree is offset to some degree by the relative simplicity
of accessing the tree in order. This is done using the recursive function shown in
Figure 5.37, which follows the left pointer as far as possible. When the leaf node is
encountered, the right node is taken as a new starting point and the tree descended.
activity 5.5
Binary Trees
By drawing a diagram similar to that shown for a binary tree, show each
stage of the addition of strings to the tree. Once the tree is populated
with several strings, show how the tree is recursively traversed in-order
and that this results in the correct alphabetical sequence being printed.
/* bintree.c - Binary Tree example for sorting alphabetically.

* Currently the list head is a static pointer.
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_NAMELEN 20
#define NODE struct node
#define NULL_NODE_PTR (NODE *)NULL
NODE
{
NODE *Parent;
NODE *LeftChild, *RightChild;
};
static NODE *RootPtr = NULL_NODE_PTR;
int main();
void AddNode( char *NewName );
NODE *CreateChildNode( NODE *Parent, char *name);
void DescendTree( NODE *StartNode );
int main()
{
char NameBuffer[MAX_NAMELEN+1];
while(1)
{
printf("Enter a name:");
gets(NameBuffer);
if( NameBuffer[0] != '\0')
AddNode(NameBuffer);
else
break;
}
DescendTree(RootPtr);
exit(0);
}
Figure 5.34 Binary tree main() calling code.

// AddNode() - add an item to the binary tree.

// The item is added to the correct leaf of the
// tree for the alphabetical sort.
void AddNode(char *NewName)
{
NODE *Current;
if( (Current = RootPtr) == NULL_NODE_PTR)

{
// nothing in tree yet
RootPtr = CreateChildNode( RootPtr, NewName);
return ;
}
while(1)
{
if( strcmp(NewName, Current->name) < 0)
{
// go left
if( Current->LeftChild == NULL_NODE_PTR )
{
Current->LeftChild =
CreateChildNode(Current, NewName);
return;
}
else
{
Current = Current->LeftChild;
}
}
else
{
// go right
if( Current->RightChild == NULL_NODE_PTR )
{
Current->RightChild =
CreateChildNode(Current, NewName);
return;
}
else
{
Current = Current->RightChild;
}
}
}
}
Figure 5.35 Binary tree function to add a node (calls CreateChildNode()).

// CreateChildNode() - create a child of the nominated parent.

// Returns the address of the new child.
NODE *CreateChildNode( NODE *Parent, char *NewName)
{
NODE *NewNodePtr;
if( (NewNodePtr = (NODE *)malloc(sizeof(NODE)))

== NULL_NODE_PTR )
{
printf("Cannot allocate memory for new node.\n");
exit(1);
}
strcpy( NewNodePtr->name, NewName);
NewNodePtr->Parent = Parent;
NewNodePtr->LeftChild = NewNodePtr->RightChild = NULL_NODE_PTR;
return NewNodePtr;
}
Figure 5.36 Binary tree helper function for AddNode().
// DescendTree() - go down the tree and print out items

void DescendTree(NODE *StartNode)
{
if( StartNode->LeftChild != NULL_NODE_PTR )
{
// left branch is there (not null) - so go down it.
DescendTree(StartNode->LeftChild);
}
// Returned from descent of this branch

// Print name of visited node.
printf("name:'%s'\n", StartNode->name);
// now go to right node, if it exists

if( StartNode->RightChild != NULL_NODE_PTR )
{
DescendTree(StartNode->RightChild);
}
}
Figure 5.37 Binary tree function to descend the tree.

5.7 Module Summary

The importance of algorithms in real-time systems was examined in this module. The
main topics examined in this module were:
The need for buffering and queue structures.
The differences, and similarities, between various queueing methods.
Algorithm performance considerations.
A number of other issues, such as searching algorithms and text parsing, have not
been covered. The references give some starting points for investigating these as the
need arises.
Further Reading
a
Module 6
MULTITASKING
Module 6 Multitasking 6.1
6.1 Module Overview

Multitasking is a very fundamental concept to the software engineer dealing with oper-
ating systems and real-time systems. After all, how can a single CPU on your desktop
PC appear to be running several programs simultaneously? In embedded and control
systems, a great number of tasks must be attended to, and usually some priority order
must be attached to these. Checking the turn indicator on a car to see if it must be
flashed, for example, is less time-critical, but no less important, than pulsing the fuel
injectors and firing the spark plugs at the right instant. Operating systems use pro-
cesses, also termed tasks, to allow several jobs to be performed simultaneously. Of
course, they only appear to be simultaneous, as a single CPU can only handle one set
of instructions at a time. The secret is to share the CPU amongst all the tasks which
must be performed, with regard to the appropriate amount of time each requires and
whether there are any more urgent matters pending. Although Unix and Windows NT
are not, in the strictest sense, real-time operating systems, they will be used in this and
the two subsequent modules to demonstrate the concepts of multitasking. The main
concepts introduced here are:
Processes, and
Threads.
These concepts are introduced by way of short example code segments for both Unix
and Windows NT.
6.2 Multitasking
The process (task) concept is the idea of subdividing the CPUs time into smaller times-
lices in order to do a small portion of each job when possible. The complete job will
likely not be finished in one timeslice. Threads, or strictly threads of execution are
a newer variation, and may be termed (for now) processes-within-processes. After
all, there is only one CPU (normally), and there may be dozens of processes running
at any given instant. If a particular process has no work to do, perhaps because it is
waiting for some data to arrive across a network or is waiting for a user-keystroke, then
the task may be temporarily suspended.
The concept of multitasking or multiprocessing is a critical one in all operating systems.

Real-time systems have their own extensions to handle real-time situations and allow
for fast exchange of information between various tasks. Figure 6.1 shows a system
with three tasks. The scheduling is performed in a round-robin fashion, with each
process given a time in which to execute. When this time is up, or the process requests
resources which are not immediately available, the process blocks (is not scheduled)
until the resources become available. This is indicated by the loopback arrows in the
figure.
blocked
task 2
task 1
task 3
sleep
Figure 6.1 Conceptual view of tasks and transitions.
Note, however, that the process does not actually execute in a loop, polling the re-
source. This would be very inefficient. Instead, the task is blocked examples of this
will be seen throughout this module.
6.2.1 Processes
A task or process is a separate program which is run concurrently with other tasks.
The terms task and process are in common use, and are used interchangeably here.
Each process is scheduled, normally in a time-sliced fashion and subject to the avail-
ability of the CPU, by the operating system. The scheduling is modified depending
upon the priority of the task and any resources the task has requested. Each task has
its own separate data segment, code segment, open files and other resources. Nor-
mally each task is constrained to access memory strictly within its own allocated areas.
However, as the following two modules will show, tasks may co-operate to perform the
overall operations required of the system and may also require synchronization with
one another.
In order to execute a process, the operating system must:
1. Read the executable code from disk and determine the initial memory resource
requirements.
2. Allocate space for the process, typically in conjunction with the memory manage-
ment subsystem of the operating system kernel.
3. Load the code into memory and perform appropriate initializations.

4. Place the process on the appropriate scheduling queue.
While the process is running, the operating system must:
1. Schedule the process at appropriate time intervals.

2. Suspend the process when the allocated timeslice expires.
3. Suspend and reschedule the process when resources are not available and later
become available.
4. Provide resources to the process as requested, such as memory.
5. Disallow the process from accessing memory owned by the kernel or other pro-
cesses. Similarly, the process is protected from other errant processes currently
executing. Normally this requires additional functionality provided by the Memory
Management Unit hardware.
As can be seen, the act of starting a process involves considerable overhead. When
this is infrequent, such as when a user application starts, the overhead is necessary
and unavoidable. However when a given system requires processes to be started
and stopped at regular intervals, the overhead becomes significant, and is a drain on
the overall system. An example of this is an web page server, where each request
is handled by a separate process. Examples of starting, suspending, and stopping a
process are given in the first sections of the module.
Suspending a process is also called blocking, when a process is waiting on a resource

or event for example, waiting for a network connection to be established so that data
may be transferred. While the resource is unavailable, the scheduling algorithm can
make more efficient use of the CPU by blocking the requesting task and scheduling an-
other. A process may voluntarily suspend itself by means of the Unix sleep(seconds)
or Windows Sleep(milliseconds) system call.
6.2.2 Threads
Conceptually similar to a process is a thread. A thread is a lightweight process. A
given process may own one or more threads of execution. Implementation of threads
in operating systems is comparatively new many Unix flavors contain the POSIX
threads library, and Windows NT supports threads. Unlike separate processes, which
are loaded on demand, threads are loaded with the process itself and not on demand.
In addition, threads are able to access the memory space of their parent process di-
rectly. This makes for much faster startup of a thread, but means special care must
be taken in protecting areas of memory which are accessed by more than one thread.
Examples of this are given later in the module.
Both processes and threads are discussed in this module, for both Unix (POSIX)
and Windows NT. Code in this section is written by the author, derived principally
from [17], [18] and [19].
6.2.3 Examining System Processes

Figure 6.2 shows the processes running on a Windows NT workstation. Some of the
application names may be familiar. Figure 6.3 shows the processes (tasks) on the
same system note that each application may create (fork) several child processes
to perform various functions. In addition, there are a number of system processes
which do not correspond directly to applications. Figure 6.4 shows the time history of
the system note the extra resource burden when a new application was started.
Figure 6.2 Windows NT applications running on a system.
activity 6.1
Examining the processes running on a system.
In Windows NT, control-alt-delete may be used to bring up a system

dialog box. Press Task Manager to examine the tasks on your system.
Under Unix, ps may be used to show the current login processes; ps -A
shows all processes.
Figure 6.3 Windows NT tasks (processes) running on a system.
Figure 6.4 The Windows NT performance monitor.

6.3 Multitasking Examples

This section shows how to create processes under Unix first, then under Windows NT.
6.3.1 Processes under Unix

Figure 6.5 shows a simple process for use with the following examples. It is termed
a child process because it is intended to be created (forked) by another process,
termed the parent. When the process runs, it will be given a unique Process Identifier
(PID) by the system. A C program always begins with the main() function, which
appears like every other function. However, since it is called not by another function
in the process but by the operating system itself, it is given special arguments. These
are:
1. The number of command-line arguments, an integer variable usually called argc

(argument count) by convention.
2. The command-line arguments themselves, a pointer to an array of pointers to

null-terminated strings. This variable is usually called argv (argument vector) by
convention.
In Figure 6.5, the command-line arguments are printed out. The following exercise
shows illustrates these. Note the use of the fflush() system call. This is because
system output is, by default, buffered. This means that the output of a program may
not immediately appear on the screen. It depends on the system scheduling and the
buffer sizes. In later exercises, several processes will be forked concurrently, and the
system output may be confusing unless each processes output is flushed immediately.
Note the use of the system calls sleep(), which temporarily suspends the process,
and exit(), which returns an exit status code to the operating system.
/* child.c - a simple child process to be run in multiple contexts

* Used with forke.c
*
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>

{
int argn;
printf("Child process here with PID %d\n", getpid() );

for( argn = 0; argn < argc; argn++)
{
printf("Child: arg %d = '%s'\n", argn, argv[argn]);
}
printf("Child sleeping...");
// Without an fflush of the standard output stream,

// the output strings above are buffered and not
// seen immediately
fflush(stdout);
sleep(2);
printf("Child exiting.\n");
fflush(stdout);
// normally an exit code of 0 signifies success,

// any other signifies failure.
exit(5);
}
Figure 6.5 A simple child process.

activity 6.2
A simple child process.
Compile and run the above example (either Windows NT command-

line or Unix). Add various command-line options and verify the result.
Change the sleep argument to 100 seconds. Under Unix:
1. Start the process in the background using

child argone argtwo &
2. Find the PID using ps -a | grep child
3. Kill the process using kill -SIGKILL <pid>, where <pid> is the
PID from ps.
Under NT:
1. Start the process in the background using

start child argone argtwo
2. Invoke the task manager using control-alt-delete and then se-
lecting the task manager.
3. Kill the process by selecting it from the task list and using the
End Process button.
Note that under Windows the alt-tab sequence may be used to step
through the task list.
Figure 6.6 shows the fork() system call, which is used in Unix to create a copy of the
current process. fork is interesting, in that it appears as a function that returns in two
places! If the call returns 0, the execution is in fact resuming in the child process. If the
call returns some other positive value, the execution has resumed in the context of the
parent process. The value returned is the PID of the child process.
Creating a child process as a duplicate of the current process is often not what is
required. The exec() family of calls takes the current process and overlays it with
a new process. Figure 6.7 shows this. As shown, the parent waits for the child to
terminate using the wait() system call. This is not mandatory the child process may
continue executing after the parent has finished. Figure 6.8 shows a sample output
from these test programs.
6.3.2 Processes under Windows NT

Windows NT does not use the indirect fork-execute method of Unix. Rather, the system
call CreateProcess() is used as shown in Figure 6.9. Note that threads are more
/* fork.c
* Demonstration of process duplication via 'fork()'
* See also forke.c
*
* Platform: SunOS or cygwin/NT
* Compiler: gcc
*
* Example output:
phanes (leis) [49] fork
child process - fork() returned 0, my pid via getpid() is 18200
Parent process ID 18199, child process is 18200
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
int main()
{
pid_t ForkPID;
switch( ForkPID = fork() )

{
case 0:
// child process - fork() returns 0
printf("child process - fork() returned %d ", ForkPID);
printf(", my pid is %d\n", getpid() );
fflush(stdout);
// zombie child exists from now until parent exits
break;
case -1:
// error in fork() - too many processes ?
perror("fork() failed:");
exit(1);
default:
// parent process. fork() returns the child PID
printf("Parent process ID %d, child process is %d\n",
getpid(), ForkPID);
fflush(stdout);
}
exit(0);
}
Figure 6.6 Using fork() to create a duplicate process.

// forke.c - Process duplication via 'fork()'

// and overlaying of another process via 'execl()'
// gcc under Unix or cygwin/NT.
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
int main()
{
pid_t ForkPID, ChildPID;
unsigned int ChildStatus;
switch( ForkPID = fork() )

{
case 0: // child process - fork() returns 0
printf("child process - fork() returned %d ", ForkPID);
// overlay the process "child"

printf("About to execl()...\n");
fflush(stdout);
execl("child", "child", "some arg", NULL);
// get here only if execl fails

printf("Should never get here!!\n");
fflush(stdout);
break;
case -1: // error in fork() - too many processes ?

exit(1);
default: // parent process. fork() returns the child PID

getpid(), ForkPID);
fflush(stdout);
// Wait for the child to exit & get its exit status.
ChildPID = wait(&ChildStatus);
// high-order 8 bits contain the child's exit code

printf("Parent: child PID %d gave exit status %d\n",
ChildPID, (ChildStatus >> 8) );
fflush(stdout);
}
exit(0);
}
Figure 6.7 Using fork() for duplication then overlaying using exec().

About to execl()...
Child process here with PID 1001
Child: arg 0 = 'child'
Child: arg 1 = 'some arg'
Child sleeping...Child exiting.
Parent: child PID 1001 gave exit status 5
Figure 6.8 Output of fork()+exec() test.
tightly integrated into NT, and that each process has a main thread. The output of this
example is shown in Figure 6.10 note that the same child code is used.
activity 6.3
Processes under Windows NT.
Compile and run the process.c code. What happens if child.exe is

not present?
Change the child process name (variable char *processName) to an-

other program, such as notepad.exe. If using notepad.exe it will be
necessary to copy it from \winnt\notepad.exe
Note that graphical or GUI (graphical user interface) applications under Windows have
the function WinMain() as their main entry point rather than main(), with different ar-
guments.
6.4 Threads
A process, under either Unix or NT, is a separate stand-alone program with its own
separate local and global variables, open files and so forth. Although child processes
are given command-line arguments from the parent and can inherit a copy of the open
file handles of the parent, the child has its own execution context. That is, its own
variables, stack, and so forth. The child process has a main() or WinMain() at which
execution begins. When main() reaches the end, or the child calls exit(), the process
terminates.
Threads are a different proposition. Threads are essentially a special function within
the main parent process. However, when that function is invoked as a thread, it is
// process.c - Windows NT processes (tasks).

// Creates 4 child processes.
// John Leis
#include <stdio.h>
#include <stdlib.h>
int main()
{
STARTUPINFO si;
PROCESS_INFORMATION pi;
BOOL fCreated;
char *processName = "child.exe";
int nProcesses = 4, processNum;
DWORD errorCode;
LPVOID lpMsgBuf;
memset(&si, 0, sizeof(STARTUPINFO) );
si.cb = sizeof(STARTUPINFO);
for( processNum = 1; processNum <= nProcesses; processNum++)

{
fCreated = CreateProcess(
// application name, command line
TEXT(processName), NULL,
NULL, NULL, TRUE, 0, NULL, TEXT("."), &si, &pi);
if( fCreated )
{
printf("----\n");
printf("Process created:\n");
printf("process handle: %ld\n", (long)pi.hProcess);
printf("process main thread handle: %ld\n",
(long)pi.hThread);
printf("process ID %ld: \n", (long)pi.dwProcessId);
printf("main thread ID: %ld\n", (long)pi.dwThreadId);
printf("----\n");
}
else
{
printf("Process creation error \n");
// normally additional error-handling code here
// to explain the reason for failure
exit(1);
}
}
Sleep(4000L);
}
Figure 6.9 Creating a process under Windows NT.

----
Process created:
process handle: 116
process main thread handle: 112
process ID 253:
main thread ID: 245
----
----
Process created:
process handle: 140
process ID 186:
main thread ID: 189
----
----
Process created:
process handle: 164
process ID 270:
main thread ID: 269
----
----
Process created:
process handle: 124
process ID 225:
main thread ID: 247
----
child process...
child process...
child process...
child process...
Goodbye.
Goodbye.
Goodbye.
Goodbye.
Figure 6.10 Output of Windows NT process creation test.

scheduled separately. The same function may be invoked as a thread as many times
as desired.
One coding issue brought about by multi-tasking is that of re-entrancy. Essentially, this
means that it is now possible for one function to be called by more than one task or
thread. If the function being called maintains static data representing the current state
of an object, the very act of calling the function in more than one context may mean that
the state information is changed several times. Some examples of this and where it
causes failure will be seen shortly.
This section shows how to create threads under Unix first, then under Windows NT.
6.4.1 Threads under Unix

Many implementations of Unix support so-called p-threads or POSIX threads, mean-
ing the POSIX Thread Library. Figure 6.11 shows a simple thread function, which out-
wardly appears to be an ordinary C function. This code exists in the same module (file)
as Figure 6.12, which is the main thread calling program. In Figure 6.12, the thread
function threadFunc() is invoked a number of times using pthread_create(). In con-
trast to a process, however, each instance of threadFunc() can access any global
variables in pthread.c. The thread function here is quite basic it simply increments
(or attempts to increment) the global variable nCounter.
Since 50 threads are created and each thread has a loop that increments nCounter
50 times, it would be expected that nCounter would be incremented to 5000. How-
ever, Figure 6.13 shows that this is not the case. This is due to the interaction of the
scheduling, and the fact that incrementing a variable is not necessarily an atomic op-
eration. An atomic operation is one that is completed entirely, and never interrupted
when partially finished. In the case of incrementing a counter, the C code
nCounter++ ;
may in fact translate into several assembly-language instructions (recall the earlier
module on compiler operation). Each assembly language instruction may be inter-
rupted by the system timer, and thus the thread currently executing may happen to be
suspended while another thread/process is allowed to run. Now consider the possible
sequence of operations in incrementing a variable in memory:
1. Load the memory location into a register.

2. Increment the value in the register.
3. Store the new value in the register to the memory location
At each stage, a system timer interrupt and re-schedule is possible. In the example
code, the actual increment has been split into several C lines to make failure more
likely. So in Figure 6.13, the value does not reach 5000 but only 4617. A simple C
operation such as
void *threadFunc()
{
long c;
long tmp;
// note that the thread instance number will not

// necessarily equal the thread id returned by thr_self()
printf("thread id %d here...\n", thr_self() );
fflush(stdout);
for( c = 0; c < 100L; c++)

{
pthread_rwlock_wrlock(&lock);
// will occasionally fail

// nCounter++ ;
// will fail often

tmp = nCounter;
printf("counter=%d\n", nCounter);
fflush(stdout);
tmp++;
nCounter = tmp;
pthread_rwlock_unlock(&lock);
}
return (void *)NULL;

}
Figure 6.11 A thread function using POSIX threads.
nCounter++ ;
may only very occasionally fail, but it will still fail eventually. Such a bug is difficult
to track down. This situation may occur in other scenarios for example consider a
database for checking seat reservations on an airline. The operation of checking for an
available seat and actually booking the seat must be atomic, otherwise the last seat on
the plane may occasionally be sold twice!
The solution to the above dilemma is that the programmer must use lock functions
and atomic calls which are provided as part of the thread library. In Figure 6.13, the
functions
pthread_rwlock_wrlock()
and
/* pthread.c - Posix 4 threads

* gcc pthread.c -o pthread -lpthread
* See man pthreads
* See also the NT version. thread.c
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *threadFunc();
long nCounter;
pthread_rwlock_t lock = PTHREAD_RWLOCK_INITIALIZER;
int main()
{
pthread_t tid;
pthread_attr_t tattr;
pthread_rwlockattr_t lattr;
int threadNum;
int nThreads = 50;
nCounter = 0L;
pthread_attr_init(&tattr);
pthread_rwlockattr_init(&lattr);
pthread_rwlock_init(&lock, &lattr);
for(threadNum = 1; threadNum <= nThreads; threadNum++)

{
if( pthread_create(&tid, &tattr, threadFunc, NULL) != 0 )
{
printf("could not create task #%d\n", threadNum);
fflush(stdout);
exit(1);
}
}
sleep(10);
fflush(stdout);
printf("main() exiting\n");
exit(0);
}
Figure 6.12 Creating a thread using POSIX threads.

thread id 4 here...
counter=0
counter=1
counter=2
counter=3
counter=4
counter=5
...
thread id 5 here...
...
counter=4612
counter=4613
counter=4614
counter=4615
counter=4616
counter=4617
main() exiting
Figure 6.13 Creating a thread using POSIX threads output without locking.
pthread_rwlock_wrlock()
are used, to place a semaphore (lock) on the critical regions of code those that must
be executed atomically. Why not simply lock the entire thread function? The answer is
because this would mean that the entire thread function must execute to completion,
and other threads would be blocked. The threads would execute sequentially, thus
defeating the purpose of having concurrent threads.
activity 6.4
Threads and atomic operations under Unix/POSIX.
Compile and run the pthread.c code. Test with and without the atomic
operation in place. How often does the counter reach an incorrect value
when simply using an increment? When using the temporary variable?
Check that failure never occurs when using the thread lock functions.
6.4.2 Threads under Windows NT

Windows NT also provides support for threads, with atomic operations and mutual
exclusion of a critical sections of code. The library function CreateThread() serves
long nCounter = 0;
long WINAPI doIncrement(long lParam)

{
long c;
long tmp;
printf("Thread: lParam=%ld\n", lParam);

for( c = 0; c < 1000L; c++)
{
// may occasionally fail
//nCounter++;
// guaranteed thread-safe
//InterlockedIncrement(&nCounter);
// fudge, will fail quite often

tmp = nCounter;
printf("counter is %ld\n", nCounter);
tmp++ ;
nCounter = tmp;
}
return NO_ERROR;
}
Figure 6.14 A thread function under Windows NT.
a similar purpose to the CreateProcess() function encountered earlier, except that it

takes a pointer to a thread function rather than the name of a disk executable.
The function shown in Figure 6.14 attempts to increment the variable nCounter 1000
times. The calling program, Figure 6.15, creates 50 threads. Thus Counter should
be incremented 50,000 times upon completion. Like the previous Unix example, if the
operation of incrementing the counter is not atomic it may be interrupted. The function
InterlockedIncrement(&nCounter)
is used in Figure 6.15 to perform an atomic (non-interruptible) increment of a variable.

Many other synchronizing primitives are provided; these are discussed in a later mod-
ule. Figure 6.16 shows a sample output.
/* thread.c - Windows NT threads and atomic counters

* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
long nCounter = 0L;
long WINAPI doIncrement(long lParam);
int main()
{
DWORD threadID;
int threadNum, nThreads = 50;
HANDLE hThread;
DWORD errorCode;
LPVOID lpMsgBuf;
printf("starting...\n");
nCounter = 0L;
for( threadNum = 1; threadNum <= nThreads; threadNum++)
{
hThread = CreateThread(NULL, 0,
(LPTHREAD_START_ROUTINE)doIncrement, // thread routine
(void *)threadNum, // passed to thread function
0, &threadID); // flags, returned thread ID
if( hThread )
{
printf("thread id is %ld\n", threadID);
}
else
{
printf("Thread creation error \n");
exit(1);
}
}
// MUST have a delay here to continue processing.

// If the task exits, the thread will be terminated
printf("sleeping 30 seconds ....\n");
fflush(stdout);
Sleep(30000L); // Windows Sleep (captial S) is in milliseconds
printf("end of process, counter is %ld\n", nCounter);
return 0;
}
Figure 6.15 Creating a thread under Windows NT.

starting...
thread id is 262
thread id is 263
thread id is 264
thread id is 265
thread id is 266
thread id is 267
thread id is 268
thread id is 269
thread id is 270
thread id is 248
thread id is 206
thread id is 216
thread id is 255
Thread: lParam=1
counter is 0
counter is 1
counter is 2
...
counter is 40343
counter is 40344
counter is 40345
counter is 40346
counter is 40347
end of process, counter is 40348
Figure 6.16 Code output when creating a thread under Windows NT.
activity 6.5
Threads and atomic operations under Windows NT.
Compile and run the thread.c code. Test with and without the atomic
operation in place. How often does the counter reach an incorrect value
when simply using an increment? When using the temporary variable?
Verify that failure never occurs when using the atomic increment function.
6.5 Module Summary

The fundamental concept of multi-tasking was introduced in this module, with some
concrete examples of threads and processes in Windows NT and Unix environments.
The important concepts from this module are:
The idea of a process;
The idea of a threads, and how threads differ from processes.
Further Reading
a
Module 7
INTERPROCESS
COMMUNICATION
Module 7 Interprocess Communication 7.1
7.1 Module Overview
As shown in the previous module, one processor (CPU) may be required to execute
several processes in a round-robin fashion. These processes may be largely indepen-
dent, or they may be the result of a conscious decision by the designer to break up
the functions into several separate, but co-operating processes. As described in the
previous module, the processes (or tasks) are largely independent and cannot commu-
nicate with each other. But such a communications mechanism is very important, so
as to enable processes to share data, request that another perform some action, and
so forth. The exact mechanism may be synchronous via messages, or asynchronous
via software interrupts. In the first instance, messages are posted on a message queue
by the sender, addressed to the receiver. The message may be only one or two bytes,
or may be a more elaborate data structure. It is said to be synchronous because
the recipient process attends to messages as time permits. The second method in-
volves stopping the normal flow of execution of the recipient task, typically by calling a
nominated function to handle the situation. The topics covered in this module are
Message queues and pipes.
Shared memory and file memory mapping.
The first, message queues, involves posting/retrieving a message as described above.

The second involves a block of memory which is common to two processes remem-
ber that normally processes memory is separate and one process cannot access the
memory of another process.
7.2 Interprocess Communication
The creation of processes and threads were discussed in the previous module, for both
Unix (POSIX) and Windows NT. Co-operating processes (and/or threads) are com-
monly used in large, complicated systems where many concurrent transactions must
be handled or where the overall system may be logically broken up into separate tasks.
In a sense, this is much like structured programming, which teaches the principle that a
single, monolithic main program should be broken down into constituent components.
The key point is that the processes co-operate which implies that they must share
information at various stages of the processes lifecycle. This is the aspect which is
discussed in this module. Synchronization, covered in the next module, may be viewed
as an extension of process communication. In many ways, one cannot be done without
the other.
As in the previous module, the conceptual fundamentals are introduced followed by

examination of working code for Unix and Windows NT. Code in this section is written
by the author, derived principally from [10] [18] [19] and [20].
7.3 Command Arguments
Processing of command-line arguments was introduced in the previous module. In

effect, it is a one-way, once-off communication between parent and child processes.
Figure 7.1 repeats a simple example of this for convenience. Note that the arguments
contained in *argv[] are strings, and so numeric arguments must first be converted to
string form using a function such as sprintf() in the parent, and then converted back
to the required data type (integer, floating point, etc) using scanf() or a function such
as atoi() (ASCII to integer).
/* cmdargs.c
* Command-line arguments
*
* Example output:
C:\usr\c\examples>gcc cmdargs.c -o cmdargs
C:\usr\c\examples>cmdargs one two three

There are 4 command-line arguments
Argument number 0 is cmdargs
Argument number 1 is one
Argument number 2 is two
Argument number 3 is three
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>

{
int argNum;
printf("There are %d command-line arguments\n", argc);

for( argNum = 0; argNum < argc; argNum++)
{
printf("Argument number %d is %s\n", argNum, argv[argNum] );
}
exit(0);
}
Figure 7.1 Passing command-line arguments to a process via main().
Such a method is easy and convenient for passing configuration variables, filenames
and so forth. However, it is once-off when the child process starts. Processes which
need to continually exchange information need some other mechanism.
7.4 Pipes
The pipe provides for a one-way or bidirectional flow of bytes in a sequenced order. It
is effectively a first-in, first-out queue of bytes. The application processes must impose
some structure on the sequence for example, as character strings or a struct data
structure. Both Unix and Windows NT support pipes. Conceptually they are similar to
files accessed in order. Furthermore, the application programming interface (API) is
similar to accessing a file using file descriptors (handles).
Figure 7.2 shows the basic idea of a pipe communicating between two processes.
process 1 process 2
pipe
Figure 7.2 Illustrating an interprocess pipe.
7.4.1 Pipes under Unix
Figure 7.3 shows the establishment of a Unix pipe. The system call pipe() creates two
file descriptors. In the example, they are in the array int fifo[2]. Element fifo[0] is
created for reading, and element fifo[1] is for writing. In preparation for sending the
pipe handles to the child process, the file descriptor is converted into an ASCII string.
After the pipe (or FIFO) is created, a child process is created. As shown in Figure 7.4,
this is similar to the examples discussed in the previous module. The child process
expects a command-line argument which is the string representation of the pipes file
descriptor, hence the string buffer pipestr is passed in the execl() call.
Turning to the child process (Figure 7.5), the command-line arguments are checked
and parsed. Recall that argv[0] is the name of the process itself. Parsing here is
simple, and consists of converting the command-line argument argv[1] string into an
integer using atoi().
Note that this only works because under Unix, child processes inherit the open file
descriptors of their parents. The command-line passing of the file descriptor is merely a
/* pipesnd.c - pipe sender (parent process)

* Use in conjunction with pipercv.c
* The file descriptor for the pipe is passed via the argv
* command-line arguments. This works because file descriptors
* are inherited by the child process
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int fifo[2];
char *msg = "this is a test";
int pid;
char pipestr[10];
printf("parent pid %d\n", getpid());

fflush(stdout);
// create fifo[0] for reading, fifo[1] for writing

if( pipe(fifo) == -1)
{
printf("cannot create pipe\n");
exit(1);
}
// create a string for the fifo[0] read end of the pipe

sprintf(pipestr, "%d", fifo[0]);
printf("parent pipestr %s\n", pipestr);
Figure 7.3 Writing on a Unix pipe (part 1 of 2).

switch( pid = fork() )

{
case 0:
// child process - fork() returns 0
printf("child process - fork() returned %d ", pid);
// overlay the process "child"

printf("About to execl()...\n");
fflush(stdout);
execl("pipercv", "pipercv", pipestr, NULL);
// get here only if execl fails

printf("Should never get here!!\n");
fflush(stdout);
exit(1);
case -1:
// error in fork() - too many processes ?
exit(2);
default:
// parent process. fork() returns the child PID
getpid(), pid);
fflush(stdout);
}
printf("parent sleeping...\n");
fflush(stdout);
sleep(4);
printf("parent %d about to write on pipe\n", getpid() );
fflush(stdout);
write(fifo[1], msg, strlen(msg));
printf("end of pipesnd process\n");

fflush(stdout);
exit(0);
}
Figure 7.4 Writing on a Unix pipe.

/* pipercv.c - pipe receiver (child process)

* Use in conjunction with pipesnd.c
* The file descriptor for the pipe is passed via the argv
* command-line arguments. This works because file descriptors
* are inherited by the child process
*
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NBUF 100

{
int fd;
char buf[NBUF];
if(argc != 2)
{
printf("expect pipercv fd\n");
exit(1);
}
fd = atoi(argv[1]);
printf("pipercv: pid %d, pipe fd %d\n", getpid(), fd);
fflush(stdout);
// simply read the first 8 chars

read(fd, buf, 8);
buf[8] = '\0';
printf("child %d got from pipe: ", getpid() );
printf("%s\n", buf);
fflush(stdout);
printf("end of pipercv process\n");

fflush(stdout);
Figure 7.5 Reading a Unix pipe.

parent pid 1000

parent pipestr 3
About to execl()...
parent sleeping...
pipercv: pid 1001, pipe fd 3
parent 1000 about to write on pipe
end of pipesnd process
child 1001 got from pipe: this is
end of pipercv process
Figure 7.6 Unix pipe output.
mechanism for the child to be informed of the value of the file descriptor corresponding
to the pipe.
Reading the pipe is similar to reading from a file, using the read() call. Thus, a byte-
stream sequence is established between parent and child. If bidirectional communica-
tion is required (child to parent as well), the parent would have to pass the file descriptor
for the write end of the pipe to the child.
Figure 7.6 shows the output from the combined parent/child pipe communications. It is
now evident why the fflush() call is used after printf() both the parent and child
output strings are appearing interleaved on the same output screen device. If the buffer
was not flushed, the output would be somewhat confusing as the timing order would be
lost. Note that the inclusion of the fflush() call does not necessarily guarantee that
the parent/child output is not mixed up, as a process context switch may occur at any
time.
7.4.2 Pipes under Windows NT
The following example is of a named pipe under Windows NT. The server process
(Figures 7.7 and 7.8) creates the pipe using CreateNamedPipe(). The pipe is a special
form of filename, of the form \pipe\mypipe, where the first pipe is mandatory and
the second argument is the name of the pipe itself. Note also the additional security
attributes which may be set up under NT.
In this example, the server (parent) is expecting a message through the pipe from
the client (child) process. Although the passing of the pipe name could be done via
command-line arguments, since the name is predefined no special mechanism is used
and the child process accesses the pipe directly. The PIPENAME string would normally
be defined in a header (.h) file which is included by both parent and child.
The child process of Figure 7.10 assumes that the pipe file has already been created,
and opens a file descriptor (Unix terminology) or handle (Windows terminology) to the
/* pserver.c Windows NT named pipe - server

* John Leis
*/
#include <stdlib.h>
#include <string.h>
#define NBUF 100

#define PIPENAME "\\\\.\\pipe\\mypipe"

{
HANDLE hPipe;
SECURITY_ATTRIBUTES sa;
SECURITY_DESCRIPTOR *psd;
BOOL fRead;
LPSTR pBuf;
char buf[NBUF];
DWORD nRead;
BOOL fConnect;
printf("Pipename = %s\n", PIPENAME);
// null acl - unlimited access

psd = (SECURITY_DESCRIPTOR *)LocalAlloc(LPTR,
SECURITY_DESCRIPTOR_MIN_LENGTH);
InitializeSecurityDescriptor(psd, SECURITY_DESCRIPTOR_REVISION);
SetSecurityDescriptorDacl(psd, TRUE, NULL, FALSE);
sa.nLength = sizeof(sa);
sa.lpSecurityDescriptor = psd;
sa.bInheritHandle = TRUE;
hPipe = CreateNamedPipe(TEXT(PIPENAME), // \\machine\pipe\name

PIPE_ACCESS_DUPLEX,
// structured, blocking mode
PIPE_TYPE_MESSAGE | PIPE_READMODE_MESSAGE | PIPE_WAIT,
1, 4096, 4096, // out, in buffer size
INFINITE, // timeout for use with WaitNamedPipe()
&sa); // security descriptor
if( hPipe == INVALID_HANDLE_VALUE )

{
printf("server:pipe error\n");
exit(1);
}
Figure 7.7 A message pipe server under Windows NT (part 1 of 2).

printf("about to connect to named pipe...\n");
fConnect = ConnectNamedPipe(hPipe, NULL);

if( fConnect )
{
printf("connect pipe OK\n");
}
else
{
printf("could not wait on pipe\n");
exit(1);
}
pBuf = buf;
fRead = ReadFile(hPipe, pBuf, NBUF, &nRead, NULL);
if( fRead == FALSE)

{
printf("read error\n");
exit(1);
}
printf("%ld bytes read: %s", nRead, buf);
exit(0);
}
Figure 7.8 A message pipe server under Windows NT (part 2 of 2).

* Example output (server)

C:\usr\c\NT>pserver
Pipename = \\.\pipe\mypipe
about to connect to named pipe...
connect pipe OK
6 bytes read: hello
* Example output (client)
C:\usr\c\NT>pclient
Pipename = \\.\pipe\mypipe
6 bytes written
Figure 7.9 Windows NT pipe outputs.
pipe using CreateFile() with the appropriate flags for read-only opening of an existing
file (actually a pipe). A message is then written using WriteFile().
An example output is shown in Figure 7.9. When executing this example, the output
is more easily understood if the client pclient and server pserver are invoked from
separate console windows. In addition, the server must be executed first in order to
create the pipe. In a practical situation, this means that the server would have to create
the client in a parent/child situation, rather than having them invoked separately as is
done here.
activity 7.1
Windows NT Pipes.
Compile the code examples pserver and pclient. Create two separate
command (DOS) windows. Invoke the client in one window it should
fail, as the code assumes the pipe has already been created and simply
tries to open the existing pipe file. Now run the server in one window
first, and then the client in the other window.
/* pclient.c Windows NT named pipe - client

* John Leis
*/
#include <stdlib.h>
#include <string.h>
#define NBUF 100

#define PIPENAME "\\\\.\\pipe\\mypipe"
int main()
{
HANDLE hPipe;
char *msg = "hello";
DWORD written, nBytes;
LPSTR pBuf;
BOOL fWrite;
printf("Pipename = %s\n", PIPENAME);
hPipe = CreateFile( TEXT(PIPENAME), // \\machine\pipe\name

GENERIC_READ | GENERIC_WRITE,
0, NULL, OPEN_EXISTING, 0, NULL);
if( hPipe == INVALID_HANDLE_VALUE )

{
printf("pipe open error\n");
exit(1);
}
pBuf = msg;
nBytes = strlen(msg)+1;
fWrite = WriteFile( hPipe, pBuf, nBytes, &written, NULL);
if( fWrite == FALSE)
{
printf("client:write error\n");
exit(1);
}
printf("%ld bytes written\n", written);
exit(0);
}
Figure 7.10 A message pipe client under Windows NT.

7.5 Shared Memory

The previous section introduced pipes, which are a queued, one-way mechanism for
exchange of data. The pipe is inherently serial or sequential. As a result, commu-
nication is potentially slower, due to the queueing of the pipe buffers in the operating
system kernel. A more fundamental problem is that the data must be accessed in the
order in which it was sent any urgent or high-priority messages must wait their turn.
In some applications, another solution is needed.
Normally each process has its own memory space, and cannot (deliberately or acci-
dentally) read from or write to the address space of any other process. However, direct
access to memory is the fastest possible type of interprocess communication, and it
allows random access (rather than sequential as in pipes).
As shown in Figure 7.11, the shared memory segment does not necessarily appear in
the same address space of each process.
process 1 process 2
shared
segment
Figure 7.11 Illustrating process memory sharing or memory mapping.
7.5.1 Shared Memory under Unix

Different Unix variants implement different API methods for accessing shared memory.
The POSIX file-mapping method is illustrated here, as it is more portable and quite
similar to the Windows NT method shown in the next section. Figure 7.12 shows a
complete shared-memory program. The abstraction uses a file, which is mapped into
the address space of the process. The name of the file (here mapfile) is the common
key for all processes using the shared memory segment.
Examining the code, the file is first created using open() with the create and read-write
flags. The mapping is performed using mmap(), which takes the file descriptor and
returns a pointer to the shared memory block. This pointer may then be used as a
conventional memory pointer. In the example, the first byte of the shared memory is
incremented in a loop every one second.
/* mmap.c - illustrates mmap() system call

* This program creates a shared memory segment via a file
* and writes some data into it. If two processes access
* the shared memory, the output becomes unpredictable.
* Platform: SunOS or cygwin/NT Compiler: gcc
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
int main()
{
int fd, i, len = 10;
off_t off = 0;
char *pMem, c;
// initialize device file

fd = open("mapfile", (O_CREAT | O_RDWR), 0666 );
if( fd < 0 )
{
perror("open(fmap)");
exit(1);
}
c = 'x';
write( fd, &c, 1);
// map the file to memory

pMem = (char *)mmap( (caddr_t)0, len,
(PROT_READ | PROT_WRITE), MAP_SHARED, fd, off);
printf("pMem is %p\n", pMem);

c = 'a';
for(i=0; i < 10; i++)
{
*pMem = c;
sleep(1);
printf("%d: c=%c *pMem = %c\n", getpid(), c, *pMem);
c++ ;
}
munmap(pMem, len);
close(fd);
exit(0);
}
Figure 7.12 Shared memory via mmap() under Unix.

phanes (leis) [16] mmap

pMem is ff380000
18105: c=a *pMem = a
18105: c=b *pMem = b
18105: c=c *pMem = c
18105: c=d *pMem = d
18105: c=e *pMem = e
18105: c=f *pMem = f
18105: c=g *pMem = a
18105: c=h *pMem = b
18105: c=i *pMem = c
...(above & below run simultaneously)...
phanes (leis) [38] mmap

pMem is ff380000
18106: c=a *pMem = h
18106: c=b *pMem = i
18106: c=c *pMem = j
18106: c=d *pMem = d
18106: c=e *pMem = e
18106: c=f *pMem = f
18106: c=g *pMem = g
18106: c=h *pMem = h
18106: c=i *pMem = i
Figure 7.13 Output of Unix shared memory example.
Starting two of these processes (preferably in two separate command windows for a
saner output) results in two processes attempting to access the same area of shared
memory. Figure 7.13 shows the situation when one is started and several seconds
later the second started. If only one process were running, the character c which was
written would correspond to the character returned from the shared memory, *pMem.
The first read of the second process results in the character h being returned, which
was placed there by the first process. The second process then writes an a, which is
read by the first process.
Evidently, the benefits of faster and random access to shared memory have introduced
another problem which was not present with pipes: that of synchronization. Some
method of guaranteeing exclusive access to a shared memory area is necessary so
that operations may be performed atomically. This topic is discussed in the next mod-
ule.
activity 7.2
Unix shared memory.
Explain the output of Figure 7.13 in terms of the source code. When did
the scheduler interrupt one process and start another?
7.5.2 Shared Memory under Windows NT

Windows NT uses a similar analogy for accessing shared memory. The memory seg-
ment is mapped to a filename, much as the pipe message queues were. Figure 7.14
shows the creation of the file using CreateFile() This is effectively a standard disk file
initially (here called mapfile). Next (Figure 7.15), the file is mapped into another file
handle using CreateFileMapping(). This prepares the file for memory mapping, and
uses a different filename (here mapmemory). Finally, the file contents are mapped to
a memory address range within the process using MapViewOfFile(). This returns a
pointer *pMapMem to the start of the shared memory block.
In a manner similar to the Unix example, Figure 7.16 shows the result when two pro-
cesses attempt to access the same shared memory block.
activity 7.3
Windows NT shared memory.
Explain the output of Figure 7.16 in terms of the source code. When did
the scheduler interrupt one process and start another?
/* mmap.c - Windows NT shared memory via file mapping.

* Run two of these processes in separate windows. Start
* one a few seconds after the other.
* Normally semaphores or some other access mechanism would
* be used to control access to shared objects.
* John Leis
*/
#include <stdlib.h>
#include <stdio.h>
#define MAP_FILENAME "mapfile"

#define MAP_MAPNAME "mapmemory"
#define BYTES_TO_MAP 1
int main()
{
HANDLE hFile, hMap;
char *pMapMem, c;
SECURITY_ATTRIBUTES sa;
SECURITY_DESCRIPTOR *psd;
int n;
// null acl - unlimited access

psd = (SECURITY_DESCRIPTOR *)LocalAlloc( LPTR,
SECURITY_DESCRIPTOR_MIN_LENGTH);
InitializeSecurityDescriptor(psd, SECURITY_DESCRIPTOR_REVISION);
SetSecurityDescriptorDacl(psd, TRUE, NULL, FALSE);
sa.nLength = sizeof(sa);
sa.lpSecurityDescriptor = psd;
sa.bInheritHandle = TRUE;
hFile = CreateFile(MAP_FILENAME, GENERIC_READ | GENERIC_WRITE,

FILE_SHARE_READ | FILE_SHARE_WRITE, NULL,
CREATE_ALWAYS, FILE_ATTRIBUTE_TEMPORARY, NULL);
if( hFile == NULL )
{
printf("CreateFile() failed\n");
exit(1) ;
}
Figure 7.14 Shared memory under Windows NT (part 1 of 2).

hMap = CreateFileMapping(hFile, NULL,

PAGE_READWRITE, 0,
BYTES_TO_MAP, MAP_MAPNAME);
if( hMap == NULL )
{
printf("CreateFileMapping() failed\n");
exit(1);
}
pMapMem = (char *)MapViewOfFile(hMap, FILE_MAP_ALL_ACCESS,

0, 0, BYTES_TO_MAP);
if( pMapMem == NULL )
{
printf("MapViewOfFile() failed\n");
exit(1);
printf("ptr %p\n", pMapMem);

c = 'a';
for( n = 0; n < 10; n++)
{
*pMapMem = c;
Sleep(1000L);
printf("%d: c=%c *pMapMem = %c\n", getpid(), c, *pMapMem);
c++ ;
}
CloseHandle(hMap);
CloseHandle(hFile);
exit(0);
}
Figure 7.15 Shared memory under Windows NT (part 2 of 2).

C:\usr\c\NT\console>mmap
ptr 0x14030000
1000: c=a *pMapMem = a
1000: c=b *pMapMem = b
1000: c=c *pMapMem = c
1000: c=d *pMapMem = a
1000: c=e *pMapMem = b
1000: c=f *pMapMem = c
1000: c=g *pMapMem = d
1000: c=h *pMapMem = e
1000: c=i *pMapMem = f
1000: c=j *pMapMem = g
Process 2:
C:\usr\c\NT\console>mmap
ptr 0x14030000
1001: c=a *pMapMem = e
1001: c=b *pMapMem = f
1001: c=c *pMapMem = g
1001: c=d *pMapMem = h
1001: c=e *pMapMem = i
1001: c=f *pMapMem = j
1001: c=g *pMapMem = g
1001: c=h *pMapMem = h
1001: c=i *pMapMem = i
1001: c=j *pMapMem = j
Figure 7.16 Output of shared memory test for Windows NT.

7.6 Windows Interprocess Communications

The Windows mechanism of inter-task communication is based on asynchronous event
notification. The basic concepts will be discussed in this section, although a complete
detailed discussion of the workings of the Windows graphical API is beyond the scope
of this treatment (students are referred to the references for details for further study of
the workings of Windows). The Unix-based XWindows system works in a similar way.
Again, space precludes a complete detailed discussion of this topic. Figure 7.17 shows
the screen view of a very basic Windows program which will be examined.
Figure 7.17 Snapshot of the Windows NT application nores.
Figure 7.18 shows the basic structure of the initial portion of a Windows program. All
Windows programs include the header file windows.h. The main entry point is not
main() but WinMain(). The main routine is relatively simple, and does not perform any
interactive processing. Instead, messages are send on an event queue to be handled
by the window-processing callback function, which is called by the window manager.
The main routine simply:
1. Registers the window class and creates the window using CreateWindowEx().
This does not make the window visible. Note the registration of the callback
function, here WndProc().
2. Displays the main window using ShowWindow() and UpdateWindow(). The latter
function sends a message to the window message queue.
3. Enters an infinite GetMessage() loop, which calls DispatchMessage().
Figure 7.19 shows the first portion of the callback function, WndProc(). This function is
called via the window manager when an event is to be processed for the application.
The parameters to this function are:
1. hwnd, a handle (identifier) for the window.
2. message, an identifier of the type of message dispatched from the message

queue.
// nores.c - windows example without resource files.

#include <string.h>
#include <stdio.h>
#include "nores.h"
LRESULT CALLBACK WndProc(HWND, UINT, WPARAM, LPARAM);

char szAppName[] = "NoResApp";
HINSTANCE hInstSave;
int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance,

LPSTR lpszCmdLine, int nWinMode)
{
HWND hwnd;
MSG msg;
WNDCLASSEX wndclass;
wndclass.cbSize = sizeof(WNDCLASSEX);
wndclass.style = CS_HREDRAW | CS_VREDRAW;
wndclass.lpfnWndProc = WndProc;
wndclass.cbClsExtra = 0;
wndclass.cbWndExtra = 0;
wndclass.hInstance = hInstance;
wndclass.hIcon = LoadIcon(NULL, IDI_APPLICATION);
wndclass.hIconSm = LoadIcon(NULL, IDI_WINLOGO);
wndclass.hCursor = LoadCursor(NULL, IDC_ARROW);
wndclass.hbrBackground = (HBRUSH)GetStockObject(LTGRAY_BRUSH);
wndclass.lpszMenuName = "";
wndclass.lpszClassName = szAppName;
if( !RegisterClassEx(&wndclass) )
return 0;
hwnd = CreateWindowEx(WS_EX_CLIENTEDGE, szAppName,
"C Coded Windows", WS_OVERLAPPEDWINDOW, // normal
CW_USEDEFAULT, CW_USEDEFAULT, // x, y
400, 150, // width, height may be CW_USEDEFAULT
HWND_DESKTOP, NULL, hInstance, NULL);
hInstSave = hInstance;
ShowWindow(hwnd, nWinMode);
UpdateWindow(hwnd);
// main message loop

while( GetMessage(&msg, NULL, 0, 0) )
{
TranslateMessage(&msg);
DispatchMessage(&msg);
}
return msg.wParam;
}
Figure 7.18 A basic Windows program: Part 1 of 4 WinMain().

3. wParam, a word parameter containing message-specific data.
4. lParam, a long parameter containing message-specific data.
The body of this procedure consists of a switch statement, invoking the appropriate
code sections according to the message. The predefined WM_COMMAND message is used
to indicate a command message from a child control; here, these are simply the button
controls within the window.
Figure 7.20 shows the portion of the switch statement handling the WM_CREATE mes-
sage. This message is sent to the callback procedure when the window is created. As
can be seen from the figure, the various graphical objects on the window are created
here. In the example, these are 3 buttons and an edit box.
Figure 7.21 shows the portion of the switch statement handling the WM_SIZE, WM_PAINT
and WM_DESTROY messages.
The WM_SIZE message is sent to the callback procedure when the window is created.
This typically involves re-drawing any graphical objects which may change according to
the size of the overall window. The WM_PAINT message is sent to the callback procedure
when the window needs repainting (re-drawing), because its status has changed (pos-
sibly from an icon to a normal window) or it is no longer obscured by other windows.
The WM_DESTROY message is sent when the window is closing down. Finally, any mes-
sages not handled are passed to the default windows procedure, DefWindowProc().
#define MAX_EDIT 20
LRESULT CALLBACK WndProc(HWND hwnd, UINT message,

WPARAM wParam, LPARAM lParam)
{
short cxClient, cyClient;
HDC hdc;
PAINTSTRUCT ps;
int dialogResp, len;
TEXTMETRIC tm;
char *prompt;
static int nbPress = 0;
static int cxChar, cyChar;
static HWND hWndEdit;
static char editBuf[MAX_EDIT+1];
switch( message )
{
case WM_COMMAND: // menu & commands
switch(LOWORD(wParam))
{
case IDB_BUTTON_1: // set the text in the edit box
nbPress++;
wsprintf(editBuf, "count=%d", nbPress);
SendMessage(hWndEdit, WM_SETTEXT,
(WPARAM)0, (LPARAM)editBuf);
break;
case IDB_BUTTON_2: // get the text from the edit box
len = (int)SendMessage(hWndEdit,
EM_GETLINE, (WPARAM)0,
(LPARAM)editBuf);
editBuf[len] = '\0'; // null-terminate
MessageBox(hwnd, editBuf, "Retrieve Text",MB_OK);
break;
case IDB_EXIT:
dialogResp = MessageBox(hwnd,
"Exit the program?", "Exit",
MB_YESNO);
if( dialogResp == IDYES )
PostMessage(hwnd, WM_DESTROY,
(WPARAM)0, (LPARAM)0);
break;
default:
break;
}
return 0; // WM_COMMAND handled
Figure 7.19 WndProc() and the WM COMMAND message.

case WM_CREATE:
hdc = GetDC(hwnd);
SelectObject(hdc, GetStockObject(SYSTEM_FIXED_FONT));
GetTextMetrics(hdc, &tm);
cxChar = tm.tmAveCharWidth;
cyChar = tm.tmHeight + tm.tmExternalLeading;
CreateWindowEx(BS_PUSHBUTTON,
"button", // window class name
"button 1",
WS_CHILD | WS_VISIBLE | BS_PUSHBUTTON,
50, 20, 100, 20,
hwnd, (HMENU)IDB_BUTTON_1,
hInstSave, (LPVOID)NULL);
"button 2",
50, 80, 100, 20,
hwnd, (HMENU)IDB_BUTTON_2,
"exit",
220, 20, 100, 20,
hwnd, (HMENU)IDB_EXIT,
// create a one-line edit box

// (placed later in WM_SIZE event)
hWndEdit = CreateWindowEx(ES_LEFT,
"edit", NULL,
WS_VISIBLE | WS_CHILD |
ES_LEFT | WS_BORDER |
ES_AUTOHSCROLL | WS_TABSTOP |
DS_3DLOOK,
0, 0, 0, 0,
hwnd, (HMENU)IDE_EDIT,
// default text in the edit box
strcpy( editBuf, "default");
ReleaseDC(hwnd, hdc);
return 0;
Figure 7.20 Processing the WndProc() WM CREATE message.

case WM_SIZE:
cxClient = LOWORD(lParam);
cyClient = HIWORD(lParam);
// draw the edit window

MoveWindow(hWndEdit, 240, 80, 100, 20, TRUE);
// set the default text in the edit window

SendMessage(hWndEdit, WM_SETTEXT,
(WPARAM)0, (LPARAM)editBuf);
//MessageBox(hwnd, "size", "TestApp", MB_OK);
return 0;
case WM_PAINT:
hdc = BeginPaint(hwnd, &ps) ;
// set text prompt

//SetBkColor(hdc, GetSysColor(COLOR_WINDOW) );
SetBkMode(hdc, TRANSPARENT);
SetTextColor(hdc, GetSysColor(COLOR_WINDOWTEXT));
prompt = "Enter:";
TextOut(hdc, 190, 82, prompt, lstrlen(prompt));
EndPaint(hwnd, &ps);
return 0;
case WM_DESTROY:
//MessageBox(hwnd, "destroy", "TestApp", MB_OK);
PostQuitMessage(0);
return 0;
default:
return DefWindowProc(hwnd, message, wParam, lParam);
}
return 0;
}
Figure 7.21 Remaining WndProc() messages.

activity 7.4
Windows Applications
Compile and run the supplied nores.c code, using
gcc -mwindows -mno-cygwin -e_mainCRTStartup %1.c -o %1.exe
Check that it runs and the buttons operate as expected.

Now add the following:
1. Just before the return statement in the WM_SIZE message, the call
MessageBox(hwnd, "size", "TestApp", MB_OK);
2. Just before the return statement in the WM_PAINT message, the
call
MessageBox(hwnd, "paint", "TestApp", MB_OK);
3. Just before the PostQuitMessage() statement in the WM_DESTROY
message, the call
MessageBox(hwnd, "destroy", "TestApp", MB_OK);
Re-compile and test what happens when the window is created, de-
stroyed, iconified, maximized, and obscured by another window then
made visible.
7.7 Module Summary

Methods of task communication have been discussed, including
Message queues and pipes.
Shared memory and file memory mapping.
The use of queued message buffers or shared memory areas depends on the particular
problem at hand. Understanding the differences between message queues and shared
memory is quite important.
Further Reading
a
Module 8
PROCESS
SYNCHRONIZATION
AND TIMING
Module 8 Process Synchronization and Timing 8.1
8.1 Module Overview

The previous two modules have introduced the notion of multitasking, using processes
and threads, and the means by which co-operating processes may communicate. The
other side of the coin in this regard is how to synchronize processes when necessary. If
the communication is asychronous via message queues or sockets, the normal flow of
control proceeds in the recipient task. However, the situation of an expected message
not arriving in time must also be catered for. If the process communication is via shared
memory, access to the memory segment must be controlled in some way, otherwise the
memory may become corrupted (as was demonstrated in the threads example). This
is a particular case of the more general case of access to shared resources typically
disk files and memory. If the communication is to flag some extraneous event, there
is no choice but to have a non-sequential method of handling the response. Because
there are many problems, a number of solutions have developed. Those covered here
are:
Events, signals & timers.
File locking & semaphores.
In any given operating system, some or all of these may be supported to varying de-
grees.
8.2 Process Notification

This section examines some methods of signalling processes in Windows NT and Unix.
8.2.1 Unix Signals

Unix systems support signals, which are asynchronous notification of some event out-
side the process. They are terms asynchronous because they are not handled in the
normal flow of execution, but are more akin to a hardware interrupt. A signal handler is
a function which is invoked on receipt of a particular signal. Many signal types are de-
fined, such as SIGKILL and SIGTERM when the process is to terminate, and SIGALRM
when a timer alarm expires. These are defined in the include file signal.h. To regis-
ter a signal handler, the signal(SIGNAME, SigHandler) function must be called. The
function SigHandler() is invoked when the named signal is received. Because this
could potentially occur at any time, the application itself and the signal handler must be
aware of this. The signal function takes no arguments and returns no arguments (that
is, a void data type) because it is not explicitly invoked (called) by another function.
Note that the signal trigger is a one-off, and it is necessary to reset the signal action
in the signal handler routine.
Figure 8.1 shows the setting up of signal handling for a timer and for an interrupt
(control-c). The timer is set using the setitimer(). The signal handlers are registered
following this. The while(1) loop is simply to keep the process from exiting while the
signals are trapped. Such a loop is termed a busy wait or a spin loop, and should be
avoided in practice because it is simply wasting processor time. In reality, the sleep()
or wait() functions are preferable.
The signal handlers are shown in Figure 8.2. As noted previously, they both take no
arguments and return nothing. A short run of the program is shown in Figure 8.3.
8.2.2 Windows NT Events

As seen in the previous module, Windows uses a callback function for synchronous
event notification. A system timer defined by IDTIMER_MESSAGE may be defined to have
an expiry time (in milliseconds) using the SetTimer() function. For example:
SetTimer(hwnd, IDTIMER_MESSAGE, TIMEOUT_MESSAGE*1000, NULL);
requests Windows to send to the WndProc() the message at the appropriate time inter-
vals. The timer is destroyed using KillTimer(hwnd, IDTIMER_MESSAGE). The switch
statement in the main callback procedure may be used to handle the timer event by
checking which timeout the event belongs to:
case WM_TIMER:
if( wParam == (WPARAM)IDTIMER_MESSAGE )
{
messageCount++ ;
// other operations
}
return 1;
Another mechanism is to define a callback function exclusively for the timeout. This is
done using
SetTimer(hwnd, IDTIMER_FUNCTION, TIMEOUT_FUNCTION*1000, TimerFunc);
The callback function is then called at the timeout intervals:
VOID CALLBACK TimerFunc(HWND hwnd, UINT msg, UINT timerID, DWORD SysTime)
{
procCount++ ;
// other processing
}
#include <signal.h>
#include <sys/time.h>
#include <stdio.h>
#include <stdlib.h>
// signal-handling functions
void SigIntHandler();
void SigHupHandler();
void SigTermHandler();
void SigAlrmHandler();
int main()
{
struct itimerval Timeout ;
Timeout.it_interval.tv_sec = 0;
Timeout.it_interval.tv_usec = 500000;
Timeout.it_value.tv_sec = 3;
Timeout.it_value.tv_usec = 0;
if( setitimer(ITIMER_REAL, &Timeout, NULL) < 0)

{
perror("setitimer()");
exit(1);
}
// SIGALRM = 14
signal(SIGALRM, SigAlrmHandler);
// SIGINT = 2
signal(SIGINT, SigIntHandler);
// SIGHUP = 1
signal(SIGHUP, SigHupHandler);
// SIGTERM = 15
// send via kill -15 pid or kill -TERM pid
signal(SIGTERM, SigTermHandler);
printf("Signals have been set up...\n");
while(1)
{
getchar();
printf("waiting...\n");
}
printf("Exiting.\n");
exit(0);
}
Figure 8.1 Unix signals initializing (part 1 of 2).

// SIGHUP = 1
void SigHupHandler()
{
printf("HANGUP SIGNAL\n");
// reset the signal handler

signal(SIGHUP, SigHupHandler);
}
// SIGINT = 2
void SigIntHandler()
{
printf("INTERRUPT SIGNAL\n");

signal(SIGINT, SigIntHandler);
}
// SIGTERM = 15
void SigTermHandler()
{
printf("TERMINATE SIGNAL\n");

//signal(SIGTERM, SigTermHandler);
printf("Exiting now.\n");
exit(1);
}
// SIGALRM = 14
void SigAlrmHandler()
{
printf("ALARM TIMER SIGNAL\n");

signal(SIGALRM, SigAlrmHandler);
}
Figure 8.2 Unix signal-handling functions (part 2 of 2).

* Example output:
ALARM TIMER SIGNAL
waiting...
ALARM TIMER SIGNAL
waiting...
INTERRUPT SIGNAL
waiting...
ALARM TIMER SIGNAL
waiting...
ALARM TIMER SIGNAL
waiting...
TERMINATE SIGNAL
Exiting now.
Figure 8.3
Output of the Unix signals test, for pid 213. kill -INT 213 and
kill -TERM 213 was entered at another console.
8.3 Atomic Resource Locking

In many instances, access to resources must be exclusive or atomic. For example,
if two processes run concurrently on a system, and one writes a particular file and
the other reads the same file, each read or write must run to completion without inter-
ference from the other. Attempting to read a partially written file, for example, gives
incorrect results.
Another easily-understood situation where this problem may occur is in accessing

money in a bank. Suppose two people share a joint bank account. Each is inde-
pendently making a withdrawal of funds from the account. A nave algorithm for the
banks processing would be:
Given CurrentBalance, RequestedFunds

if RequestedFunds >= CurrentBalance
Allow withdrawal
CurrentBalance = CurrentBalance - RequestedFunds
else
Disallow withdrawal, reason = "insufficient funds"
endif
Suppose one withdrawal request is part-way completed. The check for sufficient funds
has been completed and the withdrawal has been allowed. The balance is about to
be decremented. Now suppose the other process runs, but for whatever reason (faster
CPU, less loading on that CPU, etc) it runs the entire algorithm whilst the first is paused.
The check for sufficient funds succeeds, as the balance has not yet been decremented
by the first process. Now the second process allows the withdrawal and decrements the
balance. If, say, the original balance was $100, the first requested $70, and the second
requested $60, then they could both succeed, with the net result that the balance is
less than the available funds (-$30). If the bank does not allow overdrawn accounts
(negative balances), the above represents a processing failure.
This situation may arise in many real-world situations for example, an airline database
checking for available seats and then booking the seats, or a computer disk being
backed up whilst users were writing files to the disk.
From the above discussion, it is clear that the entire operation testing availability and
the actual transaction (bank balance deduction) must be atomic. In more general
terms, in order to acquire exclusive access to a shared resource, a process must:
1. Wait for the shared resource to become available.
2. Signify (flag) that it is using the shared resource.
3. Use the shared resource.
4. Release the resource and the locking flag.
The first two must be performed as an atomic operation, a concept which has been met
before. If, between checking if the resource is free and actually locking the resource,
the process gets interrupted, another process may run and capture the resource lock.
Thus there is the potential that two processes have captured the resource lock, thus
both accessing the shared resource simultaneously the very situation that was to
be avoided. Another scenario, that of infinite deadlock, is possible. In this case, both
processes are waiting for a resource which the other has but cannot release. The
process will then wait indefinitely.
One other issue which ought to be noted here is the wait for the shared resource. If
this is a busy wait, then processor time is needlessly spent polling a resource lock.
8.3.1 Lock Files
One method of handling the resource-locking problem is to use semaphores, discussed

in the next section. Another method is to use a lock file. Simply put, this is a file whose
contents are unimportant, but the mere fact of its existence signals that exclusive ac-
cess to a resource (possibly another file or a database record, for example) has been
granted to a particular process.
Figure 8.4 shows a process which locks a resource before attempting to use it. One
possible method of handling lock files is shown in the support functions of Figure 8.5.
The getLock() function enters what appears to be an infinite loop. Inside the loop, it
attempts to create the lock file. If it can be created, the code returns as the existence
of the lock file signifies to other processes that the resource is in use. If creation of
the lock file fails, it will be because the file already exists (the open() function is called
with create, exclusive access flags). Note that this only works because the unique
combination of O_CREAT and O_EXCL flags guarantee in Unix an atomic test-and-create

operation.
The attempt to acquire the lock is designed to fail gracefully. If the number of times
around the loop exceeds WAIT_LOCKTIME then the function gives up. In addition, the
loop is not a busy wait due to the sleep(1) call. This allows intervals of 1 second
between polling of the lock file. Such mechanisms are required in robust systems. For
example, if a process acquired the lock and the process died or was killed for some
reason, the lock file would remain but belong to nobody. Processes will wait indefinitely
for the lock file to be released. Ideally this program should incorporate a signal, so that
interrupt signals (SIGINT, SIGHUP, SIGTERM) are caught and the lockfile removed.
This will prevent problems if the application is killed with a lock held (the lock file still
exists). On Unix systems, lock files are usually created in a common directory such as
/var/lock/. This can then be checked and orhpaned lock files removed on startup.
Figure 8.6 shows the example running as two separate processes. The first starts and
acquires the lock. The second waits and checks the lock file at 1 second intervals, until
finally the first process releases the lock by removing the lockfile.
8.4 Semaphores
Another method which is supported in many systems is the semaphore. These are
similar in concept to the thread locks and atomic increments met in the module on
multitasking, in the context of threads. The so-called Dijkstra PV operations on
semaphores are defined as:
P decrease (not available, capture)
V increase (available, release)
Thus a semaphore is a free flag for a resource, indicating that the resource is free
and able to be used. A P operation decrements the count and flags the semaphore
as locked. A V operation increments the count and thus releases the flag. These
operations must be done using a kernel function.
8.4.1 Unix Semaphores
Figures 8.7 and 8.8 show a Unix client program which uses semaphores. The semaphore
is first created (Figure 8.9). After all processes have terminated, the semaphore may
be deleted from the system. Obviously this must be done by the parent process.
Figure 8.10 shows the encapsulation of the semop() system call to acquire and release
the semaphore. After the semaphore is created, its status may be examined using the
ipcs -s command as shown in Figure 8.11.
/* lockfile.c - creating a lockfile for exclusive access amongst

* several processes.
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h> // open(), close()

#include <unistd.h> // for unlink()
#define WAIT_LOCKTIME 10
int getLock(char *lockfile);
void releaseLock(char *lockfile, int lockfd);
int main()
{
int lockfd;
char *lockfile = "file.lk";
lockfd = getLock(lockfile);
if( lockfd != -1)
{
printf("use resources...\n");
fflush(stdout);
sleep(10);
releaseLock(lockfile, lockfd);
}
else
{
printf("failed to obtain lock\n");
fflush(stdout);
}
exit(0);
}
Figure 8.4 Using lock files main calling code.

// returns -1 on fail, otherwise the lockfile descriptor

int getLock(char *lockfile)
{
int tryCount = 0;
int lockfd;
do
{
// check if the number of iterations has been exceeded
if(++tryCount > WAIT_LOCKTIME)
{
printf("getLock(): timeout on obtaining lock\n");
return -1;
}
// Try to open the lockfile.

// O_CREAT | O_EXCL fails (returns -1) if the
// file already exists.
// The test is atomic (see man page)
lockfd = open(lockfile, O_CREAT | O_EXCL, 0666);
if(lockfd == -1)
{
// lockfd == -1, file creation error
printf("getLock(): lockfile already exists\n");
sleep(1);
}
} while( lockfd == -1 );
printf("getLock(): ok\n");
return lockfd;
}
// release the lock

// Ideally this should be invoked on a signal as well
// to clean up the lockfile
void releaseLock(char *lockfile, int lockfd)
{
close(lockfd);
unlink(lockfile);
}
Figure 8.5 Lock file support functions.

* Example output:
Process 1
phanes (leis) [4] lockfile
getLock(): lockfile created
use resources...
phanes (leis) [5]
Process 2 (run slightly later)
phanes (leis) [57] lockfile
getLock(): lockfile already exists
getLock(): lockfile created
use resources...
phanes (leis) [58]
Figure 8.6 Output of lock file test.
8.4.2 Windows NT Semaphores
Semaphores are created under Windows NT using the system call CreateSemaphore().
Figure 8.12 shows the creation of the semaphore, which is then captured using
WaitForSingleObject() as shown in Figure 8.13. Note the possible return codes from
this call the semaphore was acquired within the specified timeout, the timer expired
before the semaphore was available, or the wait was abandoned altogether.
Any number of client processes may then attach to the system semaphore and wait on
it. Figure 8.14 shows that a handle to the semaphore is requested using OpenSemaphore()
in much the same way as a file is opened. Next, WaitForSingleObject() is used to
atomically protect any accesses to shared resources. In the example of Figure 8.15
this is simply a Sleep() call, but of course in practice some more realistic operation
would be performed. The actual operation must be done as quickly a possible, as the
longer a process holds a semaphore the longer other processes may be delayed in
waiting for the semaphore to become available.
Figure 8.16 shows the actual output of the semaphore client/server combination. How-
ever, the timing of the output is important but not conveyed in the printed listing.
/* semop.c - semaphore operations under Unix

* Platform: SunOS
* Compiler: gcc
* Run several of these processes in separate terminal windows
* and verify the result.Use
ipcs -s
* to view semaphores. Use
ipcrm -s <semid>
* remove the semaphore, where <semid> is the ID from ipcs -s
* John Leis
*/
#include <sys/ipc.h>
#include <sys/sem.h>
#include <stdio.h>
// required definition - see man page for semctl

union semun
{
int val;
struct semid_ds *buf;
ushort_t *array;
} arg ;
/* system-wide key for our semaphore

* This will be the key shown in "ipcs -s"
*/
#define SEM_KEY 0x7645
/* easier access to semaphores when only a single

* semaphore is required (Unix allows an array of semaphores
* as well as many access primitives )
*/
int CreateSem(); // create a semaphore - return ID
void GetSem(int SemID); // set a semaphore - blocking
void ReleaseSem(int SemID);// release the semaphore
void DeleteSem(int SemID); // delete semaphore from the system
Figure 8.7 Semaphores under Unix part 1 of 2.

int main()
{
int SemID, TestNum , MyPID;
printf("Warning: if the process is stopped before completion, ");

printf("use \n ipcs -s\n");
printf("to check for the semaphore.\n");
printf("Remove the semaphore using \n");
printf("ipcrm -s \n");
SemID = CreateSem();
MyPID = getpid();
// spin the test a few times...

for( TestNum = 0; TestNum < 10; TestNum++)
{
// get the semaphore. will block in GetSem()
// until we have the semaphore
GetSem(SemID);
printf("Process %d has the semaphore\n", MyPID);
fflush(stdout);
// Do some critical operation that requires the

// semaphore. This is usually access to a shared resource
// This section should always be as short as possible!
sleep(1);
printf("Process %d releasing the semaphore\n", MyPID);
// flush the output queue - same reason as before

fflush(stdout);
// release access to the semaphore

ReleaseSem(SemID);
// used to simulate the rest of the process.

// necessary so that other processes can access the
// semaphore (and hence shared resource)
sleep(2);
}
// Delete the semaphore from the system.

// This is normally done only by the 'master'
// process who created the semaphore.
DeleteSem(SemID);
exit(0);
}
Figure 8.8 Semaphores under Unix part 2 of 2.

/* CreateSem() - create a semaphore.

* Returns the ID of a semaphore
*/
int CreateSem()
{
int SemID;
union semun SemCtlArg;
if( (SemID = semget((key_t)SEM_KEY, 1, IPC_CREAT | 0666)) < 0 )

{
perror("semget()");
exit(1);
}
/* set the semaphore to 1 initially.

*/
SemCtlArg.val = 1;
if( semctl( SemID, 0, SETVAL, &SemCtlArg) < 0)
{
perror("semctl()");
exit(1);
}
return SemID ;
}
/* DeleteSem() - delete a semaphore resource from the system

*/
void DeleteSem(int SemID)
{
union semun SemCtlArg;
if( semctl( SemID, 0, IPC_RMID, &SemCtlArg) < 0)

{
perror("semctl()");
exit(1);
}
}
Figure 8.9 Semaphore functions for Unix part 1 of 2.

/* GetSem() - get exclusive access to a semaphore

* for this process. This function will sleep until
* the semaphore is available.
*/
void GetSem(int SemID)
{
struct sembuf SemOpBuf;
/* set the semaphore operation field to -1

* ie. try and decrement the semaphore to zero.
* If it's already zero, sleep until someone
* else releases it.
*/
SemOpBuf.sem_num = 0;
SemOpBuf.sem_op = -1;
SemOpBuf.sem_flg = SEM_UNDO ;
if( semop( SemID, &SemOpBuf, 1) < 0)

{
perror("semop()");
exit(1);
}
}
/* ReleaseSem() - release the semaphore for someone else.

* Note the use of SEM_UNDO in case this process does
* a premature exit() before proper release of the semaphore.
*/
void ReleaseSem(int SemID)
{
struct sembuf SemOpBuf;
SemOpBuf.sem_num = 0;
SemOpBuf.sem_op = 1;
SemOpBuf.sem_flg = SEM_UNDO ;
if( semop( SemID, &SemOpBuf, 1) < 0)

{
perror("semop()");
exit(1);
}
}
Figure 8.10 Semaphore functions for Unix part 2 of 2.

* Run several of these processes in separate terminal windows

* and verify the result.Use
ipcs -s
* to view semaphores. Use
ipcrm -s <semid>
* remove the semaphore, where <semid> is the ID from ipcs -s
*
Example output:
phanes (leis) [11] semop
Process 17516 has the semaphore
Process 17516 releasing the semaphore
Process 17516 has the semaphore
Process 17516 releasing the semaphore
Example ipcs -s output:

phanes (leis) [39] ipcs -s
IPC status from <running system> Sunday February 13 15:06:32
T ID KEY MODE OWNER GROUP
Semaphores:
s 0 0x187cf --ra-ra-ra- root root
s 327681 0x7645 --ra-ra-ra- leis staff
phanes (leis) [40]
...
phanes (leis) [41] ipcrm -s 393217
phanes (leis) [42] ipcs -s
IPC status from <running system> Sunday February 13 15:08:54
T ID KEY MODE OWNER GROUP
Semaphores:
s 0 0x187cf --ra-ra-ra- root root
Figure 8.11 Output of Unix semaphore test.
/* semserver.c
* Windows NT semaphores
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#define SEM_NAME "mysem"
int main()
{
HANDLE hSemaphore;
LONG semIncr = 1L, semPrev;
DWORD waitStatus;
// no security attributes,
// initial count 1 (released), max count 1
hSemaphore = CreateSemaphore( NULL, 1L, 1L, SEM_NAME);
if( hSemaphore == NULL )
{
printf("CreateSemaphore() failed\n");
exit(1);
}
Figure 8.12 Windows semaphore server semaphore setup (part 1 of 2).
activity 8.1
Windows NT Semaphores.
Compile the code examples semserver and semclient. Create two sep-
arate command (DOS) windows. Invoke the client in one window it
should fail, as the code assumes the semaphore has already been cre-
ated and simply tries to open the existing semaphore. Now run the
server in one window first, and then the client in the other window.
printf("waiting for semaphore....\n");

fflush(stdout);
// decrements count
// timeout is in milliseconds, or INFINITE
waitStatus = WaitForSingleObject(hSemaphore, INFINITE);
printf("semaphore wait done, status = ");
fflush(stdout);
switch( waitStatus )
{
case WAIT_ABANDONED:
printf("abandoned\n");
break;
case WAIT_OBJECT_0:
printf("available within timeout\n");
break;
case WAIT_TIMEOUT:
printf("timeout\n");
break;
case WAIT_FAILED:
printf("failed\n");
break;
}
printf("sleeping for 10 seconds...");

fflush(stdout);
Sleep(10000L);
printf("awake\n");
fflush(stdout);
printf("releasing semaphore\n");
fflush(stdout);
ReleaseSemaphore(hSemaphore, semIncr, &semPrev);
// wait before closing

printf("sleep...");
fflush(stdout);
Sleep(10000L);
printf("awake\n");
fflush(stdout);
CloseHandle(hSemaphore);
return 0;
}
Figure 8.13 Windows semaphore server capturing the semaphore.

/* semclient.c
* Windows NT semaphores
* John Leis
*/
#include <stdio.h>
#include <stdlib.h>
#define SEM_NAME "mysem"
int main()
{
HANDLE hSemaphore;
LONG semIncr = 1L, semPrev;
DWORD waitStatus;
// enable NT event waits

hSemaphore = OpenSemaphore(SEMAPHORE_MODIFY_STATE | SYNCHRONIZE,
TRUE, SEM_NAME);
if( hSemaphore == NULL )
{
printf("OpenSemaphore() failed\n");
exit(1);
}
Figure 8.14 Windows semaphore client semaphore setup (part 1 of 2).

printf("waiting for semaphore....\n");

fflush(stdout);
// decrements count
// timeout is in milliseconds, or INFINITE
waitStatus = WaitForSingleObject(hSemaphore, INFINITE);
printf("semaphore wait done, status = ");
fflush(stdout);
switch( waitStatus )
{
case WAIT_ABANDONED:
printf("abandoned\n");
break;
case WAIT_OBJECT_0:
printf("available within timeout\n");
break;
case WAIT_TIMEOUT:
printf("timeout\n");
break;
case WAIT_FAILED:
printf("failed\n");
break;
}
printf("sleeping for 2 seconds...");

fflush(stdout);
Sleep(2000L);
printf("awake\n");
fflush(stdout);
printf("releasing semaphore\n");
fflush(stdout);
ReleaseSemaphore(hSemaphore, semIncr, &semPrev);

CloseHandle(hSemaphore);
return 0;
}
Figure 8.15 Windows semaphore client capturing the semaphore.

C:\usr\c\NT>semclient
waiting for semaphore....
semaphore wait done, status = available within timeout
sleeping for 2 seconds...awake
releasing semaphore
C:\usr\c\NT>semserver
waiting for semaphore....
semaphore wait done, status = available within timeout
sleeping for 10 seconds...awake
releasing semaphore
sleep...awake
Figure 8.16 Output from Windows semaphore client/server.

8.5 Module Summary

Synchronization is the cause of many problems in real-time systems, so an understand-
ing of some methods of synchronization is required. The important concepts from this
module are:
Events, signals & timers.
File locking & semaphores.
The conceptual difference between each of these, and where they would be used,
should be fully understood.
Further Reading
a
Module 9
VIDEO GRAPHICS
AND ANIMATION
Module 9 Video Graphics and Animation 9.1
9.1 Module Overview

Controlling graphics and audio output devices represent one of the most difficult real-
time design challenges. So-called hard real-time systems such as this pose par-
ticular challenges to the software engineer. Not only must the principles of the un-
derlying hardware be understood, they must be mastered subject to the performance
constraints of the processing system on which they are implemented. For this reason,
a case study of a simple graphics animation represents a very instructive example of a
real-time system. The topics examined in this module are
Video display basics.
Creating a real-time animation under Windows.
Although video displays are considered here, similar principles are applicable to real-
time audio output. These ideas extend to the implementation of networking protcols
and real-time control systems.
9.2 Video Subsystem

The basic scanned video device, such as a television display or conventional computer
monitor, is depicted in Figure 9.1. An electron beam scans from left to right, top to
bottom, at a very rapid rate. This rate must be fast enough so as to prevent flicker, or
at least make it nearly imperceptible. The illusion of motion is brought about by the
projection of several successive pictures onto the screen in quick succession. If the
images are redrawn too slowly, the motion of objects becomes visible. This is at the
hardware level, and the parameters are set by the design of the monitor (and by impli-
cation, the video graphics display hardware). When constructing synthetic animation,
the problem is compounded by the need to erase the moving object(s) whilst keeping
the background constant. The latter problem is of fundamental interest in what follows.
The video screen image is effectively a matrix of successive picture elements, called
pixels. The pixels are rendered in a particular color according to the values stored in
video memory (RAM). Each pixel corresponds to one or more bits in the video memory.
Because memory is linear meaning that successive memory locations form a one-
dimensional list of bytes stored the video memory must be mapped onto the two-
dimensional screen. (As an aside, the projection of three-dimensional images onto
a two-dimensional viewing plane is done using mathematical viewing transformations
from 3 dimensions to two, normally in software, sometimes using special-purpose video
processors).
The mapping of one-dimensional memory into two-dimensional viewing space is de-

picted in Figure 9.2. For a given width w and height h in pixels, the desired Cartesian
( )
co-ordinate of a pixel x; y is defined as an offset or displacement from the start of
the video memory as d = + yw x. This may be derived easily from inspection of the
(0 0)
figure. The convention is that the origin, or point ; , is in the upper-left corner of the
screen, but this need not always be the case. Assuming one byte is required per pixel,
a screen resolution of 1024 (width) by 768 (height) requires almost 1 MByte of video
RAM.
video screen
visible
horizontal retrace
vertical retrace
Figure 9.1 Scanning of the beam on a video screen.
One or more bits in each video memory location define, directly or indirectly, the color
of the displayed pixel. At the most basic level, a single bit could be used to represent
black or white. Representing color becomes more complicated. The three primary
colors red, green, and blue (RGB) may be combined in the appropriate amounts
to produce any desired color. Thus one common scheme is to reserve three bytes
to use for each pixel, with each byte representing the RGB triplet. Each byte is used
as an intensity for the red, green and blue electron guns of the display. A DAC or
digital to analog converter performs the conversion from a binary value into a relative
voltage strength. Thus the relative value of each byte, 0 to 255, represents the relative
strength of each color. For example, red 0, green 100% and blue 100% may be mixed
to represent cyan; red 100%, green 100% with no blue gives a strong yellow. Equal
red, green and blue gives a shade of gray ranging from black to white.
Using a direct representation gives the best possible color rendition, at the expense
of the amount of memory required. Using three bytes per pixel, a screen resolution of
1024 (width) by 768 (height) requires over 2MBytes of video RAM.
An alternative is an indirect representation using a palette as shown in Figure 9.3. Each

pixel then (usually) corresponds to one byte. The byte is used as an offset into the
palette. The palette is used as a color lookup table. Each entry is comprised of three
values, for red, green and blue. The relative strength of the RGB triplet defines the color
represented by that palette entry. The advantage of this scheme is that substantially
x offset = y w +x
b b b b
y
b b b
b
Figure 9.2 Mapping of linear screen memory into two dimensions.
less memory is required for the video display. The obvious disadvantage is that fewer
colors are available. A more subtle problem is that of color re-use. In a windowing
system, if one window is very graphics-intensive and uses up most of the color palette
entries, other applications suffer.
High-speed graphics cards often use an interleaving scheme as shown in Figure 9.4.
This is because the data access time of RAM, defined as the time from when the ad-
dress is presented to when the data is available, must be quite small for high-resolution,
fast-refresh video displays. Video RAM having a fast access time is generally more ex-
pensive. To use lower-speed memory, interleaving fetches the byte for pixel n from
bank 1, the byte for pixel n +1
from bank 2, and so forth. Using, say, four memory
banks in an interleaved fashion means that RAM with approximately one-quarter of the
access speed is required, which reduces system cost.
Screen
Palette
b b b
blue
green
red
Figure 9.3 Using a pixel value to lookup the color palette.
successive pixels
b b b b
memory banks
byte
Figure 9.4 Video RAM interleaving.

9.3 Animation Effects

In order to effect the illusion of motion and animation, an image on a screen or in a
window must be erased and then redrawn at a slightly offset position many times per
second. This erase/redraw cycle must be accomplished in less than about 20ms in
order to avoid flicker. This redraw may be assisted by special-purpose hardware. In
effect, two video memories are required: one which is currently being displayed, whilst
the other is being erased and re-drawn. After the nominated display hold-time has
elapsed, the roles of the two buffers are reversed: one is used for output whilst the
other is being reloaded. Such a technique is termed double-buffering, and is also used
in audio playback to avoid audible clicks when a sound playback buffer is reloaded.
As Figure 9.5 indicates, this may be controlled for full-screen animation through a spe-
cial control register in hardware. If the animation is to be done within one window on a
Windows system, the double-buffering technique is still applicable, although of course it
is not done using special hardware naturally the hardware cannot reflect the position
of all screen windows (only the entire screen). However, special-purpose high-speed
memory copy operations, used with appropriate synchronization, allows the effect of
animation in application windows.
hidden control register

b
visible
Figure 9.5 Video screen page swapping.
9.3.1 Direct Repaint Approach

Figure 9.6 shows a snapshot of the winanim1.c application which will now be dis-
cussed. The box and oval are moving slowly in opposite directions, and bounce when
they hit the window edges and reverse direction. The objective is to achieve smooth
animation without flickering or distortion. The basic idea is from from [21], although the
following represents a standard approach to the problem.
In the first approach, after each timer interval expires:

Figure 9.6 Windows animation example.
1. The window is erased by setting it to the background color.
2. The new position of the objects is calculated. This is normally a small offset
depending on the current direction of motion. However, if the object appears to
hit the windows edge, the direction is reversed.
3. The window is re-drawn with the objects in their new position.
This cartoon-sequence of animation provides the desired result, provided the pro-
cessor is fast enough to handle the timer-based redraws and allow sufficient time for
other processing. However, the performance is not entirely satisfactory. The display is
plagued by a horizontal flickering which is quite noticeable, even on a 400MHz CPU.
Before considering how to solve this problem, the internal workings of the application
will be dissected.
activity 9.1
Windows animation direct repaint approach.
Compile and run the winanim1.c code using
gcc -mwindows -e_mainCRTStartup winanim1.c -o winanim1.exe

where the flags are:
-m emulation
-e start address
-lgdi32 may be required for some systems
-mno-cygwin is for stand-alone applications
Verify that, as stated in the text, the animation is not very satisfactory
and contains significant flickering in the display window.
The window event-handling is not unlike that discussed in the previous module. The
windows callback function processes events, which are passed as an integer identifier
from the operating-system event queue. It is defined as
LRESULT CALLBACK WndProc(HWND hwnd,

UINT message,
WPARAM wParam, LPARAM lParam)
where message is the event identifier. Certain event messages have additional param-
eters associated with them, such as the identity of the timer which expired in the case
of timeout messages. These are passed in the wParam and lParam variables.
Because the code is event-driven (meaning only activated when an event occurs, as
necessary), the state of the application window must be saved. In this case, the current
position of the objects is saved using
static int xPosRect = 0; // must be static

static int xPosBall = 50; // must be static
static int xStepRect = 2; // must be static
static int xStepBall = 1; // must be static
static int BallWidth = 60; // must be static
static int RectWidth = 50; // must be static
These are declared as static variables, meaning that they retain their value from one
invocation to the next. This is in contrast to automatic variables, the space for which is
allocated each time the function is called. Internally, static variables are stored in the
static data area, whereas automatic variables are stored on the stack. If the above were
not declared as static, their value would be different each time WndProc() is invoked.
For this reason, Windows applications tend to use many global and/or static variables.
case WM_TIMER:
InvalidateRect(hwnd, NULL, TRUE);
return 0;
Figure 9.7 Windows animation, version 1 handling the WM TIMER message.
The timer event is used to handle the window update. Figure 9.7 shows the WM TIMER
event, which simply calls InvalidateRect(). This has the effect of queueing a paint
message for the entire window.
Upon receipt of a WM PAINT message, the window is re-drawn by filling the rectangle
with a background color, followed by re-drawing of the objects in their new positions.
Figure 9.8 shows the WM PAINT event, which implements the erase-redraw strategy.
The objects positions are then calculated in readiness for the next iteration.
9.3.2 Memory-Copy Approach

The erase-redraw strategy as outlined in the previous section, although correct, pro-
duces a less-than-satisfactory animation. A better strategy is to use double buffering.
In essence, this means that the updates to the window are done using a buffer which
represents the windows contents. The buffer itself is not visible whilst it is being up-
dated. Then, the contents of the window are copied as a bitmap (pixel map) to the
visible window. Thus two buffers are required: the hidden buffer and the visible screen
window itself. This is termed double-buffering.
In order to allocate the buffer, the size must be known. This changes when the window
is resized, and it captured when a WM SIZE event is sent to the callback procedure. The
necessary code is shown in Figure 9.9. GetClientRect() captures the new size of the
window and creates a memory bitmap using CreateCompatibleBitmap().
The redraw is implemented as follows. Figure 9.9 shows the WM TIMER! event, which
now draws to a memory bitmap rather than to the screen directly. The device con-
text of the visible window is obtained, using GetDC(hwnd). A bitmap, which rep-
resents the contents of the window, is selected using CreateCompatibleDC() and
SelectObject(). The drawing commands such as Rectangle() use the handle to
the memory buffer, hcdMem. Finally, the bitmap buffer is copied in its entirety to the
visible window using BitBlt(). This function is optimized for high-speed copying of
an entire bitmap, and thus to eliminate the flicker which was present in the previous
example.
To summarize, the difference between the two methods is as follows. In the first, the
timer is used to trigger a repaint of the window. The drawing is done directly to the
visible window. In the second approach, the timer is used to draw to a hidden bitmap,
which is copied to the visible window. The size event is used to capture the required
size of the hidden bitmap buffer. This double-buffering approach may be used in real-
time systems to guarantee satisfactory performance where direct hardware I/O (typi-
cally video and audio) is required.
case WM_PAINT:
hdc = BeginPaint(hwnd, &ps) ;
SetMapMode(hdc, MM_TEXT); // pixel mode
GetClientRect(hwnd, &ClientRect);
// filled rectangle for background

hbrushb = CreateSolidBrush(RGB(200, 200, 200));
SelectObject(hdc, hbrushb);
Rectangle(hdc, ClientRect.left, ClientRect.top,
ClientRect.right, ClientRect.bottom);
hbrush = CreateSolidBrush(RGB(0, 200, 0));
SelectObject(hdc, hbrush);
hpen = CreatePen(PS_SOLID, 1, RGB(200, 0, 0));
SelectObject(hdc, hpen);
MoveToEx(hdc, 0, 0, &oldPoint);
LineTo(hdc, ClientRect.right, ClientRect.bottom);
// ball bouncing
Ellipse(hdc, xPosBall, 150, xPosBall+BallWidth, 180);
if( xPosBall+BallWidth > ClientRect.right )
xStepBall = -xStepBall; // hit right-hand side
if( xPosBall < ClientRect.left )
xStepBall = -xStepBall; // hit left-hand side
xPosBall += xStepBall; // update position
// box bouncing
Rectangle(hdc, xPosRect, 50, xPosRect+RectWidth, 100);
if( xPosRect+RectWidth > ClientRect.right )
xStepRect = -xStepRect; // hit right-hand side
if( xPosRect < ClientRect.left )
xStepRect = -xStepRect; // hit left-hand side
xPosRect += xStepRect; // update position
DeleteObject(hpen);
DeleteObject(hbrush);
DeleteObject(hbrushb);
EndPaint(hwnd, &ps) ;
return 0;
Figure 9.8 Windows animation, version 1 handling the WM PAINT message.

case WM_SIZE:
hdc = GetDC(hwnd);
cxClient = LOWORD(lParam);
cyClient = HIWORD(lParam);
hdcMem = CreateCompatibleDC(hdc);
// if bitmap exists (old size), delete it

if( hBitMap )
DeleteObject(hBitMap);
hBitMap = CreateCompatibleBitmap(hdc,
return 0;
Figure 9.9 Windows animation, version 2 handling the WM SIZE message.
activity 9.2
Windows animation memory copy.
Compile and run the winanim2.c code using
gcc -mwindows -e_mainCRTStartup winanim2.c -o winanim2.exe

Verify that, as stated in the text, the animation is far superior and without
flickering in the display window.
case WM_TIMER:
if( ! hBitMap )
break;
hdc = GetDC(hwnd);
SetMapMode(hdc, MM_TEXT); // pixel mode
hdcMem = CreateCompatibleDC(hdc);
SelectObject(hdcMem, hBitMap);
// filled rectangle
hbrushb = CreateSolidBrush(RGB(200, 200, 200));
SelectObject(hdcMem, hbrushb);
Rectangle(hdcMem, ClientRect.left, ClientRect.top,
hbrush = CreateSolidBrush(RGB(0, 200, 0));
SelectObject(hdcMem, hbrush);
// line drawing
hpen = CreatePen(PS_SOLID, 1, RGB(200, 0, 0));
SelectObject(hdcMem, hpen);
MoveToEx(hdcMem, 0, 0, &oldPoint);
LineTo(hdcMem, ClientRect.right, ClientRect.bottom);
DeleteObject(hpen);
// ball bouncing
Ellipse(hdcMem, xPosBall, 150, xPosBall+BallWidth, 180);
if( xPosBall+BallWidth > ClientRect.right )
xStepBall = -xStepBall; // hit right-hand side
if( xPosBall < ClientRect.left )
xStepBall = -xStepBall; // hit left-hand side
xPosBall += xStepBall; // update position
// box bouncing
Rectangle(hdcMem, xPosRect, 50, xPosRect+RectWidth, 100);
if( xPosRect+RectWidth > ClientRect.right )
xStepRect = -xStepRect; // hit right-hand side
if( xPosRect < ClientRect.left )
xStepRect = -xStepRect; // hit left-hand side
xPosRect += xStepRect; // update position
// copy from memory bitmap to window bitmap

BitBlt(hdc, ClientRect.left, ClientRect.top,
ClientRect.right - ClientRect.left,
ClientRect.bottom - ClientRect.top,
hdcMem, ClientRect.left, ClientRect.top, SRCCOPY);
DeleteObject(hbrush);
DeleteObject(hbrushb);
DeleteDC(hdcMem);
return 0;
Figure 9.10 Windows animation, version 2 handling the WM TIMER message.

9.4 Module Summary

This module has brought together a number of concepts from previous modules. The
basic operation of a video-display was outlined, followed by a design and implemen-
tation study of real-time animation. The most important new principle examined in
this module is that of double-buffering. The need for double-buffering should be well
understood.
Further Reading
a
References
[1] Phillip A. Laplante, Real-Time Systems Design and Analysis An Engineers

Hanbbook, IEEE Press, 1993.
[2] Phillip A. Laplante, Basic real-time concepts, in Real-Time Systems Design and
Analysis An Engineers Hanbbook, chapter 1. IEEE Press, 1993.
[3] Nancy G. Leveson and Clark S. Turner, An Investigation of the Therac-25 Acci-
dents, IEEE Computer, vol. 26, no. 7, pp. 1841, July 1993.
[4] Stuart Ritchie, Systems Programming in Java, IEEE Micro, vol. 17, no. 3, pp.
3035, May/June 1997.
[5] Thomand J. Penello, Compiler Challenges with RISCs, IEEE Micro, vol. 10, no.
1, pp. 3743, Feb. 1990.
[6] Ramesh Subramaniam and Kiran Kundargi, Programming the Pentium Proces-
sor, Dr Dobbs Journal, pp. 3442, June 1993.
[7] A. Holub, Compiler Design in C, Prentice-Hall, 1990.
[8] Donald Lewine, POSIX Programmers Guide: Writing Portable UNIX Programs,
OReilly & Associates, Inc., 1991.
[9] Bill O. Gallmeister, POSIX. 4: Programming for the Real World, OReilly & Asso-
ciates, Inc., 1995.
[10] Stephen G. Kochan and Patrick H. Wood, The unix system interface, in Topics
in C Programming, chapter 5. Hayden Books, 1987.
[11] Donald E. Knuth, The Art of Computer Programming, Addison-Wesley, 1973.
[12] Niklaus Wirth, Algorithms + data structures=programs, Prentice-Hall, 1976.
[13] Robert Sedgewick, Algorithms, Addison-Wesley, 1988.
[14] Robert Sedgewick, Algorithms in C, Addison-Wesley, 1990.
[15] Robert Sedgewick, Algorithms in C++, Addison-Wesley, 1992.
[16] Jeffrey H. Kingston, Algorithms and data structures : design, correctness, analy-
sis, Addison-Wesley, 1990.
[17] Mickey Williams, Threads, in Programming Windows NT4, chapter 22. SAMS
Publishing, 1996.
[18] Herbert Schildt, Thread-based multitasking, in Windows NT4 Programming,
chapter 15. Osbourne McGraw-Hill, 1997.
[19] Borland/Inprise, Win32s API Help, Inprise Corporation, 1999.
[20] Mickey Williams, Pipes, in Programming Windows NT4, chapter 23. SAMS
Publishing, 1996.
[21] Charles Petzold, Programming Windows 3.1, Microsoft Press, 1993.

Sguide

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Sguide

Diunggah oleh

Hak Cipta:

Format Tersedia

70935

John Leis (Modules 1, 3-9) Mark Phythian (Modules 1, 2)

Distance Education Centre

Module 1 Real-Time Systems 0

1.1 Module Overview 1

1.3 Real-Time Concepts 2

1.4 What is an Operating System? 2

1.5 What is a Real-Time System? 4

1.7 Real-Time Applications 8

1.8 Compilers for Code Examples 9

1.8.2 Gnu Compilers 10

1.9 Links to Real-Time Systems Information 11

1.10 Module Summary 12

2.1 Module Overview 1

2.3 The Software Life Cycle 2

2.3.1 The Concept Phase 3

2.3.2 The Specification Phase 3

2.3.3 The Design Phase 4

2.3.4 The Programming Phase 5

2.3.5 The Test Phase 5

2.3.6 The Maintenance Phase 6

2.4 The Real-time System Specification and Design Techniques 7

2.4.1 Descriptive Techniques 7

2.4.2 Mathematical Techniques 8

2.4.3 Procedural Techniques 9

2.4.4 Structural Techniques 12

2.4.5 State-based Techniques 15

2.5 Implementation of Real-Time Kernels 19

2.5.1 The Function of a Real-Time Kernel 19

2.5.2 Polled Systems 19

2.5.3 Phase-Driven and State-Driven Systems 20

2.5.4 Interrupt Driven Systems 25

2.5.5 Types of Multi-tasking Systems 30

2.5.6 Foreground / Background Systems 30

2.6 Module Summary 31

2.7 Self Assessment Questions 31

3.1 Module Overview 1

3.3 The C Language 2

3.3.2 Compiling C Programs 2

3.3.3 Basic Structure of a C Program 10

3.3.4 Data Types 13

3.3.5 Variable Scope 15

3.3.7 Command-Line Arguments 19

3.3.8 Loading & Saving Data 19

3.3.9 Data Structures 23

3.3.10 Bitwise Operators 24

3.4 Advanced Topics 24

3.4.3 Public-Domain C Resources 27

3.5 The C++ Language 31

3.5.1 Compiling C++ Programs 31

3.5.2 Superficial Differences 32

3.5.3 Abstract Data Types 33

3.5.4 Object-Oriented Principles & Concepts 35

3.5.5 Object Definitions in C++ 37

3.5.7 Declaring Classes and Creating Objects 43

3.5.8 Inheritance and Derived Classes 44

3.5.10 Operator Overloading 46

3.5.11 Object Arrays 47

3.5.14 Object Pointers 51

3.5.15 Stream Input/Output 52

3.6 Module Summary 55

Module 4 Coding Techniques 0

4.1 Module Overview 1

The main concepts which underpin operating systems.

The extension of those concepts to real-time operating systems.

Obtaining and installing a C++ compiler.

The differences between hardware platforms, operating systems and compiler