Anda di halaman 1dari 210

UNIX SYSTEM PROGRAMMING

(BSIT 61)

Contributing Authors

Dr. G.Raghvendra Rao


Prof. & H.O.D. Dept. of Computer Science
NIE, Mysore

1
UNIX SYSTEM PROGRAMMING

Course Introduction

The present course – Unix Systems Programming – can be considered as an


extension course of the operating systems course, where you had been introduced to
the fundamentals of Unix. As you are familiar, an operating system is a platform which
helps the user or most often several users to optimally make use of the system
resources – like the CPU time, printers, memory, other storage devices etc. Unix is one
of the most popular operating systems that was initially introduced as a multi user, multi
process operating system. The student is expected to have undergone a basic study
about these features in the Unix course.

A study of an operating system can be done at two levels – at the user level or at
the system level. The user level programming must have already been studied by the
student. Here, the user becomes familiar with how to create and open files and
directories, save them an execute them using the facilities of the operating system.
Additional features like storing them under a password, storing the files under read,
write or execute restrictions, creating different users and so on can also be a part of
user level programming.

At the next level is the system programming. Here, we study the environment
under which a Unix program works. Unix, since being a multi user system will have to
store programs and data pertaining to different users and to ensure that they are made
available only to the respective users (or in some cases, other users authorized by the
owner), an elaborate system of storing the environment, sharing of data etc., have been
designed. The study of this environment manipulation becomes the first part of the
study on the Unix system programming.

Unix deals with all programs by looking at them as processes. A process can be
though of as a portion of a program or a subsection of it, which can be executed
independently. A combination of all such process interactions will give the desirable

2
program output – Hence Unix has an elaborate system of creating processes, storing
them, taking them on/ off resources etc. The sequences in which the processes
execute depend on various parameters and system environment. The understanding
and ability to manipulate the these environment is the key to success. This also forms
a part of the study.

Since a number of processes keep interacting and they also will be using the
system resources in a shared manner, it becomes desirable i) to control the access of
resources by the processes depending on the process requirements and ii) to ensure
that there is no overlapping of the resource usage. To do this, the system uses a
number of signals, called semaphores. We also learn the use of such signals.

I/O operation is a very important aspect of programming, even here, it becomes


essential that the system be able to perform input and output operations pertaining to
each of the processes. Also, it becomes essential to ensure that there will be no
overlapping of I/O operations, since it may otherwise mean mixing up of data /
information pertaining to different processes. Hence, use of semaphores or record
locking may be resorted to.

Also, there should be a capability for the processes to interact with each other.
Such inter process communication needs some communication channels. These are
achieved by pipes, forks, streams etc. We also talk of co-processes, daemon
processes which run in the back ground to achieve a variety of operations etc..

A few programs to illustrate these operations are also included, but detailed
laboratory excercises should supplement the course.

After having seen a bird’s eye view of the coverage of the topics, we have some
words about the material presentation here.

3
Since this is an advanced course, several prerequisite causes need to have been
completed by the student. Two of them are C programming and fundamentals of Unix.
Since C and Unix are not separable, a reasonably sound knowledge of C is a must.
Especially a good understanding of the concepts of pointers, structures, user defined
data structures etc is essential to follow the course material.

Talking about user defined data structures, a large number of data structures are
defined and used in the course. It is neither possible not desirable to give an
exhaustive list of them at the beginning itself, since to understood many of the fields, it
is necessary that we know something about the fields themselves. Hence, one often
comes across defined data structures like Pid_t, sigset_t etc.. which incidentally define
the Pid sets or signal sets. The candidate is advised to assimilate them as and when
they are encountered.

There is an associated problem. Often several description or functions are so


inter connected that one cannot be explained without the other. For example, if two
concepts a and b need the support of the other, it become necessary that they be
dealt in some order, a and b or b after a. In each case, the student is advised to
presume the concepts, until such a time a detailed analysis is made available. In some
cases, it may be even in different blocks, to ensure continuity of the concepts involved.
It is for instructors to see that the candidate does not feel out of place. Though all effort
is taken to maintain a sequential and steady flow of the issues involved, in certain cases
it becomes impossible to maintain such a flow pattern. Hence, the instructors need to
guide the candidate at such junctures.

Unix System Programming is mostly about functions and function calls.


However, certain theoretical and conceptual details need to be well understood before
the functions are explained. Since it is a unix course, many of the basic unix functions,
even if they are presumed to have been covered in the operating system course, are
reiterated in the first few pages. The student is advised to ensure that he understands
these concepts in detail clearly.

4
The other important aspect is about implementation. Unix System Programming
is not unique. There are several unix implementations – SVR4, 4.3 BSD, 4.3 + BSD to
name a few. There will be subtle differences between each of them. Some of the
functions, available in one version, may differ in details in another or may not be
available at all. Hence, implementation details need to be fine tuned to meet the
available versions.

Further many of the options, like flag settings are too complicated and detailed to
be included in the course work – though many of them have been included for
completion sake, in a few cases, they have been left out to avoid ambiguity and also to
ensure continuity. In case, the candidate needs further details on them, he may have to
refer to the system manuals.

The candidates are strongly advised to go through the reference text books,
especially book (1), to get any of their other doubts classified.

Now we shall go on with the course.

5
Unit – I

Unit Introduction

In this unit, we recollect the various aspects of Unix fundamentals, with which we
are already familiar. Though this unit is supposed to be a repetition, the student is
advised to diligently go through the same, since a sound knowledge of not only the
functions available to do the various operations, but the various ways in which the
arguments are passed, the type of returns one expects from the functions etc.. Are vital
for further follow up of the course.

The unit itself is divided into 3 blocks. The first block discusses about the
fundamentals of Unix file I/O operations. It is expected to clarify the various operations
on Unix files like open, read, write, lseek & closes. It also introduces to the concept of
file descriptors. For each of the operations, the list of optional arguments that change
the type of interaction with the system is also included. We also talk about the concept
of the sharing, duplication of descriptors, atomic operations and how to change the
properties of an already existing file.

The second block discuss the various file and directory functions. It begins with
the status functions which give us an idea about the status of the files. Then we look
into the different types of files are comes across normally. The block then discusses
about the concept of group and user ids and the various access permissions w.r.t
operating the files. The block also deals with the way files are actually stored, with the
alignment of various data blocks and also the concept of symbolic links.

The third block deals about the standard I/O functions. It discusses about the
need for and concept of buffering, the methods of opening a stream and getting data
into and out of such streams. The other concept dealt with is that of positioning in a
stream. The block also discusses the concepts like temporary files – which are created
by a program on the run and are closed once the program closes, password files –
encrypted password files and also shadow password files.

6
Block – I

Block Introduction

This block introduces the students to the fundamentals of Unix file I/O operations.
The understanding of these concepts is doubly important because Unix operates on
most of it’s devices and resources as files. The block given a brief description of each
of the operations, the function command, the arguments to be passed, with the various
options available. It also given an account of what return value to expect when the
function succeeds or when it fails.

The following functions are dealt with

Open function : To open the files


Create function : To create new files
Close function : To close the files
lseek function : To set the file offset values
Read function : To read an opened file
Write function : To write into an opened file.

We also look into the concept of the sharing and the concept of atomic
operations. We also see how to duplicate file descriptors and changing the properties
of an already open file.

7
Contents

1.0 Introduction

1.1 Open Function

1.2 Create Function

1.3 Close Function

1.4 Lseek Function

1.5 Read Function

1.6 Write Function

1.7 File Sharing

1.8 Atomic operations

1.9 Duplication of file descriptors

1.10Changing the properties of an already open file

1.11Block summary

1.12Review question and answers.

8
Fundamentals of Unix file I/O

In this, block we reintroduce ourselves to the fundamentals of file I/O. Though the
student is expected to have studied these and other concepts in some detail in the
earlier courses, it is essential that all concepts are unambiguously understood. So, we
get a bird’s eye view of the same, again, before going into the more advanced topics of
Unix system programming.

Most Unix file I/O operations can be performed, in fact, by using a few primitive
functions: open, write, read, lseek, close. We examine the effect of each and also the
effect of varying the buffer size on these functions. Of course, each of these functions
come with zero or more arguments, which will further qualify the actions indicated.

Before we go into the actual study of the operations, it is also desirable that we
are clear about the concept of file descriptors. To the kernel, all open files are referred
to by their respective file descriptors. Normally, a file descriptor is a non-negative
integer. When a file (either existing file or a new file) is opened, the kernel returns a file
descriptor. When a read or write operation is to be done, the file needs to identified by
it’s file descriptor which was associated with it by the kernel at the time of opening or
creating the file.

Normally, the file descriptor 0(Zero) is associated with the standard input, 1 with
the standard output, and 2 is associated with the standard error. The numbers
associated with the file descriptors normally range from 0 to 63 in many systems.

Now, we go on to the descriptions of the I/O functions.

9
1.1 Open function

The open function can create a new file or open an existing file. It’s standard format
is as given below

int open (const char * pathname, int oflag.../*, mode_t mode */);
This returns a file descriptor if the operation is successful, else returns –1 on error.
Now to the arguments.

The pathname is the name of the file to be created. The options available for this
function are specified by the oflag argument. There are a large number of options that
can be specified under this head. Some are listed below:

O_RDONLY : Open for reading only


O_WRONLY : Open for writing only
O_RDWR : Open for reading as well as writing

It need not be stated that only one of these options can be specified. In addition, a
large number of optional arguments can be included ( as indicated by .....)

Some of these optional arguments are listed below:

O_Creat : Creates a new file. This option demands another argument


“mode”, which specifies the access permissions of the new file.

O_Append : Appends to the end of an existing file, whatever is going to be


written.
O_Excl : Generates an error if O_Creat is specified, but if the file is
already is available. If the file does not exist, it creates a new file.

10
O_TRUNc : If the file exists and is successfully opened for either write_only
or readwrite modes, truncate it’s length to 0.

O_Sync : Have each write wait for the physical I/O to complete i.e if a slow
device like a printer is working, this command ensures that there is
synchronization in the operations.

The successful open operation returns a file descriptor, which is guaranteed to be the
lowest numbered unused descriptor. This property may become useful in some cases.

1.2. Create function

This will “Create” a new file, if the file does not already exist, otherwise it returns an
error.

The format is

int creat(const char * pathname, mode_t mode);

It returns the file descriptor if OK, returns –1 on error.

One problem with create command is that it creates it only for write operations. If it
is to be read after writing into it, it should be closed and then opened again using
open command.

1.3. Close function


This will close an already opened file
The format is
int close(int filedes);

11
When a process is terminated, all opened files are automatically closed and
need not be explicitly closed by the program.

1.4. Seek function

The position at which a file is read from or written into is referred to as the
“current file offset”. This non negative integer measures the number of bytes from
the beginning of the file at which the current activity ( of reading or writing) is taking
place. Also the read/write operations suitably increment the offset as the operations
progress.

By default, the offset is initialized to 0, ( the beginning of the file ) when the file is
opened.

However, if O_Append option is specified, then it indicates the current length of


the file is bytes.

However, the offset of an open file can be explicitly positioned at a suitable place
by calling lseek.

The format is

Off_t lseek ( int filedes, off_t offset, int whence);


This returns the new file offset if OK, else returns –1.

The value of the “whence” argument determines the interpretation of the offset.
i) If whence is seek_cur, the file’s offset is set to the current value plus the
offset (the offset can be positive or negative)
ii) If whence is seek_set, the file’s offset is set to offset bytes from the
beginning of the file( obviously, offset can only be positive)

12
iii) If whence is seek_end, the file’s offset is set to the size of the file plus the
offset. The offset can be positive or negative).
Note: Look at the arguments above carefully. The seek_end argument adds to the
size of the file an offset. Suppose the current size of the file is 100 bytes and we add 50
bytes as the offset. So, the next operation takes place at 150 bytes. i.e there is a “
hole” created in the file between the end of the file and the current area of operation.
These intermediate bytes are filled with 0s when read at a later stage.

You may note that the holes can be created in the other two operations also.

1.5.Read function

This is to read data from a opened file.’

The general format is

ssize_t read ( int filedes, void * buff, size_t nbytes);

It returns the number of bytes read, if the read operation is successful. If the end of
file is encountered before the read operation starts, it returns 0;

The read operation starts from the current file offset and as the reading continues,
the offset is incremented by the number if bytes actually read.

1.6 Write function

This is to write data into an opened file.

The typical format is

ssize_t write(int filedes, const void * buff, size_t n bytes);

13
This returns the number of bytes written.

The value returned is usually the same as the n bytes argument. Normally, the
write operation starts at the current offset position of the file. It the option is
O_Append, the file’s offset is set to the end of the file and the offset gets
incremented suitably during the write operation.

1.7. File sharing

Often, different processes need to share open files. Unix supports such operations.
Now, what actually is file sharing and why do processes need to share files? To do this,
we need to know the data structures used by the kernel for I/O.

Every process has an entry in the process table. Each such entry has a table of
open file descriptors, in the form of a vector, with one entry per descriptor. Each file
descriptor has i) the file descriptor flags and ii) a pointer to the file table entry.

The kernel maintains a file table for all open files. Each file table entry includes i)
the file status flags of the file (read, write, append etc.. ) ii) the current file offset and
iii) a pointer to the v-node entry for the file ( see next paragraph). Each open file has a
v-node structure. It contains information about the type of file and pointers to the
functions that operate on the file. In many cases, the v-node also contains the i-node for
the file.
The information is read from the disk when the file is opened, so that all relevant
information about the file is readily available.

The figure gives some idea about the data structures we have just now
discussed.

14
file table v-node table
fd flags ptr file status flags
v-node info

fdo
current file offset
i-node infor
fd1

fd2 v-node ptr current file size

file table

V node info
file status flag
current file i node infor
Current file size

Let us examine the case of a single process that has two different files opened –
one file is open an standard input ( file descriptors 0) and the other on standard
output(file descriptor 1)

For each of the processes a file table is opened, which contains details like the
file status flags, current file offset. Also another v node table containing details like
vnode information i-node information and current file size is created. This can be
accessed only through the v-node pointer field of the file-table.

Now suppose two/more independent processes open the same file. Then
definitely each of them should have a process table entry. Since it is the same file that
is being referred to by both of them, there can be a common v-node table. But the file
tables will have to be different, because each of the processes may be reading/writing
from different parts of the file- i.e their offsets are different.

15
process table entry

file status flags


fdo
etc v node table
fd1

fd2

v-node
file table details

Process table entry file status flags

fdo
etc
fd1

The student is advised to verify what happens when operations like write, Append, lseek
etc.. are undertaken. They will do well to trace the operations and changes on these
tables..

1.8. Atomic operations

Since Unix allows two or more processes to simultaneously operate on files, several
problems arise. Consider the following situation:-
Processes A and B are operating on a file. Both are trying to append to the file
some other data of their own. Difference scenarios arise.

We have to note that for append operation to succeed, one will have to first do the lseek
operation to the end of the file ( since the new data is to be written at the end of the file).
Consider process A has done lseek and put it’s pointer to the end of the file. Meanwhile
the process B is switched on. B also tries to find the end of the file ( any way the pointer

16
is already at the end of the file). Since the lseek operation has completed fast( or for
whatever other reason), B has also time to write the data. Now suppose the current
end of the file is 2000 bytes and B writes another 200 bytes, then the end of the file will
be extended to 2200 bytes. Suppose now, the process A is given the time. Since A’s
offset points to 2000 it self, it tends to start writing from that location, overwriting the
data written by B.

It is easy to note that a similar situation happens, even if A is allowed to write


after the lseek of B, only that now the data of A will be overwritten.

The problem arises because we are using two atomic operations( lseek and
write) to perform the write operation. Thus, after each atomic operation, there is a
possibility that the process switch takes place, leading to the problems described above.

Modern versions of Unix overcome this problem by using a single atomic


operation O_Append. Though it still involves the two stages of seeking the end of the
file and then writing the data into the file, there can be no process switching in between
and none of the anomalies that we have described above can take place.

Another example for such an anomaly can be the “ Creat “ operation. When a new file
is to be created, it is desirable that the existence of the file is to be checked and only if
no such file exists, it should be created.

Suppose the checking and creation are made two different operations. Then if
the file checks for the non existence of the file and comes back to create the file in the
next slot, may be in the intermediate period the file may be created by another process.
This would bring in anomalies. The solution again is to make creat an atomic operation.

Now we can define an atomic operation. It is an operation composed of multiple


steps. If the operation is performed, then all the steps are completed in one go. It is not

17
possible to perform one set of steps in one time slot and perform the remaining doing
the next slot.

1.9 Duplication of file descriptors

An existing file descriptor is duplicated by using any one of the following functions

int dup (int filedes);


int dup2( int filedes, int filedes 2)
Both of them return the new file descriptor if OK
otherwise error.
The new file descriptor returned by dup is guaranteed to be the lowest numbered
available file descriptor.

The new file descriptor to be returned by dup2 is specified in the filedes 2 argument.
If filedes2 is already open, then it is first closed and then returned as a new filedes2.

1.10. Changing the properties of an already open file:

The fcntl function:


The general format is
# include<fcntl.h>
int fcntl (int filedes, int cmd, ... / * int arg */);
returns ( depend on cmd), if not OK, returns error.

The fcntl function can be used for different purposes:

1. Let Cmd = F_DupFD. It returns a duplicate file descriptor


( note that this could be another method apart from using DUP)
2. Let Cmd = F_GETFD or Cmd = F_SetTFD. Correspondingly the function
gets or sets the file descriptor flags.

18
3. Let Cmd=F_GETFL or Cmd = F_SETFL. Correspondingly the function gets
or sets the file status flag.
4. Let Cmd = F_GETOWN or Cmd = F_SeTOWN. Correspondingly the
function gets or sets the I/O ownership.
5. Let Cmd = F_GETLK or Cmd = F_seTLK. The function gets or sets the
record locks.
We will quickly see how each of these operates

F_DUPFD : Duplicates the file descriptor filedes. The new file descriptor is
returned as the value of the function. It is obviously the lowest numbered descriptor
available, which is greater than or equal to the value of the third argument. The new
descriptor has it’s own set of file descriptor flags, but shares the same file table entry as
filedes.

F_GETFD : Returns the file descriptor flags for filedes as the value of
the function.

F_SETFD : Set the file descriptor flags for filedes. The new flag value
is set from the third argument of the function.

F_GETFL : Returns the file status flag for filedes as the value of the
function.

The different file status flags are as below:

O_RDONLY : Open for reading only

O_WRONLY : Open for writing only

O_RDWR : Open for reading and writing

19
O_APPEND : Append on each write

O_nonblock : non blocking mode

O_SYNC : wait for writes to complete.

F_SetFL : Sets the file status flags to the value of the third argument.

F_GETOWN : Gets the process IDs.

F_SetOWN : Sets the process IDs.

20
Block Summary

We have looked into the fundamentals of Unix file I/O operations. In particular
we have seen the functions of Open, Create, Close, lseek, read and write. We have
discussed the concept of file sharing and the need for atomic operations. The concept
of duplicate file descriptors and also changing the properties of an already open file
were also discussed.

21
Review Question

1. What is a file descriptor ?


2. What value is associated with standard input and what value with the standard
output ?
3. Give the format for opening a new file ?
4. What are the modes in which a file can be opened ?
5. What option is used to associate a slow I/O device with the System ?
6. For which mode of operation does a file gets opened in the creat command ?
7. Explain the concept of “ current file offset”
8. Explain how a hole gets created in file
9. What is file sharing ? How is implemented in unix
10. Explain the concept of atomic operations.
11. Which function is used for changing the properties of a file ? what is it’s format.

22
Answers to review questions:

1. When a file is opened, the kernel returns a file descriptor. The file gets identified
with this identifier for reading & writing.
2. Standard input is associated with 0, standard output with 1.
3. int open ( const char * pathname, int oflag ........);
4. A file can be opened for reading only, writing only and read and write
5. O_sync
6. Write only mode.
7. The position at which a file is read from or written it is referred to as the current file
offset.
8. Sometimes, when the lseek_end argument adds offset to the end of the file, it
actually goes beyond the end of the file and starts writing. A hole is then created in
the file, between the EOF and the beginning of write operation.
9. When more then one process wants to access the same file, the concept of file
sharing comes in. Unix implements this by creating a file table to each such
process, but making it point to the same v-node table.
10. When two or more operations are to be performed to complete a task, it may so
happen that the process may be removed after performing some of the tasks. For
example finding the eof and appending the data to it. In such a case, if the process
is removed after finding eof and comes back for appending at a later stage, it may so
happen that some other process might have operated on the file, there by changing
the scenario. To overcome this, the entire set of operations need to be completed in
one shot. This is called atomic operation.
11. fcntl. The format is
int fcntl (int filedes, int cmd);

23
Block –II
Block Introduction

In this block, we look into the various aspects of files, directory and the various
functions operating on them. We first begin with the stat functions – which are designed
to return the status information about the file specified. This status information would be
helpful to us in a variety of ways, as will become obvious later on.

We then have an idea of the various types of files like regular files, directory files,
special files, FIFO, sockets etc.. each of them has it’s own unique features and will be
useful in certain specific situations. We then get some idea about the various
permissions to the files. To be able to use a file, first off all one should have a valid user
id. There is also the concept of group id, wherein a group will have a single group id. In
additions, the files can be opened in read, write or execute modes or the various
combinations thereof. There is also the concept of ownership of files and directories.

Given suitable “ access rights”, a user will be able set and modify any or all of the
above. We will be familiarizing ourselves with these facts in this block.

Then we get an insight into the way the files are actually stored with the
alignment of various data blocks. We should be able to modify or unlink the
associations.

We will also get ourselves introduced to the concept of symbolic links which is an
indirector pointer to a file, which will overcome some of the drawbacks of normal links.
We see the various ways of operating on such symbolic links.

We also get some idea about the concept of times – as viewed from the file point
of view.

24
Contents
2.1 Stat function
2.2 Types of files
2.3 Set-user-id and set-group-id
2.4 File access permission.
2.5 Ownership of new file and directories
2.6 Access function
2.7 Umask function
2.8 Chmod and fchmod functions
2.9 Chown, fchown and ichown function
2.10 Filesize
2.11 File truncation
2.12 File systems
2.13 Link, unlink and rename functions
2.14 Symbolic links
2.15 Symlink and readlink functions
2.16 File times
2.17 Utime function
2.18 Mkdir and rmdir functions
2.19 Chdir, fchdir and getcwd
2.20 Block summary
2.21 Review question and answers

25
Block – II

Files and Directories

The previous block covered the basic I/O functions around regular files. Additional
features of the files, the file systems and other properties of files are examined in this
section.

2.1 Stat functions

There are three stat functions


Typically they appear as below
int stat ( const char * path name, struct stat * buf );
int fstat (int filedes, struct stat * buf);
int lstat ( const char * path name, struct stat * buf );

all of then return 0 if successful and –1 if error now let us see their functions.
Stat : Returns a structure of information about the file whose name
and path is given.

fstat : Returns a structure of information about the file when the


named file is a symbolic link. ( one should note that it
returns data about the symbolic link and not the file
referenced by the symbolic link. At this stage it is suffice to
say that such links are useful in situations where we are
walking down a directory hierarchy).

fstat: Obtains information about a file that is already open on the


descriptor fields.

26
It may be noted that the second argument is a pointer to a structure that we must
apply. The function sends the data to the buffer pointed to by buf. Of course the actual
structure fields may differ from implementation to implementation one typical structure
could be as follows:

Struct stat {
mode_t st_mod; /* file type, mode and permissions */

ino_t st_ino; /* i-node number ( serial number)*/

dev_t st_dev; /* device number */

dev_t st_rdev /* device no. for special files */

nlink_t st_nlink; /* number of links */

uid_t st_uid; /* user ID of owner */

gid_t st_gid; /* group id of owner */

off_t st_size; /* size of regular files, in bytes */

time_t st_atime; /*time of last access */

time_t st_mtime /* time of last modification */

time_t st_cttime /* time of last file status change */

long st_blksize /* best I/O block size */

27
long st_blocks /* number of 512 blocks allocated */

2.2 Types of Files

Most of the files encountered in Unix belong to either of the two types :
- regular files or
- directories

However, these need not be the only types that Unix system supports. We list
several other types below :

1. Regular file : The normal data files, either text or binary files. the
Unix kernel treats both text and binary files in the same manner,
the interpretation being left to the application programs.
2. Directory files: Unix treats directories also as files. This file
contains names of the files contained in that directory and pointers
to them. Only the kernel can write information into the directory,
but any process that has sufficient permissions would be able to
read the contents of the file.
3. Character special file : This is used for certain types of devices on
a system.
4. Block special file: This file is used for disk devices. All devices on
the system are either character special files or block special files.
5. First in First out : This type of file is used for inter process
communication between processes. Some times this is referred to
as a pipe ( the analogy is that any information meant for a
particular location is sent through a pipe. Essentially what enters
the type first will be the one that comes out first at the other end).
6. Socket : It is a type of file used for network communication
between processes ( A socket can also be used for non network
communication between processes on a single host) The analogy

28
is in the electrical socket, which connects an electrical device to
the electrical network for power transfer.
7. A symbolic link: This type of file points to another file

The system identifies each of these types of files by special macronames:

Macro Type of file


S_ISREG( ) Regular Type
S_ISDIR( ) Directory file
S_ISCHR( ) Character special file
S_ISBLK ( ) Block special file
S_ISFIFO ( ) FIFO
S_ISLNK( ) Symbolic line
S_ISSOCK( ) Socket

Just to make some of these concepts clear as also to get some programming practice,
we write a simple program which accepts a strings of command line arguments and
returns the type of file indicated by argument

# include<sys/types.h>
# include<sys/stat.h>
#include “ourhdr.h”

int
main ( int argc, char * argv [ ])
{
int i;
struct stat buf;
char * ptr;
for ( i=1; i<argc, i++)
{
printf(“%s “, argv [ i ]);
if lstat ( argv [ i ], & buf <0)

29
{ err_ret (“ lstat error “);
continue;
}
if (S_ISREG(buf.st_mode)) ptr = “regular” ;
else if ( S_ISDIR( buf.st_mode)) ptr = “ directory” ;
else if (S_ISCHR(buf.st_mode)) ptr = “Character special “ ;
else if ( S_ISBLK (buf. st_mode)) ptr = “ block special “;
else if ( S_ISFIF0 (buf.st_mode)) ptr = “FIFO” ;
else if ( S_ISLINK (buf.st_mode)) ptr = “ symbolic link”;
else if (S_ISSOCK(buf. st_mode)) ptr = “socket” ;
else ptr = “ unknown mode “ ;
printf(“%S \n”, ptr);
}
exit (o);
}

The program is fairly straight forward to need detailed explanation. Basically it


accepts a number of command line arguments and printout their file types.

For example if the argument is

$ a.out / vmunix / etc / bin / var / spool / croon / FIFO


the output could be
/ vmunix : regular
/ etc : directory
/ bin : symbolic link
/var / spool / cron / FIFO : FIFO
However in most applications regular files and directories form about 90% to 95% of the
total number of files stored in a system.

2.3 Set-user-ID and set-group-ID

30
Every process has several ids associated with it. Some are listed as follows:
i) The real user id and real group id : These IDs identity the process.
These two do not change during a login session and are infact
found with the entry in the password file.
ii) The effective user id, effective group id and supplementary group
id and supplementary group ids determine the file access
permissions.
iii) The saved user id and saved set group id contain copies of the
effective user id and effective group id.
However in most cases the effective user id and effective group id will be
identical to the real user id and real group id respectively.

Every file has a owner and a group owner. The owner is indicated by st_uid and
group owner by st_gid of the stat structure.

2.4 File access permissions.

The st_mode value also encodes access permission bits per the file. All types of
files have permissions.

The permissions bits for the files can be listed as follows:

St_mode mask meaning


S_IRUSR : user read
S_IWUSR : user write
S_IXuser : user execute
--------------------------------------------------
S_IRGRP : group read

S_IWGRP : group write

31
S_IXGRP : group execute

--------------------------------------------------
S_IROTH : others-read

S_IWOTH : others-write

S-IXOTH : others-execute

2.5. Ownership of new files and directories.

When a new file or directory is created, it is appropriate that ownership issues


are also addressed suitably. The rules for the ownership of a new directory are similar
to those of a new file.

The user ID of a new file is set to the effective user ID of the process. The Group
id of a new file can be effective group id of the process or the group id of a new file can
be the group id of the directory in which the file is created.

2.6.Access function

Typical access function is

int access ( const char * path name, int mode);


returns 0 if OK, -1 if error

When accessing a file with the open function, the kernel performs it’s access
tests based on the effective user id and the effective group id. The access function, on
the other hand, bases it’s tests on the real user id and real group id.

The argument mode is the bitwise OR of the following values:

mode Description

32
R_ok test fir read permission

W_ok test for write permission

X_ok Test for execute permission

F_ok Test for existence of file.

2.7 Umask function

This function sets the file mode creation mask for the process and returns the
previous value.

Typical umask function is given by

mode_t umask (mode_t cmask)


returns previous file mode creation mask.
The cmask argument is formed as the bitwise OR of any/ all of the access
permission bits listed with section 2.4. The file mode creation mask is used whenever
the process creates a new file or a new directory.

A typical Umost command could be

umask (S_IRGRP | IS_IWGRP | S_IROTH | S_IWORTH) ;


( what are the masks created by this command ? )

It is to be noted that most unix users may not explicitly set the umask value at all.
At the time of login, the shell’s start up file sets the values and they are never changed
by the user. However, while creating new files, it is always desirable to ensure that

33
specific access conditions are enabled and also we must be able to modify them at our
convenience.

2.8 Chmod and fchmod functions

These two functions allow the user to change the file access permissions for an
existing files.

The typical chmod function is

int chmod (const char * path name; mode_t mode);


returns 0 if OK, -1 if error
A typical fchmod function is
int fchmod (int filedes, mode_t mode);
returns 0 if OK, -1 if error

The student might have already observed that while the chmod operates on a
file, specified by the part the fchmod operates on a file that is already opened.

The argument mode is specified as the bitwise OR of the constants listed in the
table below:

Mode Command

S_ISUID Set user id on execution

S_ISGID Set group id on execution

S_ISVTX Saved text


S_IRWXV Read, write & execute by owner

34
S_IRUSR Read by owner

S_IWUSR Write by owner

S_IXUSR Execute by user

S_IRWXG Read, write & execute by group

S_IRGRP read by group

S_IWGRP Write by group

S_IXGRP Execute by group

S_IRWXO Read, write & execute by others

S_IROTH Read by others

S_IWOTH Write by others

S_IXOTH Execute by others

There are one/two additional points to be noted.

1. If we try to set the saved text operation (S_ISVXT) of a regular file


and we do not have super user privileges, the bit corresponding to
the operation in the mode is automatically turned off. Started the
other way, if a non privileged user tries to use this option, the mode
argument is automatically modified to nullify his effort. This way,

35
the system prevents malicious users from simply turning on the
same text option and overflow the swap area.
2. Secondly, a calling process may try to work on a file of another
group i.e the group id of the calling process and the group id of the
newly created file may be different. Then the set group id bit is
automatically turned off. This prevents a user from creating a set
group id file owned by some other group.

2.9. Chown, fchown and / chown function:

The chown function allows the user to change the user id of a file, also it’s group
id.

The typical formats are

int chown ( const char * path name, uid_t owner, gid_t group);
returns 0 if OK, -1 on error

int fchown ( int filedes, uid_t owner, gid_t group );


returns 0 if OK, -1 on error

int Ichown ( const char * path name, uid_t owner, gid_t group);
returns 0 if OK, -1 on error

They work in almost similar fashion, except when the file referenced is a
symbolic link.

When a symbolic link is referenced, the Ichown changes the ownership of


the symbolic link, instead of changing the file pointed to by the symbolic link..

2.10. File Size

36
The St_Size member of the stat structure contains the size of the file in bytes.
This field will be useful while operating regular files, directories or symbolic links.
( why not in other cases?).

For a regular file, the minimum size allowed is 0 bytes.


For a directory file, the size is usually a multiple of 16.
For a symbolic link, the file size is the actual number of bytes in the file name

It may be noted that holes can be created in a file - if a write operation seeks
past the current end of file before writing into the file, instead of starting at the exact
end of the file.

However, when such files, with holes are copied, the holes are automatically
blocked off.

2.11. File truncation

Some times we may like to truncate a file by knocking off the data at the end of a
file ( for whatever reasons).

The typical format is


int truncate ( const char * path name, off_t length);
returns 0 if OK, -1 if error.

If the file is already open, we use the ftruncate.


int ftruncate ( int filedes, off_t length);
returns 0 if OK, -1 if error.

Both the functions chop off the portion of the file extending beyond the length
specified.

37
This ofcourse presumes that the present size of the file is larger than that
specified by the length field. What if it is not so? Suppose the existing size of the file is
less than the size required by truncate. Some implementations leave it as it is, while
others fill the deficit with blanks ( possibly creating a hole in the file)

2.12. File systems.

Before we can appreciate the various complexities involved with the links to a
file, symbolic links etc... it is desirable that we have clear concept of the organization
and operational mechanisms of Unix file structure. Obviously different implementations
of unix file systems exist today. However, what we are aiming at in this subsection is a
broad outline that suffices to the understanding of the concepts involved.

38
Disk
Partitions 1 Partitions 2 Partitions 3
Drive

File i-list directories blocks & data blocks


System

boot
block

Super
block

i-node i-node i-node

A disk drive is divided into a number of partitions, each partition into a file system
as indicated in the second level and each i-list is actually a list of i-nodes as in the
lowest level.

Suppose we look into a detailed file system

i list data block data block directory block data directory

i-node i-node inode i-node

i-node file name


no
i-node file name
no

i-node filename

39
A little explanation is in order.

Each file system is made up of a series of directory blocks and data blocks.
( after all unix manages directories also as files).

The i-node list contains a number of i-nodes, each pointing to data block(s). The
i-nodes contain all informations about the file – file type, file’s access permission bits,
size of the file, pointers to the data blocks for the file etc.. Most of the information for the
stat structure are obtained from the i-node.

One more interesting point when the same file is copied into different directories
or when it is renamed, it is not necessary to physically duplicate the contents of the file.
All that is needed is to enter the filename in the corresponding directory and make it
point to the respective i-node. This is a very important concept that becomes useful in
the next section.

2.13. Link, unlink, remove and rename functions

We have seen that any file can have multiple directory entries pointing to it’s i-
node. What we do is to create a link to the existing file using the link function.

A typical format is as follows:

int link ( const char * existing path, const char * newpath)


returns 0 if OK, -1 if error.

This function creates a new directory entry called new path that references the
existing file existing path.

40
If new path already exists, an error is returned.

One other point to note is that every working directory has a “ link count” field,
which indicates the number of links the directory is handling.

When a “link” operation is executed, the link count of the directory is incremented
by one.

The creation of a new directory and incrementing of the link count must be an
atomic operation.
( exercise to students : why ?)

It may be noted that indiscriminate use of links may cause loops in the file
system, creating problems. Thus, the link operations are normally to be executed by
the super user.

To remove an existing directory link, we use the unlink function.

It’s format is

int unlink ( const char * pathname)


returns 0 if OK, -1 if error

This function removes an existing directory entry, decrements the link count of
the file referenced by pathname. If there are other links to the file, the data in the file
will be accessible through other links.

Only when the link count reaches 0, can the contents of the file will be deleted.

In fact whenever a file is to be closed, the kernel does it as a two stage


operation.

41
i) Count the number of processes that have the file in question open
ii) Once a) becomes 0, it checks for the link count, if it also becomes
0, the file’s content is deleted.
If the pathname is a symbolic link, it will unlink references the symbolic link, not
the file referenced by the link .

A file can also be unlinked using the remove function.

The typical format is


int remove( const char * pathname);
returns 0 if Ok, -1 on error.

For a file the operation remove is the same as unlink i.e the file becomes in
accessible through that directory entry, whereas for a directory, it means removing the
directory after it is made empty.

A file or directory can be renamed with the rename function.

Typical format
int rename(const char * oldname, const char * newname);
returns 0 if OK, -1 if error.

a) If old name specifies a file ( and not a directory) then we are


renaming a file. If the file named in the new name already
exists, it is removed and the old name file is renamed as new
name. Note that renaming involves changing the contents of
both the old name and name directories. Hence to execute this
command, one should have write permission to both old name
and new name directories.
b) If old name specifies a directory, that means we are renaming
the directory – If new name exists and it is a directory, it must be

42
empty. Then we can specify new name to the directory old
name. Otherwise, if a non-empty new name exists, then just as
in the case of files, the new name is renamed and then old
name is renamed as new name.
c) If old name and new name refer to the same file, then the
function returns successfully without doing any thing.
2.14 Symbolic links:

We have talked about symbolic links often in the previous sections. We look into
more formal details in this section.

A symbolic link is an indirect pointer to a file. It does not point directly to the file.
In fact symbolic links were introduced to overcome some of the limitations imposed by
the normal links.

a) The normal pointers normally require that the link and the file reside in
the same file system.
b) Only the super user can create a hard link to the directory.

Whereas in the case of symbolic links, there is no restriction about place of


residence of the link and the file which it points to. Also any body can create the links.

Also not all functions allow tracing of symbolic links to the files they point to

The following table gives a list of functions that allow the use of symbolic links
and those that do not.
Function Allows Symbolic Link
Access Yes
Chdir Yes
chmod Yes
chown Yes
Creat Yes
exec Yes
ichown No

43
link Yes
lstat No
mkdir Yes
mkfifo Yes
mknod Yes
open Yes
opendir Yes
pathconf Yes
readlink No
remove No
rename No
stat Yes
truncate Yes
unlink No

2.15 Symlink and readlink functions

Symlink creates a symbolic link


The Format is
int symlink (const char * actual path, const char * sympath)

A new directory entry, sympath, is created to point to actual path. It may look
strange, but actual path need not exist when sympath is being created. Also actual path
and sympath need not reside in the same file system.

Since the open function follows a symbolic link, we need a way to open the link
itself and read the name in the link. The function read link does this.
int readlink (const char *path name, char * buf, int bufsize)
is the typical format
If the function is successful, it returns the number of bytes placed into buf.
2.16 File times

The system maintains three time fields for each file. Their names and purposes
are listed in the table below

Field Purpose
St_a time Previous access time of the file

44
St_m time Previous modification time of file data
St_c time Previous change time of i-node status

The modification time is the time when the file data was previous modified,
whereas the changed status time was when i-node was previously changed. A write
operation changes the contents of the file whereas chmod and chown etc., change the
status of the i-node. The access time can be used by the system administrators to
delete these files which have not been accessed for a long time by the users. Similarly
the modification time and changed status time can be useful to archive these files
whose contents have been modified over the previous time slot.

2.17 utime function

The access time and modification time of a file can be changed with the u time
function.
Int utime (const char * path time, const struct utimebuf * times);
This function uses the structure
Struct utimbuf {
time_t actune; /* access time */
time_t mod time /* modification time */
}

2.18 mkdir and rmdir functions

mkdir is used to create directories.


The typical format is
int mkdir (const char * pathname, mode_t node);
This creates a new empty directory.
Similarly an empty directory is deleted with a rmdir function
Int rmdir (const char * path name); is the format

45
If the link count of the directory becomes 0 with this call and no other process
has the directory open, then space occupied by the directory is freed. If one or more
processes have the directory open when the link count becomes 0, then no more links
are allowed to be made to the directory but the directory is released only after the last of
the processes closes the directory.

2.19 Chdir, fchdir and get cwd

Every process works with the current working directory. All path names (that do
not begin with a will be searched beginning from the working directory. The current
working directory, in fact, is an attribute to the process we can change the current
working directory of the calling process by calling chdir function.
int chdir (const char * path name)

Similarly we have fchdir for individual files


Int fchdir (int filedes);
Similarly a function is provided for finding the complete absolute path to the
current working directory.
Char * get cwd (char *buf, size_t size);
Returns the path name if successful.

Block Summary

We began with a discussion about the structure detail availability using the stat
functions. Then we got ourselves introduced to serial types of files like regular files,
directory files, special files sockets etc.. We discussed about the concept of usr and
group ids and also about the ownership of files and directories, and how to change
them.

46
We discussed also about the symbolic links – an indirect link to a pointer file to overcome
some of the short comings of the normal pointers. We closed the discussion with the concept of
file times.

47
REVIEW QUENSTION
1. What is the need for stat functions, name the various functions & their formats.
2. What are the different types of files one normally comes across?
3. What are different file access permissions?
4. What is the Umask function do?
5. How do you change the file access permission for existing files?
6. What functions are available to change the ownership of files
7. What do the truncate function do?
8. Give the format of remove function
9. Give the format of rename function
10. In what way are symbolic links an improvement over normal links.

48
Answers:
1. The stat functions allow us to get the structure of information about the file indicated
There are three stat functions
i) int stat ( const char * pathname, struct stat * buf);
ii) int fstat ( int filedes, struct stat * buf);
iii) inst lstat ( const char * pathname, struct stat * buf);

2. The different types of files one encounters in unix are, regular files, directory files, character
special files, block special files, FIFO, socket and symbolic links.
3. The different file access permissions are read, write and execute – each enabled for user,
group and others. So, these are actually 9 combinations.
4. It sets the file mode creation mask for the process and returns the previous value
5. By the Chmod function, whose format is
int chmod (const char * pathname; mode_t mode);
6. We have the functions chown, fchown and lchown.
7. There are 2 truncate functions:
truncate knocks off the remaining portion of the file after opening it
ftruncate does the same for an already opened file.
8. int remove(const char * pathname)
9. The typical format is
int rename(const char * oldname, const char * newname);
10. i) The link and the file the link is pointing to need not reside in the same file system
ii) There is no restriction that only the super user should create the links.

49
Block Introduction (BLOCK – III)

In this block, we look into some of the standard I/O library concepts. The I/O
routines operate on what are known as file descriptors. When a file is opened, the I/O
function actually returns a pointer to the file. This pointer is used for further operation.
In this block, we look at so me of the aspects of managing the standard I/O.

We begin with the concept of buffering, which is normally assumed be available,


unless specified otherwise. We discuss about fully buffered, line buffered and
unbuffered operation, looking at the respective functions and also the methods of using
them.

The we go on to the methods of opening a stream – it can be done is various


modes like read, write and various combinations of appending to the e. of., truncate to
length specified etc.,

After having opened a stream, we look into the concept of taking input into and
output out of them. It may be one character at a time, one line at a time or a block at a
time. We have various function for each of them.

There is also the concept of positioning a stream, where in we can start our
operations at any desired place in the stream by suitably positing the off sets. We will
then be looking into the concept of temporary files, which are created on the file by the
programs in execution and are closed once the program creating them is terminated.

We close the discussions with the important concept of password files. There
are drawbacks in storing the password file as a simple file. Hence, unix systems store
then as encrypted files. Even this may not totally solve the problem. So, sometimes we
may resort to store then else where in the system – by making use of the concept of
shadow password files.

50
Contents:

3.1 Introduction
3.2 Concept of buffering
3.3 Opening a stream
3.4 input into and output out of a stream
3.5 Line at a time I/O
3.6 Positioning a stream
3.7 Concept of Temporary Files
3.8 Password file
3.9 Shadow passwords
3.10 Review Question & Answers

51
3. Standard I/O Library
3.1 Introduction:

Here we briefly look into the concept of standard I/O library, which handles
details such as buffer allocation and performing I/O operations optimally.

Normally all I/O routines are centred around file descriptors. When a file is
opened, a descriptor is returned for the file by the kernel. All subsequent I/O operations
w.r.t. the file are done using the descriptor. With the standard I/O library, the focus shifts
to the “streams”. Each file is associated with a stream, when a file is opened or created
and the streams will be useful in further I/O operations.

When a stream is opened, the standard I/O function of open returns a pointer to
a FILE object. This object in fact is a structure that containing information needed by
the standard I/O library. The normal fields of this structure include the file descriptor to
be used for I/O, a pointer to a buffer that holds the stream, the size of the buffer, a count
of the number of characters currently in the buffer etc.,.

In this section, we get a detailed view of the standard I/O library. As in other
cases we get some insight into the library, those different versions of implementations
may provide slightly different operations.

Before looking into specific aspects of streams, we predefine three streams


which are automatically available to all the processes – standard input, standard output
and standard error, indicated by STDIN_FILE NO, STDOUT_FILE NO and
STDERR_FILE NO. We simply call them stdin, std out, std err. The <stdio.h> header
provides all of them.

52
3.2 Concept of buffering

Buffering is resorted to in order to ensure efficiency in the number of read and


write calls. When several similar input or output operations are to be done sequentially,
buffering can obviate the need for repeated input and output operations.

Further, buffering, ideally should be automatic for each I/O stream, so that the
application programmers need not worry about them. In general, three types of
buffering are normally provided

a) Fully buffered: Here the actual I/O takes place when the standard I/O buffer
is full. Files residing on the disk are all fully buffered by the standard I/O
library, using the “malloc” when the I/O operation is performed the first time.
Further a buffer can be “flushed” automatically by the standard I/O routines or
by calling the flush function. The flush operation actually means writing out
the contents of the buffers, even if it is partially filled. The data already in the
buffer is normally discarded.
b) Line buffered: In this case, I/O takes place only when a new line character is
encountered. The input can come at any speed, even at one character at a
time, the actual I/O taking place only when the line is completed. Normally
when I/O using a terminal is being used, wherein the data is being typed
manually, line buffered I/O is used.
c) Unbuffered: Here the characters are not buffered. The actual instance of I/O
is expected to be as early as possible after the function call.

The normal functions used in buffering are setbuf and setvbuf.


Typical they appear as follows
Void setbuf (FILE *fp, char *buf);
Int setvbuf (FILE *fp, char *buf, int mode, size_t size);

53
They return 0 if successful, else return a nonzero integer for error .
Commonsense dictates that these functions are to be called after opening the streams ,
but before performing any other operations on them.

The setbuf can be used to turn on the buffer or turn it off. To turn on buffering,
buf must point to a buffer of length BUFSIZ (a constant defined in <stdio.h>). To turn off
the buffer, buf must point to NULL.

The setvbuf will be used to exactly specify the type of buffering needed. The
mode argument will help us choose from the three types of buffering discussed above.

- IOFBF : fully buffered


- IOLBF : Line buffered
- IONBF : Not buffered.

If we specify non buffering, but still give the buffer size and other arguments, the
non buffering mode is given priority and the other arguments are ignored.

The following table gives an idea about various options of buffering

Function node bufl Buffer & Length Types of buffering


Set buf Non null Buf=BUFSIZ Fully buf or line buf
null (no value) Unbuffered
IOFBF nonnull Buf=size Fully buffered
null System buffer
IOLBF nonnull Buf = size Line buffered
Setvbuf
null System buffer
IONBF ignored (no buffer) Unbuffered

This table is nothing but an extract of the previous discussions of this section.
We can also make a stream to be flushed
Int fflush (FILE *fp);
This forces any unwritten data for the stream to be passed to the kernel.

54
3.3 Opening a stream

Fopen, freopen and fdopen


The following are the typical formats of the functions
FILE *fopen(const char * path name, const char * type);
FILE *freopen (const char *pathname, const char * type, FILE *fp);
FILE *fdopen (int filedes, const char *type);
All of them return file pointers if OK, else NULL

Fopen opens a specified file.


Freopen opens a specified file or specified stream. If the stream is already open,
it is closed first and then reopened. This is used to open a specified file as one of the
following steams! Standard input, standard output or standard error.

Fdopen takes the existing file descriptor and associates a standard I/O stream
with the descriptor.

type Description
r or rb Open for reading
w or wb Truncate to 0 length or create for writing
a or ab Open for appending or create for writing
r+ or r+b or rb+ Open for reading and writing
w + or w+b or wb+ Truncate to 0 length or create for read write
a + or a+b or ab+ Open or create for reading and writing at end of the file

You may wonder why two different types of commands are there to do the same
job. The character b helps system to differentiate between text and binary files at the
I/O level. At the kernel level, it makes no difference, since unix kernel makes no
difference between text and binary files.
However, a close examination of the type description and the functions available
to open a stream also brings home another point. Not all combinations listed in the
above table can be used with all the functions. A summary of the different legal ways to

55
open a file, with different initial conditions is listed in the following table. Students are
advised to figure out why other combinations are not tenable.

Condition r w a r+ w+ a+
File must already exist y - - y - -
Previous content of file discarded - y - - y -
Stream can be read y - - y y y
Stream can be written - y y y y y
Stream can be written only at the end - - y - - y

By default, a stream that is newly opened in fully buffered, unless it refers to a


terminal device like a keyboard, in which case it is line buffered. However, once the
stream is opened, but before any other operations are done on it, we can change the
buffering if we want to, by using set buf or setvbuf functions.

A stream which is already opened may be closed by using fclose.


Int fclose (FILE *fp);

Any buffered input data is discarded before it is closed. Similarly any buffered
output data is flushed. Also when a process is terminated normally, all unwritten
buffered data are flushed and all open standard I/O streams are closed.

3.4 Input into and output out of a stream


There are three different types of unformatted I/O.

a) One character at a time: One can read or write are one character a
time, with the buffering operations being taken case of by the I/O
functions.
b) One line at a time: The normal fgets and fputs will do the job. Each
line is terminated with a new line character, with the maximum length
of the line being predefined.

56
c) Direct I/O: This type of I/O is supported by fread or fwrite functions.
Each operation of read or write means a certain number of objects,
each a specified size is read or written.

There are three input functions to read one character at a time:


Int get c(FILE *fp);
Int fget c (FILE * fp);
Int getchar (void);

There are subtle differences between getc and fgetc. Getc can be implemented
as a macro, while fgetc cannot be so. i.e. the arguments of getc cannot be an
expression that has side effects. Further since fgetc is a function, we can pass it’s
address as an argument to another function. Also calling getc will be faster, since
macros are known to be faster than function calls.

All of them return the next character if OK, EOF on end of file or error. Notice
that EOF is returned both at end of file or an error. To distinguish between the two
either ferror or feof is used.
Int ferror (FILE *fp);
Or
Int feof (FILE *fp);

Both of them will return a non zero if condition is true and 0 otherwise.
After reading the stream, we can push back the character by using unget C
Int ungetc (int C, FILE *fp);
The characters so pushed back are returned by subsequent reads on the stream
in the reverse order of pushing i.e. the last character pushed will appear first.

Similarly we have three output functions

Int putc(int C, FILE *fp);

57
Int fput c (int C, FILE *fp);
Int putchar (int C);

3.5 Line at a time I/O

Line at a time input is provided by the following functions


char *fgets(char *buf, int n, FILE *fp);
char *gets (char *buf);

buf in both cases mean the buffers to be read into gets reads from the standard input
device and hence nothing is specified about the input, whereas fgets reads from the
specified stream. n in the fgets is the size of the buffer. This function reads upto (n-1)
characters including newline characters, if any. The buffer is terminated by a null
character.

Line at a time output is provided by the two similar functions

int fputs (const char *str, FILE *fp);


int puts(const char *str);
There is need for explanation of the details

3.6 Positioning a stream:


Positioning a standard I/O stream becomes important when a read/write is being
done into a nonempty stream.
There are two ways to position a standard I/O stream. The first method makes
use of the function ftell and fseek, which store the file’s current position as a long integer
and return the current file position and point to that location respectively.

typical formats are as below


long ftell (FILE *fp);

58
int fseek (FILE *fp, long offset, int whence);
ftell indicates the current file position as a long integer, whereas fseek returns 0, if it can
successfully position, otherwise it returns a non zero zero.
Ofcourse, there is a rewind command to return to the beginning of the stream..
void rewind (FILE *fp);

However, ANSIC also provides two other functions to do the job

int fgetpos (FILE *fp, fpos_t *pos)


int fsetpos (FILE *fp, const fpos_t *pos)

f get pos stores the current value of the file’s position in the object pointed to by
pos.

f set pos positions the stream to the location pointed to by pos.

3.7 Concept of Temporary Files:

Temporary files are created on the fly by the programs in execution and they are
closed as and when they are closed or the program is terminated as the case may be.

The standard I/O library provides two function to assist the creation of temporary
files.

char *tmpnam (char *ptr)


This generates a valid path name
FILE *tmpfile (void)

To demonstrate how these functions work, we look at a small piece of code that
works using them.
main (void)

59
{
char name [L_tmpnam], line [MAXLINE]
FILE *fp
print (“%s \n”, tmpnam (NULL));
/* first temporary name);
tmpnam (name);
/* second temporary name */
print f (“% \n”, name);
if ((fp = tmp file ()) == NULL);
err_sys (“tmpfile errror”);
/* create the temporary file by assigning it to the
pointer. If it is null, return error */
fputs (“This is a temporary file”)
/* Write into the temporary file */
rewind (fp); /* rewind to read the contents */
if (fgets (line, size of (line), fp) == NULL)
err_sys (“fgets error”) /* display error*/
fputs (line, stdout); /* print the line into standard output */
exit (0);
}

3.8 Password file

The unix password file contains the following fields


Description member
user name char * pw_name
encrypted password char *pw_passwd
numerical user id vid_t pw_uid

60
numerical group id gid_t pw_gd
comment field char * pw_gecos
initial working directory char *pw_dir
initial shell program char * pw_shell

The description fields refers to the fields that describe each entry in the password
file.

The corresponding entries are all included in a structure called struct passwd.
The fields as stored in the structure are indicated by the member field in the above
table.

There will be one entry for each password.

There will be atleast one entry called root. The user id for this is 0.

The encrypted password will normally be a copy of the user’s password through
a one-way encryption algorithm. This means, even if some unauthorized person were
to get hold of this field, he may still not be able to decipher the password of the user.
Some of the fields may be empty as well.

3.9 Shadow passwords


In the previous section, we talked about the password entries being encrypted,
so that even if one were to get the password file, it may not be possible to directly get
back the passwords. However, many unauthorized users often make it a point to get a
copy of the password file and make intelligence guesses about the passwords. The
guesses may be run through an algorithm to see if it matches with the entries in the
password file. One may say it is a laborious process, but the success rate, in the hands
of experienced persons is quite high. To make it harder for such persons to obtain the
encrypted passwords, some systems store the passwords else where, often called the
shadow password file.

61
They may also require that the user changes the pass word at regular intervals –
this concept is called the password aging.

With these backgrounds, we are in a position to look into the actual details of the
unix process operations.

62
Block Summary

In this block, we got some important introduction to the standard I/O functions.
We begin with the concept of buffering, discussed about full buffering, line buffering, line
buffering and unbarred systems.

We then looked into the various methods of opening a stream – various


combinations of allowed operations. We then looked into the concept of streaming, the
ability to read one character at a time, one line at a time or one block at a time. Also,
the methods of positioning at desired points in a stream, by the use of offsets was
discussed.

We also discussed the concepts of temporary files, password files, encrypted


password files used by unix and also the meaning of shadow password to control the
onslaught of password breakers. Within this, we are now in a position to actually start
discussions about the unix processes.

63
Review Questions

1. What is the need for buffering?


2. What are the 3 types of buffering?
3. What functions are to set / turn off buffers ? Explain
4. What functions are available for opening a stream?
5. What are the 3 modes of I/O into a stream?
6. What is a temporary file?
7. What is the specialty of Unix password file?

64
Answers
1. When a file is repeatedly needed, it is wasteful to go the memory for every read
/ write call. So to ensure efficient operation in input and output, they are buffered.
2. a) Fully buffered b) Line buffered c) Unbuffered.

3. The functions set buf and set v buf are used


The typical formats are
Void setbuf (FILE *fp, char *buf);
And int set v buf (FILE * fp, char *buf, int mode, size_t size);
The difference is that while set buf is sued for turning on / off the buffers, setvbuf
allows us to specify the type of buffering needed.

4.There are 3 functions available for opening a stream. They are fopen, freopen and
fdopen. The typical formats are
FILE *fopen (const char *path name, const char *type);
FILE *freopen(const char *pathname, const char *type, FILE *fp);
FILE *fdopen (int filedes, const char *type);
Fopen opens a specified file,
Freopene opens a file. If it is already open, it will be closed and opened again.
Fdopen takes the existing file descriptor and associates a standard I/O stream
with the descriptor.

5. The three modes are i) one character at a time ii) one line at a time
iii) direct I/O.

6. A temporary file is created on the file by the program in execution and is closed
once the program terminates.

7. Unix system store passwords in encrypted form. i.e. the users password is
encrypted through a one way encryption algorithm and is stored in the password file.

65
UNIT – II
Unit Introduction
This unit is made up of three blocks and is expected to give the student an
insight into some of the most important concepts of system programming.

The first block deals with the various aspects of processes & their controls. We
note that each process is identified by it’s id and can create child processes and also
interact with other processes. Each program maintains a detail of the environment in
which it is working. Processes also share system resources and they may, sometimes
have to be made to wait until the resource they need becomes available. It may also
happen that the resource in question may be held by a process which itself is waiting for
some other process etc thereby making the wait process to be infinite. This concept of
“race” and how to avoid it are also studied. We also have ideas about setting and
resetting user ids, concept of accounting and measure of process times.

The second block deals with the concept of signaling. Signals are essential for
process synchronization and to meet a variety of conditions like hardware exceptions,
terminal generated signals etc.. This block gives an idea about the various signals that
are encountered – the actions that the processes take on encountering a signal ( ignore,
catch or default action ) and the need of alarm and pause signals. We also see the
concept of masking of signals and of signal sets. We briefly list a few job control signals
also.

The third block given ideas about what happens during a login. We study the
difference between terminal login and network login and how they are handled. We look
into the idea of process groups and how to get or set their ids. We familiarise ourselves
with the idea of sessions and controlling terminals also finals we see how the
foreground and background job controlling can be done.

66
BLOCK – IV

Block Introduction

In this block, we get ourselves acquainted with several of the system


concepts. In fact, this could be considered as one of the more important blocks in
the course.

The block beging with the concept of processes. Each program will possess
an environmental list, which gives the details of the system environment in which it is
working. This will give useful input for the Kernel in processing the programs. There
are functions that allow the system to get details about the environment variables of
the system, as also to set those variables. Further each program is also handed
over certain limits to the system resources that it can handle. These can be set or
modified if need arises. WE study these concepts in brief.

Then we move on to the concept of the process. Each process is identified


by it’s identification no(id). A process can create child processes to perform specific
tasks by calling the fork function. There are several aspects to the relationship
between a parent process and a child process. These are discussed in some detail.
We also discuss various possibilities arising out of the order of terminations of the
processes.

There is also a possibility of making the processes wait, till the resources they
are looking for becomes available. We may wait in general or for some specific
resources. Different types of wait operations will be studied.

We will see the concept of race condition, where in two or more processes
are cyclically waiting for some system resource to become available, which is held
by some other process. In such a situation, so process will get to complete it’s task
and the wait becomes infinitely long. We see how we can counter such situations.

67
We also have a section or real and effective user ids, how to set or reset them
etc. We touch upon the concepts of process accounting, user identification and
process times. This, in brief is the overall view of this block.

68
Contents

4.0 Introduction
4.1 Process termination
4.2 Memory allocation
4.3 Finding the limits of a process
4.4 Unix process control
4.5 The fork function.
4.6 vfork function
4.7 Exit functions
4.8 Wait or wait_pid functions
4.9 Wait 3 and wait 4 functions
4.10 Race conditions.
4.11 exec functions
4.12 User Ids and Group ids
4.13 System Function
4.14 Process accounting
4.15 User identification
4.16 Process time
4.17 Review Questions

69
THE UNIX PROCESS

4.0 Introduction : We now look into the concepts of Unix process. To begin with, we
talk of a single process environment and then move onto the process control primitives,
wherein several processes are controlled.

Any C program starts with the execution of the math function

The proto type for the main function is


int main (int argc, char *argu [ ]), argc is a number indicating the number of
command line arguments and argv is an array of pointers to the arguments.

When a c program is started by the kernel, a special start up routine is called


before the main function is called. This startup routine takes certain values like the
command line arguments and the environment from the kernel and prepares the stage
for the execution of the main function.

4.1 Process termination


Processes can be terminated in any of the following ways
a) return from main
b) calling exit
c) calling _exit
d) calling abort
e) by a terminating signal.

of these, (a), (b) and (c) may be terned as normal terminations, while d and e are
abnormal terminations.

70
Of these, return from the main is the most natural way of terminating processes.
The startup routine, mentioned in the previous section ensures that when the main
function completes execution, an exit is executed.

We look into the other methods of process termination briefly.

4.1.1. exit and _ exit function:


Both these functions terminate the process normally whereas _exit returns to the
kernel immediately, exit returns to the kernel only after certain cleanup operations are
performed.

Most unix shells provide us a way to examine the exit status of a process. If the
function is called without an exit status or main does not return a value, the exit status of
the process is not fully defined.

4.1.2 Environment list:

Each program is also supposed to be handed over an environment list. The


argument list is an array of character pointers each pointing to the address of a null
terminated string. One typical environment could be as below
environment pointer environment list environment strings
Home = - - - - - - - -
path = - - - - - - - - -
Shell = - - - - - - - - -
user = - - - - - - - - -

NULL

The entire structure is pointer by a environ pointer. Using this, specific fields of
the structure may be accessed. Later on, we also see the functions getenv and putenv
will become useful in operation of environments.

71
4.1.3 The memory layout of a C program:

Unless specified otherwise, a c program has the following components

a) a Text segment consisting of the instructions to be executed by the machine.


The text segment is sharable with other programs – only a single copy is to
be maintained, even if they are to be used by a number of users.
b) The data segment contains the variables that are initialized in the program.
c) The unutilized data segment. Normally the data is initialized to zero or null by
the kernel.
d) stack: All the automatic variables, along with the function call informations
etc., are stored in the stack. It is also useful in calling recursive C function.
e) Heap: The dynamic memory location operation is done by the heap.

Environment
variables
stack

heap

uninitialised data
initialized date
Text

4.2 Memory allocation:

There are three functions specified in C for memory allocation


a) malloc allocates the specified number of bytes of memory, with their initial
values being uncertain.
b) Calloc allocates space for the specified number of objects each of a specified
size, with all of them initialized to zero.

72
c) Realloc: changes the size of the previously allocated area. When size
decreases, it may not lead to any problems, but when there is a size increase
the blocks may have to be moved to some other location to provide the
additional space required. This also means the initial values of space
between the old contents and the end of the new area may be indeterminate.

The typical formats of these functions is as below:

void * malloc (size_t size);


void * calloc (size_t no, size_t size);
void *realloc (void *ptr, size_t new size)

The descriptions above should indicate the need for each of the parameters
concerned.

All of them return a pointer to the memory space, if the operation is successful,
else they return a null, in case of error.

There is also a command to free the space once it’s use is over.

void free (void *ptr)

4.1.4 alloca function:

There is one more function alloca, which has the same sequence of operations
as malloc.
Void *alloca (size_t size)
It functioning is also similar to that of malloc, but instead of allocating memory
from the heap, it allocates from the stack frame of the function. The advantage is that
we need not have to free the space at the end – which, anyway, is normally forgotten
by many programmers leading to difficult situation. The allocated space gets

73
automatically returned when the calling function returns. The only disadvantage is that
in some cases it may not be possible to change the size of the stack frame after the
function is called. So, some systems do not support the alloca function.

4.2 The Environment Variables:

We have briefly discussed the concept of environment strings in a previous


section. We look into it’s various details and manifestation here.

The environment strings are of the form


Name of the variable = value of the variable.

The unix kernel, normally does not have much use of these variables, though
they are routinely updated. It is for the other applications to make use of them in a
manner that is useful to them. The most common user of the environment variables
are the shell programs.

ANSI C defines a function that helps us to fetch values from the environment.
Char *getenv (const char *name);

This returns a pointer value associated with the name of the environment
variable being accessed (given by *name) and null if it is not found.

The following table gives a list of the commonly used environment variables and
their descriptions.

variables Description
Home Home directory
Lang Name of locale
LOG name Login name
Path List of paths to search for the file
TERM Terminal type
TZ Time zone information

74
However, it is not enough it we are only able to get the value of the environment
variable. We may like to change the existing variable, add a new variable or remove an
existing variable etc.,.

These operations can be done by the following functions.

4.2.1. Putenv, getenv and unsetenv function:

The typical formats are


Int put env (const char * str);
Int set env (const char *name, const char *value, int rewrite);

Both of them return 0 if successful, otherwise return a nonzero error.

Put env: function takes a string of the form name = value and puts it in the environment
list. The name already exists, the previous instance of the function name is first
removed.

Set env: sets the name to the value specified. If the name already exists then several
options arise (a) If the rewrite option is set to non zero, the existing definition for name is
modified to the new value (b) If, however, the rewrite option is set to 0, then existing
definition is not disturbed and hence and error message is returned.
The third function is of the type
Void unsetnv (const char *name);

It removes any existing definition of name. If no such previous definition exists,


then it returns an error.
At the implementation level, it is interesting to see how these functions operate.
Beginning with the easiest of them, from the implementation point of view, are needs to
just delete the pointer by moving up all the other pointers suitably. But adding a string

75
or modifying a string is more difficult. Often the space at the top of the stack cannot be
expanded any further. However the following steps are taken.

1. If an existing name is being modified

a) If the size of the new value is smaller the size of the existing value, we can
overwrite the new string over the existing string and probably deleting the
excess number of characters of the old string.

b) If the new string indicating the value is longer than the size of the existing
value, then the problems arise. We must call malloc or such other suitable
function to get enough space for the increased length of the string, copy the
new value of the string to this area, replace the pointer in the environment list
to point to this area to ensure that the new value of the environment variable
is what has been included newly.

2. If a new name is being added, the process becomes still more complicated. We
first call for malloc to allocate space for the (name = value) string and copy the
string to this area. Then depending an whether this is first time that we have
added a new to the list or whether we have done so previously, we call either
malloc or realloc to ask for space to include the pointer. We make this pointer
point to the value which we want the name to take up.

Of course, we should add a null pointer at the end of all the names. If this
description appears a bit confusing, look at the following figure, which is
expected to clear the doubts.

Environ pointer
Point to values
1
2
3
Names 4

76
5
X

Suppose we want to include (3a) between 3 and 4 what do we do?

We want to add a new value to the pointer. If the number of locations needed by
the new value is more than what is already allocated, what do we do?
These were the question answered in the earlier description.

4.3 Finding the limits of a process


getr limit and set limit functions

Every system has it’s own limitations on the system resources. It has to work
with in these limits to perform the objectives. In turn, the system will have to impose
limits on the resources allocatable to the processes that work under it. The processes
in turn will have to work with in the limits specified to them. However, for the smooth
functioning, it becomes desirable that the processes should be able to find out the limits
of the resources allocated to them and in extreme cases, may also get them changed.
These operations can be done by using the getr limit and setr limit functions. Their
typical format is as follows:
Int getr limit (int resource, struct rlimit * rlptr);
Int setr limt (int resource, const struct rlimit * rlptor);
Both of them return 0 if successful, else return an error number.

It may be noted that each call to these functions can work with one resource at a
time and pointer to the resource structure is provided in the following format

Struct rlimit {
r lim_t r lim_cur /*current limit of the resource/
rlim_t rlim_max /* max value for rlimt*/
};
The following constraints hold good regarding the changing of these limits.

77
a) only a super user can raise the upper limit (max limit, also called the hard
limit.) It may be noted that raising the upper limit would put pressure on the
overall resource position of the system and hence need to be undertaken only
by the super user.
b) Any process can lower it’s maximum limit to a value greater than or equal to
the value of the current limit, Again lowering the value would be a decision of
the individual user and it allows the process to get serviced quickly in certain
situations. However, such lowering of the limit is irreversible.
c) A current limit can be increased only upto the maximum value allocated for
the process.

An infinite value can be specified by RLIM_INFINITY.


Some of the resources that can be limited in the above manner are listed below

Command Description
RLIMIT_CORE The maximum size (in bytes) of a core file. It the
limit is set to 0, the core file is not created
RLIMIT_CPU The maximum amount of CPU time allocated, in
seconds, when the current limit is exceeded, the
SIGXCPU signal is sent to the process.
RLIMIT_Data The max size (in bytes) of data segment. This includes the area of
initialized data, uninitialised data and heap

78
RLIMIT-FSIZE The max size in bytes of a file that can be created
when the a file of size greater than the current
limit is created, the signal SIGXFSZ is raised
RLIMIT_MEMLOCK Locked in memory address space
RLIMIT_NO FILE The maximum number of open files per process at
any given instant. Changing this limit affects the
value of the function sysconf for the argument –
SC_OPEN_Max
RLIMIT_NPROC The maximum number of child processes per real
user id. Changing this limit affects the value
returned for SCCHILD_Max by the SYSCONF
RLIMIT_RSS Maximum resident set size (RSS). If the
availability of physical memory is not sufficient, the
kernel takes away memory from processes that
exceed their RSS(on the other hand, if there is
sufficient memory available, say, because the
other processes are not utilizing their full value,
the system may ignore this exceeding)
RLIMIT_STACK The maximum size of the stack
RLIMIT_VMEM The maximum size of the mapped address space.
It affects the mmap function.

The resource limits not only affect the calling process but are also
inherited by the children of the process. This is again logical, since the creation
of new processes need not affect the overall resource balance of the system.

To gain an insight into the actual working of these resource names and the
functions they perform, we write a simple program that simply takes the names of
resources and prints their limits for the process.
We shall first see the program (ref: Text book 1) and then the details
# include <sys/types.h>
# include <sys/time.h>
# include <sys/resource.h>
# define do it (name) pr limits (# name, name)
/* include the do it macro, which expands into pr limits (- - -) */
static void pr_limits (char *, int);
int
main (void)

79
{
do it (RLIMIT_CORE);
do it (RLIMIT_CPU);
do it (RLIMIT_DATA);
do it (RLIMIT_FSIZE);
do it (RLIMIT_MEMLOCK);
do it (RLIMIT_OFILE);
do it (RLIMIT_NPROC);
do it (RLIMIT_RSS);
do it (RLIMIT_STACK);
exit(o);
}
static void
pr_limits (char *name, int resouce)
{
struct rlimit limit;
if (getrlimit (resource, & limit) <0)
err_sys (“get limit error for %s”, name);
printf(“% - 2OS “, name);
if (limit rlim_cur = = RLIM_INFINITY)
printf (“in finite”);
else printf(“% lold”, limit rlim.cur);
if (limit.rlim_max = = RLIM_INFINITY)
printf(“infinite \n”);
else printf(“ lo ld /n”, limit.rlim_max);
}

Now a brief description of what we have done.

To begin with we have defined the do it (name) macro to equate pr_limits ( )


which expands the name of the resource “name” into an ASCII string “name” as also the

80
corresponding resource name. All that the pr_limits function does is to use a series of if
statements to check for the current and maximum limits of the resource and print the
same. If the value is infinity, (un limited) it prints infinite against the resource limit, else it
prints the corresponding limit, using long integer.

4.4 Unix process control:

Having seen several of the unix commands in the previous unit and also how a
single process can be controlled, we are now in a position to study the concept of
process control in Unix. Some of the controls that we look into are creation of new
processes, execution of programs and termination of the processes.

Before looking into the actual control mechanism, we shall see how unix keeps
track of the various processes. After all when one is talking of controlling a number of
processes. After all, when one is talking of controlling a number of processes, the basic
thing is to be able to identify one process from another. Unix does this in a strength
forward manner – by assigning a unique identifier – hereafter called process ID to each
process. The process ID will be a non negative integer.

Normally process ID zero (0) is used for a scheduler process called swapper.
Process ID-1 is the init process which is invoked by the kernal during the boot strap
procedure. Process 1D-2 is normally the page daemon.

Unix also provides for getting information about the ID of the process and other
informations. The following list given some of the information obtainable by the
functions.
Functions Information
Pid_t getpid(void) Process id of calling process
Pid_t getppid(void) Parent process id of process
Pid_t getuid(void) Real user id of process
Pid_t geteuid(void) Effective user id process
Pid_t getgid(void) Real group id processes
Pid_t getegid(void) Effective group id process

81
4.5 The fork function:

Now we see one of the most important functions in unix process control. This
function, called the fork function allows an existing process to create a new process.
Infact, this is the only way in which a new process can be created by the unix Kernel.

The typical structure of the process is


Pid_t fork (void);
Returns 0 in child, process id of the child in parent and –1 if unsuccessful.

The new process created is called the child process and function which created
the child process is called the parent process.

(you may note: Since each child can have only one parent, it gets a return value
of 0 on creation. If can always get back to it’s parent by calling get ppid- on the other
hand, a single parent can have an umber of children, hence the parent is given the
sl.no. of the child, so that it can distinguish between it’s children).

After the creation, both the parent and the child keep continuing their operations.
The child is considered a copy of the parent and hence gets access tothe copy of
parent’s data space, heap and stack.

(Again note: It is only a copy, not the actual space of the parent. Otherwise, there
will be chaos with both the parent and the children doing all sorts of modifications on the
parent’s space).

Ofcourse, some of the recent implementations make some smart operations


here. Instead of copying the complete space every time a child is created, they allow
the child to use the parental space on a read only basis. Only when the child starts
writing into the space, does it get a physical copy for itself.

82
Once a child is created, it is expected compete for resources just like the parent
and whether the parent gets to execute first or the child depends on the scheduling
algorithms, resource demands and a lot of other factors. To synchronies the activities of
the parent and the child, it is always necessary to introduce sufficient checks and
balances.

This is more important when we use standard I/O devices to be shared between
processes. If the I/O is line buffered (look at a previous block) we may still get the
answers correctly, since each new line character resets the buffers. But if it is fully
buffered, then sufficient care about the operation and flushing sequences is to be
maintained, if one were not to either lose data or get duplications of the same.

4.5.1 FILE sharing between the parent and the child

We have said briefly in the previous section that the parent and child share the
same data space. As a corollary, it is also a fact that the parent and the child processes
share the same open files. In fact since all descriptors of the parent are also duplicated
in child process, by default a child will be sharing the same files as of the parent.
However, one or two issues need to be resolved.

Since both the parent and the child are allowed to operate on the same set of
files, it is imperative that they share the same offset. At the instant of the fork operation,
this is automatically taken care of However, during the subsequent read and (more
importantly) write operations, the parent’s offsets should be suitably updated by the
child and vice versa so that they can be sure that they do not overwrite each other.

Continuing in the same line, if they (parent and the child) were to write to the
same descriptor, their outputs will be intermixed, unless both of them synchronize their
operations.

83
There are two ways in which this can be handled after a fork.

(a) The parent waits, while the child completes the operation. After the
termination of the child, the offsets of the descriptors will have been
upgraded accordingly for the parent to use them.
(b) After the fork, the parent and the child close the descriptors and open those
descriptors that they need. In this way each of them does not interface with
the descriptor of the other.

In addition to open files and descriptors, the following properties of the parent are
“inherited” by the child.

a) real user id, real group id


b) effective user id, effective group id
c) supplementary group ids.
d) Process group id.
e) Session id.
f) Controlling terminal.
g) Current working directory, root directory.
h) File mode creation mask
i) Environment, resource limits and host of other features.

The following are the properties not shared by the parent and child processes

a) return value from the fork


b) have different process ids.
c) Different parent process ids (naturally !)
d) Child’s time values for various operations are set to 0 at the beginning.
e) File locks are set by the parent and not inherited by the child
f) Does not inherit any pending alarms from the parent.

84
g) Set of pending signals are not passed on to the child.

Though we have discussed so many features of fork, we have not discussed why
forking is needed first of all

Forking is needed,

a) So that the parent and child can each execute portions of the same code
independently. For example, when a server gets a request for a service from
one of it’s clients, all that it does is to fork a new process, allow it to execute
the request and the server itself waits for the next request.
b) When the process wants to execute a different program, it hands over it’s
present task to a forked process and it itself jumps to the new program.

However, one also needs to be careful in creating new processes with fork. If
there are already too many processes or if the present real user id has exceeded
the limit on the number of simultaneous processes, then fork is likely to fail and is
to be avoided.

4.6 Vfork function:

The function Vfork has the same format and return values as the fork.

V fork is intended to create a new process which is supposed to execute a


new program (whereas fork executes the same process as of the parent).
Though v fork creates a new process, it does not hand over a full copy of the
parent’s address space. Instead, it runs on the address space of the parent until
it calls another execute or exit, when afterwards it executes the new program and
hence does not need the parent’s address space any more.

85
A major difference w.r.t. to fork is that while in the case of fork there is no
way of checking whether the parent or the child gets executed first , in the case
of v fork , the child always runs first and continues till the child calls execute or
exit. The parent resumes after this.

Let us write a simple program to see how v fork differs from fork.

# include <sys/types.h>
int glob=6 /* the external variable is initialized* /
int
main (void)
{
int var; /* automatic variable on the stock */
pid_t pid;
var = 88;
printf(“before v fork \n”);
if ((pid= v fork (1) <0)
err_sys (“v fork error”);
else if (pid = = 0) /* if it is a child */
glob ++ ; /* modify the parent’s variables/
var ++
exit (0); /* child terminates/
}
/*parent */
printf (“pid = %d, glob = %d; var = %d \n”, get pid ( ), glob, var);
exit (0);
}

The program is fairly straight forward to need any explanation. The only point to
note is that the parent’s variables, which are modified by the child, become available to

86
the parent. This, of course is to be expected since to the parent and the child operate
from the same address space.

4.7 Exit functions:

We repeat what we have seen in the earlier section:- a process can terminate
either normally or abnormally.

It can terminate normally by


a) executive a return from the main function.
b) Calling an exit function
c) Calling a _exit function

It can terminate abnormally by


a) calling abort
b) on receiving signals to terminate itself.
In all cases of termination, the kernel acts similarly. It closes all the open
descriptors (of the process), releaser the memory earmarked to the process, modify the
environment suitably etc.

The parent also needs to be intimated about the termination of it’s child process,
and more importantly how it terminated, since the parent may have to take suitable
action based on this information. The exit and _exit functions generate an exit status as
an argument. In the case of abnormal terminations, the Kernel generates a “termination
status” to indicate the type of abnormal termination and the reason for the same. The
parent can obtain the termination status from either the wait or wait pid function. (refer
next section)

Now, what happens if the parent process has terminated before the child
process? In such a case, where do the information indicated in the above paragraph go
to? Unix has an answer. The Kernel starts all programming exercises with a “init”

87
process. This can be thought of as some sort of a “mother of all processes”, in the
sense, it creates the first process, then the second, then their children etc. Whenever a
process terminates, the init checks whether the parent of the process is still around. If
not init itself becomes the parent process (by changing the parent field of the child
process to 1, see earlier discussions an process ids)

In this way, there can be no process without a parent, since init terminates only
after all the processes in it’s environment are terminated.

Unfortunately, there is a problem even when the parent is alive when the child
terminates. Preliminary concepts of scheduling tell us that only one process can be
active at any instance of time (in single processor systems, which is almost always the
case). In such a case, when a process becomes active after a child has terminated and
starts looking for it’s child process, it does not help if the child has simply vanished after
returning the relevant status codes or whatever signals. Again the init will have to
maintain some minimal information about the terminated process, it’s termination status,
amount of CPU time used etc., This information is called a “Zombie”. The Zombies are
maintained till such time the data is picked up by their parents.

4.8 Wait or wait_pid functions:

A process that calls wait or waidpid can


a) block (if all it’s children are still running)
b) return immediately with the termination status of a child
c) return immediately with an error (if it has not child processes)

But why should a process call the wait or wait pid function, to begin with? Actually
it can call it randomly or in response to a signal from init. In either case, however, it’s
purpose is to collect some terminating information about it’s children.

88
Going back to the previous section, where we left off, a terminated child process,
whether the termination is normal or abnormal, makes the Kernel send a SIGCHLD
signal to the parent. The parent, in response to this signal, may execute a wait function.

It is clear that if it waits when a SIGCHLD signal has been sent by the Kernel, it
can immediately return after collecting the termination status of the child in question,
since the data is already available. On the other hand, if the wait is called without any
such signals, then it may enter the block state , until such time a child terminates.

The typical formats of wait and wait pid are as follows:


Pid_t wait (int *statloc);
Pid_t wait pid (pid_t pid, int *statloc, int options);

Both of them return process id if successful, 0 (indicating the condition listed


later) or –1 an error.

The wait function, as indicated above, may block the caller process until such
time a child process terminates. But wait pid has an option to prevent such blocking.
Also wait pid simply does not wait for any child process to terminate, it can be specific
about the process whose termination it is waiting for. But wait returns the moment any
child process terminates, giving it’s status details.

Now we shall briefly see the arguments of the two functions statiol is a pointer to
an integer. Unless it is a null pointer, the termination status of the terminated (child)
process is stored in this location. If the parent process does not care about the actual
status of the terminated process (but is only interested in finding whether the process
has terminated or not) this filed is set to null.

The pid argument in wait pid can have different values and accordingly it’s
interpretation and the action taken changes.

89
a) pid = = -1, the function waits for the termination of any process. In
effect, wait pid becomes similar to wait.
b) Pid >0 waits for the child whose process id equals the value of pid.
c) Pid = = 0 waits for any child whose process group id becomes equal
to that of the calling process.
d) Pid < -1 waits for any child whose process group id equals the
“absolute” value of pid.
To sum up wait pid is a sophisticated version of wait in that

a) It provides a non blocking version of wait. This becomes


important in several occasions, where we want to just check the
status of the process, but are not interested in waiting for it’s
termination to occur.
b) Wait pid allows us to wait for a particular process termination.
This is more meaning full, because in most cases, we are
interested in the status of a required process to continue the
operations, not the status of any process.
c) It helps the job control operation. We can actually control the
sequence in which the processes work and terminate.

4.8.1 Before we continue, we look a very useful and curiously simple function.
Especially when we want the job control operation to be done, the sequence in which
the processes work and interact becomes exceedingly important . Suppose a process
needs some input from another process. Then it is essential that the second process
should terminate first. What happens if it gets delayed and on the other hand, the first
process almost completes, except, of course the portion which it needs to be
supplemented by the second process. To make it wait, one of the ways could be the
sleep function. We simply call sleep ( ) to make up for the imbalance in the timings.
4.9 Wait 3 and wait 4 functions:

90
These can be considered the extensions of wait and wait pid. They are planned
to provide an additional argument that allows the Kernel to return a summary of the
resources used by the terminated process (and all it’s child processes)
The typical format is
Pid_t wait3 (int statloc, int options, struct rusage * rusage);
Pid_t wait 4(pid_t pid, int statloc, int options, struct rusage * rusage);
Both return the usage id if successful, else return –1.

4.10 Race conditions:

The Race conditions occur because of the limited resources of the systems.
Basically, if there is no sharing of resources, there is little possibility of race conditions.
Race conditions occurs when multiple processes are trying to operate on a shared
resource and the final outcome would depend on the order in which the processes run.
The fork operations, if resorted to ceaselessly, would produce race conditions.

Suppose there is a logical sequence in the program that depends, for it’s result,
on whether the child or parent operations runs first after fork. Normally, no such
prediction is possible, since it depends the scheduling algorithm and the conditions
existing at the time when the scheduler has to choose the process to run.

One way of overcoming the problem is to use sleep, described in the previous
section. Suppose we want to child to complete before the parent. If it happens, no
problem. Otherwise, if the parent completes before the child, it must be put to wait. On
the other hand if the child completes before the parent, but has to wait for the parent, it
can be put to sleep. But how long? It is a problem that can never be decided before
hand.

One simple way to overcome this problem is to use polling. Suppose instead of
simply calling the process to sleep, we use the following code
While (get ppid ( ) ! = 1)

91
Sleep (1)

Until the parent is terminated, the child sleeps. But it has to be woken up once
every second (sleep (1)) to check whether the test condition has been met. This can be
wasteful in many situations.

To overcome these difficulties, various types of inter process communication


signals can be used, which we shall see in the later sections.

4.11 exec functions:

The fork function, as we have seen earlier, creates a new (child) process and
causes another program to execute by calling the exec function. When a exec is called,
the present process is completely replaced by the new program which starts executing.
The process id does not change because exec simply replaces the current process with
a new program.

Though we refer to exec as one function, there are as many as six exec
functions.

They are of the following types:


Int excel (const char *path name, const char * argo, arg1, - - - - - - );
Int execv(const char *path name, char * const arvg [ ]);
Int execle (const char * path name, const char *argo, arg1 - - - - );
Int execve (const char *pathname, char * const argv [ ], char * const envp [ ]);
Int execlp (const char *filename, const char * argo, arg1 - - - - - -);
Int execvp (const char *filename, char * const argv [ ]);

All of them return –1 on error, do not return any thing if successful.

Now, how do they differ from one another?

92
The first difference is that execl, exe cv, execle and execve take a path name as
the argument while the others take file name as the argument (when a path is indicated
it indicates a list of directories, separated by colons)

The second difference is regarding the argument list. It may be noted that the
functions execl, execlp and execle require the command line arguments to be specified
as a list of separate arguments – arg1, arg2, arg3 - - - - etc.,. But execv, execvp and
execve allow the user to build an array of pointers to the respective arguments and
indicate this array as the argument – like arg [ ]. In fact, student can note that in the
command stands for list, (execl, execle, execlp) while v stands for vector (execv, execup
and execve).

Now the two functions whose names end with e, (namely execle and execve)
allow the user to a pass a pointer to the array of pointers which in turn point to the
environment strings. The others use the environ variable in the calling process to make
a copy of the existing environment. Normally, when a child is created, the environment
is copied on the child (except certain special cases, which we do not deal with here).

To conclude, we make the following (repeated) observations regarding exec


functions:

i) when the function ends with e, it will take a pointer to give the address of
the array of pointers.
ii) When it ends with P, it takes file name as the argument. It then uses
the environmental variable path = - - - - to find the path to the executable
file.
iii) when it has l, it means the function takes a list of arguments, to be
indicated individually, where instead if it equals with v, it means an array
(or a vector) will be the argument.

93
When an exec operation takes place, there are several properties that the new
program inherits from the calling process. One of them, obviously, is the process Id.

The others in the list are as follows:


a) parent process id
b) real user id and real group id
c) supplementary group ids.
d) Process group id
e) Session id
f) Controlling terminal
g) Time left for the process
h) Current working directory
i) Root directory
j) File mode creation masic
k) File locks
l) Pending signals
m) Resource limits etc.,.

4.12 User Ids and Group Ids

The set uid and set gid functions.


The real user id and effective user id are important arguments, which can be set
by the set uid functions. Similarly the group ids (both real user and effective user) can
be set the using set gid function.

Typical formats are


Int setuid (uid_t uid);
Int setgid (gid_t gid)

94
Both of them return 0 is successful, else return –1. Obviously changing user ids is a
serious business and hence strict restrictions need to be applied.

The following are some of the rules as to who can change the ides

a) If the process has super user privileges, the set uid function sets the
real user id, effective user id and the saved set user id to the value
indicated by uid.
b) If the process does not enjoy super user privileges, but if uid equals
either the real user id or the saved set user id, then only the effective
user id -–uid is changed. The real user id and the saved set user id
are not touched.
c) If the process does not have super user privileges nor does the value
of uid is the same as the saved set user id or real user id, then an error
is raised.
d) Only a super user process can change the real user id. The real user
id is set when a user logs in and is not changed for the duration of the
login. Since login is a super user process, the login process can set all
the three user ids by calling set uid.
e) The effective user id is set by the exec. functions, only if the set user id
bit is set for the program file. Otherwise, the exec function maintains
the effective user id at the current value.
f) The saved set-user-id is taken from the effective user id by exec. This
copy is made available after exec stores the effective user id from the
file’s user id.

4.12.1 Setreuid and setregid functions:

This function is for swapping the effective user id and the real user id.

95
Typical format is
Int setreuid (uid_t ruid, uid_t euid);
Int setregid (gid_t rgid, gid_t egid);
Both of them return 0 on successful opertion and –1 an error.

This will help an unprivileged user to swap between real user id and effective
user id. This helps in swapping the normal permissions.

4.13 System Function:

This helps the user from inside a program to use system commands. A typical
format is
Int system (const char * cmdstring)

The cmd string, (the command string) is a command for the system function, and
the actual operation depends on what the command stands for.

The following bit of program would help us get a clear idea of the operation of the
system command.

# include <sys/types.h>
# inlcude <sys/wait.h>?
int
main void ( )
{ int status;
if (( status = system (“date”) <0)
err_sys (“system error”);
pr_exit (status);
if (( status = system (“no such command”)) <0)
err_sys (“System error”);
pr_exit (status);

96
if ((status = system (“who ; exit 44”) < 0 )
err_sys (“system error”);
pr_exit (status);
exit (0);
}

All that the program does is to accept the calls to system functions, give the
appropriate error signals, if necessary, otherwise give the requisitioned outputs.

The function pr_exit is a function that uses the appropriate macros to print the
description as required, before exiting.

4.14 Process accounting:

Once we talk of resource sharing, it also means some way of maintaing who has
used how much of which resource – in short the process of accounting. This is an
optional process that needs to be enabled for it’s operation. Once enabled, it writes an
accounting record for each process that terminates. These are normally 32 bytes of
binary data which give some of the important details about the process like the amount
of CPU time used, user id, group id, elapsed time etc.,.

A typical accounting structure that maintains the accounting records can be of the
following format:

Struct acct
{
char ac_flag /* flag */
char ac_stat /* termination status */
uid_t ac_uid /* real user id */
gid_t ac_gid /* real group id */
dev_t ac_tty /* controlling terminal */

97
time_t ac_btime /* starting time */
comp_t ac_timeu /* user CPU time (clock cycles) */
comp_t ac_stime /* system CPU time (clock cycles) */
comp_t ac_etime /* elapsed time (clock cycles) */
comp_t ac_mem /* memory usage (average) */
comp_t ac_io /* bytes transferred during read and write */
comp_t ac_rw /* blocks read or written */
comp_t ac_com[8] /* command name */

All the fields are self explanatory except the first one, namely the flag, which
needs some explanation.

The ac_flag member of the structure records certain events during the execution
of the process
Ac_flag Description
AFORK Process is the result of fork, but did not call
exec
ASU Process used super user privileges
A COMPAT Process used compatibility mode
A CORE Process dumped core
AXSIG Process was killed by a signal

Whenever a new process is created, the data required for accounting are
initialised by the Kernel and are updated as long as the process is alive once the
process terminates, the corresponding accounting record of the process is written into.
Since the records are written after each process terminates, the entry of the records in
the accounting file is in the order of the termination, not in the order of their creation.

One other information we should note is that the accounting is for the processes,
not programs. That means when one programs calls another and then the third one
etc., all that we get is a single accounting record. So, the accounting records need
some modifications before they can be directly used to debit / charge costs for the use
of system resources.

98
Now he question arises: how does one use this information for the accounting
purposes of a particular program. One sequence of operations can be as follows:-

a) Become a super user and enable accounting with the accton


command.
b) Run the program for which accounting is needed. Depending on the
number of processes that it creates, we get that many number of
accounting records in the accounting file.
c) Disable the accounting by entering the super user mode.
d) Another program can be run (at the super user level) to pick the
required fields from the accounting file and print it. Once the fields of
the accounting record are known, it is easy to write a program that
does it.

4.15 user identification:

Any process is identified with it’s real and effective user ids and effective and real
group ids. Suppose we want to know the user login name. We can call a function get
pwuid. But if a user has multiple login names, each with the same user id, it becomes
not an easy proposition. The get login function gives a method to fetch such a login
name.

Typical format is
Char * get login (void);
Returns a string pointer to point to the login name, points to NULL on error.
4.16 Process times

Most systems deal with three measures of time. Real time (or clock time), user
CPU time and System CPU Time. Any process can call the times function to obtain the
time values for it’s terminated children and for itself.

99
The typical format is
Clock_t times (struct tms *buf);
Returns the elapsed time in clock ticks if ok, else return error (-1)

The structure tms, pointed to by the buffer is of the following format


Struct tms {
Clock_t tms_utime /* user CPU time */
Clock_t tms_stime /* System CPU time */
Clock_t tms_cutime /* user CPU time for the terminated children */
clock_t tms_cstime /* system cpu time, for terminated children */
}

Two points need to be noted at this stage

1. The time referred to in the fields of the above structure are all clock
ticks. This can be converted to seconds using the number of clock
ticks per second, the _SC_CLK_TCK value returned by syconf.
2. The system does not provide any absolute times. Infact hardly
anybody will be interested in absolute values of time most of the time.
Most often, we will be interested in the elapsed time – The time
duration between two events – which can be got by simply subtracting
a new clock value from a previously set value.

Block Summary

This is an important block, where in we discussed a number of vital issues.


We learnt about the concept of a process; which can be roughly described as the

100
executing unit of a program we learnt about the system and process environment,
their usefulness, how to view the environment variables, as well as to set them.

Then the concept of a parent program creating a child program using a fork
was discussed. Several aspects about the shareability of files and resources
between the parent and the child processes as also about the results of the
termination of either the parent or the child early were discussed. We also learnt
how to make the processes wait and studied the different methods of calling the wait
function.

Then we were introduced to the concept of race condition, where in a process


is waiting for a resource held by another process, but the second process is waiting
for a resource held by the first process, thereby producing an infinite wait. We have
also seen how the system intervenes to overcome this situation.

We also talked about the user ids, methods of setting or getting the data. The
concepts of how to do process accounting, user identification and process times.

101
Review Questions

1. What is alloca function ? How does it differ from malloc?


2. What are the commonly used environment variables?
3. Which function helps us to fetch the environment values?
4. Which functions help us to set the environment values?
5. Which functions help us to get and set the limits of a process?
6. What does the fork function do?
7. Which two functions help us to make a process wait?
8. What is a race condition?
9. What are the three measures of time? Which function allows to get the time?

102
Answers

1. alloca is used for allocating and memory of a suitable size, by calling in the form
void *alloca (size_t size)
But instead of allocating from the heap, it allocates memory from the stack frame.

2. The commonly used environment variables are


Home – home directory
Lang – Name of he locale
LOG name – Login name
Path – List of paths
Term – Terminal type
TZ – time zone etc.,.

3. The function getenv is used. It’s typical format is


Char *getenv (const char *name).

4. The function set env (const char *name, const char *value, int rewrite);

5. The functions getr limit an dsetr limit are used.


Their typical format are
Int getr limit (int resource, struct rlimit *rlptr);
and Int setr limit (int resource, const struct rlimit * rlptr);
respectively

6. It creates a child process for a parent.

7. The functions wait and wait_pid will be useful. Their formats are
Pid_t wait (int * statloc); and
Pir_t waitpid (pid_t pid, int * statloc, int options);
Respectively.

103
8. Two or more process each waiting for a resource held by the other, so that in
effect, no process gets the resource it is waiting for nor does it release the
resource it holds is called a race condition. In this case, the wait becomes
infinitely long.

9. The three measures are real time, user CPU time and System CPU Time.
The function clock_t times (struct tms * buf) help to access the time storage
buffer.

104
BLOCK - V
Block Introduction

This block is about signals signaling is a very important aspect of synchronizing


the processes. Since every process will have to share the system resources and also
share data and information, it becomes essential that there has to be signaling process
to convey messages.

We start with the basic concepts of signaling and what process should do on
receiving signal can ignore, can catch or allow the default action for the signal to
happen. Then we learn some of the commonly encountered signal this list is not
comprehensive but is only indicative in nature.

We then look at some functions that operate with signals. We also look into the
concept of a reentrant program – which is of vital importance if we have to operate with
signals.

We also look into the concept of signal sets and functions to operate on signal
sets. We finally have an encounter with a few job control signals.

105
Contents

5.1 Introduction
5.2 Basic Concepts of signals
5.3 Some of the commonly used signals
5.4 We now look at some of the functions of unix that operate with signals
5.5 The concept of Reentrant functions
5.6 Reliable signal terminology
5.7 Raise and Kill functions
5.8 Alarm and pause functions
5.9 signal sets
5.10 Sigpending function
5.11 Sigaction function
5.12 Sigsuspend function
5.13 Abort function
5.14 Sleep function
5.15 Job control signals

106
Signals
5.1 Introduction:

In this block, we study about signals used by Unix system for its various
operations. They are of utmost importance in process synchronization, handling of
asynchronous events and such other operations.

Signals can be considered as software interrupts. Their function is to make the


presently working program or the other processes in the various states take note of
some important events that have taken place- like a user typing a key to stop a program
or same program getting terminated.

Once basic problem with signaling is that they should not get lost, but must be
registered by the programs and processes which are supposed to take notice of them.
What we see here is essentially a list of signals, their description, their intended effects
and side effects if any. The student can keep in mind that it is not necessary to
remember all of them, though that may be desirable. What is more important is to note
that such signals exist and in specific cases, one should be able to use them and more
importantly, use them correctly.

5.2 Basic Concepts of signals

As we have noted that signals are software interrupts that are expected to signal
some situations. They all have names. To differentiate them, all of them begin with the
characters SIG.

To assist in system implementation, all these names are defined by positive


constants (Called signal numbers) in the header <signal.h> which is to be included in
header files before the signals can be processed.

107
Having seen what a signal stands for, we now see under what conditions signals
get generated. Actually a large number of conditions and combinations of conditions
generate the signals. We list only a few of them below.

a) Hardware Exceptions: These are special conditions generated during the


execution of the programs which make further computations not possible or
prone to errors- like invalid memory reference, division by 0 and so on. Normally,
the system hardware has the capability to detect such erroneous situations and
raise an alarm to the kernel. The kernel, on receiving these alarms (which are
normally in the form of some bit being set / reset) generates the corresponding
signal, so that the process currently running or any other process can be warned
about the same.
b) Terminal generated signals: These signals are encountered when users
press certain terminal keys. For example, when a process is in execution, if the
DELETE key is pressed, the terminal generates an interrupt signal. This
mechanism is normally useful in stopping a runaway program.
c) Some times it is desirable to send signals to other processes. Suppose a
background process is in a runaway conditions. One way of stopping it could be
to send a kill(1) command, that terminates the back ground process.

d) Similarly a kill (2) function allows a process to send any signal to any other
process or process group. Obviously there are limitations as to which process
can send the kill(2) to which other process. As a thumb rule, one will have to be
the owner of the process that is receiving the signal. Alternatively, in the
superuser mode, any signal can be sent to any of the process.
e) There are certain situations wherein the condition existing in the execution of
the program needs to be made available to the other processes, but not
hardware-generated exceptions are available. Situations like when an alarm set
by the process goes or when a process writes to a pipe after the reader has
terminated his reading need special attention. In such cases, software signals
are generated.

108
It may be noted that by definition signals are not synchronous events. No one
can predict, before hand, when the signal goes off. (If one can predict such events,
then signals are not needed.) The process can check a particular bit, a memory
location etc. to find predictable nature of the events, signals are needed – so that the
process need not keep checking for the happening of the event, but the event on it’s
happening sends a signal to the process.

So far, we have tried to understand the need for and the nature of signals. Now,
once a signal is triggered, what should the kernel do? Technically, what the kernel does
on receipt of it’s signal is called “disposition of the signal”. Some authors also call it “
Action associated with the signal”.

Normally, three types of actions are possible.


a) Ignore the signal: Many of the signals go away automatically if ignored by the
kernel. You may ask why a signal that is ignored is to be generated in the first
place. The answer is that the kernel gets intimated about certain situations by
the signals. It is for the kernel to decide whether the signals need to be attended
to or to be ignored, based on the situations prevailing.
However, two signals, SIGKILL and SIGSTOP can not be ignored. We see
shortly that these are the signals to kill or stop a process and the intention is to
ensure that the action takes place at all costs.

b) Catch the signal: This is actually the action part of the signaling process.
Whenever a signal is received, the kernel is asked to call a particular function. In
that function, the programmer will have given a routine as to what action is to be
taken.
c) Allow the default action to take place: Every signal has a default action. If the
kernel neither decides to ignore the signal nor to catch the signal, default action
takes place. It may be noted that in most cases, the default action is to terminate
the process.

109
5.3 Some of the commonly used signals

Now we make a list of some of the commonly used signals and a brief
description of each of them. Though the student may not be able to remember all of
them, it may be pointed out that experienced programmers are normally expected to
know that such signals exist.
SIGABRT This is generated by calling abort. The process terminates
automatically
SIGALRM This is generated when a timer that is set by the alarm
function expires. This is also generated when an interval
timer set by settimer(2) function expires.
SIGBUS This is generated when an implementation defined
hardware fault occurs
SIGCHLD Whenever a process stop or terminates, the SIGCHLD
signal is sent to the parent. By default the signal is ignored.
The parent can catch this signal if it wants to be notified
about the child’s status and it’s changes. The normal action
in the signal catching function is to call one of the wait
functions, to fetch the child’s process ID and termination
process.
SIGCONT This signal is sent to a stopped process to make it continue.
The default action is to ignore this signal, if the process has
not stopped.
SIGEMT This is raised when there is an implementation defined
hardware fault.
SIGFPE This is raised, when an arithmetic exception, like overflow
or underflow occur.
SIGHUP This signal is raised when a disconnect is detected by the
terminal interface and is meant to be addressed to the
controlling process associated with the controlling terminal.
This signal is also generated when the controlling process
terminates. In this case, the signal is sent to each process

110
in the foreground process group. This signal makes the
daemon processor reread their configuration files.
SIGILL This signal is raised when a process has executed an illegal
hardware instruction.
SIGINFO This signal is raised by typing the status key. It normally
causes the status information on the processes in the
foreground process group to be displayed in the terminal
SIGIO This is raised to signify an asynchronous I/O event.
SIGIOT This is raised to indicate an implementation defined
hardware fault.
SIGKILL This is raised by the system administrator to kill a process.
This cannot be ignored.
SIGPIPE This is raised if a pipeline is being written into, but the
reader of the pipe has terminated. This is also raised if a
socket is being written into by a process, but the reader
has terminated.
SIGPROF This is raised when a profiling interval timer set by set
timer(2) function has expired.
SIGPWR This is an interesting signal, available in SVR4, but is
system dependent. It is useful on a system connected
through a UPS. When power fails, the UPS takes over the
supply of power and the system can be notified. At this
point of time, since the power supply is being continued,
there is nothing that the system should do. If the battery of
the UPS gets too low the software is again notified and this
second notification is done by SIGPWR. The signal sends
a message to the init process to immediately start the
shutdown process.

SIGQUIT This signal is raised when a terminal quit key is typed. It is


sent to all processes in the foreground process group,
asking the processes to terminate and also to generate a
core file
SIGSEGV This is raised when a process has made an invalid memory

111
reference.
SIGSTOP This is raised to stop a process. This cannot be ignored.
SIGSYS This is raised to signal an invalid system call
SIGTERM This is raised to send a termination signal by the kill(1)
command
SIGTRAP This is raised to indicate an implementation defined
hardware fault
SIGSTP This is raised when the terminal driver encounters a
suspend key being pressed. It is sent to all processes in
the foreground process group
SIGTTIN This is raised by the terminal driver when a process in the
back ground process group tries to read from it’s controlling
terminal .
SIGTTOU This is raised by the terminal driver when a process in the
background process group tries to write to it’s control
terminal.
SIGURG This is raised by a process to indicate that an urgent
condition has been encountered.
SIGUSR1 User defined signal for use in application programs
SIGUSR2
SIGXCPU This is raised to indicate that a process has exceeded it’s
CPU current limit.
SIGXFSZ This is raised when the filesize limit (current) is exceeded
by a process.
SIGVTALRM This is raised when a virtual timer alarm set by settimer(2)
function expires

5.4 We now look at some of the functions of Unix that operate with signals.
The first of them is the signal function.
It’s typical format is
Void(* signal (int signo, void (*func)(int)))(int);
We look at the argument, the signo argument is the name of the signal, the value
of func can be (i) the constant SIG_IGN(i) The constant SIG_DFL or the address of the
function to be called when the signal is raised.

112
If we specify SIG_IGN, for fun, the function will ignore the signal. (ofcourse the
signals SIGKILL and SIGSTOP cannot be ignored).

If we indicate the value of fun as SIG_DFL we are setting the associated action to
be the default action of the signal.

If a function name is associated, then when the signal is raised, it is caught by


calling the function specified. Some authors call the function “signal handler” or “signal
catching function”.

Before we proceed, we would like to know the default actions of the signals we
have studied earlier. Fortunately, we need not have to make an exhaustive list again.
Most of them simply specify terminate as the default action. We can simply list out those
that have a different default condition.

The following signals have ignore as the default action


SIGCHLD, SIGPWR, SIGURG
The following signals have stop process as the default action
SIGSTP, SIGTTIN, SIGTTOU

All others have terminate as the default action.


To make ourselves familiar with all the concepts so far, we look into a simple signal
handler program that catches either of the two user defined signals (Ref. Text Book 1).
#include <signal.h>
/* This is the handler function for both the signals */
main (void)
{
if (signal(SIGUSR1,SIG_USR)== SIG_ERR)
err_sys (“Can’t catch SIGUSR1”);
if (signal(SIGUSR2,SIG_USR)== SIG_ERR)
err_sys(“can’t catch SIGUSR2”);

113
pause();
}
static void
sig_usr (int signo) /* argument is signal no */
{
if(Signo = =SIGUSR1)
printf(“received SIGUSR1 \n”) ;
else if (signo==SIGUSR2)
printf(“received SIGUSR2 \n”);
else
err_dump(“received signal %d \n”, signo)
return;
}
5.5 The concept of Reentrant functions
We have started the discussion of signals indicating that a signal can be studied in
terms of software interrupts. It actually means, when a signal appears, if a process
is executing, the process is temporarily interrupted to handle the signal (by running
the signal handler function) and if the signal handling does not end up in termination
of the process we come back to the process that was executing when the signal
was raised and try to continue from the place where we left off.

Looks simple enough, if we can mark the position where we abandoned the
process to execute the signal handler. This can also be easily done using the
system stack. But the problem lies elsewhere. We cannot exactly simulate the
original process, if it was executing certain specific operations when we interrupted.
For example, if it was handling static memory allocation, we may not be able to
exactly start where we left off, because of so many other executions that have taken
place in between.

114
A function which can be executed from where we left off (as desired in the above
discussion) is said to be reentrant function i.e. the result of previous executions do
not affect the sequence of the present execution.

Most systems make a list of Reentrant functions that they guarantee would
satisfy this property; The programmer will have to be extra careful while handling
non reentrant programs.

5.6. Reliable signal terminology

we define some of the terminology that we may be using for our future
discussions.

A signal is said to be “generated” for a process when the event that causes the
signal occurs. When the signal is generated, the kernel usually sets a flag to
indicate the same.

The signal is said to be “delivered’, when the action for the signal is taken. In
between the periods (from the time the signal is generated and the signal is
delivered) the signal is pending.

A process can “block” a signal. Suppose a signal is generated for a process and
if it cannot be ignored, the process has the other option of blocking the signal. It
continues to be in the blocked state until either it is unblocked or the process
executes the action as specified or changes the action field so that it can ignore the
signal.

This mechanism helps the process to decide what to do, not immediately after
the signal is generated but just before the signal is delivered.

115
The sig pending function can be called by a process to determine which signals
are blocked and pending for the process.

If a signal is blocked and meanwhile a new signal is generated before it is


unblocked, then the signals are queued and delivered.

Each process is associated with a signal mask, which normally takes the form of
a bit. There will be one bit for each possible signal. By making the bit on, the signal is
said to be currently blocked. To check the status of he current signal mask, the process
can use the sigprocemask function.

The typical format of the function is


Int sigprocmask (int how, const sigset_t *set, sigset_t * oset);
Returns 0 if successful, else returns –1.

If, at the time of calling, oset is a null pointer, the current signal mask for the
process is returned through oset.

If set is a null pointer, then the how argument indicates how the current signal is
modified. The different values of how and their significance are listed below.
How Description
Sig_BLOCK Create a “set” containing the additional
signals that need to be blocked. The new
signal mask will be the union of the current
signal mask and the set.
SIG_UNBLOCK Set contains the signals to be unblocked.
The new signal mask will be the intersection
of it’s current signal and the set.
SIG_setmask The new signal mask is the value set by set

If set is a null mask, then the process is not changed. Then the value of how is
redundant.

116
If there are any pending unblocked signals, the unblocked signals, after calling
sigprocmask, will be delivered to the process.

5.7 Raise and Kill functions:

The kill function sends a signal to either a process or a group of processes. The
raise function allows a function to send a signal to itself

Typical formats are


Int kill (pid_t pid, int signo);
Int raise (int signo);
Both return 0 if successful, else return –1.

For the kill function


If pid > 0 the signal is sent to the process whose process id is pid.

Pid = = 0 the signal is sent to all processes, whose group id equals to


the process group id of the sender and for which the sender
has the permission to send the signal.

Pid < 0 The signal is sent to all processes whose process group id
equal the absolute value of pid and for which the sender has
permission to send the signal.

Pid = -1 unspecified.

For signo, signal 0 is the null signal, which is used to check whether a process
still exists. If we send a null signal to a process that does not exist, it returns –1 as the
return value. If any other value is returned, we say the process exists.

117
Unfortunately, there is a big catch here. Unix recycles pids. What does it mean?
When a process is killed or terminated or whatever, it’s id may be allotted to some other
process. If you want to ascertain whether the earlier process still exists and send a null
signal, you may get a signal that the process exists, but it may not same process that
you are expecting it to be, but a new process with the same id.

5.8 Alarm and pause functions:

The alarm function is used to specify a time (in future) when a timer should go off.
When the time expires, SIGALRM is generated. If we do not catch the signal, the
default action is to terminate the process.

The typical format is as below


Unsigned int alarm (unsigned int seconds)
Return 0 or number of seconds left in the previous alarm

The seconds value is the number clock seconds after which the signal goes off.
At that instant the Kernel raises the SIGALRM. But the process may actually get the
signal later, because of the several aspects that we have discussed earlier, as also the
fact that there can be scheduling delays.

Also, only one alarm can be set for one process. If suppose, there is already an
alarm set for the process. Then alarm function resets the alarm to the value specified
and returns the “ remaining” time for the previous alarm in seconds.

If the alarm is not caught, it can, by default, terminate the process. But most
processes catch the alarm before deciding as to whether to terminate or not.
The typical pause function as follows
Int pause (void)
Returns –1 with error no., otherwise does not return.

118
The pause function suspends the calling process until a signal is caught.

Using pause and alarm combination, we can simulate the sleep function. Let us
call this sleep new function.

# include <signal.h>
#include <unist.h>
static void
sig_alrm (int signo)
{ return; /* the main process returns after sleep */
}
unsigned int
sleep new (unsigned int nses)
/* takes the delay parameter in nanosecs */
if (signal (SIGALRM, Sig_alrm) = = SIG_ERR
return (n secs);
alarm (n secs) /* start the timer */
pause ( ) /* wait till the signal raiser */
return alarm (0) /* turn off the timer */
}
This simple implementation, though illustrates the use of the alarm, pause and
signals, is imperfect.

The student is advised think why it is imperfect, based on the discussions so far.

5.9 Signal sets.

In one of the previous functions (the concept of blocking to be exact). we talked


about signal sets, though we did not specify the actual form that these sets take. We
deal with them in more detail.

119
So for we have been analyzing the effects of single signals - one at a time. But
these signals need not be mutually exclusive . In fact they may be used as
complementary to one another. In such cases, we talk of sets of signals - a signal set.
Note that we used a signal set to arrange and rearrange masks in sigprocmask.
The following functions manipulate signal sets.
1) int sigemtyset (sigset_t *set);
2) int sigfillset (sigset_t *set);
3) int sigaddset (sigset_t *set, int signo);
4) int sigdelset (sigset_t *set, int signo);

all of them return 0 if ok, -1 on error


5) int sigismember (const sigset_t *set, int signo);
returns 1 if true, 0 if false.
The function sigemptyset initializes the signal set, pointed to by set, so that all
signals are null (excluded).
The function sigfillset initializes the signal set, pointed to by set, so that all signals
are included. Before using a signal set for any operation, the applications should call
either of these two functions, to initialize the sets.
The addset and deleteset functions respectively add to and delete from the set
pointed to by set, the signal whose number appears in signo. Note that at each stage,
we can add or delete one signal only.

5.10 Sigpending function

Often we are interested in knowing the set of signals that are blocked from
delivery and are currently pending with the calling process. The sigpending function will
be doing the job.

120
The typical format is
int sigpending (sigset_t *set);
Returns 0 if ok, -1 if error.
We can write a small program to find all the blocked signals of a process as
below.
#include <signal.h>
void
pr_mask (const char * str)
{
sigset_t sigset;
int errno_save;
}
errno_save = errno;
if (sigprocmask (0,NULL, &offset) < 0)
err_sys ("sigprocmask error");
printf("%s", str);
if((sigismember(&sigset, SIGINT)) printf ("SIGINT");
if((sigismember(&sigset,SIGQUIT)) printf("SIGQUIT");
if((sigismember(&sigset, SIGusr1)) printf ("SIGUSR1");
if((sigismember(&sigset,SIGALRM)) printf("SIGALRM");
/* print the corresponding signal names.
This list can go on, to check as many signals as you need. */
printf("\n");
errno=errno_save;
}

5.11 Sigaction function

This function is used to examine, modify or examine and modify the action
associated with a particular signal. The function sigaction supercedes the function
signal, we have seen earlier in this section.

121
The typical format is
int sigaction(int signo, const struct sigaction * act, struct sigaction * Oact);
returns 0 if ok, -1 on error.

The argument signo is the signal which we are examining or modifying. If the act
pointer is not null, the action is of modification. If O act is not null, the system returns
the previous action of the signal.
The structure sigaction is of the following form.
Struct sigaction{
void (* sa_handler)(); /* address of the signal handler */
/* it can also be either SIG_IGN or SIG_DFL */
sigset_t sa_mask; /* additional signals to mask */
int sa_flags; /* signal option */
};
The list of sa_flags are listed below along with their descriptions.
Once a sigaction sets up the action for a particular signal, it remains unchanged
until it is explicitly changed by calling sigaction again.
Option flags Description
SA_NO CLD STOP if signo is SIGCHLD, do not generate the signal when
a child process stops.
SA_Restart System calls interrupted by this signal are
automatically restarted.

SA_Onstack If an alternate stack has been declared, this signal is


delivered to the process as the alternate stack.

This is not the complete set of options. The student may refer to standard
literature to get a complete list of a options.

5.12 Sigsuspend function.

122
Often we would like not to receive signals when certain critical sections of the
code are being executed. one way to do would be to use the signal mask to block and
unblock signals. When we are just entering the, critical section, we can mask the
signals and unmask them after we pass the section. However, consider the scenario
when we unblock the signal and pause, waiting for a blocked signal to occur. If the
signal we are waiting for happens to come between the unblocking and pause actions,
we tend to lose the signal.

There is another way in which we both reset the mask and put the process to
sleep with an atomic operation. (Note that an atomic operation is one wherein the
process cannot be deallocated when the operation is going on. This means, unmasking
and pause are completed in one stroke, avoiding the problem we have just discussed).
int sigsuspend(const sigset_t * sigmask);
returns -1 with error no.
The signal mask of hte process is set to the value pointed to by sigmask.

5.13 Abort function


A typical abort function is as follows
void abort (void);
This function never returns .
This function sends SIGABRT signal to the process and the process normally
should not ignore this signal.
However the process can catch SIGABRT. This facility is provided to allow for
the process to effect any cleanup if it wants to, before the process termination.

5.14 Sleep function


We have used and also briefly seen sleep function in one of the previous
sections. However, typically sleep is a function that needs to be studied in the context
of signals.
The typical implementation of sleep is

123
unsigned int sleep (unsigned int seconds);
returns 0 or the number of unslept seconds.
This function, when called, causes the calling process to be suspended until one
of the two conditions are met.
a) The amount of real time, specified by the seconds field has elapsed.
b) A signal is caught by the process and the signal handler returns.

However, the difficulty is still the same as we have seen with the alarm signal.
i.e. there may be a time log between the time when the sleep ends and when the actual
return to the process takes place.
A word about the values returned by sleep. In a normal return, it returns 0.
However, if it returns early, because of certain signal being caught (case (b) above), it
returns the number of seconds that were still to elapse for the normal return.

5.15 Job control signals.


There are six signals that can be considered to be job control signals.

SIGCHLD child process has terminated or stopped


SIGCONT continue the process, if stopped
SIGSTOP stop the signal (cannot be ignored)
SIGTSTP Interactive stop signal
SIGTTIN Read from the controlling terminal
SIGTTOU write from the controlling terminal.

Typically a job control signals is one that can be used for controlling the
sequence of process execution.

Since we have already seen these signal operations, we will not discuss them in
any detail. The student is advised to note that by suitably introducing these signals, we
can control the way the processes are scheduled, irrespective of the scheduling, and

124
get the desired sequence of operations. However, one should be careful about the
interactions between the signals to ensure proper sequencing.

One such example is if any of the stop signals are generated (SIGSTP, SIGSTOP
SIGTTIN or SIGTTOU) and pending SIGCONT signal for the process is discarded.

Similarly, when SIGCONT signal comes up, any pending stop signals are
cleared.

125
Block Summary

We have briefly discussed the importance of signals in synchronization of


processes. Signaling necessary to ensure proper sequencing of process and sharing of
resources as also to take care of conditions like hardware exceptions, terminal
generated signals, killing o fprocesses etc. We have seen a list of normally used
signals and the fact that the process can ignore, catch or allow default action an signals.
It was also established that recentrant functions are essential for smooth operation of
signal generated operations.

We have looked into the details for several function like Raise, Kill, alarm pause
etc. Finally we discussed the concept of signal sets and five functions to manipulate
such signal sets. A part from this, we studied the sigpending, sigaction, sigsuspend,
Abort and sleep functions. We closed the discussion with Job control signals.

126
Review Questions

1. What are the situations in which signaling becomes necessary?


2. What are the actions that a process can take on encounting a signal?
3. What is a recentrant functions?
4. Give the format of kill and raise fucntions?
5. What is the purpose of alarm and pause functions?
6. What is a signal set?
7.What are the opreations on signal sets?
8. Give the format of the sleep function?
9. Name the Siz Job control signals
10. Which signal does the abort function send?

127
Answers

1. Signaling becomes necessary in situation like process synchnomisation, resource


sharing, hardware exceptions, terminal generated signals, killing o fproces etc.

2. There are 3 types of actions possible – ignore the signal, catch the signal and allow
default action.

3. A function which can be executed from where we had left it before without affecting
the quality of results is called a reentrant function.

4. The format are


int kill (pid_t pid, int signo);
and int raise (int signo); respectively

5. The alarm function is sued to specify a future time when the signal should go off. The
pause function makes the process suspend it’s activities temporarily.

6. they are collection of signals, so that arrangement of maskings becomes easy.

7. To check whether the set is empty ; to enter an object into it, Tos elect an object, to
check whether a given objet is a member etc.

8. Unsigned int sleep (unsigned int seconds)

9. SIGCHLD, SIGCONT, SIG STOP, SIGTSTP, SIGTTIN and SIGTTOU.

10. SIGABRT Signal

128
BLOCK VI

BLOCK INTRODUCTION

In this block we study interactions between processes. In the previous chapters


we have seen that there are relationships between different processes. Every process
has a parent process, but need not necessarily have child processes. Whenever a child
process terminates the parent is notified. Also the parent can obtain the exit status
details of it's children.
In this block, we look at the process groups in more detail, in order to study the
relationships between the processes and also the relationship between the login shell
(which gets invoked when we login) and the processes that start from the login shell.

CONTENTS
6.1 The concept of terminal login
6.2 Network logins
6.3 Process groups
6.4 Sessions
6.5 Controlling terminal
6.6 Tcgetprgp and tcsetpgrp functions
6.7 Job control
6.8 Review questions and answers

129
6.1 The concept of terminal login
What happens when a user logs in from a terminal? Though there are slight
differences between one version of unix and another, we look into the more general
format, of course specifying the specialties of the versions when necessary.
Normally the system administrator will have created a file whcih we can call login
file. In this file, each line corresponds to one terminal device of the syste (Note that it
need not always be an I/o terminal, other devices are also included). That line specifies
the name of the device, parameters to be exchanged for logging in etc. when the
system is on, thekernel creates the first process, the init (also indicated by the process
id1) and this init is responsible for the interaction with the various users as and when
they login at different terminals. init reads theavove cited login file and for every
terminal that is allowed to login, produces a fork and an exec of a program to get the
details of the terminal (let us call it getty_ ty stands for terminal device). The entire
process can be shown as in the following figure.

Boot strap

process id1

init

fork forks, one per each terminal

init

each child process executes getty

getty

The processes describes so far are all working with super user privileges and as given
real user id of 0.
Getty calls the open function for the terminal device. This opens the terminal for
reading and writing. Once the device is open (depending on the type of the device, the
actual mode of opeing and the delay before it becomes open differ), file descriptors 0,1
and 2 are set to the device. (The student is adviced to look back into the concept of file
descriptors to be sure what these descriptors are actually about). The getty outputs a

130
login function (or something similar) and works for the user to type in his user name.
once the name is typed, it invokes the login program. init invokes getty and attaches
the environment for login, with the name of the terminal and other details. At this stage,
the status of the processes appeared as below.

Boot strap

init

Fork exec reads tty forks one per terminal


Creates empty environment
getty Opens terminal device
(file descriptor 0,1,2
exec Reads user name, sets initial env.

login

Now login takes over. Since it is working with the superuser privilages, it calls a
function getpwnam to fetch the passwordentry for the user. It asks the user to give his
password by prompting him to do so and reads the password typed in by him. It calls
for the encryption of the password and compared it with the encrypted entry in the
password file (student will do well to remember that passwords in unix are stored in
encrypted form in the password file and any entry into the terminal is to be encrypted by
using the suitable algorithm, so that it can be compared). If the password fails to match
(after several trials , which can be set) it calls for exit with an argument of 1 being
returned. This termination is noticed by init which will do another fork, exec and getty,
so that the user can try the login process again.
If on the other hand, the login has been successful, then login changes to the
home directory of the user (chdir). The ownership of the terminal is changed using
chown, so that the user who has successful logged in becomes the owner and group
owner. The user permissions are suitably changed, enabling the user to read, write and
group_write. Group ids are the home directory, user name and path are incorporated.
The user who has logged in will now be able to work under his own user id. At this
stage the arrangement of processes appears as follows.

131
Boot strap

process id 1
Init

fork
Init
exec
getty

login

Login shell

File descriptors 0,1 and 2

Terminal
Device driver
RS232 connection

User at terminal

6.3 Network Logins

In the present day scenario, networks are taking over from individual systems. In
the case of the terminal login described above, where in a central unit is catering to the
needs of several terminals, init will have full information about the terminal devices
enabled for logins and uses the getty for the devices. In the case of a network,
however, the login is done through the ethernet drivers which interface to the kernel and
no prior information about how many such logins can occur is available. Hence, unlike
the case we have discussed previously, wherein the process creates a fork of each
terminal and waits for the login to come in, it is more sensible to act as and when a
network connection request arrives.
At startup, the init process executes a shell script when the system is ready for
the multiuser operations. This starts a daemon process called inetd. This daemon

132
keeps waiting for any TCP/IP connection request to arrive and once such a request
arrives, it does a fork and exec (as in the previous case).
However, in a network instead of login the user is supposed to telnet to get
himself into the network. Normally, he does this by a typing in the command "telnet
hostnode".
Then a sequence of operations called telnetd are started (ofcourse, the usual
process of checking the password, exiting if the password does not match etc. are to
be gone through. At this stage, we presume a successful matching passwords). The
telnetd then produces two processes, one to take care of network connection and
communication with the client and the second doing the job of login shell. This telnetd
can be thought of as a pseudoterminal which connects the client and the server. The
file descriptors 0,1 and 2 are attached to the pseudoterminal. Then, the login performs
operations like changing the home directory, setting group ids and user ids and setting
up the initial environment.
The arrangement of the process can be briefly given as below
process id1
Executes a shell script when the
init
system comes up for multiuser

inetd TCP Connection request Telnet

fork after the connection request

inetd

exec

telnetd

exec

login shell

file descriptors 0,1,2


pseudo
terminal
network connection
user at a
terminal
133
6.4 Process groups

Every process has a process id and it will also have a process group id. A
process group is a collection of one/more processes, with it's own unique process group
id. They are positive integers and can be stored in a data type pid_t- similar to the
process id.
the function getpgrp returns the process group id of the calling process. The
typical format is
Pid_t getpgrp(void);
returns the process group id f the calling process.
Further each process group may have a process group leader, who is identified by
having it's process id equal to the process group id of the group.
A process group leader creates a process group and creates processes in the group. It
is not necessary that the life of the process group leader should equal or be greater
than the life of the group itself. A process group survives as long as even one of hte
processes in the group is surviving. But the group leader can terminate even before it.
The last process of the group may either terminate or enter some other process group,
in either case the group itself gets terminated.
The function setpgid helps a process to either join an existing process group or
creates a new process group itself.
the typical format is
int setpgid (pid_t pid, pid_t pgid);
This sets the process group id pid to pgid - i.e. the process with the id pid to the group
with the group id pgid. However, if the two fields of pid and pgid are equal, then pid
creates a new group and itself becomes the group leader.
A process can set the id of either itself or one of its children. However, once the
child has called one of the exec functions, the parent can not change the pid of the
child.
If pid=0 then the process id of the caller is used. If pgid=0 then pid is used as the
group id.

134
6.5 Sessions
In this section, we look into a very important concept, the concept of sessions. A
session is a collection of one/more process groups.
A process can establish a new session by calling the setsid function .
Typical format is
Pid_t setsid (void)
returns the process group if successful, returns -1 if on error. If the calling
process is not a process group leader, then setsid function creates a new session.
In fact, several possible options are possible
a) The calling process becomes a session leader of the new session. The
process will be the only process of the new session.
b) The process becomes the process group leader of a new group
c) The process loses it’s association with it’s controlling terminal.
Now what is a control terminal ? look into the next session.

6.6 Controlling terminal

The concept of control terminal was raised in the previous section. We would
like to discuss about what a control terminal is all about, in this section.
A controlling terminal is normally a terminal device in which one logs in. A session
can have a single control terminal. A session leader establishes the connection with the
control terminal on behalf of the session. Such a session leader is called a controlling
process. The process groups within a session can be grouped into a single foreground
process group and one or more number of background process groups.
When the terminal’s interrupt key is pressed, a quit signal is sent to all process in
the foreground process group.
Normally once a user creating a session logs in, his terminal automatically
becomes the control terminal.
In case any other program or process wants to communicate with the controlling
terminal, it has to open the file/dev/tty and use it for communication.

135
6.7 tcgetprgp and tcsetpgrp functions.

These functions will be useful to indicate the foreground process group to the
device driver.
The general format is
Pid_t tcgetpgrp(int filedes);
This returns the group id for the foreground process if ok, otherwise returns –1 as
error.
The other function is
Int tcsetpgrp(int filedes, pid_t pgrpid);
This returns 0 if successful, -1 if on error.
The function tcgetpgrp returns the process group id of the foreground process
group for the terminal open on filedes.
Tcsetpgrp can be used to set the foreground process group id to pgrpid, (if the
process has a controlling terminal). The pgrpid value should be the process group id of
a process group in the same session. Filedes refers to the filedescriptor of the
controlling terminal of the session.

6.8 Job control

The term job control has been encountered previously also. It normally means
the capability to start multiple jobs, groups or processes from a single terminal and
control their sequence of execution like which job can access the terminal and which
should run in the background. Job controlling essentially means we are using a shell
that supports the task of job control and uses the features to do the same.

When a background job is started, the shell assigns a job identifier and prints one
or more process ids. The interaction with the terminal driver becomes important
because the suspend key- if entered from the terminal affects the foreground jobs. This

136
key, when depressed, makes the driver send SIGTSTP signaling the foreground
process group jobs to stop. However, the background jobs remain unaffected.

There is yet another condition that the terminal must be able to handle. There is
one foreground job and one or more background jobs. If a character is entered from the
terminal, which of these jobs receive that input will be the question. The foregound job
is normally expected to receive the input. If the background job is in need of an input, it
may try to read the input. If this is detected by the terminal, it sends a SIGTTIN signal
to the background job. The background job stops and if needed, the user can bring it to
the foreground. Then afterwards, it can read the inputs from the controlling terminal.

Similarly a background job may try to output to the terminal. Depending on the
circumstances, the user may allow or disallow the same. However, there is a conflicting
opinion about the need or otherwise for the job control. With the advent of windowing
systems it can be claimed that job control is not needed and the effect and effort an
dresources needed for it can be used for better purposes. Others feel it could
supplement the facilities provided by the windowing system.

137
Block Summary

In this block, we looked into the concept of a terminal login. We understand the
various steps that the shell will have to undertake for a successful login. We also
discussed the difference between a terminal login and network login. The concept of
process groups and how to set the group ids and get the current value of group id were
also discussed.

Then the concept of sessions was discussed as also the idea of a controlling
terminal. We familiarised ourselves with the functions that indicate the foreground
process group to the driver. The final idea discussed was the concept of foreground
and back ground jobs and how we can control them.

138
Review Question

1. What is the diffeence between network login and terminal login

2. What is a process group?

3. Which fuction gives the process group id of the calling process?

4. What is a session?

5. Which function establishes a new session. Give it’s format

6. What is a controlling terminal in the context of a session?

7. What does the function tcgetpgrp do?

8. What does the function tcsetpgrp do?

Answers

1. In a terminal login, we use simple login process, whreas in a network


login we use telnet.
2. A Process group is a collection of one/more processes, with it’s own
process group id.
3. The format of the function is pid getprgrp(void), which returns the program
group id.
4. A session is a collection of one / more process groups.
5. The format is pid_t set sid (void)
6. A controlling terminal in the context of a sessio is the terminal through
whihc a session leader establishes the connetion.
7. It returns the group id of the foreground process.
8. It helps to set the foreground proces group id.

139
Unit – III
INTRODUCTION
This unit consists of two blocks. In the first block, we are introduced to the
details of terminal I/O operations. We get acquainted with the two major modes of I/O
operations – the canonical and non canonical modes. Then we get started with the
normal I/O functions, how to set and control the I/O parameters etc.. The next concept
is how to interact with slower devices, especially when a no. of such devices are there –
either by polling or by use of semaphores etc..

We also see what a deadlock is, which incidentally is a result of the concept of
record lockings. We also learn to distinguish between advisory and mandatory dead
locks. Then we get ourselves introduced to streams – the details of stream messages.
We also discuss about daemons – which are actually background processes that are to
do several sundry and accounting jobs.

In the second block, we look into the concept of inter-process communications.


We start with the concept of pipes, which can be viewed as a connecting channel
between two processes through the kernel. It is a half duplex connection. We study the
various functions to open and close pipes and also how we can get a full duplex pipe
out of the structure.

We also study FIFOs – which help us connect even processes that are not from
the same ancestor – which is not possible with pipes. The other concept we
study is about message queues.

Then we move on to semaphores – which help the system processes to share


the resources by giving suitable indications. We study the various functions to operate
on semaphores we study the concepts of shared memory, stream pipes and client
server operations also.

140
BLOCK - VII
Block Introduction

In this block, we shall be looking into the details of the terminal I/O oprations,.
The block starts with an introduction to the various terminology used in connetion with
I/O oprations. WE start with the two major modes of I/O oprations – the canonical and
non canonical modes. Then we see the normal I/O opration fucntions, getting or setting
them, band rate and line control fucntions and the concept of erminal identification.

Then the concept of Non blocking I/O, which facilitate interaction with slow
devices is introduced. The other key concept is record locking that allows multiple
devices to use the same file simulatneously, we see how to set or adjust locks and other
details.

The next concept is of the dead lock, which of course is a produt of record
locking. When two or more processes have locked their own resources and are waiting
for other resouces held by othre dead locks occur. We also see the concept of implied
interitence and how to release dead locks. A subtopic is the concept of advisory and
mandatory dead locks.

Then we move on to streams – the details of stream messages the functions that
help us to operate with streams for reading and writing. The other concept is I/O
multiplexing – how to choose and operate with one out of a several list of devices.

Finally, we look at the concept of daemous, why do we need them, how to creat
them etc.

141
Contents:

7.1 Introduction
7.2 Getting an dsetting attriubtes
7.3 Baud rate functions
7.4 Line control functins
7.5 Terminal identification
7.6 The canonical mode
7.7 Non canonical mode
7.8 Termcap, terminfo and curser
7.9 Non blocking I/O
7.10Record locking
7.11Concept of deadlock
7.12Streams
7.13I/O multiplexing
7.14Select function
7.15Poll function
7.16Introudtion to daemons and their characterization
7.17How to write daemons and why?
Review Question and Answers

142
BLOCK VII

Terminal I/O Operations

7.1 Introduction: Handing of I/O is at best a messy operation. A lot of factors need to
be taken in to account and even then it can at best be only satisfactory. Before going
into the details of I/O operations, we start by noting the two major modes of I/O.
a) Canonical mode input processing
b) Non canonical mode input processing.

In the canonical mode terminal input mechanism, the input is processed as lines.
The terminal driver returns one line per read request, which is processed by the
system.

In the noncanonical mode input processing, the input characters remain


as such, and are never assembled or considered to be lines.

Default operation is canonical.

We can think of a terminal device as being controlled by the terminal


driver. Each terminal device has an input queue and an output queue.

Next character read by next character written by


process the processes
if echo is
Input queue Output queue
enabled

max input
next character read next character to
from device transmit to device

143
The above logical representation gives the input and output queues for a terminal
device. There are several points to consider.

a) There is a link between the input and output queues only if echo is
enabled
b) The max size of the input queue may be predefined. Once this size is
exceeded, what happens to the next incoming input will be
implementation dependent. One simple mechanism could be to raise
an alarm in some form.
c) There is also a limit on the maximum and number of bytes that the
canonical line can contain.
d) The output queue is also limited in size. However there may not be
any over flow condition here, because if the buffers get filled up, the
Kernel can just put the writing process to sleep.
e) There is a tc flush function that helps to flush either the input queue or
out queue.

Most unix systems implement the canonical processing using a module called
terminal line discipline. This is given below:

User process

Read & write function

Kernal
Terminal line
discipline

Terminal device driver

Device

144
When the user process wants to have I/O with a device, it calls the
appropriate read and write functions. These function, in turn call the terminal line
discipline, which interacts with the terminal device driver to interact with the
actual devices. Note that the actual devices can be physical devices with varying
physical properties that need to be tuned properly before the I/O operations can
start. Hence the need for the terminal line discipline.

All the characteristics of the terminal device (those that can be checked and
changed) are contained in a structure called termios.

Struct termios {
Tcflag_t c_i flag /* input flag */
Tcflag_T c_oflag /* output flag */
Tcflag_t c_c flag /* control flag */
Tcflag_t c_i flag /* local flag */
Cc_t c_cc /* control characters */
}

The data type tc flag_t can hold a variety of flag values. The nature of each of
the flag is evident in the comment fields. Each of these flags can be set to a number of
optional values. The actual list of values is too long to be listed here, but it is enough to
point out that most of the possible settings are included in the list.

We also list out the normal functions used in I/O operations and their method of
usage in brief.

function Description
tcgetattr Fetches the attributes and returns in a
termios structure
tcsetattr Sets the attributes as included in the
termios structure
Cfgeti speed Get the input speed (in bauds)

145
Cfget0speed Get the out put speed (in bauds)
Cfsetispeed Set the input speed (in bauds)
cfsetospeed Set the out put speed (in bauds)
tcdrain Wait for all output to be transmitted
tcflow Suspend, transmit or receive
tcflush Flush pending inputs and outputs
Tc send break Send BREAK character
tcgetpgrp Get foreground process group id
tcsetpgrp Set foreground process group id

Some of these functions, we study in more detail.

7.2 Getting and setting attributes:

tcgetattr and tcsetattr

Typical formats are


Int tcget attr (int filedes, struct termios * termptr);
Int tc set attr (int filedes, int opt, const struct termios * termptr);

The fields refers to the terminal device. The set function takes the options and
sets the current attributes. The get function returns the current terminal attributes.

7.3 Baud rate functions:

Baud rate, as we know, is the speed at which data is transferred to or from the I/O
device. Baud rate stands for bits per second. A number of functions are available to
manipulate the baud rates of the connected I/O devices. We shall examine a few of
them.

Typical formats are


Speed_t cfgetispeed (const struct termios * termptr);
Speed_T cfgetospeed (const struct termios * termptr);
Both of them return the baudrate value.

146
Int cfsetispeed (struct termios *termptr, speed_t speed);
Int cfsetospeed (struct terrnios *termptr, speed_t speed);
Both return 0 if OK and –1 if error.

However, note that the set functions do not indicate whether the settings have
properly been done or not. They only return whether the setting was successful or not.

It is a good practice to use tcgetattr to check whether the settings have been
proper before going ahead with the next steps.

Similarly it is advisable to use tcgetattr to get the setting details before attempting
to set the new values.

7.4 Line Control functions:

Some functions are available which provide the line control capability to the
terminal devices.

The typical formats are


Int tcdrain (int filedes);
Int tcflow (int filedes, int action);
Int tcflush (int filedes, int queue);
Int tc send break (int filedes, int duraton);
All of them return 0 for successful operation and –1 for error.

Now brief description of each of them:


a) The tcdrain waits for all outputs to be transmitted.
b) The tc flow provides us several options to manipulate the flow action.
The actual action to be performed is provided by the action argument.
The action can be any one of the following values:
TCOOFF Suspends the output

147
TCOON Restarts the previously suspended output.
TCIOFF Transmits the stop character, which has the effect of
asking the terminal device to stop transmission.
TCION Transmits the START character, which has the
effect of asking the terminal device to restart
transmission.

c) The tcflush function lets us flush the input buffer or the output buffer.
I.e. we may want to start afresh after clearing all that data that is lying
in the buffer, yet to be read by the programs or by throwing away all
output data that has been written by the programs but not yet
transmitted to the device. The reason why we may want to do it is not
important here, except that the tc flush can accomplish it by the
suitable argument being placed in the queue field.
The queue can take any one of the following arguments.

TCIflush To flush the input queue


TCOflush To flush the output queue
TCIO Flush To flush both input and output queues.

d) The tcsend break function transmits a continuous stream of 0 bits for


the duration specified in duration argument. However, the actual
duration of transmission, though is proportional to the value specified,
but will not be exactly the same. For example a value of 0 for duration
would effectively transmit for about 0.5 seconds.

7.5 Terminal identification:

The typical format is


Char * ctermid (char *ptr);

148
If the pointer is non null, the function returns a pointer to an array that controls
the name of he controlling terminal.

There are two other similar functions


Int isatty (int filedes)
This returns 1 (true) if the device pointed to by filedes is a terminal device,
returns 0 (false)if it is not.

The other function is


Char *tty name (int filedes);
This returns a pointer to the path name of the terminal indicated by filedes, NULL
if error.

7.6 The canonical mode:

Having seen some of the functions, we would like to look at the canonical mode as
a whole.

Implementing and operating in canonical mode is fairly straight forward. Once a


read command is issued, the terminal returns the input when a complete line is read.

How to decide when a complete line has been inputted:

i) Read can be effected if a prerequested number of bytes have been


inputted.
ii) Read can be effected when a line delimiter is encountered.
iii) Read can be effected if a signal is caught and the function does not restart
automatically.

To make the concept a bit more clear, we write a small function get pass, that
reads a password from the user at the terminal (Ref the text book (i) for the program)

149
# include <stdio.h>
# include <signal.h>
# inlcude <termios.h>
# define MAX_PASS_LEN 8
/* maximum number of characters in the pass word */
char * get pass (const char * prompt)
/* function beginning here */
{ static char buff [MAX_PASS_LEN+1];
char * ptr;
sig set_t sig, sigsave;
struct termios term, termsave;
FILE * fp
Int c;
If ((fp = fopen (ctermid (NULL), “r+’))==NULL)
Return NULL; /* open the file */
Set buf (fp, NULL);
Sign empty set (& sig); / * create empty set */
/* block SIGINT and SIGTSTP, save signal mask */
sig add set (&sig, SIGINT);
sig add set (&sig, SIGTSTP);
sig pro mask (SIG_BLOCK, & sig, & sig save);
tcgetattr ((fileno(fp), & term save);
/* save tty state */
term = term save;
/* structure copy */
term.c_flag & = (ECHO/ECHOE/ECHOK/ECHONL);
/* set the flags to echo the input characters on to the terminal */
tc set attr ((fileno (fp), TC$A FLUSH, & term);
fputs (prompt, fp);
ptr = buf; /* point to the buffer */

150
while ((c=get c (fp) ! = EOF & & C ! = ‘\n’)
{ if ptr<& buf (MAX_PASS_LEN))
*ptr ++ = C;
} /* while the max length is not reached, input the character */
*ptr = 0; /* null terminate */
put c (‘\n’, fp); /* echo a new line */
tc set attr (file no (fp), TCSA FLUSH, & term save);
/* restore tty state */
SIGPROCMask (SIG_set mask, & Sigsave; NULL);
/* restore the signal mask */
fclose (fp);
return (buf);
}

The comments made at appropriate places in the program are self explanatory.

7.7 Non canonical mode:

This mode, as we have seen does not look at the input data / output data in terms
of lines. This mode can be specified by turning off the flag ICANON in the C-1 flag field
of the termios structure, Apart from the fact that the data is not processsd in terms of
lines, the other difference with respect to the canonical form is that certain special
characters that indicate erase, EOF, EOL etc lose their meaning and hence are not
processed.

In the canonical form, all that the system has to do is to check when the line is full
and process it suitably. But in the noncanonical form the system has to decide when to
return the data / get data.

Normally a prespecified amount of data is processed or data is processed at


prespecified duration of time, irrespective of the amount of data accumulated. There

151
are two variables in the C-CC array in the termios structure – MIN and TIME. These
two can be used to set the operation mode in the noncanomical mode. Time specifies
the numbers of 0.1 seconds to wait from the data to arrive. MIN indicates the minimum
number of bytes to wait for before a read operation is done. These two, as we can see,
can be set independently, but their operations are inter related. Let us look at the
combinations.

a) Min = 0, Time = 0
If any data becomes available, reader returns the data immediately. If no data is
available, returns 0 immediately.
b) Min = 0, Time >0
The non zero time specifies a read timer when a read is to be called It returns the
data available as and when the prespecified time interval is reached. If no data
becomes available at that period, it returns 0.
c) Min > 0, Time = 0
The operation takes place only when the specified minimum number of bytes
have been received. This may become dangerous, as in some cases it may
mean waiting for very long periods.
d) Min > 0, Time >0
If the sufficient minimum number of bytes is received before the time expires, the
minimum number of bytes are returned. If the time elapses before the MIN
number of bytes are received what ever bytes are available are received.

7.8 Termcap, terminfo and curser:

These are the functions which allow the processes to do small manipulations with
the terminals, without depending on the OS to do it.

152
Term cap stands for “terminal capability”. It contains a file containing descriptions
of various terminals – the features that they support, and how to adjust the terminal
parameters. These can be used along with the vi editor.

However, termcap has a serious draw back. As more and more terminals are
added, scanning the termcap file for the details of a particular terminal itself becomes
difficult. To over come this problem, terminfo was created, which contains compiled
version of the textual description and hence can be detected much faster.

While both term cap and term info tell us about the methods to change the
terminal capabilities, neither of them is able to do soon it’s own. For this there is the
“cursors”, which contains several functions to perform the various operations.

7.9 Non blocking I/O:

In one of the previous sections, we discussed about blocking of I/O. The “slow”
system calls are those that can block for long periods (for ever, for that matter)

For example, reading from a file in which data is not present, reading and writing
of files that have record locked etc can result in infinite blocking of the I/O operation.

Non blocking I/O allows us to issue an I/O command and ensure that if the
operation cannot be completed for whatever reason, the control returns immediately, of
course returning with an error noting that the operation could have been blocked.

Now, how do we specify the non blocking I/O for a given descriptor:

a) if we are using open to get the descriptor, we can specify the 0_non block
flag.

153
b) If the descriptor is already open, we can call fcnt1 to switch on the 0_non
block flag.

7.10 Record locking:

This is a very important I/O concept on which lot of effort and times are spent. The
problem is very simple. We know unix allows sharing of files. More than one process
(or user) can access the same file (or copies of the same file in some cases)
simultaneously. What happens if both of them (or three of them) try to modify the
contents of a file? Or one of them is writing into it and the other is trying to read from the
file? Since we can never be sure about the sequence in which operations are being
done (whether reading precedes writing and reading comes after writing), the output
that the processes produce become indeterminate.

One simple way to avoid this problem is to ensure that when one of the
processes is writing into the file, no other process can access the file (in some cases
reading can be allowed, but that again, depends), all that the unix system does is to lock
the file. (or the record concerned, so that other records of the same file may continue to
be accessible.)

We can define record locking as the mechanism wherein a given process can
prevent other processes from modifying a region of the file. what is locked in most
cases is a “range” in a file.

7.10.1. Some functions for record locking


The typical format of the normal command is
Int fcnt1 (int filedes, int cmd, - - - -, struct flock * flockptr)
For record locking, cmd can be any one of F_Getlk, F_Setlk or set LKW

The last argument flock ptr is a pointer to an flock structure.

154
The flock structure itself can look like this
Struct flock {
Short 1_type /* F_RDLCK, F_WRLCK, or F_UNLCK */
Offt_t l_start /* off set in bytes, relative to l */
Short l_whence /* seek_Set, seek_cur or seek_end */
Off_t l_len /* length in bytes */
Pid_t l_pid /* returned with F_Getlk */
}

The first line describes the various types of locks that can be incorporated – a
shared read lock (F_RDLCK); and exclusive write lock (F_WRLCK) or unlock the region
(F_UNLCK);

The starting bytes offset (of the region where locking or unlocking region is to
begin) is given by l_start and l_whence.

The size of the region that is being blocked or unblocked is given by l_len.

Now, a few words about shared and exclusive locks. Commonsense tells us that
no harm is done if several processes share a read lock over a region – i.e. if they can
read different bytes of the region independently, but only one process should have
exclusive write lock, being able to write into the region.

Further, when a write operation is in progress in any region of a file, no read


operation should be taking place in the same region.

Needless to say to obtain the read and write locks, the descriptor needs to be
opened in the appropriate mode.

Now, we are in a position to define three different commands (for the cmd
argument) for the fcntl function.

155
a) F_GeTLK : to check whether the block pointed to by the description in
the flock ptr is already locked by some other process. Then, the
information about this lock is returned. This can easily be done by
overwriting the structure in the fcntl command with the information
about the already existing lock. If there is no lock, that can be prevent
our lock from being created, the structure pointed to by flockptr is left
unchanged.
b) F_SetLK : set the lock described by the flock ptr.
c) F setLKW : This is the same as FsetLK except for the w – meaning
wait. If the requested lock cannot be granted for the reason that
another process has already locked the region, the calling process is to
be put to sleep. The wait is interrupted, only if the signal to the effect
that the region is unlocked is caught. Again, since testing for the lock
being released and trying for it’s own lock are two different operations
– it is not an atomic operation, there is no guarantee that some other
process will not obtain the lock in the inter – mediate time lag.

When setting or releasing locks, the system combines or splits adjacent areas as
required.

7.11 Concept of deadlock:

A dead lock is said to occur when two processes are reaching for a resource
each held ( and locked by the other). To elaborate, suppose process 1 needs resource
B, which is not presently available. It also needs resource A, which is available and
process 1 locks it. However, it can proceed only if resource A also becomes available.

Similarly let process 2 needs resource A, which is held by process 1 as above.


It already has with it the resource B, which it has locked.

156
Now process 1 does not proceed with it’s operation until resource B becomes
available. Resource B will never become available, because it is with process 2, which
will be released only when resource A is made available to process 2. But this cannot
happen because process 1 will not release it till process 2 releases it’s resource etc.
So, for an infinite time, A and B are held locked by process 1 and process 2
respectively.

Now, if a deadlock is detected, the kernel can choose one process to receive the
resource held by the other process. For example in the above case, it may force
process 2 to release the resource A so that process 1 can complete it’s job. Once
process 1 release A & B process 2 can complete. However, the main difficulty in tacking
deadlocks lies not in releasing the deadlock so much as detecting whether a dead lock
has occurred and if so why it has occurred.

7.11.1 Implied inheritance and release of dead locks:

We shall look into certain other implications of record locking. As we have seen
locks are associated with a process and a file when a process terminates, all it’s locks
are released on the contrary, when a descriptor is closed, any locks on the file
referenced by the descriptor for that process also need to be closed.

Suppose we have executed the following steps

Fd1 = open (path 1, - - - - -)


Read_lock (fd1, - - - -- -- -)
Fd2 = dup (fd1) / * duplicate fd1 into fd2 */
Close (fd2).

After closing (fd2) not only the lock associated with fd2 but also the one
associated with fd1 will be unlocked.

157
We can look at another example

fd1 = open (path 1, . .. . . . . . )


read_lock (fd1, . . . . . . . . .)
fd2 = open (path 1, . . . . . . .)
close (fd2)

The second aspect to note is that locks are normally not inherited by the child
created by a fork... It a process had secured a lock and then calls a fork, the child may
be considered to be an entirely different process and hence cannot claim to any of the
locks on any descriptors that are held by it’s parent.

In fact this goes by the definition of locks and common sense. Look at the basic
logic behind locking. More than one process cannot be (or should not be ) able to write
or read from the locked sections. If the children of the parent process are allowed to
inherit the locks, then over a period of time, more than one process will be able to
operate on files, thereby defeating the purpose of locking.

However, one exception is that locks may be inherited across an exec, because
in that case, effectively only one process will still be able to access the records.

7.11.2 Advisory and Mandatory locks:

Roughly the concept of Mandatory locking can be said to be similar to the one
that being enforced by the Kernel. When such a locking is done, every calling process
will have it’s open, read and write actions verified by the Kernel to ensure that it is not
violating the locking restrictions. If a process tries to read / write from a region locked
by another process, the process trying to access may have it’s actions blocked.

On the other hand, suppose we have several routines, which are likely to be
frequently accessing, say a database. If we know exactly that only these many

158
functions are likely to access the database, then the processes may be pooled to form
what can be called a set of “cooperating processes”. Every time a read / write is to be
done, they need not specifically lock or unlock the regions, nor has the Kernel to check
for the validity of the actions. The routines themselves may be so designed that they do
not overreach their limits. Such a “locking” can be termed advisory.

The advantages of advisory locking are obvious. But it is to be borne in mind


that any outside (“rogue”) process, that does not form itself to be a part of the pool and
may write / read from any where it likes.

However, it may be noted that even mandatory record locking can be


circumvented by intelligent programmers. Also malicious users (who may themselves
be genuine) can simply lock publicity readable files, so that they prevent any one else
from making use of the files.

7.12 Streams

Streams are provided as a general way to interface communication drives into the
Kernel. A Stream provides a full duplex path between the user process and device
driver. Streams can be used with pseudo device drivers. A simple stream can be
visualized as below.

User processes

Stream head
(System call

Device driver
(or pseudo device
159
Beneath the stream head, any number of processing modules can be included into
the stream. Infact we can define downstream and upstream sides for a stream. The
data that we write into a stream goes down stream whereas the data read by the device
driver is sent upstream.

Stream modules are similar to device drivers in the sense that they execute as a
part of the Kernel and are normally link edited into the Kernel when the Kernel is built.

We can access a stream with the normal functions like –


Open, close, read, write and ioctl.
In addition, several version of unix contribute their own set of commands to operate
on them. We see some of these commands as we go along.

7.12.1. Stream messages:

All input and output operations to the streams are based on messages. (In fact
all stream devices are character files). The stream head and user process exchange
messages using read, write, ioct1, getmsg, getpmsg, putmsg and putpmsg.

Any message between the user process and the stream head consists of the
following:

i) A message type
ii) (Optional) Control information.
iii) (Optional) data.

Of course, different types of messages can accommodate different types of


information, which we see in detail.

160
The control information and data can be specified by the following type of
strucutres:
Struct str buf
{ int max leng; /* size of buffer */
int leng; /* no. of bytes currently in buffer */
char buf; / pointer to buffer */
}

The size of the buffer and current no. of bytes in the buffer need to be set so that
the Kernel won’t over flow the buffer.

A zero length buffer is also acceptable and a long field of –1 indicates that there
is no control or data information.

7.12.2: We now look into some of the functions that will be helpful to us in operating
with streams.

A stream message is written into the stream using either putmsg or put pmsg
function.
Typical format are
Int putmsg (int filedes, const struct strbuf * ctlptr, const struct strbuf
*dataptr, int flag);
And
Int putpmsg (int filedes, const struct strbuf * ctlptr, const struct strbuf
* dataptr, int band, int flag);

Both of them return 0 if successful, -1 if not.

The arguments of file description, pointer to the control buffer and pointer to the
data buffer which are the first three arguments are fairly evident.

161
The only difference between putmsg and putpmsg is that the latter allows us to
define a priority band indicated by the argument band as one of the arguments.
Every message can be defined to be having a queuing priority. The three priority
bands are :
i) high priority messages (highest priority)
ii) priority band messages (medium priority)
iii) ordinary messages (lowest priority).

The messages with ordinary priority (lowest priority) will have a band value of 0.
Messages with higher priorities may have priorities from 1 to 255, higher the priority,
higher the number of band. The messages in the input queue are arranged in the order
of their priorities.

7.12.3 Reading data from a stream:

Just as we are able to write into a stream using write, putmsg and putpmsg for
writing into a stream, we use read, getmsg and getpmsg to read data from a stream.

Typical formats are given below:


int getmsg(int filedges, struct str buf * Ctlptr, struct strbuf * dataptr, int * flagptr);
and
int getpmsg (int fildes, struct strbuf * ctlptr, struct strbuf * dataptr, int *bandptr, int
* flag ptr);
Both return non negative value for success and –1 on error.

The details of these two functions are similar to putmsg and getpmsg. Only one
detail can be noticed. Instead of band and flag values, we have pointers to bandptr and
flagptr. Needless to say that these values are to be set properly before the functions
are called.

162
By making the flagptr to 0, we get the next message, waiting for the stream, in
the queue.

If we want to get only high priority messages, we should make integer pointed to
by flagptr to RS_HIPRI.

7.13 I/O multiplexing:

When we want to read from one descriptor and write into another, we can use
blocking I/O in a loop.

while (( n= read (Stdin_fileno, buf, bufsiz)) > 0 )


if (write (std out_file no, buf, n) ! = n)
err_sys (“Write error”);

In this case, we simply block the I/O over and over, until the buf size becomes
empty.

Suppose, we have to read from two descriptors. We cannot use the blocking
principle, since while we are blocked on one descriptor, the other descriptor may
provide data.

One method to take care of such a situation is to create a fork so that the two
processes can take care of he two I/O operations. (each taking one half of the
operation).

In such a case, each of the two processes can block one of the streams.
However, there may be a small problem during termination.

If the child terminates first, then the parent is intimated about the same and it can
also terminate. On the other hand, if the parent terminates first, the child may not be

163
able to continue. In such cases, it is desirable that a suitable signal is generated to
enable the child also to stop.

There is a second way of I/O multiplexing. In this case, we avoid blocking. We


set both descriptors to non blocking and issue a read on the first descriptor. If data is
available, it is processed. If no data is available, since it is non blocking, the call returns
immediately. The call can now be raised to the second descriptor. After some time gap,
we can again try reading the first descriptor and subsequently the second descriptor
and so on. This type of operation is called polling. Note that there may be more than
two streams also and we keep asking each stream 1,2 ,3 . . . in the same order for data
and then allow a time gap, before repeating the process again. This method will be
useful when most often we get the data that we are looking for from the streams. But
imagine a case that there is a high probability that our polling will not result in data being
read, as each stream chooses to send data only sparingly. Then, we will be wasting a
lot of CPU time in polling.

A third method would be the asynchronous I/O. In this case, the Kernel (or the
processes) will not go round searching for data as in polling, nor does the stream is
blocked b;y a process. When a descriptor has data available for I/O, it should signal to
the Kernel. One problem that may arise is when the Kernel receives a signal about the
data availability, it knows that data is available on one of the descriptors, but it does not
know on which. So there should either be a mechanism where in a descriptor can
identify it self or the Kernel should do a polling once a data ready signal is available.
The first option of the descriptor identifying itself would make programming (at the
system level) a bit more complex, while the second option can be time consuming when
a large number of descriptors need to be checked to ascertain which of them is ready
with the data. With these principles in mind, we look at some of the functions that are
useful in I/O multiplexing.

7.14 Select function:

164
Typical format is
int select (int maxfdp1, fd_set * readfds, fd_set * write fds, fd_Set except fds,
struct time val * tvptr);
returns a count of read descriptors on successful operation.
0 on time out, -1 on error.

The argument read fds, write fds and except fds are pointers to the descriptor
sets. They are the sets to specify which are the descriptors we are interested in
readable condition, which are the descriptors in writeable condition and which ins an
exception condition, A descriptor set stored in a fd_Set data type can be used for each
of the operations. This fd_set data type can be used to allocate a variable of the
corresponding type, assign a variable of this type to another variable of the same type
or use the macros to do some operation on the sets.

The normally used macros can be


Fd_Zero (fd_set * fd set) /* clear all bits in fdset */
Fd_Set (int fd, fd_Set * fdset) /* turn on the bit for fd */
Fd_CLR (int fd, fd_Set * fdset) /* turn off the bit for fd */
Fd_Isset (int fd, fd_set * fdset) /* test bit for fd */

The argument maxfdp1 stands for max fd plus 1. We take the number of the
highest descriptor we are interested in, add 1 to it and store it in the first argument
(Since the descriptors numbers start from 0, if suppose n is the largest numbered
descriptor we are interested in, then actually there are 0, 1, 2, . . . . n = n+1 descriptors.
That is the reason why we add 1 to the max fd descriptor.)

Now, let us look at the last argument. This specifies how long one will wait.

Struct time val {


Long tv_Sec; /* seconds */
Long tv_u sec; /* and micro seconds */

165
}

So we specific for - - - - seconds and - - - - microseconds we are waiting, before


we return.

There can be different values of this function:


a) tvptr = = NULL wait indefinitely, until a signal is caught return is
made only when one of the specified signals is caught.
b) Tvptr.tvsec = = 0 and tvptr tvusec = = 0 Both seconds and
microseconds are zero, indicating that once all the descriptors
are tested, a return is made immediately. This means there is
no blocking in the select function.
c) Both the fields are given certain nonzero values. Wait for the
specified period and a return is made after the time expires.

A select function either returns a positive value indicating the number of


descriptors that are ready, or 0 indicating that no descriptor is ready or a – 1 to indicate
that an error has occurred.

a) If a descriptor in read fds is ready, then we can read the data from it,
readily.
b) If a descriptor in the write fds is ready, then we can write data into it
readily.
c) A descriptor in the exception set, if ready, indicates that an exception is
pending. It may mean an arrival of out of band data or such similar
error conditions, which need to be cleared.

7.15Poll function:

This can be considered to be similar to select in the sense it allows the


programmer to poll the descriptors, but it’s interface is slightly different

166
Typical format is

int poll (struct poll fd fd array [ ], unsigned long nfds, int time out);

It returns a count of ready descriptors, 0 on time out and –1 on error.

Instead of writing a list of descriptors for each operation (read, write or exception)
as in the case of select, poll builds and operates on an array of structures, each element
specifying the conditions of one descriptor.

The typical structure of poll fd is as follows:

Struct poll fd
{ int fd; /* file descriptor number */
short events; /* events to be checked for */
short revents; /* events that have occurred on fd*/
};

The number of the elements that have to be polled will appear in nfds

The events member of each element in the poll fd is to be suitably set.


The field time out specifies how long one is to wait. It can be
a) timeout = inftim –> infinite time (until a signal is caught)
b) time out = 0 –> no waiting time
c) time out > 0 --> The waiting time is specified in milli seconds.

These fields are similar to those of the select function.

7.16Introduction to daemons and their characterization:

167
Daemons are processes that live for long periods of time. They start normally
when a system is boot strapped, continue throughout the operation of the system,
terminating only during shut down. Normally they are designed to run in the
background. Their main reason of existence is to perform a lot of house keeping,
accounting and other day-to-day jobs so that the other processes can work comfortably.

Further all daemons work with super user privileges (user id=0), none of them have
a control terminal meaning all of them will be running in the background. The parent of
all these daemons is the init process.

7.17 How to write daemons and why?

We have already noted that daemons are written and executed to perform
several house keeping activities, in the background. However, since there will be a
number of such daemons, all working in the background, it becomes essential to ensure
that there will be no unwanted side effects. By side effects we means interactions
between the daemons or between a daemon and another process that has not been
visualized fully. Otherwise, we may end up in tackling situations that are not only
undesirable, but are quite difficult to trace in the first place.

Now, what are these ground rules that are to be expected to follow, while coding
daemons:

a) Let all daemons fork from the parent exit. This ensures that the daemons do
not terminate prematurely. This also ensures that since exit is performed last,
no daemon may end up becoming a group leader just because it’s parent
has terminated.
b) Call set sid to create a new session.
c) Change the current working directory to the root directory.

168
d) Set the file mode creation mask to 0. This would ensure full flexibility of
operation to the daemon process, while it goes about creating files.

Since the daemon does not have a controlling terminal, error messages from the
daemon are difficult to handle. Hence, normally a central error logging facility is
required. Most unix systems provide log / device drives to facilitate the same.

169
Block Summary

We began the discussion with the concept of canonical and non canonical modes
of input processing, also nothing that most unix systems use canonical mode. We
noted that the charactersties of aterminal device are contained in the strucutre termios,
which can be accessed. We discussed how to get these attributes as well as boudrate
fucntions, and line control functions.

We also saw how, int he noncanonical mode, we can set the times for data input
/ output.

The next major topic was record locking to ensure data integrity, but we also saw
how this can lead to dead locks. Of course we saw some functions to help us in record
locking. We saw the differene between advisory and mandatory deadlocks also.

The next concept was about streams and the functions that help us to operate
with streams.

The next concept was I/O multiplexing – with the study of the fucntins select, poll
and the need for polling.

The last item was about daemous – whihc are background processes to do
sundry jobs. We looked into their need and certain ground rules as to how to oprate
them.

170
Review Questions

1. What are the twomodes of terminal I/O operations?


2. Give the strucutre of termios.
3. What functions are abailable to set and get the terminal attributes. What is
their format?
4. Name the line control functions?
5. What facility is available to set the amount of data processed in the non
canonical form?
6. Why is record locking needed?
7. Distinguish between advisory and mandatory locks.
8. What does pur msg function does?
9. Name the two fucntions available for reading data from a stream?
10. What does the poll function do?

171
Answers

1. They are canonical mode and non canonical mode.


2. The strucutre is
struct termios {
tcflag_t c_iflag
Tcflag_t c_oflag
tcflag_t c_c flag
tcflag_t c_i flag
cc_t c_cc
}
3. int tcget attr (int filedes, struct termios * termptr) to get the attributes
int tc set attr (int filedes, int opt, const struct termios * termptr) to set the
attributes

4. They are tcdrain, tcflow, tcflush, tc send break.

5. The two values of min and time which specify the minimum number of bytes
and time in 0.1 seconds respectively. If sufficient no. of bytes or the time expires,
which ever is earlier, sets the process.

6. It is needed to ensure that multiple read and write oprations by different


processes on a single block of file do not produce jubled data.

7. Advisory locks is self imposed by the coprocess. Mandatory locks are ensured
by strict monitoring by the system about the various restrictions.

8. The put msg function helps to put a message into a stream.

9. They are get msg and get pm sg


10. It helps to poll the I/O devices and return the no. of ready descriptors.

172
Block – VIII
Block Introdution

In this block, we deal with the concepts of interprocess communication between


different proess can be taken up in a number of ways. One important mechanism is by
pieps. A pipe can be though of as a half duplex connection between two proceses,
through which data can pass. They can be created between the processes and to
ensure the actual duplex communications that one would be needing, two such pipes
are created between a pair of processes and we close the appropirate ends. This
concept, along with the relevent functions, will be studied in the block.

The other concept we study is about coprocesses which can be thought of as an


alternative to pipes. We also study FIFOs – which help us overcome some of the
restrictions pose by pipes. We also help transfer messages, especiallyb etween the
clients and the server.

Then we get an idea into semaphores – which are indicates about the availability
or otherwise of resources. We see how to implement and oprate semaphoes and also
the various functions for the same. We then move on to the concept of shared memory
and how to operate on it – by setting suitable limits to the individual proceses. We also
see the concept of stream pipes and the basic client server oprations.

173
Contents

8.1 Introduction
8.2 Pipes
8.3 Popen and Pclose function
8.4 Concept of Coprocesses
8.5 FIFOs
8.6 Message Queues
8.7 Semaphores
8.8 The concept of shared memory
8.9 Client Server Properties
8.10 Steam pipes
8.11 Passing file descriptors

174
Inter Process communication

8.1 Introduction: There are enormous needs for the processes to share data and
information. So far, we have seen one way of data exchange between the processes –
by explicitly passing open files across a fork or an exec or though the file system.
However, there are other techniques available that facilitate communication across
processes – termed IPC, Interprocess communication by programmers. We see some
of them in the next few sections.

8.2 Pipes:

The concept of pipes is a very important and time tested method of IPC in Unix
systems. They however have two limitations:

a) Data flows in only one direction – They are half dupliex


b) They can be used only between processes that have a common ancestor –
i.e. the two processes, must share a common fork parent at some level.

Now, to begin with how do we create pipes and what is a pipe.?

A pipe can be viewed as a connection between two processes, through which


data can pass. But the fact is that this connection has to pass through the Kernel. So,
we look at the process pictorially as below:

Pipe

Fd[0] fd[1]
User process

A pipe can be created by calling a pipe function:

175
The typical format is
int pipe (int filedes [2]);
returns 0 is successful, -1 if error.

Note that there are two file descriptors to be passed through the array fields [ ] .
The first of these arguments, filedes [0] is open for reading and the second argument,
filedes [1] is for writing ie. filedes [1] writes it’s output into filedes [0]. These is also
shown pictorially in the above figure.

Normally a pipe is followed by a call to fork, thereby creating an IPC from the
parent to the child.

Once a pipe is created, the direction of data transfer is to be reflected in the pipe.
If the parent wants to send data to the child, the parent closes the read end of the pipe
(fd[0]) and the child closes the write end fd[1]. Similarly if the child wants to send data
to the parent, it closes fd[0] and the parent closed fd[1]

The following figures give the description

Parent Child
Fd[0] fd[1] Fd[0] fd[1]

pipe

Kernal
Note that by closing a corresponding pair of fd[0] and fd[1] , we can ensure a
unidirectional data transfer.

176
The same concept of sending data from the parent to the child over a pipe is
illustrated in the following program:

int
main (void)
{
int n, fd[2];
pid_t pid;
char line [maxline];
if (pipe (fd) < 0)
err_sys (“Pipe error”);
else if (pid >0)
{ close (fd[0]); /* parent */
write (fd[1]. “ sample closing \n”);
}
else
{ close (fd[1]); /* child */
n=read (fd[0], line, Maxline);
write (stdio-FILENo, line, n);
}
exit (0);
}

8.3 Popen and Pclose function:

The process of creating a pipe, forking of a child, closing the unused ends of the
pipe, executing a shell to execute the command and waiting for the command to
terminate – This is the normal sequence of operations when a process is trying to
contact another process and transmit data.

177
The unix standard library provides two functions to do all this work. They are
popen and pclose functions.

Typical formats are


FILE * open (const char *cmd string, const char * type);
Returns a file pointer if successful, else a NULL pointer.

Similarly, the other functions is


Int pclose (FILE *fp)
Returns termination status of cmd string or –1 on error.

The function popen does a fork operation, executes the cmd string using exec
and returns a file pointer. If the type argument is r, the pointer is connected to the
standard output of cmd string.
Parent child (cmd String)

fp Std out

On the other hand, if the type is w, the file pointer gets connected to the input of
cmd string.

The pclose function closes the standard I/O stream, waits for the command to
terminate and returns the termination status.

Just to familiarize ourselves with the concepts, we write a small program, which
is copying a file to a pager program (Ref. Text No:1)

# include <sys.waith.h>
# define pager “$ { pager:- more}”
/* This is the environmental variable to define pager */

178
int
main (int argc, char * argv[ ] )
{
char line [MAXLINE]
FILE *fpin, * fout;
if (argc ! = 2)
err_quit (“usage: a.out < pathname>”);
if ((fp in = fopen (argv[1], “r”)) = = NULL)
err_sys (“can’t open % s”, argv[1]);
if (( fpout = popen (pager, “W”)) = = NULL
err_sys (“open error”);
/* copy argv[1] to pager */
while (fgets (line, Maxline, fpin) ! = NULL
{ if (fputs (line, fpout) = = EOF)
err_sys (“fputs error to pipe”);
}
if (ferror (fpin))
err_sys (“fgets error”);
if (pclose / fpoput) = = -1)
err_sys (“pclose error”);
exit (0);
}

8.4 Concept of Coprocesses:

Unix has a concept of a filter. A filter can be described as a program that reads
from the standard input and writes into the standard output, in the most simplistic case.
Filters are normally connected in shell pipelines. If, however the same program
generates it’s input and reads it’s output, then the filter can be termed a coprocess.

179
A coprocess runs in the background from a shell and it’s standard input and
standard output are connected to another program by means of a pine.

Looking another way, while the popen gives us a one way pipe to the standard
input from a process or from a process to a standard output, a coprocess can be viewed
as a two way pipe – or rather two one way pipes in different directions between the
same set of processes – one to the standard input and one from the standard output to
a process.

The concept can be viewed as per the following sketch

Parent child (coprocess)


Fp1[1] Pipe1 Std in

Fd2[0] Std out


Pipe2

The following is a simple program utilizing the coprocess that reads two numbers
from it’s standard input and computes their sum to display on the standard output.

int
main (void)
{
int n, int1, int2;
char line [MAX line];
while (( n= read (stdin_fileno, line, maxline))>0)
{ line [n] = 0; /* terminate with null */
if (sscanf (line, %d%d”. & int1, &int2) = = 2)
{ s printf (line, “%d\n”, int1 + int2);
n=strlen (line);
if (write (Stdout_fileno, line, n) ! = n)
err_sys (“write error”);

180
}
else
{
if (write (STDOUT_Fileno, “invalid args \n”. 13) ! = 13)
err_sys (“write error”);
}
} exit (0);
}

The above program acts as a filter to add two numbers.

We now write one more program that use the add2 coprocess, after reading the
two numbers from the standard input. The value from the coprocess is added to the
standard output.

# include <signal.h>
static void sig_pipe (int); /* signal handler */
int
main (void)
{
int n, fd1[2], fd2[2];
pid_t pid;
char line [max line];
if (signal (SIGPIPE, sig_pipe) = = sig_err
err_sys (“signal error”);
if pipe (fd1) < 0 \ \ pipe (fd2) < 0)
err_sys (“pipe error”);
if ((pid = fork( ) ) <0)
err_sys (“fork error”);
else if (pid >0) /* parent */
{ close (fd1[0]);

181
close (fd2[0]);
while (fgets (line, maxline, std in) ! = NULL)
{ n = strlen (line);
if (write (fd1[1], line. n) ! = n)
err_sys (“write error to pipe”);
if ((n=read (fd2[0], line, maxline))<0)
err_sys (“read error from pipe”);
if (n==0) {
err_msg (“child closed pipe”);
break;
}
line [n] =0; /* line termination char */
if (fputs (line, stdout) = = EOF)
err_sys (“fputs error”);
}
if (ferror (stdin))
err_sys (“fgets error on stidin”);
exit (0);
}
else /* child */
{ close (fd1 [1]);
close (fd2[0]);
if (fd1[0] ! = stdin_fileno)
{ if (dup2 (fd1[0], stdin Fileno) ! = stdin_Fileno)
err_sys (“dup2 error to stdin”);
close (fd1[0]);
}
if (fd2[1] ! = stdout_fileno)
{ if (dup2 (fd2[1], stdout_file no) ! = stdout, file no)
err_sys (“dup2 error to stdout”);
close (fd2[1]);

182
}
if (excel (“./ add2”, “add2”, (char*) 0 ) < 0)
err_sys (“excl error”);
}
}
static void
sig_pipe (int signo)
{
printf (“sigpipe caught \n”);
exit (1);
}

8.5 FIFOS:

FIFOS are also sometimes called pipes. Pipes can be used only between related
processes when they have a common ancestor, in the definition that we have seen
sofar. With FIFOs, however, unrelated processes can also exchange data.

Creating a FIFO is similar to creating a file


The typical format is
Int mkfifo (const char * path name, mode_t mode);
Returns 0 if successful, -1 if on error.

The specifications for the argument mode in the mkfifo function are similar to the
open function options seen in the first unit. Also, rules for creation of user and group
ownership of new FIFO are the same as described earlier for processes.

Once a FIFO is created using mkfifo, we can operate on it in a way similar to


normal file operations. We may open it using open function and use normal I/O
functions like read, write, close etc as we do with files.

183
However, there are one or two concepts that we have to bear in mind while
dealing with FIFOs, regarding the effect of the non blocking (O_Nonblock)flag.

a) In the normal case, when the o_Nonblock is not specified, a FIFO if


open for read only operation blocks, This can be relieved only if some
other process opens the same FIFO for writing. Similarly a write only
opened FIFO blocks, until some other process opens it for reading.
b) Suppose the O_Nonblock is specified, A FIFO opened for read only
does not block, but returns immediately. But a FIFO opened for write
only returns an error if no process opens it for reading.

Also, normally several processes try to write simultaneously into an FIFO. I.e.
more than one process may open a given FIFO for writing. Obviously this may lead to
intermixed writing into the FIFO. To avoid it, writing into a FIFO will be made a an
atomic operation. This would ensure that no interleaved writing would be there, but a
single process may go on writing indefinitely into a FIFO. To over come this, the
maximum amount of data that can be atomically written into a FIFO is also specified.

FIFOs are used by shell commands to pass data from one shell pipe line to
another, by passing the need for creating intermediate temporary files. They also
become useful in the client server scenario.

8.6 Message Queues:

As the name indicates, they are a list of messages. Messages queues are
normally stored as linked lists of messages in the Kernel. Each message is identified by
a message queue identifier. For simplicity, we may refer to a message queue as simply
a “queue” and a message queue id as simply a “queue id” (in this section only)

New queues can be created or an existing queue may be opened and new
messages can be added to the end of the queue. Every individual message is identified

184
by it’s length fields and the actual messages. Though it is called a queue, it is not
always necessary that we fetch the messages in a first-in-first-out manner. Messages
can also be fetched based on their message type.

Before we start looking into the details and the functions operating on the
message queues, we define a data structure which we call msquid_ds, to associate the
message queues.

The following is the msquid_ds strucutre :

Struct msquid_ds {
Struct ipc_perm msg_perm; /* message permissions */
Struct msg * msg_first; /* pointer to the first message on the queue */
Struct msg * msg_last; /* pointer to the last message */
Ulong msg_cbytes; /* current no. of bytes in queue */
Ulong msg_qnum; /* no. of messages in queue */
Ulong msg _qbytes; /* max no. of bytes in queue */
Pid_t msg_ispid; /* pid of last msgsnd ( ) */
Pid_t msg_lrpid; /* pid of last msgrcv ( ) */
Time_t msg_stime /* last – msg snd ( ) time */
Time_ t msg_rtime /* last – msgrcv ( ) time */
Time_t msg_ctime /* last change time */
}

Now briefly look at each of these fields:

msg_first and msg_last point to the locations where the first and last messages
are stored in the Kernel.

The first field is a structure used to defined inter process communication


permissions. These permissions necessarily indicate the type of associations of that

185
the messages can have with over messages. They normally define the various
ownerships and the types of permission allocated.

A typical permission structure looks as follows

Struct ipc_perm {

uid_t uid; /* owner’s effective user id */


gid_t gid ; / * owners effective group id */
uid_t cuid ; /* creater’s effective user id */
gid_t cgid; /* creator’s effective group id */
mode_t mode; /* access modes */
ulong seq; /* slot usage sequence number */
key_t key; /* key */
};

All the fields in the ipc permission structure are self evident.

Some of the fields in the message queue structure become obvious as we


proceed further with the various functions.

Further, message queues, just because they are linked list structures, cannot
grow to any size. The system provides certain limitations on them, which of course can
be modified or reset. The following are some of the limits.
Name Description
MSGMax The size of the largest message that can be used, in bytes
MSGMNB The maximum size, in bytes, of one particular queue.
MSGMNI The maximum number of message queues
MSGTQL The maximum number of messages.

Normally the largest message size is set to 2048 bytes and the max number of
message queues to 50.

186
Now we look at the various functions that operate on the message queues.

The message mssgget is used, as indicated earlier, either to open an existing


message queue or to create a new message queue.

The typical format is


int msgget (key_t key, int flag);
it returns the message queue if successful, other wise returns –1.

Each message queue is associated with a key. If a new queue is being created,
the user has to specify the key. If an existing queue is being opened, then the key
specified must be the same as the one of that was specified when the queue was
opened for the first time. The kernel converts the key to an identifier to associate it
uniquely with the queue.

When we create a new queue, we must also specify a flag with both Ipc_creat an
Ipc_Excl bits set.

The other function that one would like to examine is the msgct1 function. This
performs various functions on the queue.

The typical format is


int msgct1 (int msqid, int emd, struct msquid_ds * buf);
returns 0 if successful, -1 on error.
Look at the fields:
Msquid is the id of the queue.
The cmd argument specifies the command to be performed on the queue. The
normal operations are as follows
IPC_Stat To fetch the msquid_ds structure for the queue and store it in
the structure pointed to by buf.

187
IPC_Set Choose the fields msg_perm, uid, msg_perm.gid,
msg_perm.mode and msg.qtypes and set them – This can be
done only if the effective user id of he process doing the
operation equals the value of msg_perm.uid or
msg_perm.cuid or by a process enjoying super user
privileges.
To increase the no. of bytes allocated to msg_qbytes, only the
super user is permitted.
IPC_RMID To remove the message queue along with any data in it from
the system – immediately. So much so that any process
already using the queue will get an error message. This
command can be executed only by a process whose effective
user id equals that of msg-perm.cuid or msg-perm.uid or by
the one with super user privileges.

To place data into the message queue, we use the function msgsnd.

The typical format is

int msgsnd (int msqid, const void *ptr, size_t nby;tes, int flag)
The message returns 0 if ok else returns –1.

Now the fields.


Msquid is the message queue id in question. Ptr points to a long integer that
contains the message type as an integer followed by the message data. The structure
to hold this can be something as follows:

Struct mesg
{ long m type /* type of the message */
char m text [max] /* message text, with size max */
}

188
The ptr points to this structure.

A flag can specify whether the message will be of no wait type or not. (The
comparable fields of non blocking in the I/O opeation).

Messages can be retrieved from the queue by the msgrcv function.

The typical format is


Int msgrcv (int msquid, void *ptr, size_t nbytes, long type, int flag);
Returns the size of the data portion if successful, -1 on error.

Ptr points to a structure just like in msgsnd. N bytes indicates the data buffer for
the message data. If the message received is longer than the buffer size, then
message is truncated to fit the buffer size, if MSG_no error flag is set. If this value is not
set, a receipt of a message longer than the buffer size would mean only an error
message.

The type argument helps us to specify the type of message


Type = 0 The first message is returned.
Type > 0 The first message whose message type equals “type” is returned.
Type <0 The first message whose type is the lowest value less than or equal to
the absolute value of type is returned.
Note that a type = 0 means a First in First out operation. The declaration of a
non zero type could produce a priority based queue.

We can specify a flag value of IPC_Nowait to make it non blocking (refer the
previous message description)

8.7 Semaphores:

189
A semaphore is a counter used to provide access to a shared data object for
multiple process. To obtain a shared resource, a process will follow the following steps.

a) Check the status of the semaphore that indicates the availability or


otherwise of the resource.
b) If the semaphore indicates a positive value, the process can make use
of the resource. The process decrements the value of the semaphore.
This indicates that one more unit of the resource indicated by the
semaphore has been committed.
c) If the value of the semaphore is 0, the resource is not available and
hence the process goes to sleep till the semaphore value becomes
greater than 0. Then it goes to step (a)
d) When a process has made use of the shared resource and returns the
resource, the semaphore value is increased by 1.

In some cases, a signal may be sent to the sleeping processes.

Common sense tells us that the checking of the semaphore value and
decrementing it’s value (indicating alocation to a process) should be an atomic
operation.

8.7.1. Implementation of semaphore structures:

The kernel maintains a semid_ds structure for each semaphore.


The data structure can be described as follows
Struct semid_ds
{
struct ipc_perm sem_perm /* permissions */
struct sem * sem_base /* pointer to the first semaphore in the set */

190
ushort sem_n sems /* no. of semaphores in the set */
time_t sem_0 times /* last - semop ( ) time */
time t semc time /* last – change time */
}

the sem-base points to the memory in the kernel, where the semaphore is stored.
This points to an array of sem structures.

Struct sem {
Ushort semval /* semaphore value, always >=0 */
Pid_t sempid /* pid for last operation */
Ushort semcnt /* no. of processes awaiting sem val > carrval */
Ushort semzcnt /* no. of processes awaiting semval = 0 */

Now, we shall look at a few functions to operate on semaphores. The first


function is to obtain a semaphore id.

The typical format


int semget (key_t key, int n sems, int flag);
returns the semaphore id on success, else returns –1.

This can be used either to get an existing semaphore or to create a new


semaphore structure.
If a new set is being created, then apart from other initializations, we set sem_ 0
time is set to 0, set sem_c time to current time and sem_nsems to no. of semaphores in
the set.

If we are referencing an existing set of semaphores, we specify n sems to 0.

8.7.2. The function sem Ct1:

191
The next function we see is semct1, which can be thought of as a catch all
function for all sorts of operations on semaphores
The typical for is
Int semct1 (int semid, inte semnum, int cmd, union semun arg);
The argument semid specifies the id of the semaphore on which operation are to
be done.

Semnum specifies the particular member of the semaphore set on which


operations are to be performed. The valid semnums are from 0 to nsems –1.
Cmd indicates the set of commands that can be performed on the particular
semaphore member indicated by semnum of the specified semaphore whose id is
indicated in semid. There are about 10 commands, each of them will be indicated briefly
here
IPC_stat Fetch the semid_ds structure for this set and store it in the structure
pointed to by org.buf.
IPC-set Set the following fields from the structure pointed to by arg.buf
i) sem_perm.uid
ii) sem_perm.gid
iii) Sem_perm.mode.
These commands can be executed only if the effective user id of the
process calling the command equals sem_perm.cuid or sem_perm.uid
or, of course by the super user privileged process.
IPC_RM Remove the semaphore set from the system As we have seen earlier,
ID this removal is instantaneous and any process still using the
semaphore may end up getting an error signal on it’s next attempt to
access the semaphore. This command, again can be executed if the
ffective user id of the process calling the operation equals
sem_perm.cuid or sem_perm.uid or by a process with super user
privileges.
Getval Return the value of the semval for the member semnum
Setval Set the value of semval for the member semnum. The value for which
it is to be set is indicated by arg.val.
Getpid Return the value ofsempid for the member semnum

192
Getncnt Return the value of semncnt for the member semnum.
Getzcnt Return the value of semZcnt for the member semnum.
Getall Fetch all the semaphore values in the set. These values are stored in
an array pointed to by arg.array.
Setall Set all the semaphore values in the set to the value pointed to by
arg.array.

The final argument of the command is the actual union, named semun.

Union semun {
Int val /* for set val */
Struct semid_ds * buf /* for ipc_Stat and Ips_set */
Ushort * arry /* for get all and set all */

8.7.3. The function sem op

The next function we shall be looking at is the semop which atomically performs
a number of operations on a semaphore set.

The typical format is


Int semop array (int semid, struct sembuf semop array [ ], size_t nops);
Sem id is the semaphore id

Semop array is the pointer to an array that points to the operations to be done on
the semaphore. Each field of the array is of the following type.

Struct sembuf {
Ushort sem_num ; /* member no. is set (0,1,2, . . . . nsem-1)*/
Short sem_op; /* operation (negative, 0 or positive */
Short sem_flg; /* IPC – no wait , sem_undo */
}
nops specifies the no. of operations in the array.

193
The operation to be done on each member of the set is specified by the
respective sem_op values. As has been indicated, this value can be negative, 0 or
positive);

a) when the sem-op flag is positive, it indicates that the resources of the
system should be returned by the process. The value of sem_op is
added to the current value of the semaphore we are operating upon.
If the undo flag is set, sem_op is also subtracted from the semaphore’s
adjustment value for the process.
b) If sem_op is negative, it means the process wants to obtain the
resources that it had indiated. In such a case, a number of alternative
situations are possible. If the current value of the semaphore is
greater than are equal to the absolute value of sem_op (which indites
the required resources are available), the absolute value of sem_op is
subtracted from the semaphore’s value. The resulting value after the
subtraction should not fall below 0. If the undo flag has been specified,
the absolute value of the sem_op is also added to the semaphore’s
adjustment value for this process. On the other hand, if the
semaphore’s value is less than the absolute value of sem_op, it means
the process is requesting for more amount of resources then is
actually available. In such a case.
i) If IPC-no wait is specified, control returns with an error
message.
ii) If Ipc_no wait is not specified, the sem cnt for the semaphore is
incremented and the calling process is put to sleep (suspended)
pending one of the following occurances
1. The semaphore value becomes greater than or equal to the
absolute value of sem_op. This happens when certain other
processes have released their resources. Then the value of sem
ncnt for the semaphore is decrementd and the absolute value of

194
sem_op is subtracted from the new semaphore value. If undo flag
is specified, the absolute value of sem_op is also added to
semphore adjustment value.
2. The semaphore is removed from the system. In this case the
process gets an error message.
3. A signal is caught by the process and the signal handler returns.
Then the value of semncnt for the semaphore is decrementd and
the function returns error.
c. Suppose the value of sem_op is 0 we would then wait until the
semaphore’s value becomes 0. In such a case, the current value of the
semaphore is 0, the function returns immediately.
If it is not zero, then
i. If Ipc_no wait is specified, an error is returned.
ii. If Ipc_no wait is not specified, the semzcnt for the
semaphore is incremented and the calling process is
suspended until one of the following things occur.
1. The semaphore’s value becomes 0. Then the value or semzcnt
of the semaphore is decrementd.
2. The semaphore is removed from the system. Then the process
gets an error message.
3. A signal is caught by the process and the signal handler returns.
Then the value of semncnt for the semaphore is decremented
and the function returns error.

8.7.4. What to do if a function with system resources terminates?

It becomes a problem, if a process suddenly terminates, for what ever reasons,


while it is still having the resources allocated to it which are indicated in the semaphore.
How do we account for the resources held by the process? When we specify the
sem_undo flag for a semaphore operation and allocate resources, the kernel
remembers how many resources are allocated from each particular semaphore to the

195
process (indicated by the absolute value of sem_op). Once the process terminates,
voluntarily or involuntarily, the kernel checks whether the process has any outstanding
semaphore adjustments and suitably adjusts the respective semaphores.

If the value of the semaphore is set using semctl, using either setval or setall
commands, the adjustment value for that semaphore in all processes will be set to 0.

8.8 The concept of shared memory:

The basic principle behind the shared memory is that two or more processes are
allowed to share a given region of memory. The most important concept is
synchronizing access to a given region amongst different processes. This makes the
operation faster because data need not have to be copied between the client and the
server, for example. The only constraint is that when data is being put in to the shared
region, some other process should not be able to access the data from the same region
and viceversa. Two mechanisms are used to synchronize the operations -–use of
semaphores and using record locking.

We look into the details of the same in the subsequent paragraphs.

The Kernel maintains the information about each shared memory segment in the
following structure type:

Struct shmid_ds
{
struct ipc_perm shm_perm /* security permissions */
struct anon_map * shm_amp /* pointer in Kernel */
int shm_segsz /* size of the segment in bytes */
ushort shm_lkcnt /* no. of times segment is being locked */
pid_t shm_cpid /* pid of the last shmop ( ) */
pid_t shm_cpid /* pid of the creator */

196
ulong shm_nattch /* no. of current attaches */
ulong shm_cnattch /* used only for shminto */
time_t shm_a time /* last attach time */
time_t shm_d time /* last detach time */
time_t shm_c time /* last change time*/
};
All the fields are self explanatory

Also, certain limits will have to be set to the shared memory size
Name Description
SHAMAX Maximum size in bytes of a shared memory segment
SHAMIN Minimum size in bytes of a shared memory segment
SHAMNI Maximum no. of shared memory segments, system wide
SHMSEG Maximum no. of shared memory segments, per process.

These limits can be used to manipulate the various parameter of hte memory
sharing process.

Now, we are in a position to look into several of the memory sharing functions:

8.8.1. The function shm get

The typical format is


Int shmget (key_t key, int size, int flag);
If successful, returns the shared memory id, -1 if error.

When a new shared memory segment is to be created, the following members of


the shmid structure need to be initialized.
i) The ipc_perm is to be initialized. The method of initializing this field
is the same as we have seen in several previous occasions.
ii) Shm_lpid, shm_nattach, shm_ a time and shm_d time need to be
initialized to 0.
iii) Shm_ltime is set to the current time.

197
Size indicates the maximum size of the shared memory segment. If we are
referencing an existing segment, size is specified as zero. If a new segment is being
created, the size should be specified suitably.

8.8.2 The shmctl functin:


The shmctl function can do various shared memory operations
The typical format is
Int shmctl (int shmid, int cmd, struct shmid_ds *buf)
Returns 0 if successful, otherwise returns –1.

The cmd argument specifies one of the following commands to be performed.


The operation is done on the segment specified by shmid.
Ipc_stat Fetch the shmid_ds struture for this segment and store it in the
strucutre pointed to be buf.
Ipc_Set The following fields from the structure pointed to by buf in the
strucutre associated with this segment – shm_perm.uid,
shm_perm.gid, and shm_perm.mode.
The command can be executed only if the process calling the
function has it’s effective user id equal to shm_perm.cuid or
shm_perm.uid or by a process with super user privileges.
Ipc_RMID Remove the shared memory segment set from the system.
Note that the shm_nattch field in the shmid_ds structure is an
attachment count for the shared memory segment. So, when a
shared memory segment is to be removed, the segment’s
identifier is removed, so that shmat can no longer attach the
segment. The shared segment is actually removed from only
after the last process using the segment either terminates or
detaches from it.
This command can be executed only by a process whose
effective user id equals shm_perm.cuid or shm_perm.uid or by a
super user privileged process.

198
SHM_Lock Lock the memory segment in the memory. This can be
executed only by the super user.
Shm_unlock Unlock the shared memory segment. Again this can be
executed only by a super user.

8.8.3 The shmat function:

This function helps to attach a process to the address space of a shared memory
segment

The typical format is


Void *shmat (int shmid, void *addr, int flag)

The function returns a pointer to the shared memory segment if successful,


otherwise –1.

The address at which the segment gets attached to the process depends on the
addr argument and also whether SHM_RND bit is specified in the flag field.
a) If addr is 0, the segment is attached at the first available address
selected by the Kernel.
b) If addr is non zero and shm_RND is not specified, the segment is
attached at the address given by addr.
c) If addr is non zero and shm_RND is specified, the segment gets
attached at the address given by (addr –(add mod shmlba)). ShmLba
stands for “lower boundary address multiple of shared memory”.

It is advisable to specify the addr as 0 and let the system choose the
corresponding address.

8.8.4 The shmdt function:

199
The function helps to detach the memory segment, once we have completed the
operations

The typical format is


Int shmdt (void *addr);
Returns 0 if successful otherwise –1.

It may be noted that the function detaches the memory segment but does not
remove the identifier and the associated data structures.

8.9 Client Server Properties:

There are different properties of the client – server mode operation that are
affected by the IPCs

The simplest way of having a client – server operation is to have the client fork
and exec the desired server. Before the fork, pipe operations can be used to create two
one way pipes, needed for two way data transfer, as envisaged earlier. The server can
be a set-user-id program. Also the server can determine the real user id and hence
determine the client’s identity.

This arrangement helps one to build on “open server”. It can open files for the
client, instead of the client calling the open function. In this type of server client
architectures, since the server is a child of the parent, it can only pass the contents of
the file back to the parent. But the child cannot pass a file descriptor back to the parent
though the parent can pass such a descriptor to the child.

The next type of server is a daemon process that is working using some form of
IPC with it’s clients. In this case, either FIFOs or message queues are used for
message transfers. Once such message queues are created, several possibilities
regarding their operations arise.

200
a) A single queue can be used for passing of messages between the
server and all it’s clients. The clients, when sending their requests, can
use their process id, which can be used by the server for returning the
responses.
b) Each client has it’s own message queue. Before sending it’s first
request, each client creates it’s own message queue with a IPC –
private key. The server also has it’s own queue, with a key known to
each of it’s clients. When the client sends the first request to the
server’s known queue, the request contains the message queue id of
the client’s queue. The server sends it’s first response to the client’s
queue and all future requests and responses are exchanged on this
queues.

The problem is that the server has to go on reading messages from multiple
queues, as neither select nor poll work with message queues.

The basic problem with the shared memory concept is that a single message can
be in a shared memory segment at a time. On the other hand, it is possible that each
client has one shared memory segment with the server. But this poses the additional
problem that the server should be able to identify the client accurately.

8.10 Stream pipes:

One problem with the pipe structure we have seen so far is that they are
unidirectional. So, to have a full duplex pipe, we had to create two such unidirectional
pipes and close the complementary ends etc as we have seen earlier.

One other way of doing it is to use a bi-directional pipe, called the stream pipe.
It’s structure is given below:
User process user process

Fd[0] fd[1] Fd[0] fd[1]


201
Stream pipe

To see how a single stream pipe can operate, we shall rewrite the program which
we did using coprocessors, in section 8.4

# include <signal.h>
static void sig_pipe (int) ; /* our signal handler */
int
main (void)
{ int n, fd[2];
pid_t pid;
char line [Max Line];
if (signal (SIG Pipe, Sig pipe) = = sig_err)
err_sys (“signal error”);
if (S_pipe (fd) <0) / * only needs a signal stream pipe */
err_sys (“pipe error”);
if ((pid = fork ( ) ) < 0)
err_sys (“fork error”);
else if (pid > 0) /* parent */
{ close (fd1[1]);
while (fgets (line, Maxline, Stdin) ! = NULL)
{ n = strlen (line);
if (write (fd[0], line, n) ! = n)
err_sys (“write error to pipe”);
if ((n=read (fd[0], line, max line )) < 0)

202
err_sys (“read error from pipe”);
if (n = = 0)
{ err_msg (“child closed the pipe”);
break;
}
line [n] =0; /* null terminate */
if (fputs (line, std out) = = EOF)
err_sys (“fputs error”);
}
if (ferror (stdin))
err_sys (“fgets error on stdin”);
exit (0);
}
else { /* child */
close (fd[0]);
if (fd[1] ! = stdin_File no)
{ if (dup2 (fd[1], std in file no) ! = stdinfile no)
err_sys (“dup 2 error to stdin”);
}
if (fd[1] ! = stdout_fileno)
{ if (dup 2 (fd[1], stdout_fileno) ! = stdout_fileno)
err_sys (“dup2 error to stdout”);
}
if (excel (“ ./ add2” , “add2”, NULL <0)
err_sys (“execl error”);
}
}
static void
sig_pipe (int signo)
{ printf (“sig pipe caught \n”);
exit(1);

203
}

Since each end of the stream pipe is fully duplex, the parent, in the above
program uses only fd[0] for both reading and writing and the child duplicates fd[1] to
both stdin and stdout (standard input and standard output)

Parent Child
Stdin
Fd[0] Fd[1]
Stdout

8.10.1 The s_pipe function:

This function, to create the stream pipe is similar to the pipe function. So, we do
not go into great detail, except to state that the s_pipe function just calls the standard
pipe function, to create a full duplex pipe

The typical format is


int s_pipe (int fd[2]);
It returns two file descriptors, fd[0] and fd[1].

8.11 Passing file descriptors:

The ability to pass an open file descriptor between process is a desirable feature
as it helps in designing different types of applications. For example one process (may
be typically a server) can do everything needed to open a file and the calling process
just needs to accept the descriptor passed on to it by the server to do I/O functions

204
using the same. The file or the device opening details are all transparent to the calling
process (let us call it a Client).

When an open file descriptor is passed from one process to another, the passing
process (server) and the receiving process (client) should share the same file table
entry. In simple terms, it just means passing a pointer to an open file table from one
process to another. The pointer is assigned the first available file descriptor in the
receiving process.
Process table entry File Table v node table
Fd ptr File status vnode information
Current offset inode info
(1) Vnode pt Current file size

Process table entry


Fd ptr

(2)

The process is passing the file descriptor from (1) to (2).

8.11.1 Functions for sending and receiving file descriptors:

The following functions can be used to send and receive file descriptors:

Their typical format are


int send_Fd(int spipefd, int filedes);
int send_err (int spipe fd, int status, const char *errmsg);
Both return 0 if successful otherwise returns –1.

205
When a process wants to pass a file descriptor, it calls either send fd or send err.

The send_Fd sends the descriptor fieldes across the stream pipe spipefd.
Send_err sends the errmsg across the stream pipe spipe fd, followed by status byte.
The value can be from –1 through – 255.

The receiver, to receive the descriptor does so by calling the function recv_Fd.
The typical format is
int recv_Fd (int spipefd, ssize_t_ (*userfunc) (int, const void *, size_t));
It returns the file descriptor if successful, else a number less than 0.

If an error message was sent by the server, the client’s user function is called to
process the message. The first argument of user function is the constant std err_fileno,
followed by a pointer to the error message and it’s length.

Block Summary

We began with the basic idea of interprocess communications. We introduced


ourselves to the concept of pipes which can be viewed as a connecting channel
between two processes, but is goes through the kernel. We also noted that a pipe call
is followed by a call to the fork. When studied functions to open and close the pipes –
the popen and pclose fucntions.

Then we moved on to the coprocesses and wrote a program that utilised the
concept of coprocesses. The next topic was FIFOs – which can be looked upon as
pipes beween processes that do not share a cammon ancestor in fork oprations. We

206
also saw the concept of message queues – which is a list of messages, as the name
suggests. We studied mssget and msgct1 functions.

We moved on to semaphores; a mechanism to provided access to shared data


object for multiple processes. We studied the methods of implementing the semphore
strucures – the functions semct1, semop etc.

Then we studied the concept of shared memory and the ufnctions shm get and
shmct1, as also the functions shmat and shmdf. We also briefly studied the stream
pipes and the s_pipe fucntions. The last concept that we studied was about passing file
descriptors and the fucntions required for the same.

207
Review Questions:

1. What is a pipe? What are it’s limitations?


2. What are the typical formats for popen and pclose functions?
3. What is a filter?
4. How does a FIFO differ from a pipe?
5. What is a message queue? How does is store and identify messages?
6. What is a semaphore?
7. What is the need for shared memory ? What is the concept behind it?
8. name the function used to create a shared memory segment?
9. What is a stream pipe?
10. Name the functions used to send and receive file escriptors?

208
Answers

1. A pipe is a connection between two processes through which the dta can pass. It’s
limitations are that data can passin only one directions and they can be sued only by
processes who have a common ancestor.

2. The typical formats are


FILE * open (const char *cmd string, const char * type)
And Int pclose (FILE *fp)

3. A filter can be described as a program that reads from the standard input and write on
to the standard output.

4. A FIFO will help to exchange data een between unrelated processes.

5. A message queue is a list of messges. Message queues are normally stored as linke
dlists of messages in he kernel. Each message is identified by a message queue
identifier.

6. A semaphore is a counter used to provide access to a shared data object for multiple
processes.

7. The concept behind shared memory is to allow two or more processes to share a
given region of memory – after suitable synchronisation is done. This makes the
oprations faster.

8. The typical format of the function is


Int shmget (key_t key, int size, int flag);

9. A stream pipe can be viewed as a bidirectional stream.


10. The functions are send_fd and recv_fd respectively.

209
References:

1. W.Richard Stevens: Advanced Programming in UNIX Environment, Addison – Wesley


(The course closely follows the pattern of the book)

2. Terrance chan: Unix Programming Using C++, PHI.

210

Anda mungkin juga menyukai