! \0
This declares msg to be an array of char just large enough to hold the string
literal "Help!".
The compiler ensure that enough space for the entire string literal is allocated.
Arrays are aggregate structures, that is they are aggregate types which are built
on top of simpler types.
x[8];
num[100];
buffer[80];
num2[ 10*200 + 30 ];
/*illegal*/
/*illegal*/
9-2
return sign * n;
9-3
/* foo[4]==0 */
/* compile-time error */
Character arrays
9-4
/* READ-ONLY */
9-5
9-6
For now we will think of the program residing within one source
file
Later, we will discuss how to split it up into two or more source
files
The program, if written as one source file, will look something
like this
#include's
#define's
function prototype declarations for main
main () { ... }
external variables for push and pop
void push (double f) { ... }
double pop (void) { ... }
int getop (char s[]) { ... }
routines called by getop
9-8
9-9
Start
digit
digit
digit
.
Any character that is not a
' ', '\t', digit, or .
9-10
return NUMBER;
9-11
9-12
9-13
contents
main()
push(), pop() and their variables
getop()
getch(), ungetch()
9-14
getop.c
#include <stdio.h>
#include <ctype.h>
#include "calc.h"
getch.c
#include <stdio.h>
/* need calc.h
* declarations for
* consistency checking
*/
#include "calc.h"
#define BUFSIZ 100
static char buf[BUFSIZ];
static int bufp = 0;
int getch (void)
{...}
void ungetch (int c){...}
9-15
.
.
.
Compiler
Machine
memory
model
Programmer's
model
9-16
p:
...
c:
...
...
int x;
...
p = &x;
const int x;
...
p = &x;
int a[10];
...
p = &a[1];
ILLEGAL
register int x;
...
p = &x;
p = &(x+y);
p = &33;
9-17
ip
ip = &z[0];
9-18
The syntax of the declaration for a variable in C has been designed to mimic the
syntax of the expressions in which they might occur
DECLARATION
double a[10];
double *ap = a;
double atof (const char *);
USAGE
for (i = 0; i < 10; i++)
a[i] = 0;
*ap = 10;
void foo (void) {
double x = atof ("35.7");
}
The pointer declaration gives the name of the object and the type of thing to
which it points
In general, pointers are constrained to point to the objects for which they are
declared
Pointers to void may point to anything but they may not be dereferenced (more
later)
9-19
Unary operators * and & have the same precedence as other unary operators
higher than binary operators
lower than [] or () operators
These precedence rules also apply within the declarators
p1 = &a[0];
a[0] = 100;
printf ("p1 points to %d\n", *p1); /*100*/
printf ("p2 points to %d\n", *p2); /*10*/
9-20
/*WRONG*/
void swap (int x, int y)
{
int tmp = x;
x = y;
y = tmp;
}
void foo (void)
{
int a = 10, b = 20;
swap (a, b);
printf ("a=%d, b=%d\n", a, b);
}
Initial activation frame
tmp
20
10
call swap
9-21
10
20
20
10
20
10
20
10
swap exits
20
10
No change
to a or b!
This implementation of swap fails because it failed to take into account that the
parameters are copies of the inputs
A correct implementation needs pointers to the objects being swapped to be
passed in
We now reimplement swap by passing in pointers to the objects to be swapped
9-22
20
10
call swap
tmp 10
tmp = *x
20
10
tmp 10
20
10
*x = *y
20
20
10
20
tmp 10
x
y
*y = tmp
10
20
exit swap
success
9-23
The standard library function scanf() may be used to read in input from the
terminal
Like printf(), it expects a format string followed by a series of arguments
In many ways, the format is identical to printf
However, because scanf expects a pointer to its argument, there are some
areas where it is distinctly different
format
%d
%u
%hd
%ld
%f
%lf
%Lf
%c
%s
object
int
unsigned int
short
long
float
double
long double
character
char array
example
void foo (void)
{
int i;
scanf("%d", &i);
9-24
scanf("%d", ip);
void bar()
{
int x, *ip = &x;
scanf("%d", ip);
9-25
The format string for functions printf and scanf are similar in many ways
However, one may observe that scanf is much pickier about its arguments
format
%d
int
%hd
short
%ld
%f
long
float
%lf
double
%Lf
%g
long double
floating point
%g
object
float
allowable types
for printf
char
short
int
char
short
int
long
float
double
float
double
long double
float
double
N/A
9-26
/*trouble*/
9-27
/*trouble*/
/*correct*/
9-28
Most compilers do not check format strings of printf and scanf style
functions to see that they are consistent with the arguments that the user provides
Some compilers, recently, have started doing this
GNU C -- version 2.0 and up -- does check them and it helps prevent a lot of
bugs
the Unisys A-Series C compiler does this type of checking as well
Missing arguments to either of these can wreak havoc
It is therefore critical that you understand exactly what you are doing with either
printf or scanf
x:
y:
x:
y:
x:
y:
INTERPRETATION
pointer to int
int
pointer to int
pointer to int
pointer to int
int
Explain why the last declaration of the table is equivalent to the first declaration
9-29
9-30
REMARK
long a[8];
a[0]
a[1]
a+1
a[2]
a[3]
a[4]
a+2
a[5]
a[6]
a[7]
a+7
9-31
pa
a[1]
pa+1
a[2]
a[3]
a[4]
pa+2
a[5]
a[6]
a[7]
pa+7
KEY QUESTION
What type of object does pa point to?
The answer the compiler gives to this question defines how pointer arithmetic is
defined on it
* pa *(pa + 0) pa[0]
9-32
104
108
112
116
37
a[0]
a[1]
104
108
37
a[2]
a[0]
a[1]
104
a[0]
a[3]
a[4]
132
136
100 120
a[5]
a[6]
112
116
120
62
62
94
a[2]
a[3]
a[4]
124
128
132
136
112 116
a[5]
a[6]
128
132
136
a (104)
108
37
128
a (104)
p += 2;
100
124
94
p = &a[2];
q = &a[3];
a[2] = 62;
a[3] = *p;
100
120
a[1]
112
116
120
62
62
94
a[2]
a[3]
a[4]
124
120 116
a[5]
a (104)
9-33
a[6]
++ * ip
* ip ++
* ++ ip
( * ip ) ++
EFFECT
Take whatever ip points at
Add 1 to it
Assign the result to y
This is equivalent to writing
(*ip) += 1
which in turn is equivalent to
*ip = *ip + 1
Thus, it increments whatever ip is
pointing at by 1
This is equivalent to writing
++ ( * ip )
Thus, it increments whatever ip is
pointing at by 1 and returns the
incremented result
This is equivalent to writing
* ( ip ++ )
because * and ++ are both unary
operators of equal precedence; these
operators associate right to left.
Thus, the expression returns the value
in *ip and performs a post increment of
the pointer ip
This is equivalent to writing
* ( ++ ip )
because * and ++ are both unary
operators of equal precedence.
Thus, the expression increments the
pointer ip and then dereferences and
returns that new value
The return value of the expression is
the original value of *ip; it performs a
post increment of what ip points at
9-34
+=
ip
(pre)++
lvalue <-- *ip (dereference ip)
*
ip
9-35
*
tmp <-- ip
tmp2 <-- ip + 1
schedule store of tmp2 to ip
return tmp expression result
(post)++
ip
*
tmp <-- ip + 1
schedule store of tmp to ip
return tmp expression result
(pre)++
ip
9-36
REAL X;
EBCDIC ARRAY C[0:0];
C pointer example
double x;
dhar c;
double a[100];
char b[80];
POINTER PA;
POINTER PB;
double *pa;
char *pb;
PA := POINTER(A[0],48);
PB := B[10];
pa = a;
pb = &b[10];
A[0] := 10.0;
REPLACE B[10] BY "f";
a[0] = 10.0;
b[10] = 'f';
X := REAL(PA, 6);
REPLACE C[0] BY PB;
% PA becomes a character pointer!
PA := PA + 30;
PB := PB-5; % PB := B[5];
REPLACE PA BY 100.0;
REPLACE PB BY "A";
x = *pa;
c = *pb;
pa = pa + 5;
pb = pb - 5;
*pa = 100.0;
*pb = 'A';
9-37
DECLARATION
char
char *
INTERPRETATION
character
pointer to character
const char
constant character
const char *
char const *
char * const
9-38
9-39
/**
**
**
**
**
**/
char
void
{
9-40
K&R claim that the declarations char s[] and char *s are equivalent when
used to declare formal parameters to a function
If one ignores the const-ness of these declarations, then this is true
Consider the following example
/* &a[10] */
/* legal still */
ip++;
/*NOT LEGAL*/
a = ip;
/*NOT LEGAL*/
a++;
Array names cannot be modified
Pointers, however, may point to any position within the array
Thus the following declarations have the following semantics
DECLARATION
char s[]
char * const s
char *s
INTERPRETATION
declares that s is an array of char
declares s to be a constant pointer to char
declares that s is a pointer to char
9-41
strlen("hello world\n");
read-only string
literal
call strlen
n
"hello world\n"
9-42
/**
**
**
**
**
**
**/
char
{
Header: <string.h>
Name:
strcpy
Copies string 'ct' over to string 's',
including '\0'
Input string s is the return value
*strcpy (char *s, const char *ct)
/* need to save return value */
char *result = s;
while (*s++ = *ct++)
continue;
return result;
s[i] = '\0';
return s;
9-43
lvalue4 = *lvalue2
return lvalue4
*
lvalue2 = ct
tmp2 = ct+1
schedule store of
tmp2 to ct
return lvalue2
lvalue1 = s
tmp1 = s+1
schedule store of tmp1 to s
return lvalue1
(post)++
(post)++
ct
The object to which ct points to is copied over to the object to which s points
After copying *ct to *s, points s and ct are incremented
The value copied from *ct to *s is the value of the expression
This return value (when it becomes '\0') terminates the while loop
9-44
One should notice that strcpy does not allocate space for the target string
It assumes that the user of that function passes in a valid string
#include <stdio.h>
#include <string.h>
void foo (void)
{
char *s;
char buf[100];
/* this will probably cause a crash */
s = strcpy (s, "Hello world\n");
printf ("s = '%s'\n", s);
/* correct usage */
strcpy (buf, "Hello world\n");
printf (buf);
/* also correct */
s = buf + strlen (buf);
strcpy (s, "Hello again (second line)\n");
printf ("%s", s);
9-45
return s1;
9-46
Notice that strncpy always executes n steps, regardless of how long the string
to be copied is
The function strncpy should be used when you are not sure how many
characters are in the source string, but you are concerned with overwriting your
buffer
#include <string.h>
/**
** Assume that the string src is valid
**/
void foo (const char *src)
{
char buf[100], buf2[10], *s1, *s2;
/* always safe */
strncpy (buf, src, 100);
/* might not print */
printf ("%s", buf);
/* might cause a crash */
strcpy (buf, src);
s1 = buf;
strcpy (s1, "Hello world.\n");
s2 = buf + strlen ("Hello ");
printf ("%s", s2);
/*works fine*/
9-47
return s1;
9-48
It should be noted that we can implement strcat using the strlen function
#include <string.h>
char *strcat (char *s1, const char *s2)
{
char *s = s1 + strlen (s1);
/* copy s2 starting at the end of s1 */
while (*s++ = *s2++)
continue;
}
return s1;
We can also implement strcat using both strlen and strcpy
#include <string.h>
char *strcat (char *s1, const char *s2)
{
/* copy s2 starting at the end of s1 */
strcpy (s1 + strlen (s1), s2);
}
return s1;
Many C implementations inline the code for strlen, strcpy, and strcat
This last implementation may be just as efficient as the first two
9-49
Function strncpy is required to pad the source string with null characters
this makes it always take time n
Function strncat always appends a null character to the result
this ensures that strings produced by strncat can always be printed
it is different from strncpy in that strncpy may not always append the
null character to its result
9-50
/*
* null terminate the result
*/
*s = '\0';
9-51
(buf,
("s1:
("s2:
("s3:
9-52
9-53
tmp5 = lvalue1
schedule store
lvalue1 to c1
return tmp5
tmp6 = lvalue2
schedule store
lvalue2 to c2
return tmp6
==
=
lvalue1 = *tmp1
return lvalue1
c1
lvalue2 = *tmp3
return lvalue2
c2
tmp1 = s1
tmp2 = s1 + 1
schedule store
tmp2 to s1
++ (post) return tmp1
s1
tmp3 = s2
tmp4 = s2 + 1
schedule store
tmp4 to s2
++ (post) return tmp3
s2
9-54
9-55
9-56
#include <ctype.h>
#define upper(c) (islower(c) ? toupper(c) : (c))
int strcasecmp (const char *s1, const char *s2)
{
unsigned char c1, c2;
for ( ;; ) {
/*infinite loop*/
c1 = *s1++;
c2 = *s2++;
/*
* The expression c1 = upper(*s1++)
* will not work. The expression
* c1 = toupper(*s1++) will also probably
* not work, even if toupper left
* non-lower case letter alone. Explain.
*/
c1 = upper (c1);
c2 = upper (c2);
9-57
One uses this function to locate the first occurrence (the one having the lowest
subscript) of a character in a null-terminated string
A search failure returns a null pointer
9-58
Using strchr
#include <stdio.h>
#include <string.h>
#include <assert.h>
void foo (void)
{
const char *s = "Hello, out, there.\n";
char *t;
printf ("s: %s", s);
t = strchr (s, ',');
assert (t != NULL);
printf ("t: %s", t); /* ", out, there.\n" */
t = strchr (t + 1, ',');
assert (t != NULL);
printf ("t: %s", t); /* ", there.\n" */
assert (strchr (t + 1, ',') == NULL);
}
9-59
9-60
If two pointers point to different elements in the same array, then subtracting
those two pointers gives the number of objects (of that pointer's type)
separating them (see K&R, p. 206), i.e. the distance between the two pointers
measured in units which equal the size of the pointer's base type
int a[100], *p1 = a, *p2 = &a[20];
int offset = p2 - p1;
Pointers p1 and p2 point to the 0th and 20th array elements respectively
Thus, offset is set to be 20, the number of elements between position 0 and
20
9-61
Implementation of strcspn
#include <string.h>
size_t
strcspn(const char *s1, const char *s2)
{
const char *sc1, *sc2;
/*
*
*
*
*/
for
}
/* terminating nulls match */
return (sc1 - s1);
9-62
Functions strchr and strcspn are useful when parsing null terminated
strings
Let us consider implementing a function which returns a pointer to the first
whitespace character within a string
#include <string.h>
/**
** Returns a pointer to the first whitespace
** character (either ' ', '\t', or '\n') in s
**
** A NULL return value implies that no
** whitespace exists within the string
**/
const char *
find_whitespace (const char *s)
{
size_t index;
index = strcspn (s, " \t\n");
if (s[index] == '\0') {
/* no whitespace */
return NULL;
}
9-63
9-64
Implementation of strpbrk
#include <string.h>
char *
strpbrk(const char *s1, const char *s2)
{
const char *sc1, *sc2;
for (sc1=s1; *sc1 != '\0'; sc1++) {
/* check each sc2 char to see if in sc1 */
for (sc2=s2; *sc2 != '\0'; sc2++) {
/* Is this sc2 char in sc1? */
if (*sc1 == *sc2) {
/* oops: lost our const-ness */
return (char *)sc1;
}
}
}
9-65
9-66
9-67
9-68
9-69
}
/* substring not found */
return NULL;
9-70
9-71
/* t points to the
token "a" */
t = strtok (NULL, ","); /* t points to the
token "??b" */
t = strtok (NULL, "#,");/* t points to the
token "c" */
t = strtok (NULL, "?"); /* t is a null pointer*/
What strtok does
The strtok function is an intricate function designed to help users parse a nullterminated string into tokens
The user must specify a set of separators (e.g., whitespace)
Sequences of one or more separators occur between tokens
The strtok function conceptually stores where the pointer was last in a
static variable
Also, note, the strtok writes into the search string s1 which you pass to it
It you don't want this to happen, you must copy over the string to temporary
storage
Also, strtok is not reentrant, i.e., it can only be used to parse one string at any
given time
9-72
/*comment*/
9-73
9-74