You are on page 1of 74

CHAPTER 9

Introduction to pointers and arrays; the <string.h> library

Reference: Brooks, Chapter 6


Arrays

An array is an ordered collection of contiguous objects, all of the same type.


The declaration char line[80] declares an array, size 80, of characters.
The first element of this array is at position 0
The last element is at position 79
All arrays in C are zero bounded, that is they start at position 0.
To access a particular element of this array, one would use the subscript operator
in C, i.e. line[10] would return the 11th character of array line.
char msg[] = "Help!";

! \0

This declares msg to be an array of char just large enough to hold the string
literal "Help!".
The compiler ensure that enough space for the entire string literal is allocated.
Arrays are aggregate structures, that is they are aggregate types which are built
on top of simpler types.

Copyright (c) 1999 by Robert C. Carden IV, Ph.D.


9/27/2015

Introduction to pointers and arrays; the <string.h> library


Array declarations
Before we can use an array in a C program we must first declare it
The general syntax for an array declaration is as follows:
<type-specifier> array_name [ <size> ]

The <type-specifier> describes the type of each array element


The array_name is an identifier specifying the name of the array
The <size> in the square brackets is the number of elements in the array
This number must be a constant integer expression

The array is a contiguous block of <type-specifier> objects


The first object is array_name[0]
The last object is array_name[ <size> - 1 ]
An example of array declarations are
int
double
char
float

x[8];
num[100];
buffer[80];
num2[ 10*200 + 30 ];

#define ARRAY_SIZE 100


/* enum { ARRAY_SIZE = 100 }; */
double num3[ARRAY_SIZE];
double num4[ARRAY_SIZE*30];
An example of some illegal array declarations are
int i = 100;
float numbers[i];
const int j = 200;
double xxx[3*j + 10];

/*illegal*/
/*illegal*/

9-2

Introduction to pointers and arrays; the <string.h> library


Example
#include <ctype.h>
/**
**
atoi: convert s to integer
**
-- handles negative numbers
**/
int atoi (const char s[])
{
int i, n, sign = 1;
for (i = 0; isspace (s[i]); i++)
continue;
/* skip white space */
if (s[i] == '-' || s[i] == '+')
/* determine sign and then skip it */
sign = s[i++] == '-' ? -1 : 1;
for (n = 0; isdigit (s[i]); i++)
n = 10 * n + (s[i] - '0');
}

return sign * n;

9-3

Introduction to pointers and arrays; the <string.h> library


Array Initialization
An array may be initialized by following its declaration with a list of initializers
enclosed in braces and separated by commas
const int month_days[] = { 31, 28, 31, 30, 31, 30,

31, 31, 30, 31, 30, 31};

When the size of the array is omitted (as in this example)...


the compiler will compute the length of the array
it does so by counting the number of initializers
there are 12 in this example
If there are fewer initializers for an array than the number specified...
the missing elements will be set to 0 for external and static, but NOT
automatic variables
It is an error to have too many initializers
int foo[5]={1,2,3,4};
int bar[3]={1,2,3,4};

/* foo[4]==0 */
/* compile-time error */

There is no way to specify repetition of an initializer


Likewise, there is no way to initialize an element in the middle of an array
without supplying all of the preceding values as well

Character arrays

Character arrays are a special case...


a string literal may be used as an initializer
char pattern[] = "Hello!";
is a shorthand for
char pattern[] =
{'H', 'e', 'l', 'l', 'o', '!', '\0'};

this is not the same as...


const char *pattern = "Hello";

9-4

/* READ-ONLY */

Introduction to pointers and arrays; the <string.h> library


Example -- A reverse polish calculator
Reference: K&R, Chapter 4 (4.3 - 4.5)

We wish to implement a calculator program that provides the


operators +, -, *, and /.
We will use reverse Polish notation because it is easier to
implement
The following table gives infix operations and the corresponding
reverse Polish expression
infix expression
(1 - 2) * (4 + 5)
2 - 3 - 4 * 5
2 * 3 - 4 * 5 - 6 * 7
(1 - 3)/(4 * (7 + 2))

reverse Polish expression


1 2 - 4 5 + *
2 3 - 4 5 * 2 3 * 4 5 * - 6 7 * 1 3 - 4 7 2 + * /

In reverse Polish notation, parentheses are never needed as


long as we know the number of operands each operator
requires
To implement, each operand is pushed onto a stack
When we encounter an operator...
the appropriate number of operands are popped from the
stack
the operation is performed
the result is pushed back onto the stack
In the first reverse Polish string in the table...
push 1, push 2
pop 2 and pop 1; subtract (result is -1); push -1
push 4, push 5
pop 5 and pop 4; add (result is 9); push 9
pop 9 and pop -1; multiply (result is -9); push -9
When the end of line is encountered, we pop the top of stack
and print the result
In this case the result would be -9

9-5

Introduction to pointers and arrays; the <string.h> library


Example -- A reverse polish calculator (2)

The following pseudocode describes the structure and


functionality of the algorithm that we will implement
while ( next operator or operand is not end-of-file
indicator )
if ( number )
push it
else if ( operator )
pop operands
do operation
push result
else if ( newline )
pop and print top of stack
else
error

The operations to push and pop a stack are trivial


We will implement them as a separate module because we wish
to include error checking

9-6

Introduction to pointers and arrays; the <string.h> library


Example -- A reverse polish calculator (3)

For now we will think of the program residing within one source
file
Later, we will discuss how to split it up into two or more source
files
The program, if written as one source file, will look something
like this
#include's
#define's
function prototype declarations for main
main () { ... }
external variables for push and pop
void push (double f) { ... }
double pop (void) { ... }
int getop (char s[]) { ... }
routines called by getop

Includes and function declarations for main program driver


#include <stdio.h>
#include <stdlib.h> /* ansi: for atof() */
/* Constants for later use */
/* max size of operand or operator */
#define MAXOP 100
/* signal that a number was found */
#define NUMBER '0'
/* Forward declarations */
int getop (char s[]);
void push (double f);
double pop (void);
9-7

Introduction to pointers and arrays; the <string.h> library


Example -- A reverse polish calculator (4) -- main program
/* reverse Polish calculator */
int main (void)
{
int type;
double op2;
char s[MAXOP];

while ((type = getop (s)) != EOF) {


switch (type) {
case NUMBER:
push (atof (s));
break;
case '+':
push (pop () + pop ());
break;
case '*':
push (pop () * pop ());
break;
case '-':
/*not push(pop() - pop()) */
/*or push(-pop() + pop()) */
op2 = pop ();
push (pop () op2);
break;
case '/':
op2 = pop ();
if (op2 != 0) push (pop () / op2)
else printf ("error: zero divisor\n");
break;
case '\n':
printf ("\t%.8g\n", pop ());
break;
default:
printf ("error: ");
printf ("unknown command %s\n", s);
break;
}
}
return 0;

9-8

Introduction to pointers and arrays; the <string.h> library


Stack manipulating functions
/* maximum depth of val stack */
#define MAXVAL 100
/* global: next free stack position */
static int sp = 0;
/* global: value stack */
static double val[MAXVAL];
/**
** push: push f onto value stack
**/
void push (double f)
{
if (sp < MAXVAL)
val[sp++] = f;
else {+
printf ("error: stack full -- ");
printf ("cannot push %g\n", f);
}
}
/**
** pop: pop and return top value from stack
**/
double pop (void)
{
if (sp > 0)
return val[--sp];
else
printf ("error: stack empty\n");

/* only on error would we get this far */


return 0.0;

9-9

Introduction to pointers and arrays; the <string.h> library


Operator and operand evaluator

A variable is external if it is declared outside of functions


Functions push and pop share data
However, it is better for main not to know about this data
Thus, we declare the stack and stack index to be static
Consider now the function that fetches the next operator or
operand -- getop
The basic task is as follows
skip blanks and tabs
if the next character is not a digit or decimal point, return it
otherwise, collect a string of digits, including possibly a
decimal point, and return NUMBER (signal to indicate that a
number was detected)

' ' or '\t'

Start

digit

digit

digit

.
Any character that is not a
' ', '\t', digit, or .

9-10

Introduction to pointers and arrays; the <string.h> library


Operator and operand evaluator (2)
#include <ctype.h>
int getch (void);
void ungetch (int);
/**
** getop: get next operator or numeric operand
** -- if it is an operator, return its char value
** -- if it is a number, return NUMBER
** -- in both cases, copy input string into s[]
**/
int getop (char s[])
{
int i, c;
while (isspace (s[0] = c = getch ()))
continue;
s[1] = '\0';
if (!isdigit (c) && c != '.')
return c;
/* not a number--maybe '\n' or +, etc.*/
i = 0;
if (isdigit (c)) /*collect integer part*/
/* while (isdigit (s[++i] = c = getch ())) */
do s[++i] = c = getch ();
while (isdigit (c));
if (c == '.')/*collect fractional part*/
while (s[++I] = c = getch (), isdigit (c))
continue;
s[i] = '\0';
/*overwrite last character*/
if (c != EOF)
/*return last character to input stream*/
ungetch (c);
}

return NUMBER;

9-11

Introduction to pointers and arrays; the <string.h> library


Operator and operand evaluator (3)

In the previous example, two routines were used


getch and ungetch
Often, when we are reading in input, we need to look ahead one
or more characters to determine whether what we have read so
far matches what we are looking for
If we read too much, then we need to return the characters that
we did not use
This particular program always reads one character too many in
the case that it is recognizing a number
We handle the necessity of returning a character to the input
stream by implementing a pair of cooperating functions
getch and ungetch
Function getch fetches the next input character either from a
buffer or from the input stream
Function ungetch pushes the character into the buffer so that
a future call to getch will fetch it

9-12

Introduction to pointers and arrays; the <string.h> library


Operator and operand evaluator (4)
#include <stdio.h>
#define BUFSIZ 100
/* buffer for ungetch */
static char buf[BUFSIZ];
/* next free position in buf */
static int bufp = 0;
/**
** getch: get a (possibly pushed back) character
**/
int getch (void)
{
return bufp > 0 ? buf[--bufp] : getchar ();
}
/**
** ungetch: push character back on input
**/
void ungetch (int c)
{
if (bufp >= BUFSIZ)
printf ("ungetch: too many characters\n");
else
buf[bufp++] = c;
}

These two functions share bufp and buf


Function ungetch is actually more general than the standard
library function ungetc because it provides more than one
character pushback
Why don't these functions handle a pushed back EOF correctly?
How would you fix it?

9-13

Introduction to pointers and arrays; the <string.h> library


Header files (1)

Let us now consider splitting the reverse Polish calculator


program into several source files
The following files will contain the following functions
file
main.c
stack.c
getop.c
getch.c

contents
main()
push(), pop() and their variables
getop()
getch(), ungetch()

The definitions and declarations are shared among files


We wish to centralize these declarations as much as possible
To do this, we will put all common declarations in a header file
which will be included by files as necessary
This brings up an important programming paradigm:
Whenever possible, avoid
having more than one
version of the truth

9-14

Introduction to pointers and arrays; the <string.h> library


Header files (2)
calc.h
#define NUMBER '0'
void push (double f);
double pop (void);
int getop (char s[]);
int getch (void);
void ungetch (int);
main.c
#include <stdio.h>
#include <stdlib.h>
#include "calc.h"

getop.c
#include <stdio.h>
#include <ctype.h>
#include "calc.h"

#define MAXOP 100

int getop (char s[]){...}

int main (void) {...}


stack.c
#include <stdio.h>
#include "calc.h"
#define MAXVAL 100
static int sp = 0;
static double val[MAXVAL];
void push (double f){...}
double pop (void)
{...}

getch.c
#include <stdio.h>
/* need calc.h
* declarations for
* consistency checking
*/
#include "calc.h"
#define BUFSIZ 100
static char buf[BUFSIZ];
static int bufp = 0;
int getch (void)
{...}
void ungetch (int c){...}

9-15

Introduction to pointers and arrays; the <string.h> library


Arrays and pointers
Reference: Brooks, Chapter 6 (6.5 - 6.6); Kelley & Pohl, Chapter 6

C uses a flat, linear memory model


this does not imply that the target architecture must use a flat, linear memory
model
it simply implies that the compiler must make it look that way

.
.
.

Compiler

Machine
memory
model

Programmer's
model

9-16

Introduction to pointers and arrays; the <string.h> library


Typical situation

Any byte can be represented as a char


A pair of one byte cells may be represented by a short
Four adjacent bytes may be represented as a long

A pointer is a group of cells (usually 2 or 4) that can hold an address

p:

...

c:

...

...

The unary operator & gives the address of an object


p = &c;

Assigns the address of c to the variable p


Variable p points to c
Can apply & operator only to objects in memory
variables
array elements
Cannot apply & operator to objects which do not reside in main memory
expressions
constants
register variables
LEGAL

int x;
...
p = &x;
const int x;
...
p = &x;
int a[10];
...
p = &a[1];

ILLEGAL
register int x;
...
p = &x;
p = &(x+y);
p = &33;

9-17

Introduction to pointers and arrays; the <string.h> library


Unary operator *

The unary operator * serves as either the indirection or dereferencing operator,


depending on the context
When applied to a pointer, it accesses the object to which that pointer points

void foo (void)


{
int x = 1, y = 2, z[10];
/* ip is a pointer to int */
ip
x
int *ip;
/* ip now points to x */
ip = &x;
/* y is now 1 */
y = *ip;
/* x is now 0 */
*ip = 0;
/* ip now points to z[0] */
z[0]

ip

ip = &z[0];

9-18

Introduction to pointers and arrays; the <string.h> library


Unary operator * (2)

The syntax of the declaration for a variable in C has been designed to mimic the
syntax of the expressions in which they might occur
DECLARATION

double a[10];
double *ap = a;
double atof (const char *);

USAGE
for (i = 0; i < 10; i++)
a[i] = 0;
*ap = 10;
void foo (void) {
double x = atof ("35.7");
}

The pointer declaration gives the name of the object and the type of thing to
which it points
In general, pointers are constrained to point to the objects for which they are
declared
Pointers to void may point to anything but they may not be dereferenced (more
later)

9-19

Introduction to pointers and arrays; the <string.h> library


More on pointers and their syntax

Unary operators * and & have the same precedence as other unary operators
higher than binary operators
lower than [] or () operators
These precedence rules also apply within the declarators

Pointers may be assigned to each other

Void foo (void)


{
int *p1, *p2, x = 0, a[10];
p1 = &x;
p2 = p1;
*p1 = 10;
printf ("p2 points to %d\n", *p2);

p1 = &a[0];
a[0] = 100;
printf ("p1 points to %d\n", *p1); /*100*/
printf ("p2 points to %d\n", *p2); /*10*/

9-20

Introduction to pointers and arrays; the <string.h> library


Pointers and function arguments

C passes in arguments to functions by value, i.e. it copies the parameters over


The following is a wrong implementation of a function to swap two integers

/*WRONG*/
void swap (int x, int y)
{
int tmp = x;
x = y;
y = tmp;
}
void foo (void)
{
int a = 10, b = 20;
swap (a, b);
printf ("a=%d, b=%d\n", a, b);
}
Initial activation frame
tmp

20

10

call swap

9-21

10

20

20

10

Introduction to pointers and arrays; the <string.h> library


Activation frame after swap completes and exits
tmp 10

20

10

20

10

swap exits

20

10

No change
to a or b!

This implementation of swap fails because it failed to take into account that the
parameters are copies of the inputs
A correct implementation needs pointers to the objects being swapped to be
passed in
We now reimplement swap by passing in pointers to the objects to be swapped

void swap (int *x, int *y)


{
int tmp = *x;
*x = *y;
*y = tmp;
}
void foo (void)
{
int a = 10, b = 20;
swap (&a, &b);
}

9-22

Introduction to pointers and arrays; the <string.h> library


Activation frames of call to swap
tmp
x
y
Initial
stack

20

10

call swap

tmp 10

tmp = *x

20

10

tmp 10

20

10

*x = *y

20

20

10

20

tmp 10
x
y
*y = tmp

10

20

exit swap

success

9-23

Introduction to pointers and arrays; the <string.h> library


Application -- the scanf function

The standard library function scanf() may be used to read in input from the
terminal
Like printf(), it expects a format string followed by a series of arguments
In many ways, the format is identical to printf
However, because scanf expects a pointer to its argument, there are some
areas where it is distinctly different
format
%d
%u
%hd
%ld
%f
%lf
%Lf
%c
%s

object

required variable type


int *
unsigned int *
short *
long *
float *
double *
long double *
char *
char *

int
unsigned int
short
long
float
double
long double
character
char array

example
void foo (void)
{
int i;

scanf("%d", &i);

format string specifies that i is int *


scanf copied integer over
to *(&i), i.e. i

9-24

Introduction to pointers and arrays; the <string.h> library


Application -- the scanf function (2)
void fubar (void)
{
int *ip;

scanf("%d", ip);

Format string specifies that ip is int *


Program will probably crash when scanf
tries to write output into *ip
The problem is that pointer ip is
uninitialized when passed to scanf

The problem encountered in this example is that an invalid pointer is passed to


scanf
Because ip was never initialized to point to anything valid, it is currently
"pointing into space"
Programmers must make certain that pointers are always pointing to something
valid before used

void bar()
{
int x, *ip = &x;
scanf("%d", ip);

Specifies that ip is int *


Legal -- x will contain the new
value because this is what ip
is pointing to

9-25

Introduction to pointers and arrays; the <string.h> library


Printf versus scanf

The format string for functions printf and scanf are similar in many ways
However, one may observe that scanf is much pickier about its arguments

format
%d

int

%hd

short

%ld
%f

long
float

%lf

double

%Lf
%g

long double
floating point

%g

object

float

allowable types
for printf
char
short
int
char
short
int
long
float
double
float
double
long double
float
double
N/A

allowable types for


scanf
int *
short *
long *
float *
double *
long double *
N/A
float *

Why this subtle difference in parameter requirements?


printf and scanf both use the stack up approach for the additional
arguments
this is because the additional arguments (the ones specified by the format) fall
under the ellipsis of the function prototype
The printf function takes advantage of the fact that
char and short both promote to int
float promotes to double
However, scanf cannot take advantage of this because a pointer remains
unchanged

9-26

Introduction to pointers and arrays; the <string.h> library


Printf versus scanf (2)
void foo()
{
int x;
scanf("%d", &x);
Specifies that &x is int *
Legal -- x will contain new value
printf("x = %d\n", x)
Specifies that x is int -- okay
scanf("%hd", &x);

/*trouble*/

Specifies that &x is short *


Will work if shorts and ints are the
same size
Will fail on a machine where shorts
are different in size from ints,
e.g. 16 and 32 bits respectively
as is the case on an Apollo DN3000
printf("x = %hd\n", x)

Specifies that x is short


Printf expects to find an int because
it knows that its parameter, even if
it is short, will promote to int.
The input is then treated (converted)
as if it were a short

9-27

Introduction to pointers and arrays; the <string.h> library


Printf versus scanf (3)

Pointers get passed in without change

void foo2 (void)


{
float f; double d;
scanf("%f", &f);
scanf expects float *
printf("f = %g\n", f)
printf expects double
The float argument gets promoted
scanf("%g", &d);

/*trouble*/

scanf expects float * buts gets a


double * argument
On most machines, doubles are much
larger than floats so this will
probably not work
printf("d = %g\n", d)
printf expects double
scanf("%lf", &d);

/*correct*/

scanf expects double *

9-28

Introduction to pointers and arrays; the <string.h> library


Printf versus scanf (4)

Most compilers do not check format strings of printf and scanf style
functions to see that they are consistent with the arguments that the user provides
Some compilers, recently, have started doing this
GNU C -- version 2.0 and up -- does check them and it helps prevent a lot of
bugs
the Unisys A-Series C compiler does this type of checking as well
Missing arguments to either of these can wreak havoc
It is therefore critical that you understand exactly what you are doing with either
printf or scanf

Pointer declaration nuances


DECLARATION
int *x, y;
int *x, *y;
#define pInt int *
pInt x, y;

x:
y:
x:
y:
x:
y:

INTERPRETATION
pointer to int
int
pointer to int
pointer to int
pointer to int
int

Explain why the last declaration of the table is equivalent to the first declaration

9-29

Introduction to pointers and arrays; the <string.h> library


Pointers and arrays

There is a close relationship between pointers and arrays in C


With many operations, arrays and pointers may be used interchangeably
Consider the following function
#include <string.h>
void reverse (char s[])
{
int i, j;

for (i = 0, j = strlen (s) - 1; i < j; i++, j--) {


register const char c = s[i];
s[i] = s[j];
s[j] = c;
}

It may also be written as


#include <string.h>
void reverse (char *s)
{
int i, j;

for (i = 0, j = strlen (s) - 1; i < j; i++, j--) {


register const char c = s[i];
s[i] = s[j];
s[j] = c;
}

9-30

Introduction to pointers and arrays; the <string.h> library


Pointer arithmetic define d
DEFINITION
Let x be a pointer to some type t. Then for any integer i,
*(x + i) x[i]
&x[i] x + i

REMARK

This defines the subscripting operator [] as well as pointer arithmetic

long a[8];
a[0]

a[1]

a+1

a[2]

a[3]

a[4]

a+2

a[5]

a[6]

a[7]

a+7

a: constant pointer to long (i.e. long * const)


a[1] == *(a+1)
a+1 = expression giving pointer to the next
element of the array
The C compiler must generate code so that pointer
arithmetic will work
a[i] == *(a+i)

9-31

Introduction to pointers and arrays; the <string.h> library


Relationship between pointers and arrays
long a[8], *pa = a, *pa0 = &a[0];
a[0]

pa

a[1]

pa+1

a[2]

a[3]

a[4]

pa+2

a[5]

a[6]

a[7]

pa+7

The expression x = *pa is equivalent to writing x = a[0]


The expression pa+i increments the pointer just enough so that it is pointing i
positions forward
The new pointer value depends on what the compiler perceives the base type of
the pointer to be
Adding 1 to a pointer pa implies that pa+1 points to the next object in the
array
Adding i to a pointer pa implies that pa+i points to the ith object beyond pa
in the array
These operations all work regardless of what type of array pa points to

KEY QUESTION
What type of object does pa point to?

The answer the compiler gives to this question defines how pointer arithmetic is
defined on it
* pa *(pa + 0) pa[0]

9-32

Introduction to pointers and arrays; the <string.h> library


Pointers versus variables

A variable, in reality, is simply a reference to some memory location


A pointer variable, in reality, is a reference to some memory location containing
yet another reference to some other memory location
long
p =
q =
*p =
*q =
100

x, a[7], *p, *q;


&x;
&a[4];
37;
94;

104

108

112

116

37

a[0]

a[1]

104

108

37

a[2]

a[0]

a[1]

104

a[0]

a[3]

a[4]

132

136

100 120
a[5]

a[6]

112

116

120

62

62

94

a[2]

a[3]

a[4]

124

128

132

136

112 116
a[5]

a[6]

128

132

136

a (104)
108

37

128

a (104)

p += 2;
100

124

94

p = &a[2];
q = &a[3];
a[2] = 62;
a[3] = *p;
100

120

a[1]

112

116

120

62

62

94

a[2]

a[3]

a[4]

124

120 116
a[5]

a (104)

9-33

a[6]

Introduction to pointers and arrays; the <string.h> library


Pointer usage
STATEMENT
y = *ip + 1
*ip += 1

++ * ip

* ip ++

* ++ ip

( * ip ) ++

EFFECT
Take whatever ip points at
Add 1 to it
Assign the result to y
This is equivalent to writing
(*ip) += 1
which in turn is equivalent to
*ip = *ip + 1
Thus, it increments whatever ip is
pointing at by 1
This is equivalent to writing
++ ( * ip )
Thus, it increments whatever ip is
pointing at by 1 and returns the
incremented result
This is equivalent to writing
* ( ip ++ )
because * and ++ are both unary
operators of equal precedence; these
operators associate right to left.
Thus, the expression returns the value
in *ip and performs a post increment of
the pointer ip
This is equivalent to writing
* ( ++ ip )
because * and ++ are both unary
operators of equal precedence.
Thus, the expression increments the
pointer ip and then dereferences and
returns that new value
The return value of the expression is
the original value of *ip; it performs a
post increment of what ip points at

9-34

Introduction to pointers and arrays; the <string.h> library


Pointer usage (2)

The following is the parse tree for the expression * ip += 1

+=

tmp <-- lvalue + 1


schedule store of
tmp to lvalue
return tmp as expression
result

lvalue <-- *ip


(dereference ip)

ip

The following is the parse tree for the expression ++ * ip


tmp <-- lvalue + 1
schedule store of tmp to lvalue
return tmp as expression result

(pre)++
lvalue <-- *ip (dereference ip)

*
ip

9-35

Introduction to pointers and arrays; the <string.h> library


Pointer usage (3)

The following is the parse tree for the expression * ip ++


lvalue <-- *tmp (dereference tmp)
return lvalue as expression result

*
tmp <-- ip
tmp2 <-- ip + 1
schedule store of tmp2 to ip
return tmp expression result

(post)++
ip

The following is the parse tree for the expression * ++ ip


lvalue <-- *tmp (dereference tmp)
return lvalue as expression result

*
tmp <-- ip + 1
schedule store of tmp to ip
return tmp expression result

(pre)++
ip

9-36

Introduction to pointers and arrays; the <string.h> library


Algol pointers versus C pointers

Algol and C have radically different concepts of what pointers represent


In Algol, a pointer is a reference to an element within an array
one establishes the reference by assigning to the pointer the desired element
within the array
one references the array element by simply referencing the pointer
in this sense, the pointer and the array element are aliases
In C, a pointer is an actual address
one establishes the reference by taking the address of the object
one the references the object by explicitly dereferencing the pointer object
Algol pointer example

REAL X;
EBCDIC ARRAY C[0:0];

C pointer example
double x;
dhar c;

REAL ARRAY A[0:99];


EBCDIC ARRAY B[0:79];

double a[100];
char b[80];

POINTER PA;
POINTER PB;

double *pa;
char *pb;

PA := POINTER(A[0],48);
PB := B[10];

pa = a;
pb = &b[10];

A[0] := 10.0;
REPLACE B[10] BY "f";

a[0] = 10.0;
b[10] = 'f';

X := REAL(PA, 6);
REPLACE C[0] BY PB;
% PA becomes a character pointer!
PA := PA + 30;
PB := PB-5; % PB := B[5];
REPLACE PA BY 100.0;
REPLACE PB BY "A";

x = *pa;
c = *pb;
pa = pa + 5;
pb = pb - 5;
*pa = 100.0;
*pb = 'A';

9-37

Introduction to pointers and arrays; the <string.h> library


Arrays, pointers, and const-ness
Reference: Rojiani, Chapter 11

We have seen how to declare pointers so far


Recall that the const qualifier specifies that the object is read-only
The const qualifier may be applied to pointers to specify that either
the object to which this pointer points is read-only
the pointer itself is read-only
The following table illustrates different ways in which a pointer to char may be
defined and what it means

DECLARATION
char
char *

INTERPRETATION
character
pointer to character

const char

constant character

const char *

pointer to constant character

char const *

pointer to constant character

char * const

constant pointer to character

const char * const

constant pointer to constant character

9-38

Introduction to pointers and arrays; the <string.h> library


Named constants

Pointers to objects are different from the objects themselves


If a pointer is constant, then it can only point to the object to which it is
initialized
If a pointer points to constant objects...
the pointer may be changed
however, none of the objects to which the pointer points may be changed
through that pointer

const int model = 90;


const int v[] = { 1, 2, 3, 4 };
int xyz = 100;
void foo (void)
{
int *model_p;
const int *c_model_p;
/* error: model is constant */
model = 200;
model++;
/* warning: const-ness of model is lost */
model_p = &model;
/* problem with above statement */
*model_p = 32; /*legal by itself*/
/* legal */
c_model_p = &model;
c_model_p = &xyz;
c_model_p = model_p;
/* warning: const-ness of model_p is lost */
model_p = c_model_p;
/* error: c_model_p points to const int */
*c_model_p = 192;
/* error: v[] is an array of const int */
v[0] = 96; v[1] = 34;
}

9-39

Introduction to pointers and arrays; the <string.h> library


Constant function arguments

Users often declare formal parameters to functions to be const


This is particularly common when declaring pointer parameters
Specifying an argument as const simply asserts that the function will not
change that argument internally

/**
**
**
**
**
**/
char
void
{

The first argument to string copy is the


string being copied to. The second argument,
though, is used for reference. None of its
characters will be modified.
*strcpy (char *, const char *);
foo (char *s)
/*
* String literal is read-only. Thus, it
* makes sense to assign the pointer to it
* to a pointer to constant char
*/
const char *hello = "Hello world";
/*
* Declaring an array x declares x to be of
* type char * const, i.e. a constant pointer
* to char. The characters may be changed,
* but the pointer itself can never be
* changed.
*/
char x[100];
/* legal: parameters match exactly */
strcpy (x, hello);
strcpy (x, s);
/* legal */
/* const-ness of first parameter is being lost
* by this call and should generate a warning.
*/
strcpy (hello, "Goodbye");

9-40

Introduction to pointers and arrays; the <string.h> library


Arrays names versus constant pointers

K&R claim that the declarations char s[] and char *s are equivalent when
used to declare formal parameters to a function
If one ignores the const-ness of these declarations, then this is true
Consider the following example

void foo (void)


{
int a[100];
int *ip;
/* legal */
ip = a;
/* also legal */
ip = a + 10;

/* &a[10] */

/* legal still */
ip++;
/*NOT LEGAL*/
a = ip;

/*NOT LEGAL*/
a++;
Array names cannot be modified
Pointers, however, may point to any position within the array
Thus the following declarations have the following semantics

DECLARATION
char s[]
char * const s
char *s

INTERPRETATION
declares that s is an array of char
declares s to be a constant pointer to char
declares that s is a pointer to char

9-41

Introduction to pointers and arrays; the <string.h> library


The strlen function from <string.h>
/**
** Standard library function from <string.h> to
** calculate the length of a string
**/
int strlen (const char *s)
{
int n;
for (n = 0; *s != '\0'; s++)
n++;
return n;
}

The pointer s is incremented, i.e. it is changed so that it points to the next


element in the character array
Function strlen modifies its own local copy of the pointer
It does not change the contents of the string

strlen("hello world\n");

read-only string
literal
call strlen

n
"hello world\n"

Incrementing s changes strlen's local copy, pushing it to point to the next


character

9-42

Introduction to pointers and arrays; the <string.h> library


Implementation of strcpy()

A number of useful functions are available in the header <string.h>


These functions perform operations on C strings
Care must be taken to insure that valid `strings (i.e., arrays of chars) are passed
to these functions
We will show how to implement a few of these functions to give a flavor of them

/**
**
**
**
**
**
**/
char
{

Header: <string.h>
Name:
strcpy
Copies string 'ct' over to string 's',
including '\0'
Input string s is the return value
*strcpy (char *s, const char *ct)
/* need to save return value */
char *result = s;
while (*s++ = *ct++)
continue;
return result;

The code fragment while (*s++ = *ct++); is typical C style


It may be rewritten as follows

char *strcpy (char *s, const char *ct)


{
int i;
for (i = 0; ct[i] != '\0'; i++)
s[i] = ct[i];

s[i] = '\0';
return s;

/* add null terminator */

9-43

Introduction to pointers and arrays; the <string.h> library


Implementation of strcpy() -- 2

Consider the code fragment while (*s++ = *ct++);


The expression *s++ = *ct++ does all of the work
return tmp3 as value of expression
tmp3 = lvalue4
schedule store of tmp3 to lvalue3
lvalue3 = *lvalue1
return lvalue3

lvalue4 = *lvalue2
return lvalue4

*
lvalue2 = ct
tmp2 = ct+1
schedule store of
tmp2 to ct
return lvalue2

lvalue1 = s
tmp1 = s+1
schedule store of tmp1 to s
return lvalue1

(post)++

(post)++

ct

The object to which ct points to is copied over to the object to which s points
After copying *ct to *s, points s and ct are incremented
The value copied from *ct to *s is the value of the expression
This return value (when it becomes '\0') terminates the while loop

9-44

Introduction to pointers and arrays; the <string.h> library


Using strcpy()

One should notice that strcpy does not allocate space for the target string
It assumes that the user of that function passes in a valid string

#include <stdio.h>
#include <string.h>
void foo (void)
{
char *s;
char buf[100];
/* this will probably cause a crash */
s = strcpy (s, "Hello world\n");
printf ("s = '%s'\n", s);
/* correct usage */
strcpy (buf, "Hello world\n");
printf (buf);
/* also correct */
s = buf + strlen (buf);
strcpy (s, "Hello again (second line)\n");
printf ("%s", s);

/* print out buf which we have built up */


printf ("**************\n");
printf ("%s", buf);
Function printf expects as its first parameter a format string
int printf(const char *format, ...);

An array of characters can serve as that format string

9-45

Introduction to pointers and arrays; the <string.h> library


The strncpy function
Description in the ANSI standard
Synopsis
#include <string.h>
char *
strncpy(char *s1, const char *s2, size_t n);
Description
The strncpy function copies not more than n characters (characters that follow a
null character are not copied) from the array pointed to by s2 to the array pointed
to by s1. If copying takes place between objects that overlap, the behavior is
undefined.
If the array pointed to by s2 is a string that is shorter than n characters, null
characters are appended to the copy in the array pointed to by s1, until n characters
in all have been written.
Returns
The strncpy function returns the value of s1.
Implementation of strncpy
#include <string.h>
char *
strncpy (char *s1, const char *s2, size_t n)
{
char *s = s1;
for (; n > 0 && *s2 != '\0'; n--)
*s++ = *s2++;
while (n-- > 0)
*s++ = '\0';
}

return s1;

9-46

Introduction to pointers and arrays; the <string.h> library


The strncpy function (2)

Notice that strncpy always executes n steps, regardless of how long the string
to be copied is
The function strncpy should be used when you are not sure how many
characters are in the source string, but you are concerned with overwriting your
buffer

#include <string.h>
/**
** Assume that the string src is valid
**/
void foo (const char *src)
{
char buf[100], buf2[10], *s1, *s2;
/* always safe */
strncpy (buf, src, 100);
/* might not print */
printf ("%s", buf);
/* might cause a crash */
strcpy (buf, src);
s1 = buf;
strcpy (s1, "Hello world.\n");
s2 = buf + strlen ("Hello ");
printf ("%s", s2);
/*works fine*/

strncpy(buf2, s1, 10);


printf("%s", buf2); /*trouble*/

9-47

Introduction to pointers and arrays; the <string.h> library


The strcat function
Description in the ANSI standard
Synopsis
#include <string.h>
char *strcat (char *s1, const char *s2);
Description
The strcat function appends a copy of the string pointed to by s2 (including the
terminating null character) to the end of the string pointed to by s1. The initial
character of s2 overwrites the null character at the end of s1. If copying takes place
between objects that overlap, the behavior is undefined.
Returns
The strcat function returns the value of s1.
Implementation of strcat
#include <string.h>
char *strcat (char *s1, const char *s2)
{
char *s = s1;
/* position pointer s at the end */
while (*s)
s++;
/* copy s2 starting at the end of s1 */
while (*s++ = *s2++)
continue;
}

return s1;

9-48

Introduction to pointers and arrays; the <string.h> library


The strcat function (2)

It should be noted that we can implement strcat using the strlen function

#include <string.h>
char *strcat (char *s1, const char *s2)
{
char *s = s1 + strlen (s1);
/* copy s2 starting at the end of s1 */
while (*s++ = *s2++)
continue;
}

return s1;
We can also implement strcat using both strlen and strcpy

#include <string.h>
char *strcat (char *s1, const char *s2)
{
/* copy s2 starting at the end of s1 */
strcpy (s1 + strlen (s1), s2);
}

return s1;
Many C implementations inline the code for strlen, strcpy, and strcat
This last implementation may be just as efficient as the first two

9-49

Introduction to pointers and arrays; the <string.h> library


The strncat function
Description in the ANSI standard
Synopsis
#include <string.h>
char *
strncat(char *s1, const char *s2, size_t n);
Description
The strncat function appends not more than n characters (a null character and
the characters that follow are not appended) from the array pointed to by s2 to the
end of the string pointed to by s1. The initial character of s2 overwrites the null
character at the end of s1. A terminating null character is always appended to the
result. If copying takes place between objects that overlap, the behavior is
undefined.
Returns
The strncat function returns the value of s1.
Contrast between strncat and strncpy

Function strncpy is required to pad the source string with null characters
this makes it always take time n
Function strncat always appends a null character to the result
this ensures that strings produced by strncat can always be printed
it is different from strncpy in that strncpy may not always append the
null character to its result

9-50

Introduction to pointers and arrays; the <string.h> library


Implementation of strncat
#include <string.h>
char *
strncat (char *s1, const char *s2, size_t n)
{
char *s = s1;
/*
* position pointer s at the end
*/
while (*s)
s++;
/*
* copy
* copy
*/
for (; n
*s++ =

s2 starting at the end of s1


no more than n characters
> 0 && *s2 != '\0'; n--)
*s2++;

/*
* null terminate the result
*/
*s = '\0';

/* return original pointer */


return s1;

9-51

Introduction to pointers and arrays; the <string.h> library


Example using strcat
#include <stdio.h>
#include <string.h>
#define LINELEN 80
void foo (void)
{
char buf[LINELEN];
char *s1, *s2, *s3;
/*
* trouble: buf is not initialized
*
explain why this may cause a crash
*/
strcat (buf, "Hello world.\n");
/* s1, s2 and s3 represent lines 1, 2 and 3 */
s1 = buf;
/* build up a string to print */
strcpy (buf, "This is the first line\n");
s2 = buf + strlen (buf);
strcat (buf, "This is the second line\n");
s3 = s2 + strlen (s2);
strcat
printf
printf
printf

(buf,
("s1:
("s2:
("s3:

"This is the third line\n");


%s", s1);
%s", s2);
%s", s3);

/* faster way of building string */


strcpy (s1 = buf, "This is the first line\n");
strcpy (s1 += strlen(s1),
"This is the second line\n");
strcpy (s1 += strlen(s1),
"This is the third line\n");

9-52

Introduction to pointers and arrays; the <string.h> library


The strcmp function
Description in the ANSI standard
The sign of a nonzero value returned by comparison functions memcmp,
strcmp, and strncmp is determined by the sign of the difference between the values
of the first pair of characters (both interpreted as unsigned char) that differ in
the objects being compared.
Synopsis
#include <string.h>
int strcmp(const char *s1, const char *s2);
Description
The strcmp function compares the string pointed to by s1 to the string pointed
to by s2.
Returns
The strcmp function returns an integer greater than, equal to, or less than zero,
accordingly as the string pointed to by s1 is greater than, equal to, or less than the
string pointed to by s2.
Implementation of strcmp
#include <string.h>
int strcmp (const char *s1, const char *s2)
{
unsigned char c1, c2;
while ((c1 = *s1++) == (c2 = *s2++))
if (c1 == '\0')
return 0;
return c1 < c2 ? -1 : 1;
}

9-53

Introduction to pointers and arrays; the <string.h> library


Concise expressions

Let us consider the while loop:


while ((c1 = *s1++) == (c2 = *s2++))
if (c1 == '\0')
return 0;

The parse tree for the while expression is as follows


tmp7 = (tmp5 == tmp6)
return tmp7

tmp5 = lvalue1
schedule store
lvalue1 to c1
return tmp5

tmp6 = lvalue2
schedule store
lvalue2 to c2
return tmp6

==

=
lvalue1 = *tmp1
return lvalue1

c1

lvalue2 = *tmp3
return lvalue2

c2
tmp1 = s1
tmp2 = s1 + 1
schedule store
tmp2 to s1
++ (post) return tmp1

s1

tmp3 = s2
tmp4 = s2 + 1
schedule store
tmp4 to s2
++ (post) return tmp3

s2

9-54

Introduction to pointers and arrays; the <string.h> library


The strncmp function
Description in the ANSI standard
The sign of a nonzero value returned by comparison functions memcmp,
strcmp, and strncmp is determined by the sign of the difference between the values
of the first pair of characters (both interpreted as unsigned char) that differ in
the objects being compared.
Synopsis
#include <string.h>
int strncmp(const char *s1, const char *s2,
size_t n);
Description
The strncmp function compares not more than n characters (characters that
follow a null character are not compared) from the array pointed to by s1 to the
array pointed to by s2.
Returns
The strncmp function returns an integer greater than, equal to, or less than zero,
accordingly as the possibly null-terminated array pointed to by s1 is greater than,
equal to, or less than the possibly null-terminated array pointed to by s2.

9-55

Introduction to pointers and arrays; the <string.h> library


Implementation of strncmp
#include <string.h>
int
strncmp(const char *s1, const char *s2, size_t n)
{
unsigned char c1, c2;
for ( ; n>0; n--, s1++, s2++) {
/*
* Scan through n characters
* Either string may be null-terminated
* It is possible that neither string
*
will be null-terminated.
*/
if ((c1 = *s1) != (c2 = *s2)) {
/* strings are different here */
return c1 < c2 ? -1 : 1;
} else if (c1 == '\0') {
/*
* Both strings are null-terminated
* and they both end here -- they
* are both exactly the same
*/
return 0;
}
}
/* the first n characters are identical */
return 0;
}

9-56

Introduction to pointers and arrays; the <string.h> library


Comparing strings while ignoring their case

Sometimes it is desirable to compare two strings while ignoring case


One approach is to take the two input strings, convert them both to upper case,
and then use strcmp (or strncmp) to compare them
A more efficient approach is to convert each character as the strings are
compared to a common case and compare those
second approach is much faster if the strings differ early on
first approach would require the copying of two potentially long strings
This function is not part of <string.h>, i.e. you must implement it yourself if
you want to use it

#include <ctype.h>
#define upper(c) (islower(c) ? toupper(c) : (c))
int strcasecmp (const char *s1, const char *s2)
{
unsigned char c1, c2;
for ( ;; ) {
/*infinite loop*/
c1 = *s1++;
c2 = *s2++;
/*
* The expression c1 = upper(*s1++)
* will not work. The expression
* c1 = toupper(*s1++) will also probably
* not work, even if toupper left
* non-lower case letter alone. Explain.
*/
c1 = upper (c1);
c2 = upper (c2);

/* now compare c1 and c2 */


if (c1 != c2)
return c1 < c2 ? -1 : 1;
else if (c1 == '\0')
return 0;

9-57

Introduction to pointers and arrays; the <string.h> library


The NULL pointer

C defines a special value for an invalid pointer


The pointer is NULL (and is defined by including a header)
Another way of specifying NULL is to use the constant 0
Consequently, NULL is also considered to be FALSE
char *p = 0;
char *s = NULL;

The strchr function

One uses this function to locate the first occurrence (the one having the lowest
subscript) of a character in a null-terminated string
A search failure returns a null pointer

Description in the ANSI standard


Synopsis
#include <string.h>
char *strchr (const char *s, int c);
Description
The strchr function locates the first occurrence of c (converted to a char) in the
string pointed to by s. The terminating null character is considered to be part of the
string.
Returns
The strchr function returns a pointer to the located character, or a null pointer if
the character does not occur in the string.

9-58

Introduction to pointers and arrays; the <string.h> library


Implementation of strchr
#include <string.h>
char *
strchr (const char *s, int c)
{
const char ch = c;
char c2;
for ( ; (c2 = *s) != ch; s++ ) {
if (c2 == '\0') {
/* no ch in s */
return NULL;
}
}
/* oops: we just lost our const-ness */
return (char *) s;
}

Using strchr
#include <stdio.h>
#include <string.h>
#include <assert.h>
void foo (void)
{
const char *s = "Hello, out, there.\n";
char *t;
printf ("s: %s", s);
t = strchr (s, ',');
assert (t != NULL);
printf ("t: %s", t); /* ", out, there.\n" */
t = strchr (t + 1, ',');
assert (t != NULL);
printf ("t: %s", t); /* ", there.\n" */
assert (strchr (t + 1, ',') == NULL);
}

9-59

Introduction to pointers and arrays; the <string.h> library


The strrchr function
Description in the ANSI standard
Synopsis
#include <string.h>
char *strrchr (const char *s, int c);
Description
The strrchr function locates the last occurrence of c (converted to a char) in the
string pointed to by s. The terminating null character is considered to be part of the
string.
Returns
The strrchr function returns a pointer to the located character, or a null pointer if
the character does not occur in the string.
Implementation of strrchr
#include <string.h>
char *
strrchr(const char *s, int c)
{
const char ch = c;
const char *sc;

for (sc = NULL; ; s++) {


/* we want to find the last one */
if (*s == ch)
sc = s;
if (*s == '\0')
return (char *)sc;
}

9-60

Introduction to pointers and arrays; the <string.h> library


Pointer subtraction

If two pointers point to different elements in the same array, then subtracting
those two pointers gives the number of objects (of that pointer's type)
separating them (see K&R, p. 206), i.e. the distance between the two pointers
measured in units which equal the size of the pointer's base type
int a[100], *p1 = a, *p2 = &a[20];
int offset = p2 - p1;

Pointers p1 and p2 point to the 0th and 20th array elements respectively
Thus, offset is set to be 20, the number of elements between position 0 and
20

The strcspn function


Description in the ANSI standard
Synopsis
#include <string.h>
size_t
strcspn (const char *s1, const char *s2);
Description
The strcspn function computes the length of the maximum initial segment of the
string pointed to by s1 which consists entirely of characters not from the string
pointed to by s2.
Returns
The strcspn function returns the length of the segment.

9-61

Introduction to pointers and arrays; the <string.h> library


What strcspn does

You can think of strcspn as a companion to strchr


it matches any set of characters instead of just one
However, strcspn returns an index into the string instead of a pointer to the
element
If it does not find a match, it returns the index of the terminating null character as
opposed to the NULL pointer

Implementation of strcspn
#include <string.h>
size_t
strcspn(const char *s1, const char *s2)
{
const char *sc1, *sc2;
/*
*
*
*
*/
for

As soon as we find a character in s2 which


is in s1, we have found what we are
looking for.
(sc1=s1; *sc1 != '\0'; sc1++) {
for (sc2=s2; *sc2 != '\0'; sc2++) {
if (*sc1 == *sc2) {
/* we found one */
return (sc1 - s1);
}
}

}
/* terminating nulls match */
return (sc1 - s1);

9-62

Introduction to pointers and arrays; the <string.h> library


Using strcspn

Functions strchr and strcspn are useful when parsing null terminated
strings
Let us consider implementing a function which returns a pointer to the first
whitespace character within a string

#include <string.h>
/**
** Returns a pointer to the first whitespace
** character (either ' ', '\t', or '\n') in s
**
** A NULL return value implies that no
** whitespace exists within the string
**/
const char *
find_whitespace (const char *s)
{
size_t index;
index = strcspn (s, " \t\n");
if (s[index] == '\0') {
/* no whitespace */
return NULL;
}

/* s[index] must be a whitespace character */


return (s + index);

9-63

Introduction to pointers and arrays; the <string.h> library


The strpbrk function
Description in the ANSI standard
Synopsis
#include <string.h>
char *
strpbrk(const char *s1, const char *s2);
Description
The strpbrk function locates the first occurrence in the string pointed to by s1 of
any character from the string pointed to by s2.
Returns
The strpbrk function returns a pointer to the character, or a null pointer if no
character from s2 occurs in s1.

9-64

Introduction to pointers and arrays; the <string.h> library


What strpbrk does

You can think of strpbrk as a companion to strchr


it matches any set of characters instead of just one
It is also similar to strcspn
However, strcspn returns an index into the array while strpbrk returns a
pointer

Implementation of strpbrk
#include <string.h>
char *
strpbrk(const char *s1, const char *s2)
{
const char *sc1, *sc2;
for (sc1=s1; *sc1 != '\0'; sc1++) {
/* check each sc2 char to see if in sc1 */
for (sc2=s2; *sc2 != '\0'; sc2++) {
/* Is this sc2 char in sc1? */
if (*sc1 == *sc2) {
/* oops: lost our const-ness */
return (char *)sc1;
}
}
}

/* terminating nulls match */


return NULL;

9-65

Introduction to pointers and arrays; the <string.h> library


The strspn function
Description in the ANSI standard
Synopsis
#include <string.h>
size_t
strspn(const char *s1, const char *s2);
Description
The strspn function computes the length of the maximum initial segment of the
string pointed to by s1 which consists entirely of characters from the string pointed
to by s2.
Returns
The strspn function returns the length of the segment.
What strspn does

You can think of strspn as the complement to strcspn


it searches for a character that matches none of the elements in a set of
characters instead of any one of them
Like strcspn, strspn returns an index into the string
If it does not find a match, it returns the index of the terminating null character
Conceptually, the call to strspn(s, "abc") finds the longest possible span
of the characters from the set "abc"

9-66

Introduction to pointers and arrays; the <string.h> library


Implementation of strspn
#include <string.h>
size_t
strspn(const char *s1, const char *s2)
{
const char *sc1, *sc2;

for (sc1=s1; *sc1; sc1++) {


/*
* check to see if *sc1 is in sc2
*/
for (sc2=s2; ; sc2++) {
if (*sc2 == '\0') {
/*
* None of the characters in
* sc2 match up with anything in
* sc1 so this is where the span
* of characters ends
*/
return (sc1 - s1);
} else if (*sc1 == *sc2) {
/* we found a match -- stop */
break;
}
/* else keep searching */
}
}
/*
* All characters in s1 are in the character
* set pointed to be s2.
*/
return (sc1 - s1);

9-67

Introduction to pointers and arrays; the <string.h> library


Using strspn and strcspn
#include <string.h>
static const char *whitespace = " \t\n";
/**
** Starting at position i of the string s,
** return an index which skips over all
** whitespace. Return an index to the null
** character if there is nothing but whitespace.
**/
size_t
skip_spaces (const char *s, size_t i)
{
size_t i2;
/* compute index in string 's+i' */
i2 = strspn (s + i, whitespace);
/*
* i2 is the offset relative to s+i
* we need to return an offset relative to s
*/
return i2 + i;
}
/**
** Starting at position i of the string s,
** return an index identifying the next
** whitespace character or null terminator.
**/
size_t
find_spaces(const char *s, size_t i)
{
size_t i2;
/* scan until we find whitespace */
i2 = strcspn(s+i, whitespace);
return i2 + i;
}

9-68

Introduction to pointers and arrays; the <string.h> library


The strstr function
Description in the ANSI standard
Synopsis
#include <string.h>
char *
strstr (const char *s1, const char *s2);
Description
The strstr function locates the first occurrence in the string pointed to by s1 of
the sequence of characters (excluding the terminating null character) in the string
pointed to by s2.
Returns
The strstr function returns a pointer to the located string, or a null pointer if the
string is not found. If s2 points to a string with zero length, the function returns s1.
What strstr does

Writing strstr(s1, s2) locates the first occurrence of the substring s2 in


the string s1
A successful search returns a pointer to the beginning of the substring within s1
A NULL return value implies failure to find the substring s2 in s1

9-69

Introduction to pointers and arrays; the <string.h> library


Implementation of strstr
#include <string.h>
char *
strstr (const char *s1, const char *s2)
{
if (*s2 == '\0') {
/* desired substring is empty */
/* oops: we lost our const-ness */
return (char *) s1;
}
/*
* Scan for the first character of s2 within
* s1 and then start comparing the remaining
* characters. If the substring does not
* exist there, push the pointer forward one
* and scan again.
*/
for ( ; (s1 = strchr (s1, *s2)) != NULL; s1++) {
const char *sc1 = s1, *sc2 = s2;
/* compare the rest of s2 */
for (;;) {
if (*++sc2 == '\0') {
/* we have matched all of s2 */
/* oops: lost our const-ness */
return (char *)s1;
} else if (*++sc1 != *sc2) {
/* strings differ here */
break;
}
}

}
/* substring not found */
return NULL;

9-70

Introduction to pointers and arrays; the <string.h> library


The strtok function
Description in the ANSI standard
Synopsis
#include <string.h>
char *
strtok (char *s1, const char *s2);
Description
A sequence of calls to the strtok function breaks the string pointed to by s1 into
a sequence of tokens, each of which is delimited by a character from the string
pointed to by s2. The first call in the sequence has s1 as its first argument, and is
followed by calls with a null pointer as their first argument. The separator string
pointed to by s2 may be different from call to call.
The first call in the sequence searches the string pointed to by s1 for the first
character that is not contained in the current separator string pointed to by s2. If no
such character is found, then there are no tokens in the string pointed to by s1 and
the strtok function returns a null pointer. If such a character is found, it is the start
of the first token.
The strtok function then searches from there for a character that is contained in
the current separator string. If no such character is found, the current token extends
to the end of the string pointed to by s1, and subsequent searches for a token will
return a null pointer. If such a character is found, it is overwritten by a null
character, which terminates the current token. The strtok function saves a pointer
to the following character, from which the next search will start.
Each subsequent call, with a null pointer as the value of the first argument, starts
searching from the saved pointer and behaves as described above.
The implementation shall behave as if no library function calls the strtok
function.

9-71

Introduction to pointers and arrays; the <string.h> library


The strtok function -- description in the ANSI standard (continued)
Returns
The strtok function returns a pointer to the first character of a token, or a null
pointer if there is no token
Example
#include <string.h>
static char str[] = "?a???b,,,#c";
char *t;
t = strtok (str, "?");

/* t points to the
token "a" */
t = strtok (NULL, ","); /* t points to the
token "??b" */
t = strtok (NULL, "#,");/* t points to the
token "c" */
t = strtok (NULL, "?"); /* t is a null pointer*/
What strtok does

The strtok function is an intricate function designed to help users parse a nullterminated string into tokens
The user must specify a set of separators (e.g., whitespace)
Sequences of one or more separators occur between tokens
The strtok function conceptually stores where the pointer was last in a
static variable
Also, note, the strtok writes into the search string s1 which you pass to it
It you don't want this to happen, you must copy over the string to temporary
storage
Also, strtok is not reentrant, i.e., it can only be used to parse one string at any
given time

9-72

Introduction to pointers and arrays; the <string.h> library


Using strtok
#include <string.h>
extern int
strcmp_nocase(const char *s1, const char *s2);
/**
** Given the string line (from /etc/hosts),
** search it to see if hostname is listed.
** If it is the hostname, return 1 and copy
** the corresponding internet number over.
** Otherwise, return 0.
**
** 192.59.3.56
mva15a mva15a.mv_eng.unisys.COM
**/
int hostline (char *line,
const char *hostname, char *internet)
{
const char *whitespace = " \t\n";
char *tok, *tok2;
if (line[0] == '#') return 0;

/*comment*/

if ((tok = strtok (line, whitespace)) == NULL)


return 0; /* blank line */
while ((tok2 = strtok (NULL, whitespace) != NULL){
/* many names follow internet number */
if (strcasecmp (tok2, hostname) == 0) {
/* this is it */
/* first token == internet number */
strcpy (internet, tok);
/* this is our host */
return 1;
}
}
/* sorry, we didn't find it */
return 0;
}

9-73

Introduction to pointers and arrays; the <string.h> library


Implementation of strtok
#include <string.h>
char *
strtok (char *s1, const char *s2)
{
char *begin, *end;
static char *save = "";
/* for safety */
begin = (s1 != NULL) ? s1 : save;
/* skip over all characters in s2 */
begin += strspn (begin, s2);
if (*begin == '\0') {
/* no token */
save = "";
/* for safety */
return NULL;
}
/* scan until we see a character in s2 */
end = strpbrk (begin, s2);
if (*end != '\0') {
/* this marks the end of the token */
*end++ = '\0';
}
/* save position for later call */
save = end;

/* return beginning of token */


return begin;

9-74

You might also like