You are on page 1of 54

http://ee.hawaii.

edu/~tep/EE160/Book/chap7/
chap7.html2
7 Arrays

A programmer is concerned with developing and implementing algorithms for a


variety of tasks. As tasks become more complex, algorithm development is facilitated
by structuring or organizing data in specialized ways. There is no best data structure
for all tasks; suitable data structures must be selected for the specific task. Some data
structures are provided by programming languages; others must be derived by the
programmer from available data types and structures.
So far we have used integer, floating point and character data types as well as pointers
to them. These data types are called base or scalar data types. Such base data types
may be used to derive data structures which are organized groupings of instances of
these types. The C language provides some widely used compound or derived data
types together with mechanisms which allow the programmer to define variables of
these types and access the data stored within them.
The first such type we will discuss is called an array. Many tasks require storing and
processing a list of data items. For example, we may need to store a list of exam
scores and to process it in numerous ways: find the maximum and minimum, average
the scores, sort the scores in descending order, search for a specific score, etc. Data
items in simple lists are usually of the same scalar type; for example a list of exam
scores consists of all integer type items. We naturally think of a list as a data structure
that should be referenced as a unit. C provides a derived data type that stores such a
list of objects where each object is of the same data type - the array.
In this chapter, we will discuss arrays; how they are declared and data is accessed in
an array. We will discuss the relationship between arrays and pointers and how arrays
are passed as arguments in function calls. We will present several example programs
using arrays, including a revision of our ``payroll'' task from previous chapters. One
important use of arrays is to hold strings of characters. We will introduce strings in
this chapter and show how they are stored in C; however, since strings are important
in handling non-numeric data, we will discuss string processing at length in Chapter

7.1.1 Declaring Arrays


Let us consider the task of reading and printing a list of exam scores.
LIST0: Read and store a list of exam scores and then print it.
Since we are required to store the entire list of scores before printing it, we will use an
array hold the data. Successive elements of the list will be stored in successive
elements of the array. We will use a counter to indicate the next available position in
the array. Such a counter is called an index into the array. Here is an algorithm for our
task:
initialize the index to the beginning of the array
while there are more data items
read a score and store in array at the current index
increment index
set another counter, count = index - the number of items in the array
traverse the array: for each index starting at the beginning to count
print the array element at index
The algorithm reads exam scores and stores them in successive
elements of an array. Once the list is stored in an array, the
algorithm traverses the array, i.e. accesses successive elements,
and prints them. A count of items read in is kept and the traversal
continues until that count is reached.

We can implement the above algorithm in a C program as shown in Figure 7.1.


Before explaining this code, here is a sample session generated by executing this
program:
• ***List of Exam Scores***

• Type scores, EOF to quit
• 67
• 75
• 82
• 69
• '136D

• ***Exam Scores***

• 67
• 75
• 82
• 69
Referring to the code in Figure 7.1, the program first declares an
array, exam_scores[MAX], of type integer. This declaration allocates a
contiguous block of memory for objects of integer type as shown in Figure 7.2.
The macro, MAX, in square brackets gives the size of the array, i.e. the
number of elements this compound data structure is to contain. The name of
the array,exam_scores, refers to the entire collection of MAX integer cells.
Individual objects in the array may be accessed by specifying the name of the
array and the index, or element number, of the object; a process
called indexing. In C, the elements in the array are numbered from 0 to MAX -
1. So, the elements of the array are referred to
as exam_scores[0], exam_scores[1], ...,exam_scores[MAX - 1], where the index
of each element is placed in square brackets. These index specifiers are
sometimes calledsubsctipts, analogous to the mathematical expression a.

These indexed or subscripted array expressions are the names of each object
in the array and may be used just like any other variable name.
In the code, the while loop reads a score into the variable, n, places it in
the array by assigning it to exam_scores[index], and increments index. The
loop is terminated either when index reaches MAX (indicating a full array) or
when returns EOF, indicating the end of the data. We could have also read
each data item directly into exam_scores[index] by writing scanf() as follows:
scanf("%d", &exam_scores[index])
We choose to separate reading an item and storing it in the array because the
use of the increment operator, ++, for index is clearer if reading and
storing of data items are separated.

Once the data items are read and stored in the array, a count of items read
is stored in the variable count. The list is then printed using a for loop.
The array is traversed from element 0 to element count - 1, printing each
element in turn.
From the above example, we have seen how we can declare a variable to be of
the compound data type, array, how data can be stored in the elements of the
array, and subsequently accessed. More formally, the syntax for an array
declaration is:
-specifier identifier>[<size>];
where the <type-specifier> may be any scalar or derived data type; and
the <size> must evaluate, at compile time, to an unsigned integer. Such a
declaration allocates a contiguous block of memory for objects of the
specified type. The data type for each object in the block is specified by
the <type-specifier>, and the number of objects in the block is given by |sf
<size> as seen in Figure 7.2. As stated above, the index values for all
arrays in C must start with 0 and end with the highest index, which is one
less than the size of the array. The subscripting expression with the syntax:

[<expression>]

is the name of one element object and may be used like any other variable
name. The subscript, <expression> must evaluate, at run time, to an integer.
Examples include:

int a[10];
float b[20];
char s[100];
int i = 0;

a[3] = 13;
a[5] = 8 * a[3];
b[6] = 10.0;
printf("The value of b[6] is %f\n", b[6]);
scanf("%c", &s[7]);
c[i] = c[i+1];
Through the remainder of this chapter, we will use the following symbolic
constants for many of our examples:
/* File: araydef.h */
#define MAX 20
#define SIZE 100
In programming with arrays, we frequently need to initialize the elements.
Here is a loop that traverses an array and initializes the array elements to
zero:

int i, ex[MAX];

for (i = 0; i < MAX; i++)


ex[i] = 0;
The loop assigns zero to ex[i] until i becomes MAX, at which point it
terminates and the array elements are all initialized to zero. One precaution
to programmers using arrays is that C does not check if the index used as a
subscript is within the size of the declared array, leaving such checks as
the programmer's responsibility. Failure to do so can, and probably will
result in catastrophe.

7.1.2 Character Strings as Arrays


Our next task is to store and print non-numeric text data, i.e. a sequence of characters
which are called strings. A string is an list (or string) of characters stored
contiguously with a marker to indicate the end of the string. Let us consider the task:
STRING0: Read and store a string of characters and print it out.
Since the characters of a string are stored contiguously, we can easily implement a
string by using an array of characters if we keep track of the number of elements
stored in the array. However, common operations on strings include breaking them up
into parts (called substrings), joining them together to create new strings, replacing
parts of them with other strings, etc. There must be some way of detecting the size of
a current valid string stored in an array of characters.
In C, a string of characters is stored in successive elements of a character array and
terminated by the NULL character. For example, the string "Hello" is stored in a
character array, msg[], as follows:
char msg[SIZE];

msg[0] = 'H';
msg[1] = 'e';
msg[2] = 'l';
msg[3] = 'l';
msg[4] = 'o';
msg[5] = '\0';

The NULL character is written using the escape sequence ' 0'. The

ASCII value of NULL is 0, and NULL is defined as a macro to be 0


in stdio.h; so programs can use the symbol, NULL, in expressions if the
header file is included. The remaining elements in the array after
the NULL may have any garbage values. When the string is retrieved,
it will be retrieved starting at index 0 and succeeding characters are
obtained by incrementing the index until the first NULL character is
reached signaling the end of the string. Figure7.3 shows a string as
it is stored in memory. Note, string constants, such as
"Hello"
are automatically terminated by NULL by the compiler.

Given this implementation of strings in C, the algorithm to implement our task is now
easily written. We will assume that a string input is a sequence of characters
terminated by a newline character. (The newline character is not part of the string).
Here is the algorithm:
initialize index to zero
while not a newline character
read and store a character in the array at the next index
increment the index value
terminate the string of characters in the array with a NULL char.
initialize index to zero
traverse the array until a NULL character is reached
print the array character at index
increment the index value

The program implementation has:


• a loop to read string characters until a newline is reached;
• a statement to terminate the string with a NULL;
• and a loop to print out the string.
The code is shown in Figure 7.4 and a sample session form the
program is shown below.
Sample Session:
• ***Character Strings***

• Type characters terminated by a RETURN or ENTER
• Hello
• Hello
The first while loop reads a character into ch and checks if it is a newline,
which discarded and the loop terminated. Otherwise, the character is stored
in msg[i] and the array index, i, incremented. When the loop terminates,
a NULL character is appended to the string of characters. In this program, we
have assumed that the size of msg[] is large enough to store the string.
Since a line on a terminal is 80 characters wide and since we have defined
SIZE to be 100, this seems a safe assumption.
The next while loop in the program traverses the string and prints each
character until a NULL character is reached. Note, we do not need to keep a
count of the number of characters stored in the array in this program since
the first NULL character encountered indicates the end of the string. In our
program, when the first NULL is reached we terminate the string output with a
newline.
The assignment expression in the above program:
msg[i] = '\0';
can also be written as:

msg[i] = NULL;
or:

msg[i] = 0;
In the first case, the character whose ASCII value is 0 is assigned to ;
where in the other cases, a zero value is assigned tomsg[i]. The above
assignment expressions are identical. The first expression makes it clear
that a null character is assigned tomsg[i], but the second uses a symbolic
constant which is easier to read and understand.

To accommodate the terminating NULL character, the size of an array that


houses a string must be at least one greater than the expected maximum size
of string. Since different strings may be stored in an array at different
times, the first NULL character in the array delimits a valid strin. The
importance of the NULL character to signal the end of a valid string is
obvious. If there were no NULL character inserted after the valid string, the
loop traversal would continue to print values interpreted as characters,
possibly beyond the array boundary until it fortuitously found a (0)
character.
The second while loop may also be written:
while (msg[i] != NULL)
putchar(msg[i++]);
and the while condition further simplified as:

while (msg[i])
putchar(msg[i++]);
If msg[i] is any character with a non-zero ASCII value, the while expression
evaluates to True. If msg[i] is the NULL character, its value is zero and
thus False. The last form of the while condition is the more common usage.
While we have used the increment operator in the putchar() argument, it may
also be used separately for clarity:

while (msg[i]) {
putchar(msg[i]);
i++;
}
It is possible for a string to be empty; that is, a string may have no
characters in it. An empty string is a character array with
the NULL character in the zeroth index position, msg[0].
7.2 Passing Arrays to Functions
We have now seen two examples of the use of arrays - to hold numeric data such as
test scores, and to hold character strings. We have also seen two methods for
determining how many cells of an array hold useful information - storing a count in a
separate variable, and marking the end of the data with a special character. In both
cases, the details of array processing can easily obscure the actual logic of a program -
processing a set of scores or a character string. It is often best to treat an array as
anabstract data type with a set of allowed operations on the array which are
performed by functional modules. Let us return to our exam score example to read
and store scores in an array and then print them, except that we now wish to use
functions to read and print the array.
LIST1: Read an array and print a list of scores using functional modules.
The algorithm is very similar to our previous task, except that the details of reading
and printing the array is hidden by functions. The function, read_intaray(), reads
scores and stores them, returning the number of scores read. The
function, print_intaray(), prints the contents of the array. The refined algorithm
for main() can be written as:
print title, etc.
n = read_intaray(exam_scores, MAX);
print_intaray(exam_scores, n);
Notice we have passed an array, exam_scores, and a constant, MAX (specifying the
maximum size of the proposed list), to read_intarray() and expect it to return the
number of scores placed in the array. Similarly, when we print the array
using print_intarray, we give it the array to be printed and a count of elements it
contains. We saw in Chapter that in order for a called function to access objects in
the calling function (such as to store elements in an array) we must use indirect
access, i.e. pointers. So, read_intaray() must indirectly access the array, exam_scores,
in main(). One unique feature of C is that array access is always indirect; thus making
it particularly easy for a called function to indirectly access elements of an array and
store or retrieve values. As we will see in later sections, array access by index value is
interpreted as an indirect access, so we may simply use array indexing as indirect
access.
We are now ready to implement the algorithm for main() using functions to read data
into the array and to print the array. The code is shown in Figure 7.5.
The function calls in main() pass the name of the array, exam_scores, as an argument
because the name of an array in an expression evaluates to a pointer to the array. In
other words, the expression, exam_scores, is a pointer to (the first element of) the
array, exam_scores[]. Its type is, therefore, int *, and a called function uses this
pointer (passed as an argument) to indirectly access the elements of the array. As seen
in the Figure, for both functions, the headers and the prototypes show the first formal
parameter as an integer array without specifying the size. In C, this syntax is
interpreted as a pointer variable; so scores is declared aa an int * variable. We will
soon discuss how arrays are accessed in C; for now, we will assume that these
pointers may be used to indirectly access the arrays.
The second formal parameter in both functions is lim which specifies the maximum
number of items. For read_intaray(), this may be considered the maximum number
of scores that can be read so that it does not read more items than the size of the array
allows ( MAX). The function returns the actual number of items read which is saved in
the variable, n, in main(). For the function, print_intaray(), lim represents the fact
that it must not print more than n items. Again, since arrays in C are accessed
indirectly, these functions are able to access the array which is defined and allocated
in main(). A sample session for this implementation of the task would be identical to
the one shown earlier.
Similarly, we can modify the program, string.c, to use functions to read and print
strings. The task and the algorithm are the same as defined for STRING0 in the last
section, except that the program is terminated when an empty string is read. The code
is shown in Figure 7.6.
The driver calls read_str() and print_str() repeatedly until an empty string is read
(detected when s[0] is zero, i.e. NULL). The argument passed
to read_str() andprint_str() is str, a pointer to (the first element of) a character
array, i.e. a char *. The function, read_str(), reads characters until a newline is read
and indirectly stores the characters into the string, s. The function, print_str(), prints
characters from the string, s until NULL is reached and terminates the output with a
newline. Notice we have declared the formal parameter, s as a char *, rather than as
an array: char s[]. As we will see in the next section, C treats the two declarations
exactly the same.

Previous: 7.8 Summary


Up: 7 Arrays
Next: 7.10 Problems
Previous Page: 7.8 Summary
Next Page: 7.10 Problems

7.9 Exercises
With the following declaration:
int *p, x[10];
char *t, s[100];
Explain each of the following expressions. If there is an error, explain why it is an
error.
1.
1. x
2. x + i
3. *(x + i)
4. x++;

2.
1. p = x;
2. *p
3. p++;
4. p++;
5. p--;
6. --p;

3.
1. p = x + 5;
2. *p;
3. --p;
4. p*;
4. scanf("%s", s);
Input: Hello, Hello.
5. printf("%s\n", s);
6. scanf("%s", t);
7. t = s;
8. scanf("%s", t);

Check the following problems; find and correct errors, if any. What will be the
output in each case.
9. main()
10. { int i, x[10] = { 1, 2, 3, 4};
11.
12. for (i = 0; i < 10; i++) {
13. printf("%d\n", *x);
14. x++;
15. }
16.}
17.main()
18. { int i, *ptr, x[10] = { 1, 2, 3, 4};
19.
20. for (i = 0; i < 10; i++) {
21. printf("%d\n", *ptr);
22. ptr++;
23. }
24.}
25.main()
26. { int i, x[10] = { 1, 2, 3, 4};
27.
28. for (i = 0; i < 10; i++)
29. printf("%d\n", (x + i));
30.}
31.main()
32. { int i, x[10] = { 1, 2, 3, 4};
33.
34. for (i = 0; i < 10; i++)
35. printf("%d\n", *(x + i));
36.}
37.main()
38. { int i, *ptr, x[10] = {1, 2, 3, 4};
39.
40. ptr = x;
41. for (i = 0; i < 10; i++) {
42. printf("%d\n", *ptr);
43. ptr++;
44. }
45.}
46.main()
47. { int i, *ptr, x[10] = {1, 2, 3, 4};
48.
49. ptr = x;
50. for (i = 0; i < 10; i++) {
51. printf("%d\n", ptr);
52. ptr++;
53. }
54.}
55.main()
56.{ char x[10];
57.
58. x = "Hawaii;
59. printf("%s\n", x);
60.}
61.main()
62. { char *ptr;
63.
64. ptr = "Hawaii";
65. printf("%s\n", ptr);
66.}
67.main()
68. { char *ptr, x[10] = "Hawaii";
69.
70. for (i = 0; i < 10; i++)
71. printf("%d %d %d\n", x + i, *(x + i), x[i]);
72.}
73.main()
74.{ char x[10];
75.
76. scanf("%s", x);
77. printf("%s\n", x);
78.}

The Input is:


Good Day to You
79.main()
80. { char *ptr;
81.
82. scanf("%s", ptr);
83. printf("%s\n", ptr);
84.}

The Input is:


Good Day to You

85.Here is the data stored in an array


86.char s[100];
87.
Hawaii\0Manoa\0

What will be printed out by the following loop?


i = 0;
while (s[i]) {
putchar(s[i]);
i++;
}
tep@wiliki.eng.hawaii.edu
Wed Aug 17 08:56:22 HST 1994
http://en.wikipedia.org/wiki/C_syntax

C syntax
From Wikipedia, the free encyclopedia

The syntax of the C programming language is a set of rules that specifies whether
the sequence of characters in a file is conforming C source code. The rules specify how
the character sequences are to be chunked into tokens (the lexical grammar), the
permissible sequences of these tokens and some of the meaning to be attributed to
these permissible token sequences (additional meaning is assigned by the semantics of
the language).

C syntax makes use of the maximal munch principle.

Contents
[hide]

• 1 Data structures
○ 1.1 Primitive data types
 1.1.1 Integral types
 1.1.2 Enumerated type
 1.1.3 Floating point
types
 1.1.4 Storage duration
specifiers
 1.1.5 Type qualifiers
○ 1.2 Pointers
 1.2.1 Referencing
 1.2.2 Dereferencing
○ 1.3 Arrays
 1.3.1 Array definition
 1.3.2 Accessing
elements
 1.3.3 Dynamic arrays
 1.3.4 Multidimensional
arrays
○ 1.4 Strings
 1.4.1 Backslash
escapes
 1.4.2 String literal
concatenation
 1.4.3 Character
constants
 1.4.4 Wide character
strings
 1.4.5 Variable width
strings
 1.4.6 Library functions
○ 1.5 Structures and unions
 1.5.1 Structures
 1.5.2 Unions
 1.5.3 Declaration
 1.5.4 Accessing
members
 1.5.5 Initialization
 1.5.6 Assignment
 1.5.7 Other operations
 1.5.8 Bit fields
 1.5.9 Incomplete types
• 2 Operators
• 3 Control structures
○ 3.1 Compound statements
○ 3.2 Selection statements
○ 3.3 Iteration statements
○ 3.4 Jump statements
 3.4.1 Storing the
address of a label
• 4 Functions
○ 4.1 Syntax
 4.1.1 Function Pointers
○ 4.2 Global structure
○ 4.3 Argument passing
 4.3.1 Array parameters
• 5 Miscellaneous
○ 5.1 Reserved keywords
○ 5.2 Case sensitivity
○ 5.3 Comments
○ 5.4 Command-line arguments
○ 5.5 Evaluation order
○ 5.6 Undefined behavior
• 6 See also
• 7 References
• 8 External links

[edit]Data structures

[edit]Primitive data types


The C language represents numbers in three forms: integral, real and complex. This
distinction reflects similar distinctions in the instruction setarchitecture of most central
processing units. Integral data types store numbers in the set of integers,
while real and complex numbers represent numbers (or pair of numbers) in the set
of real numbers in floating point form.
All C integer types have signed and unsigned variants. If signed or unsigned is
not specified explicitly, in most circumstances signed is assumed. However, for
historic reasons plain char is a type distinct from
both signed char and unsigned char. It may be a signed type or an unsigned type,
depending on the compiler and the character set (C guarantees that members of the C
basic character set have positive values). Also,bit field types specified as plain int may
be signed or unsigned, depending on the compiler.

[edit]Integral types
The integral types come in different sizes, with varying amounts of memory usage and
range of representable numbers. Modifiers are used to designate the
size: short, long and long long[1]. The character type, whose specifier is char,
represents the smallest addressable storage unit, which is most often an 8-bit byte (its
size must be at least 7-bit to store the basic character set, or larger) The standard
header limits.h defines the minimum and maximum values of the integral primitive data
types, amongst other limits.

The following table provides a list of the integral types and their common storage sizes.
The first listed number of bits is also the minimum required byISO C. The last column is
the equivalent exact-width C99 types from the stdint.h header.

Common definitions of integral types

Number Unambiguous
Implicit specifier(s) Explicit specifier
of bits type

signed char same 8 int8_t

unsigned char same 8 uint8_t

char same 8 None 1

short signed short int 16 int16_t

unsigned short unsigned short int 16 uint16_t

int16_t or int32_
int signed int 16 or 32
t

uint16_t or uint
unsigned unsigned int 16 or 32
32_t

long signed long int 32 or 64 int32_t or int64_


t

uint32_t or uint
unsigned long unsigned long int 32 or 64
64_t

long long[1] signed long long int 64 int64_t

unsigned long
unsigned long long int 64 uint64_t
long[1]

1
 Char is distinct from both signed and unsigned char, but is guaranteed to have the
same representation as one of them.
The size and limits of the plain int type (without the short, long,
or long long modifiers) vary much more than the other integral types among C
implementations. The Single UNIX Specification specifies that the int type must be at
least 32 bits, but the ISO C standard only requires 16 bits. Refer to limits.h for
guaranteed constraints on these data types. On most existing implementations, two of
the five integral types have the same bit widths.

Integral type literal constants may be represented in one of two ways, by an integer
type number, or by a single character surrounded by single quotes. Integers may be
represented in three bases: decimal (48 or -293), octal with a "0" prefix (0177),
or hexadecimal with a "0x" prefix (0x3FE). A character in single quotes ('F'), called a
"character constant," represents the value of that character in the execution character
set (often ASCII). In C, character constants have type int (in C++, they have
type char).

[edit]Enumerated type
The enumerated type in C, specified with the enum keyword, and often just called an
"enum," is a type designed to represent values across a series of named constants.
Each of the enumerated constants has type int. Each enum type itself is compatible
with char or a signed or unsigned integer type, but each implementation defines its
own rules for choosing a type.
Some compilers warn if an object with enumerated type is assigned a value that is not
one of its constants. However, such an object can be assigned any values in the range
of their compatible type, and enum constants can be used anywhere an integer is
expected. For this reason, enum values are often used in place of the
preprocessor #define directives to create a series of named constants.

An enumerated type is declared with the enum specifier, an optional name for the enum,
a list of one or more constants contained within curly braces and separated by commas,
and an optional list of variable names. Subsequent references to a specific enumerated
type use the enum keyword and the name of the enum. By default, the first constant in
an enumeration is assigned value zero, and each subsequent value is incremented by
one over the previous constant. Specific values may also be assigned to constants in
the declaration, and any subsequent constants without specific values will be given
incremented values from that point onward.

For example, consider the following declaration:


enum colors { RED, GREEN, BLUE = 5, YELLOW } paint_color;
Which declares the enum colors type; the int constants RED (whose value is
zero), GREEN (whose value is one greater than RED, one), BLUE(whose value is the
given value, five), and YELLOW (whose value is one greater than BLUE, six); and
the enum colors variable paint_color. The constants may be used outside of the
context of the enum, and values other than the constants may be assigned
to paint_color, or any other variable of type enum colors.

[edit]Floating point types


The floating-point form is used to represent numbers with a fractional component. They
do not however represent most rational numbers exactly; they are a close
approximation instead. There are three types of real values, denoted by their specifier:
single-precision (specifier float), double-precision (double) and double-extended-
precision (long double). Each of these may represent values in a different form, often
one of the IEEE floating pointformats.

Floating-point constants may be written in decimal notation, e.g. 1.23. Scientific notation
may be used by adding e or E followed by a decimal exponent, e.g. 1.23e2 (which has
the value 123). Either a decimal point or an exponent is required (otherwise, the number
is an integer constant). Hexadecimal floating-point constants follow similar rules except
that they must be prefixed by 0x and use p to specify a binary exponent, e.g. 0xAp-2
(which has the value 2.5, since 10 * 2^-2 = 10 / 4). Both decimal and hexadecimal
floating-point constants may be suffixed by f or F to indicate a constant of type float,
by l or L to indicate type long double, or left unsuffixed for a double constant.

The standard header file float.h defines the minimum and maximum values of the
floating-point types float, double, and long double. It also defines other limits that
are relevant to the processing of floating-point numbers.

[edit]Storage duration specifiers


Every object has a storage class, which may be automatic, static, or allocated.
Variables declared within a block by default have automatic storage, as do those
explicitly declared with the auto[2] or register storage class specifiers.
The auto and register specifiers may only be used within functions and function
argument declarations; as such, the auto specifier is always redundant. Objects
declared outside of all blocks and those explicitly declared with the static storage
class specifier have static storage duration.

Objects with automatic storage are local to the block in which they were declared and
are discarded when the block is exited. Additionally, objects declared with
the register storage class may be given higher priority by the compiler for access
to registers; although they may not actually be stored in registers, objects with this
storage class may not be used with the address-of (&) unary operator. Objects with
static storage persist upon exit from the block in which they were declared. In this way,
the same object can be accessed by a function across multiple calls. Objects with
allocated storage duration are created and destroyed explicitly with malloc, free, and
related functions.
The extern storage class specifier indicates that the storage for an object has been
defined elsewhere. When used inside a block, it indicates that the storage has been
defined by a declaration outside of that block. When used outside of all blocks, it
indicates that the storage has been defined outside of the file. The extern storage
class specifier is redundant when used on a function declaration. It indicates that the
declared function has been defined outside of the file.

[edit]Type qualifiers
Objects can be qualified to indicate special properties of the data they contain.
The const type qualifier indicates that the value of an object should not change once it
has been initialized. Attempting to modify an object qualified with const yields
undefined behavior, so some C implementations store them in read-only segments of
memory. The volatile type qualifier indicates that the value of an object may be
changed externally without any action by the program (see volatile variable); it may be
completely ignored by the compiler.

[edit]Pointers
In declarations the asterisk modifier (*) specifies a pointer type. For example, where the
specifier int would refer to the integer type, the specifierint * refers to the type
"pointer to integer". Pointer values associate two pieces of information: a memory
address and a data type. The following line of code declares a pointer-to-integer
variable called ptr:
int *ptr;
[edit]Referencing
When a non-static pointer is declared, it has an unspecified value associated with it.
The address associated with such a pointer must be changed by assignment prior to
using it. In the following example, ptr is set so that it points to the data associated with
the variable a:
int *ptr;
int a;

ptr = &a;
In order to accomplish this, the "address-of" operator (unary &) is used. It produces the
memory location of the data object that follows.

[edit]Dereferencing
The pointed-to data can be accessed through a pointer value. In the following example,
the integer variable b is set to the value of integer variable a, which is 10:
int *p;
int a, b;

a = 10;
p = &a;
b = *p;
In order to accomplish that task, the dereference operator (unary *) is used. It returns
the data to which its operand—which must be of pointer type—points. Thus, the
expression *p denotes the same value as a.
[edit]Arrays
[edit]Array definition
Arrays are used in C to represent structures of consecutive elements of the same type.
The definition of a (fixed-size) array has the following syntax:
int array[100];
which defines an array named array to hold 100 values of the primitive type int. If
declared within a function, the array dimension may also be a non-constant expression,
in which case memory for the specified number of elements will be allocated. In most
contexts in later use, a mention of the variable array is converted to a pointer to the
first item in the array. The sizeof operator is an exception: sizeof array yields the
size of the entire array (that is, 100 times the size of an int). Another exception is
the & (address-of) operator, which yields a pointer to the entire array
(e.g.int (*ptr_to_array)[100] = &array;).

[edit]Accessing elements
The primary facility for accessing the values of the elements of an array is the array
subscript operator. To access the i-indexed element of array, the syntax would
be array[i], which refers to the value stored in that array element.

Array subscript numbering begins at 0. The largest allowed array subscript is therefore
equal to the number of elements in the array minus 1. To illustrate this, consider an
array a declared as having 10 elements; the first element would be a[0] and the last
element would be a[9]. C provides no facility for automatic bounds checking for array
usage. Though logically the last subscript in an array of 10 elements would be 9,
subscripts 10, 11, and so forth could accidentally be specified, with undefined results.

Due to array↔pointer interchangeability, the addresses of each of the array elements


can be expressed in equivalent pointer arithmetic. The following table illustrates both
methods for the existing array:

Array subscripts vs. pointer arithmetic

Element
1 2 3 n
index
Array
array[0] array[1] array[2] array[n-1]
subscript

Derefere
*(array + n-
nced *array *(array + 1) *(array + 2)
1)
pointer

Similarly, since the expression a[i] is semantically equivalent to *(a+i), which in turn
is equivalent to *(i+a), the expression can also be written as i[a] (although this form
is rarely used).
[edit]Dynamic arrays
A constant value is required for the dimension in a declaration of a static array. A
desired feature is the ability to set the length of an array dynamically at run-time instead:
int n = ...;
int a[n];
a[3] = 10;
This behavior can be simulated with the help of the C standard library.
The malloc function provides a simple method for allocating memory. It takes one
parameter: the amount of memory to allocate in bytes. Upon successful
allocation, malloc returns a generic (void *) pointer value, pointing to the beginning of
the allocated space. The pointer value returned is converted to an appropriate type
implicitly by assignment. If the allocation could not be completed, malloc returns a null
pointer. The following segment is therefore similar in function to the above desired
declaration:
#include <stdlib.h> /* declares malloc */

int *a;
a = malloc(n * sizeof(int));
a[3] = 10;
The result is a "pointer to int" variable (a) that points to the first
of n contiguous int objects; due to array↔pointer equivalence this can be used in
place of an actual array name, as shown in the last line. The advantage in using
this dynamic allocation is that the amount of memory that is allocated to it can be limited
to what is actually needed at run time, and this can be changed as needed (using the
standard library function realloc).

When the dynamically-allocated memory is no longer needed, it should be released


back to the run-time system. This is done with a call to the freefunction. It takes a
single parameter: a pointer to previously allocated memory. This is the value that was
returned by a previous call to malloc. It is considered good practice to then set the
pointer variable to NULL so that further attempts to access the memory to which it points
will fail. If this is not done, the variable becomes a dangling pointer, and such errors in
the code (or manipulations by an attacker) might be very hard to detect and lead to
obscure and potentially dangerous malfunction caused by memory corruption.
free(a);
a = NULL;
Standard C-99 also supports variable-length arrays (VLAs) within block scope. Such
array variables are allocated based on the value of an integer value at runtime upon
entry to a block, and are deallocated at the end of the block.
float read_and_process(int sz)
{
float vals[sz]; // VLA, size determined at runtime

for (int i = 0; i < sz; i++)


vals[i] = read_value();
return process(vals, sz);
}
[edit]Multidimensional arrays
In addition, C supports arrays of multiple dimensions, which are stored in row-major
order. Technically, C multidimensional arrays are just one-dimensional arrays whose
elements are arrays. The syntax for declaring multidimensional arrays is as follows:
int array2d[ROWS][COLUMNS];
(where ROWS and COLUMNS are constants); this defines a two-dimensional array.
Reading the subscripts from left to right, array2d is an array of length ROWS, each
element of which is an array of COLUMNS ints.

To access an integer element in this multidimensional array, one would use


array2d[4][3]
Again, reading from left to right, this accesses the 5th row, 4th element in that row
(array2d[4] is an array, which we are then subscripting with the[3] to access the
fourth integer).

Higher-dimensional arrays can be declared in a similar manner.

A multidimensional array should not be confused with an array of references to arrays


(also known as Iliffe vectors or sometimes array of arrays). The former is always
rectangular (all subarrays must be the same size), and occupies a contiguous region of
memory. The latter is a one-dimensional array of pointers, each of which may point to
the first element of a subarray in a different place in memory, and the sub-arrays do not
have to be the same size. The latter can be created by multiple use of malloc.

[edit]Strings
In C, string constants (literals) are surrounded by double quotes ("),
e.g. "Hello world!" and are compiled to an array of the specified char values with
an additional null terminating character (0-valued) code to mark the end of the string.

String literals may not contain embedded newlines; this proscription somewhat
simplifies parsing of the language. To include a newline in a string, thebackslash
escape \n may be used, as below.

There are several standard library functions for operating with string data (not
necessarily constant) organized as array of char using this null-terminated format;
see below.

C's string-literal syntax has been very influential, and has made its way into many other
languages, such as C++, Perl, Python, PHP, Java, Javascript, C#, Ruby. Nowadays,
almost all new languages adopt or build upon C-style string syntax. Languages that lack
this syntax tend to precede C.

[edit]Backslash escapes
If you wish to include a double quote inside the string, that can be done by escaping it
with a backslash (\), for example, "This string contains \"double
quotes\".". To insert a literal backslash, one must double it, e.g. "A backslash
looks like this: \\".

Backslashes may be used to enter control characters, etc., into a string:


Esca
Meaning
pe
\\ Literal backslash

\" Double quote

\' Single quote

\n Newline (line feed)

\r Carriage return

\b Backspace

\t Horizontal tab

\f Form feed

\a Alert (bell)

\v Vertical tab

Question mark (used to


\?
escape trigraphs)

\nnn Character with octal value nnn

Character with hexadecimal


\xhh
value hh

The use of other backslash escapes is not defined by the C standard, although compiler
vendors often provide additional escape codes as language extensions.

[edit]String literal concatenation


Adjacent string literals are concatenated at compile time; this allows long strings to be
split over multiple lines, and also allows string literals resulting from C
preprocessor defines and macros to be appended to strings at compile time:
printf(__FILE__ ": %d: Hello "
"world\n", __LINE__);
will expand to
printf("helloworld.c" ": %d: Hello "
"world\n", 10);
which is syntactically equivalent to
printf("helloworld.c: %d: Hello world\n", 10);
[edit]Character constants
Individual character constants are represented by single-quotes, e.g. 'A', and have
type int (in C++ char). The difference is that "A" represents a pointer to the first
element of a null-terminated array, whereas 'A' directly represents the code value (65
if ASCII is used). The same backslash-escapes are supported as for strings, except that
(of course) " can validly be used as a character without being escaped, whereas ' must
now be escaped. A character constant cannot be empty (i.e. '' is invalid syntax),
although a string may be (it still has the null terminating character). Multi-character
constants (e.g. 'xy') are valid, although rarely useful — they let one store several
characters in an integer (e.g. 4 ASCII characters can fit in a 32-bit integer, 8 in a 64-bit
one). Since the order in which the characters are packed into one int is not specified,
portable use of multi-character constants is difficult.

[edit]Wide character strings


Since type char is usually 1 byte wide, a single char value typically can represent at
most 255 distinct character codes, not nearly enough for all the characters in use
worldwide. To provide better support for international characters, the first C standard
(C89) introduced wide characters (encoded in type wchar_t) and wide character
strings, which are written as L"Hello world!"

Wide characters are most commonly either 2 bytes (using a 2-byte encoding such
as UTF-16) or 4 bytes (usually UTF-32), but Standard C does not specify the width
for wchar_t, leaving the choice to the implementor. Microsoft Windows generally uses
UTF-16, thus the above string would be 26 bytes long for a Microsoft compiler;
the Unix world prefers UTF-32, thus compilers such as GCC would generate a 52-byte
string. A 2-byte widewchar_t suffers the same limitation as char, in that certain
characters (those outside the BMP) cannot be represented in a single wchar_t; but
must be represented using surrogate pairs.

The original C standard specified only minimal functions for operating with wide
character strings; in 1995 the standard was modified to include much more extensive
support, comparable to that for char strings. The relevant functions are mostly named
after their char equivalents, with the addition of a "w" or the replacement of "str" with
"wcs"; they are specified in <wchar.h>, with <wctype.h> containing wide-character
classification and mapping functions.
[edit]Variable width strings
A common alternative to wchar_t is to use a variable-width encoding, whereby a
logical character may extend over multiple positions of the string. Variable-width strings
may be encoded into literals verbatim, at the risk of confusing the compiler, or using
numerical backslash escapes (e.g."\xc3\xa9" for "é" in UTF-8). The UTF-8 encoding
was specifically designed (under Plan 9) for compatibility with the standard library string
functions; supporting features of the encoding include a lack of embedded nulls, no
valid interpretations for subsequences, and trivial resynchronisation. Encodings lacking
these features are likely to prove incompatible with the standard library functions;
encoding-aware string functions are often used in such case.

[edit]Library functions
Strings, both constant and variable, may be manipulated without using the standard
library. However, the library contains many useful functions for working with null-
terminated strings. It is the programmer's responsibility to ensure that enough storage
has been allocated to hold the resulting strings.

The most commonly used string functions are:


 strcat(dest, source) - appends the string source to the end of string dest
 strchr(s, c) - finds the first instance of character c in string s and returns a
pointer to it or a null pointer if c is not found
 strcmp(a, b) - compares strings a and b (lexicographical ordering); returns
negative if a is less than b, 0 if equal, positive if greater.
 strcpy(dest, source) - copies the string source onto the string dest
 strlen(st) - return the length of string st
 strncat(dest, source, n) - appends a maximum of n characters from the
string source to the end of string dest and null terminates the string at the end of
input or at index n+1 when the max length is reached
 strncmp(a, b, n) - compares a maximum of n characters from
strings a and b (lexical ordering); returns negative if a is less than b, 0 if equal,
positive if greater
 strrchr(s, c) - finds the last instance of character c in string s and returns a
pointer to it or a null pointer if c is not found
Other standard string functions include:
 strcoll(s1, s2) - compare two strings according to a locale-specific collating
sequence
 strcspn(s1, s2) - returns the index of the first character in s1 that matches any
character in s2
 strerror(errno) - returns a string with an error message corresponding to the
code in errno
 strncpy(dest, source, n) - copies n characters from the string source onto
the string dest, substituting null bytes once past the end ofsource; does not null
terminate if max length is reached
 strpbrk(s1, s2) - returns a pointer to the first character in s1 that matches any
character in s2 or a null pointer if not found
 strspn(s1, s2) - returns the index of the first character in s1 that matches no
character in s2
 strstr(st, subst) - returns a pointer to the first occurrence of the
string subst in st or a null pointer if no such substring exists
 strtok(s1, s2) - returns a pointer to a token within s1 delimited by the
characters in s2
 strxfrm(s1, s2, n) - transforms s2 onto s1, such that s1 used
with strcmp gives the same results as s2 used with strcoll
There is a similar set of functions for handling wide character strings.

[edit]Structures and unions


[edit]Structures
Structures in C are defined as data containers consisting of a sequence of named
members of various types. They are similar to records in other programming languages.
The members of a structure are stored in consecutive locations in memory, although the
compiler is allowed to insert padding between or after members (but not before the first
member) for efficiency. The size of a structure is equal to the sum of the sizes of its
members, plus the size of the padding.

[edit]Unions
Unions in C are related to structures and are defined as objects that may hold (at
different times) objects of different types and sizes. They are analogous to variant
records in other programming languages. Unlike structures, the components of a union
all refer to the same location in memory. In this way, a union can be used at various
times to hold different types of objects, without the need to create a separate object for
each new type. The size of a union is equal to the size of its largest component type.

[edit]Declaration
Structures are declared with the struct keyword and unions are declared with
the union keyword. The specifier keyword is followed by an optional identifier name,
which is used to identify the form of the structure or union. The identifier is followed by
the declaration of the structure or union's body: a list of member declarations, contained
within curly braces, with each declaration terminated by a semicolon. Finally, the
declaration concludes with an optional list of identifier names, which are declared as
instances of the structure or union.
For example, the following statement declares a structure named s that contains three
members; it will also declare an instance of the structure known as t:
struct s
{
int x;
float y;
char *z;
} t;
And the following statement will declare a similar union named u and an instance of it
named n:
union u
{
int x;
float y;
char *z;
} n;
Once a structure or union body has been declared and given a name, it can be
considered a new data type using the specifier struct or union, as appropriate, and
the name. For example, the following statement, given the above structure declaration,
declares a new instance of the structure snamed r:
struct s r;
It is also common to use the typedef specifier to eliminate the need for
the struct or union keyword in later references to the structure. The first identifier
after the body of the structure is taken as the new name for the structure type. For
example, the following statement will declare a new type known as s_type that will
contain some structure:
typedef struct {…} s_type;
Future statements can then use the specifier s_type (instead of the
expanded struct … specifier) to refer to the structure.

[edit]Accessing members
Members are accessed using the name of the instance of a structure or union, a period
(.), and the name of the member. For example, given the declaration of t from above,
the member known as y (of type float) can be accessed using the following syntax:
t.y
Structures are commonly accessed through pointers. Consider the following example
that defines a pointer to t, known as ptr_to_t:
struct s *ptr_to_t = &t;
Member y of t can then be accessed by dereferencing ptr_to_t and using the result
as the left operand:
(*ptr_to_t).y
Which is identical to the simpler t.y above as long as ptr_to_t points to t. Because
this operation is common, C provides an abbreviated syntax for accessing a member
directly from a pointer. With this syntax, the name of the instance is replaced with the
name of the pointer and the period is replaced with the character sequence ->. Thus,
the following method of accessing y is identical to the previous two:
ptr_to_t->y
Members of unions are accessed in the same way.

[edit]Initialization
A structure can be initialized in its declarations using an initializer list, similar to arrays. If
a structure is not initialized, the values of its members are undefined until assigned. The
components of the initializer list must agree, in type and number, with the components
of the structure itself.
The following statement will initialize a new instance of the structure s from above
known as pi:
struct s pi = { 3, 3.1415, "Pi" };
Designated initializers allow members to be initialized by name. The following
initialization is equivalent to the previous one.
struct s pi = { .z = "Pi", .x = 3, .y = 3.1415 };
Members may be initialized in any order, and those that are not explicitly mentioned are
set to zero.

Any one member of a union may be initialized using designated initializers.


union u value = { .y = 3.1415 };
In C89, a union could only be initialized with a value of the type of its first member. That
is, the union u from above can only be initialized with a value of type int.
union u value = { 3 };
[edit]Assignment
Assigning values to individual members of structures and unions is syntactically
identical to assigning values to any other object. The only difference is that the lvalue of
the assignment is the name of the member, as accessed by the syntax mentioned
above.

A structure can also be assigned as a unit to another structure of the same type.
Structures (and pointers to structures) may also be used as function parameter and
return types.
For example, the following statement assigns the value of 74 (the ASCII code point for
the letter 't') to the member named x in the structure t, from above:
t.x = 74;
And the same assignment, using ptr_to_t in place of t, would look like:
ptr_to_t->x = 74;
Assignment with members of unions is identical, except that each new assignment
changes the current type of the union, and the previous type and value are lost.

[edit]Other operations
According to the C standard, the only legal operations that can be performed on a
structure are copying it, assigning to it as a unit (or initializing it), taking its address with
the address-of (&) unary operator, and accessing its members. Unions have the same
restrictions. One of the operations implicitly forbidden is comparison: structures and
unions cannot be compared using C's standard comparison facilities (==, >, <, etc.).
[edit]Bit fields
C also provides a special type of structure member known as a bit field, which is an
integer with an explicitly specified number of bits. A bit field is declared as a structure
member of type int, signed int, unsigned int, or _Bool, following the member
name by a colon (:) and the number of bits it should occupy. The total number of bits in
a single bit field must not exceed the total number of bits in its declared type.

As a special exception to the usual C syntax rules, it is implementation-defined whether


a bit field declared as type int, without specifying signedor unsigned, is signed or
unsigned. Thus, it is recommended to explicitly specify signed or unsigned on all
structure members for portability.

Empty entries consisting of just a colon followed by a number of bits are also allowed;
these indicate padding.

The members of bit fields do not have addresses, and as such cannot be used with the
address-of (&) unary operator. The sizeof operator may not be applied to bit fields.

The following declaration declares a new structure type known as f and an instance of
it known as g. Comments provide a description of each of the members:
struct f
{
unsigned int flag : 1; /* a bit flag: can either be on (1)
or off (0) */
signed int num : 4; /* a signed 4-bit field; range
-7...7 or -8...7 */
: 3; /* 3 bits of padding to round out 8
bits */
} g;
[edit]Incomplete types
The body of a struct or union declaration, or a typedef thereof, may be omitted,
yielding an incomplete type. Such a type may not be instantiated (its size is not known),
nor may its members be accessed (they, too, are unknown); however, the derived
pointer type may be used (but not dereferenced).

Incomplete types are used to implement recursive structures; the body of the type
declaration may be deferred to later in the translation unit:
typedef struct Bert Bert;
typedef struct Wilma Wilma;
struct Bert
{
Wilma *wilma;
};

struct Wilma
{
Bert *bert;
};
Incomplete types are also used for data hiding; the incomplete type is defined in
a header file, and the body only within the relevant source file.

[edit]Operators

Main article: Operators in C and C++

[edit]Control structures

C is a free-form language.

Bracing style varies from programmer to programmer and can be the subject of debate.
See Indent style for more details.

[edit]Compound statements
In the items in this section, any <statement> can be replaced with a compound
statement. Compound statements have the form:
{
<optional-declaration-list>
<optional-statement-list>
}
and are used as the body of a function or anywhere that a single statement is expected.
The declaration-list declares variables to be used in thatscope, and the statement-list
are the actions to be performed. Brackets define their own scope, and variables defined
inside those brackets will be automatically deallocated at the closing bracket.
Declarations and statements can be freely intermixed within a compound statement (as
in C++).

[edit]Selection statements
C has two types of selection statements: the if statement and the switch statement.

The if statement is in the form:


if (<expression>)
<statement1>
else
<statement2>
In the if statement, if the <expression> in parentheses is nonzero (true), control
passes to <statement1>. If the else clause is present and the <expression> is zero
(false), control will pass to <statement2>. The "else <statement2> part is optional, and if
absent, a false <expression> will simply result in skipping over the <statement1>.
An else always matches the nearest previous unmatched if; braces may be used to
override this when necessary, or for clarity.
The switch statement causes control to be transferred to one of several statements
depending on the value of an expression, which must haveintegral type. The
substatement controlled by a switch is typically compound. Any statement within the
substatement may be labeled with one or more case labels, which consist of the
keyword case followed by a constant expression and then a colon (:). The syntax is as
follows:
switch (<expression>)
{
case <label1> :
<statements 1>
case <label2> :
<statements 2>
break;
default :
<statements 3>
}
No two of the case constants associated with the same switch may have the same
value. There may be at most one default label associated with a switch - if none of
the case labels are equal to the expression in the parentheses following switch,
control passes to the default label, or if there is no default label, execution
resumes just beyond the entire construct. Switches may be nested;
a case or default label is associated with the innermost switch that contains it.
Switch statements can "fall through", that is, when one case section has completed its
execution, statements will continue to be executed downward until a break; statement
is encountered. Fall-through is useful in some circumstances, but is usually not desired.
In the preceding example, if <label2> is reached, the statements <statements 2> are
executed and nothing more inside the braces. However if <label1> is reached, both
<statements 1> and <statements 2> are executed since there is no break to separate
the two case statements.

[edit]Iteration statements
C has three forms of iteration statement:
do
<statement>
while ( <expression> ) ;

while ( <expression> )
<statement>

for ( <expression> ; <expression> ; <expression> )


<statement>
In the while and do statements, the substatement is executed repeatedly so long as
the value of the expression remains nonzero (true). Withwhile, the test, including all
side effects from the expression, occurs before each execution of the statement;
with do, the test follows each iteration. Thus, a do statement always executes its
substatement at least once, whereas while may not execute the substatement at all.

If all three expressions are present in a for, the statement


for (e1; e2; e3)
s;
is equivalent to
e1;
while (e2)
{
s;
e3;
}
except for the behavior of a continue; statement (which in the for loop jumps
to e3 instead of e2).

Any of the three expressions in the for loop may be omitted. A missing second
expression makes the while test always nonzero, creating a potentially infinite loop.

Since C99, the first expression may take the form of a declaration, typically including an
initializer, such as
for (int i=0; i< limit; i++){
...
}
The declaration's scope is limited to the extent of the for loop.

[edit]Jump statements
Jump statements transfer control unconditionally. There are four types of jump
statements in C: goto, continue, break, and return.

The goto statement looks like this:


goto <identifier> ;
The identifier must be a label (followed by a colon) located in the current function.
Control transfers to the labeled statement.
A continue statement may appear only within an iteration statement and causes
control to pass to the loop-continuation portion of the innermost enclosing iteration
statement. That is, within each of the statements
while (expression)
{
/* ... */
cont: ;
}

do
{
/* ... */
cont: ;
} while (expression);

for (expr1; expr2; expr3) {


/* ... */
cont: ;
}
a continue not contained within a nested iteration statement is the same
as goto cont.

The break statement is used to end a for loop, while loop, do loop,
or switch statement. Control passes to the statement following the terminated
statement.
A function returns to its caller by the return statement. When return is followed by
an expression, the value is returned to the caller as the value of the function.
Encountering the end of the function is equivalent to a return with no expression. In
that case, if the function is declared as returning a value and the caller tries to use the
returned value, the result is undefined.

[edit]Storing the address of a label


GCC extends the C language with a unary && operator that returns the address of a
label. This address can be stored in a void* variable type and may be used later in a
goto instruction. For example, the following prints "hi " in an infinite loop:
void *ptr = &&J1;

J1: printf("hi ");


goto *ptr;
This feature can be used to implement a jump table.
[edit]Functions

[edit]Syntax
A C function definition consists of a return type (void if no value is returned), a unique
name, a list of parameters in parentheses, and various statements. A function with non-
void return type should include at least one return statement.
<return-type> functionName( <parameter-list> )
{
<statements>
return <expression of type return-type>;
}
where <parameter-list> variables is a comma separated list of parameter
declarations, each item in the list being a data type followed by an identifier: data-type
variable, data-type variable,.... If there are no parameters the parameter-list may left
empty or optionally be specified with the single word void. It is possible to define a
function as taking a variable number of parameters by providing the ... keyword as the
last parameter instead of a data type and variable name. A commonly used function that
does this is the standard library function printf, which has the declaration:
int printf (const char*, ...);
Manipulation of these parameters can be done by using the routines in the standard
library header <stdarg.h>.

[edit]Function Pointers
A pointer to a function can be declared as follows:
<return-type> (*functionName)(<parameter-list>);
The following program shows use of a function pointer for selecting between addition
and subtraction:
#include <stdio.h>

int (*operation)(int x, int y);

int add(int x, int y)


{
return x + y;
}

int subtract(int x, int y)


{
return x - y;
}

int main(int argc, char* args[])


{
int foo = 1, bar = 1;

operation = add;
printf("%d + %d = %d\n", foo, bar, operation(foo, bar));
operation = subtract;
printf("%d - %d = %d\n", foo, bar, operation(foo, bar));
return 0;
}
[edit]Global structure
After preprocessing, at the highest level a C program consists of a sequence of
declarations at file scope. These may be partitioned into several separate source files,
which may be compiled separately; the resulting object modules are then linked along
with implementation-provided run-time support modules to produce an executable
image.

The declarations introduce functions, variables and types. C functions are akin to the
subroutines of Fortran or the procedures of Pascal.

A definition is a special type of declaration. A variable definition sets aside storage and
possibly initializes it, a function definition provides its body.

An implementation of C providing all of the standard library functions is called a hosted


implementation. Programs written for hosted implementations are required to define a
special function called main, which is the first function called when execution of the
program begins.
Hosted implementations start program execution by invoking the main function, which
must be defined following one of these prototypes:
int main() {...}
int main(void) {...}
int main(int argc, char *argv[]) {...}
(int main(int argc, char **argv) is also allowed). The first two definitions are
equivalent (and both are compatible with C++). It is probably up to individual preference
which one is used (the current C standard contains two examples of main() and two
of main(void), but the draft C++ standard uses main()). The return value of main
(which should be int) serves as termination status returned to the host environment.
The C standard defines return values 0 and EXIT_SUCCESS as indicating success
and EXIT_FAILURE as indicating failure. (EXIT_SUCCESS andEXIT_FAILURE are
defined in <stdlib.h>). Other return values have implementation defined meanings;
for example, under Linux a program killed by a signal yields a return code of the
numerical value of the signal plus 128.
A minimal C program would consist only of an empty main routine:
int main(){}
The main function will usually call other functions to help it perform its job.

Some implementations are not hosted, usually because they are not intended to be
used with an operating system. Such implementations are calledfree-standing in the C
standard. A free-standing implementation is free to specify how it handles program
startup; in particular it need not require a program to define a main function.

Functions may be written by the programmer or provided by existing libraries. Interfaces


for the latter are usually declared by including header files—with
the #include preprocessing directive—and the library objects are linked into the final
executable image. Certain library functions, such asprintf, are defined by the C
standard; these are referred to as the standard library functions.

A function may return a value to caller (usually another C function, or the hosting
environment for the function main). The printf function mentioned above returns how
many characters were printed, but this value is often ignored.
[edit]Argument passing
In C, arguments are passed to functions by value while other languages may pass
variables by reference. This means that the receiving function gets copies of the values
and has no direct way of altering the original variables. For a function to alter a variable
passed from another function, the caller must pass its address (a pointer to it), which
can then be dereferenced in the receiving function (see Pointers for more info):
void incInt(int *y)
{
(*y)++; // Increase the value of 'x', in main, by one
}

int main(void)
{
int x = 0;
incInt(&x); // pass a reference to the var 'x'
return 0;
}
The function scanf works the same way:
int x;
scanf("%d", &x);
In order to pass an editable pointer to a function you have to pass a pointer
to that pointer; its address:
#include <stdio.h>
#include <stdlib.h>

void setInt(int **p, int n)


{
*p = malloc(sizeof(int)); // allocate a memory area,
saving the pointer in the
// location pointed to by the
parameter "p"
if (*p == NULL)
{
perror("malloc");
exit(EXIT_FAILURE);
}

// dereference the given pointer that has been assigned an


address
// of dynamically allocated memory and set the int to the
value of n (42)
**p = n;
}

int main(void)
{
int *p; // create a pointer to an integer
setInt(&p, 42); // pass the address of 'p'
free(p);
return 0;
}
int **p defines a pointer to a pointer, which is the address to the pointer p in this
case.

[edit]Array parameters
Function parameters of array type may at first glance appear to be an exception to C's
pass-by-value rule. The following program will print 2, not 1:
#include <stdio.h>

void setArray(int array[], int index, int value)


{
array[index] = value;
}

int main(void)
{
int a[1] = {1};
setArray(a, 0, 2);
printf ("a[0]=%d\n", a[0]);
return 0;
}
However, there is a different reason for this behavior. In fact, a function parameter
declared with an array type is treated almost exactly like one declared to be a pointer.
That is, the preceding declaration of setArray is equivalent to the following:
void setArray(int *array, int index, int value)
At the same time, C rules for the use of arrays in expressions cause the value of a in
the call to setArray to be converted to a pointer to the first element of array a. Thus, in
fact this is still an example of pass-by-value, with the caveat that it is the address of the
first element of the array being passed by value, not the contents of the array.
[edit]Miscellaneous

[edit]Reserved keywords
The following words are reserved, and may not be used as identifiers:
auto double int
_Bool else long
break enum register
case extern restrict
char float return
_Complex for short
const goto signed
continue if sizeof
default _Imaginary static
do inline struct

Implementations may reserve other keywords, such as asm, although implementations


typically provide non-standard keywords that begin with one or two underscores.

[edit]Case sensitivity
C identifiers are case sensitive (e.g., foo, FOO, and Foo are the names of different
objects). Some linkers may map external identifiers to a single case, although this is
uncommon in most modern linkers.

[edit]Comments
Text starting with /* is treated as a comment and ignored. The comment ends at the
next */; it can occur within expressions, and can span multiple lines. Accidental
omission of the comment terminator is problematic in that the next comment's properly
constructed comment terminator will be used to terminate the initial comment, and all
code in between the comments will be considered as a comment. C-style comments do
not "nest".
C++ style line comments start with // and extend to the end of the line:
// this line will be ignored by the compiler

/* these lines
will be ignored
by the compiler */
x = *p/*q; /* note: this comment starts after the 'p' */
[edit]Command-line arguments
The parameters given on a command line are passed to a C program with two
predefined variables - the count of the command-line arguments in argcand the
individual arguments as character strings in the pointer array argv. So the command

myFilt p1 p2 p3

results in something like

m y F i l t \0 p 1 \0 p 2 \0 p 3 \0

argv[0] argv[1] argv[2] argv[3]

(Note: While individual strings are contiguous arrays of char, there is no guarantee that
the strings are stored as a contiguous group.)
The name of the program, argv[0], may be useful when printing diagnostic messages
or for making one binary serve multiple purposes. The individual values of the
parameters may be accessed with argv[1], argv[2], and argv[3], as shown in the
following program:
#include <stdio.h>

int main(int argc, char *argv[])


{
int i;

printf ("argc\t= %d\n", argc);


for (i = 0; i < argc; i++)
printf ("argv[%i]\t= %s\n", i, argv[i]);
return 0;
}
[edit]Evaluation order
In any reasonably complex expression, there arises a choice as to the order in which to
evaluate the parts of the expression: (1+1)+(3+3) may be evaluated in the
order (1+1)+(3+3), (2)+(3+3), (2)+(6), 8 or in the
order (1+1)+(3+3), (1+1)+(6), (2)+(6), 8. Formally, a conforming C compiler
may evaluate expressions in any order between sequence points. Sequence points are
defined by:

 Statement ends at semicolons.


 The sequencing operator: a comma. However, commas that delimit function
arguments are not sequence points.
 The short-circuit operators: logical and (&&) and logical or (||).
 The ternary operator (?:): This operator evaluates its first sub-expression first, and
then its second or third (never both of them) based on the value of the first.
 Entry to and exit from a function call (but not between evaluations of the arguments).
Expressions before a sequence point are always evaluated before those after a
sequence point. In the case of short-circuit evaluation, the second expression may not
be evaluated depending on the result of the first expression. For example, in the
expression (a() || b()), if the first argument evaluates to nonzero (true), the result
of the entire expression will also be true, so b() is not evaluated.

The arguments to a function call may be evaluated in any order, as long as they are all
evaluated by the time the function call takes place. The following expression, for
example, has undefined behavior:
printf("%s %s\n", argv[i = 0], argv[++i]);
[edit]Undefined behavior
An aspect of the C standard (not unique to C) is that the behavior of certain code is said
to be "undefined". In practice, this means that the program produced from this code can
do anything, from working as the programmer intended, to crashing every time it is run.
For example, the following code produces undefined behavior, because the variable b is
modified more than once with no intervening sequence point:
#include <stdio.h>

int main(void)
{
int a, b = 1;

a = b++ + b++;
printf("%d\n", a);
return 0;
}
Because there is no sequence point between the modifications of b in b++ + b++, it is
possible to perform the evaluation steps in more than one order, resulting in an
ambiguous statement. This can be fixed by rewriting the code to insert a sequence
point:
a = b++;
a += b++;
[edit]See also

 Blocks (C language extension)


 C programming language
 C variable types and declarations
 Operators in C and C++
[edit]References

 Kernighan, Brian W.; Ritchie, Dennis M. (1988). The C Programming Language (2nd
Edition ed.). Upper Saddle River, New Jersey: Prentice Hall PTR.ISBN 0131103709.
 American National Standard for Information Systems - Programming Language - C -
ANSI X3.159-1989
a b c
1. ^ The long long modifier was introduced in the C99 standard.

2. ^ The meaning of auto is a type specifier rather than a storage class specifier in
C++0x
[edit]External links

 The syntax of C in Backus-Naur form


 Programming in C
 The comp.lang.c Frequently Asked Questions Page
[hid

v•d•e

C programmin

Libraries C standard library · glibc · dietlibc · uClibc · Newlib · EGLIBC


Features String · Syntax · Preprocessor · Variable types and declarations · F

Select
C++ · C++0x · C# · D · Objective-C · Vala
descendants

C and other
C and C++ (Compatibility · Operators) · Comparison of Pascal and C
languages

Categ

Categories: C programming language | Source code


• Log in / create account
• Article
• Discussion
• Read
• Edit
• View history
Top of Form

Bottom of Form
• Main page
• Contents
• Featured content
• Current events
• Random article
• Donate
Interaction
• Help
• About Wikipedia
• Community portal
• Recent changes
• Contact Wikipedia
Toolbox
Print/export
Languages
• Tiếng Việt
• This page was last modified on 15 November 2010 at 02:30.
• Text is available under the Creative Commons Attribution-ShareAlike License;
additional terms may apply. See Terms of Use for details.
Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit
organization.
• Contact us

• Privacy policy

• About Wikipedia

• Disclaimers

You might also like