You are on page 1of 16

Importance of C language? The importance of C language is that the time efficiency of C programs are very high compared to others..

Moreover it is a middle level language. i.e We can write a C program that can perform the task of a program written in low level language as well as the program written in high level language... A Token is the basic and the smallest unit of a program There are 6 types of tokens in 'C'. They are: 1) Keywords 2) Identifiers 3) Constants 4) Strings 5) Special symbols 6) Operators Anything that you can't put whitespace between. The indivisible elements of a program. Example: printf("Hello world %d",variable); Tokens: (7 total) printf ( "Hello World %d" , variable ) ; In a C source program, the basic element recognized by the compiler is the "token." A token is source-program text that the compiler does not break down into component elements. Syntax token: keyword identifier constant string-literal operator punctuator The keywords, identifiers, constants, string literals, and operators described in this section are examples of tokens. Punctuation characters such as brackets ([ ]), braces ({ }), parentheses ( ( ) ), and commas (,) are also tokens. C tokens The tokens of a language are the basic building blocks which can be put together to construct programs. A token can be a reserved word (such as int or while), an identifier (such as b or sum), a constant (such as 25 or "Alice in Wonderland"), a delimiter (such as { or ;) or an operator (such as + or =). For example, consider the following portion of the program: main() {int a, b, sum;a = 14;b = 25;sum = a + b;printf("%d + %d = %d\n", a, b, sum); } Starting from the beginning, we can list the tokens (in bold) in order:

main - identifier ( - left bracket, delimiter ) - right bracket, delimiter { - left brace, delimiter int - reserved word a - identifier , - comma, delimiter b - identifier , - comma, delimiter sum - identifier ; - semicolon, delimiter a - identifier = - equals sign, operator 14 - constant ; - semicolon, delimiter and so on. Thus we can think of a program as a stream of tokens, which is precisely how the compiler views it. So that, as far as the compiler is concerned, the above could have been written: main() { int a, b, sum; a = 14; b = 25; sum = a + b; printf("%d + %d = %d\n", a, b, sum); } The order of the tokens is exactly the same; to the compiler, it is the same program. To the computer, only the order of the tokens is important. However, layout and spacing are important to make the program more readable to human beings. C Keywords Keywords are the words whose meaning has already been explained to the C compiler (or in a broad sense to the computer). The keywords cannot be used as variable names because if we do so we are trying to assign a new meaning to the keyword, which is not allowed by the computer. Some C compilers allow you to construct variable names that exactly resemble the keywords. However, it would be safer not to mix up the variable names and the keywords. The keywords are also called Reserved words. There are only 32 keywords available in C. Figure 1.5 gives a list of these keywords for your ready reference. A detailed discussion of each of these keywords would be taken up in later chapters wherever their use is relevant.

auto break case char const continue default do

double else enum extern float for goto if

int long register return short signed sizeof static

struct switch typedef union unsigned void volatile while

"Identifiers" or "symbols" are the names you supply for variables, types, functions, and labels in your program. Identifier names must differ in spelling and case from any keywords. You cannot use keywords (either C or Microsoft) as identifiers; they are reserved for special use. You create an identifier by specifying it in the declaration of a variable, type, or function. In this example, result is an identifier for an integer variable, and main and printf are identifier names for functions.

#include <stdio.h> int main() { int result; if ( result != 0 ) printf_s( "Bad file handle\n" );

Once declared, you can use the identifier in later program statements to refer to the associated value.

Constants, Variables and Keywords


The alphabets, numbers and special symbols when properly combined form constants, variables and keywords. Let us see what are constants and variables in C. A constant is an entity that doesnt change whereas a variable is an entity that may change. In any program we typically do lots of calculations. The results of these calculations are stored in computers memory. Like human memory the computer memory also consists of millions of cells. The calculated values are stored in these memory cells. To make the retrieval and usage of these values easy these memory cells (also called memory locations) are given names. Since the value stored in each location may change the names given to these locations are called variable names. Consider the following example. Here 3 is stored in a memory location and a name x is given to it. Then we are assigning a new value 5 to the same memory location x. This would overwrite the earlier value 3, since a memory location can hold only one value at a time.

Escape Sequences
It causes an escape from the normal interpretation of a string, so that the next character is recognized as one having a special meaning.

The following example shows usage of \n and a new escape sequence \t, called tab. A \t moves the cursor to the next tab stop. A 80-column screen usually has 10 tab stops. In other words, the screen is divided into 10 zones of 8 columns each. Printing a tab takes the cursor to the beginning of next printing zone. For example, if cursor is positioned in column 5, then printing a tab takes it to column 8.
main( ) { printf ( "You\tmust\tbe\tcrazy\nto\thate\tthis\tbook" ) ; }

And heres the output...


1234 01234567890123456789012345678901234567890 You must be crazy to hate this book

The \n character causes a new line to begin following crazy. The tab and newline are probably the most commonly used escape sequences, but there are others as well. Figure 11.4 shows a complete list of these escape sequences.

Esc. Seq. \n \b \f \ \\

Purpose New line Backspace Form feed Single quote

Esc. Seq. \t \r \a \ Backslas h

Purpose Tab Carriage return Alert Double quote

The first few of these escape sequences are more or less self- explanatory. \b moves the cursor one position to the left of its current position. \r takes the cursor to the beginning of the line in which it is currently placed. \a alerts the user by sounding the speaker inside the computer. Form feed advances the computer stationery attached to the printer to the top of the next page. Characters that are ordinarily used as delimiters... the single quote, double quote, and the backslash can be printed by preceding them with the backslash.

Preprocessor directives

Preprocessor directives are lines included in the code of our programs that are not program statements but directives for the preprocessor. These lines are always preceded by a hash sign (#). The preprocessor is executed before the actual compilation of code begins, therefore the preprocessor digests all these directives before any code is generated by the statements. These preprocessor directives extend only across a single line of code. As soon as a newline character is found, the preprocessor directive is considered to end. No semicolon (;) is expected at the end of a preprocessor directive. The only way a preprocessor directive can extend through more than one line is by preceding the newline character at the end of the line by a backslash (\). macro definitions (#define, #undef) To define preprocessor macros we can use #define. Its format is: #define identifier replacement When the preprocessor encounters this directive, it replaces any occurrence of identifier in the rest of the code by replacement. This replacement can be an expression, a statement, a block or simply anything. The preprocessor does not understand C++, it simply replaces any occurrence of identifier byreplacement. 1 #define TABLE_SIZE 100 2 int table1[TABLE_SIZE]; 3 int table2[TABLE_SIZE]; After the preprocessor has replaced TABLE_SIZE, the code becomes equivalent to: 1 int table1[100]; 2 int table2[100]; Source file inclusion (#include) This directive has also been used assiduously in other sections of this tutorial. When the preprocessor finds an #include directive it replaces it by the entire content of the specified file. There are two ways to specify a file to be included: 1 #include "file" 2 #include <file>

Integer and Float Conversions In order to effectively develop C programs, it will be necessary to understand the rules that are used for the implicit conversion of floating point and integer values in C. These are mentioned below. Note them carefully. (a) (b) (c) 1An arithmetic operation between an integer and integer always yields an integer result. 2An operation between a real and real always yields a real result.

3An operation between an integer and real always yields a real result. In this operation the integer is first promoted to a real and then the operation is performed. Hence the result is real. Few practical examples shown in the following figure would put the issue beyond doubt. Operation Result Operation Result 5 / 2 2 2 / 5 0 5.0 / 2 2.5 2.0 / 5 0.4 5 / 2.0 2.5 2 / 5.0 0.4 5.0 / 2.0 2.5 2.0 / 5.0 0.4

Character Sets When you write a program, you express C source files as text lines containing characters from the source character set. When a program executes in the target environment, it uses characters from the target character set. These character sets are related, but need not have the same encoding or all the same members. Every character set contains a distinct code value for each character in the basic C character set. A character set can also contain additional characters with other code values. For example: The character constant 'x' becomes the value of the code for the character corresponding to x in the target character set. The string literal "xyz" becomes a sequence of character constants stored in successive bytes of memory, followed by a byte containing the value zero: {'x', 'y', 'z', '\0'} A string literal is one way to specify a null-terminated string, an array of zero or more bytes followed by a byte containing the value zero. Visible graphic characters in the basic C character set: Form Members letter A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z digit underscore punctuation 0 1 2 3 4 5 6 7 8 9 _

! " # % & ' ( ) * + , - . / : ; < = > ? [ \ ] ^ { | } ~ Additional graphic characters in the basic C character set: Character Meaning space leave blank space BEL signal an alert (BELl) BS go back one position (BackSpace) FF go to top of page (Form Feed) NL go to start of next line (NewLine) CR go to start of this line (Carriage Return) HT go to next Horizontal Tab stop VT go to next Vertical Tab stop The code value zero is reserved for the null character which is always in the target character set. Code values for the basic C character set are positive when stored in an object of type char. Code values for the digits are contiguous, with increasing value. For example, '0' + 5 equals '5'. Code values for any two letters are not necessarily contiguous. Character Sets and Locales An implementation can support multiple locales, each with a different character set. A locale summarizes conventions peculiar to a given culture, such as how to format dates or how to sort names. To change locales and, therefore, target character sets while the program is running, use the function setlocale. The translator encodes character constants and string literals for the "C" locale, which is the locale in effect at program startup. ASCII Characters for MPE Users

The ASCII character set defines 128 characters (0 to 127 decimal, 0 to FF hexadecimal, and 0 to 177 octal). This character set is a subset of many other character sets with 256 characters, including the ANSI character set of MS Windows, the Roman-8 character set of HP systems, and the IBM PC Extended Character Set of DOS, and the ISO Latin-1 character set used by Web browsers. They are not the same as the EBCDIC character set used on IBM mainframes. The Control Characters The first 32 values are non-printing control characters, such as Return and Line feed. You generate these characters on the keyboard by holding down the Control key while you strike another key. For example, Bell is value 7, Control plus G, often shown in documents as ^G. Notice that 7 is 64 less than the value of G (71); the Control key subtracts 64 from the value of the keys that it modifies. For the text version of the following tables, click Control Characters and Printing Characters. Control Characters Char Oct Dec Hex Control-Key Control Action NUL 0 0 0 ^@ Null character SOH 1 1 1 ^A Start of heading, = console interrupt STX 2 2 2 ^B Start of text, maintenance mode on HP console ETX 3 3 3 ^C End of text EOT 4 4 4 ^D End of transmission, not the same as ETB ENQ 5 5 5 ^E Enquiry, goes with ACK; old HP flow control ACK 6 6 6 ^F Acknowledge, clears ENQ logon hand BEL 7 7 7 ^G Bell, rings the bell... BS 10 8 8 ^H Backspace, works on HP terminals/computers HT 11 9 9 ^I Horizontal tab, move to next tab stop LF 12 10 a ^J Line Feed VT 13 11 b ^K Vertical tab FF 14 12 c ^L Form Feed, page eject CR 15 13 d ^M Carriage Return SO 16 14 e ^N Shift Out, alternate character set SI 17 15 f ^O Shift In, resume defaultn character set DLE 20 16 10 ^P Data link escape XON, with XOFF to pause listings; ":okay to DC1 21 17 11 ^Q send". DC2 22 18 12 ^R Device control 2, block-mode flow control DC3 23 19 13 ^S XOFF, with XON is TERM=18 flow control DC4 24 20 14 ^T Device control 4 NAK 25 21 15 ^U Negative acknowledge SYN 26 22 16 ^V Synchronous idle ETB 27 23 17 ^W End transmission block, not the same as EOT CAN 30 24 17 ^X Cancel line, MPE echoes !!! EM 31 25 19 ^Y End of medium, Control-Y interrupt SUB 32 26 1a ^Z Substitute ESC 33 27 1b ^[ Escape, next character is not echoed FS 34 28 1c ^\ File separator GS 35 29 1d ^] Group separator RS 36 30 1e ^^ Record separator, block-mode terminator

US 37 31 1f ^_ Unit separator Printing Characters Char Octal Dec Hex Description SP 40 32 20 Space ! 41 33 21 Exclamation mark " 42 34 22 Quotation mark (&quot; in HTML) # 43 35 23 Cross hatch (number sign) $ 44 36 24 Dollar sign % 45 37 25 Percent sign & 46 38 26 Ampersand ` 47 39 27 Closing single quote (apostrophe) ( 50 40 28 Opening parentheses ) 51 41 29 Closing parentheses * 52 42 2a Asterisk (star, multiply) + 53 43 2b Plus , 54 44 2c Comma 55 45 2d Hyphen, dash, minus . 56 46 2e Period / 57 47 2f Slant (forward slash, divide) 0 60 48 30 Zero 1 61 49 31 One 2 62 50 32 Two 3 63 51 33 Three 4 64 52 34 Four 5 65 53 35 Five 6 66 54 36 Six 7 67 55 37 Seven 8 70 56 38 Eight 9 71 57 39 Nine : 72 58 3a Colon ; 73 59 3b Semicolon < 74 60 3c Less than sign (&lt; in HTML) = 75 61 3d Equals sign > 76 62 3e Greater than sign (&gt; in HTML) ? 77 63 3f Question mark @ 100 64 40 At-sign A 101 65 41 Uppercase A B 102 66 42 Uppercase B C 103 67 43 Uppercase C D 104 68 44 Uppercase D E 105 69 45 Uppercase E F 106 70 46 Uppercase F G 107 71 47 Uppercase G H 110 72 48 Uppercase H I 111 73 49 Uppercase I J 112 74 4a Uppercase J K 113 75 4b Uppercase K L 114 76 4c Uppercase L

M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { |

115 116 117 120 121 122 123 124 125 126 127 130 131 132 133 134 135 136 137 140 141 142 143 144 145 146 147 150 151 152 153 154 155 156 157 160 161 162 163 164 165 166 167 170 171 172 173 174

77 4d 78 4e 79 4f 80 50 81 51 82 52 83 53 84 54 85 55 86 56 87 57 88 58 89 59 90 5a 91 5b 92 5c 93 5d 94 5e 95 5f 96 60 97 61 98 62 99 63 100 64 101 65 102 66 103 67 104 68 105 69 106 6a 107 6b 108 6c 109 6d 110 6e 111 6f 112 70 113 71 114 72 115 73 116 74 117 75 118 76 119 77 120 78 121 79 122 7a 123 7b 124 7c

Uppercase M Uppercase N Uppercase O Uppercase P Uppercase Q Uppercase R Uppercase S Uppercase T Uppercase U Uppercase V Uppercase W Uppercase X Uppercase Y Uppercase Z Opening square bracket Reverse slant (Backslash) Closing square bracket Caret (Circumflex) Underscore Opening single quote Lowercase a Lowercase b Lowercase c Lowercase d Lowercase e Lowercase f Lowercase g Lowercase h Lowercase i Lowercase j Lowercase k Lowercase l Lowercase m Lowercase n Lowercase o Lowercase p Lowercase q Lowercase r Lowercase s Lowercase t Lowercase u Lowercase v Lowercase w Lowercase x Lowercase y Lowercase z Opening curly brace Vertical line

} 175 ~ 176 DEL 177

125 7d Cloing curly brace 126 7e Tilde (approximate) 127 7f Delete (rubout), cross-hatch box

A character denotes any alphabet ,digit or symbols to represent information.The following are the valid alphabets, numbers and special symbols permitted in C Numerals: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 Alphabets: a, b, .z A, B, ...Z Arithmetic Operations: +, -, *, /, %(Mod) Special Characters: ( = \ ) ! { $ & } ? | _ [ . ^ / ] , ~ * < : ` % > ; # @

Blank -

CONSTANTS, VARIABLES AND KEYWORDS A constant is an entity that does not change, but a variable as the name suggests may change. We do a number of calculations in a computer and the computed values are stored in some memory spaces. Inorder to retrieve and re-use those values from the computers memory locations they are given names. Since the value stored in each location may change, the names given to these locations are called as variable names. Constants: There are mainly three types of constants namely: integer, real and character constants. Integer Constants: The integer constants Whole Numbers Eg. 25, 35, -25, -46 Computer allocates only 2 bytes in memory. 16th bit is sign bit. (if 0 then +ve value, if 1 then ve value) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 214 213 212 211 210 29 28 27 26 5 4 3 2 1 0 2 2 2 2 2 2 = 1*1 + 4*1 + 8*1 + 16*1 + 32*1 + 64*1 + 128*1 + 512*1 + 1024*1 + 2048*1 + 4096*1 + 2*1 + 8192*1 + = 32767 (32767 Bits can be stored for integer constants) 32768 is negative -32767 is minimum

256*1 + 16284*1

(i) Decimal Integer Constant: 0 to 9 E.g: 49, 58, -62, (40000 cannot come bcoz it is > 32767)

(ii) Octal Integer Constant: 0 to 7 Add 0 before the value. Eg.: 045, 056, 067 (iii) Hexadecimal Integer: 0 to 9 and A to F Add 0x before the value E.g: 0x42, 0x56, 0x67

REAL CONSTANTS: The real or floating point constants are in two forms namely fractional form and the exponential form. A real constant in fractional form must: Have at least one digit It must have a decimal point Could have positive or negative sign(default sign is positive) Must not have commas or spaces within it Allots 4 bytes in memory Ex: +867.9, -26.9876, 654.0 In exponential form, the real constant is represented as two parts. The part lying before the e is the mantissa, and the one following e is the exponent. The real constant in exponential form must follow the following rules: The mantissa part and the exponential part should be separated by the letter e The mantissa may have a positive or negative sign(default sign is positive) The exponent must have at least one digit The exponent must be a positive or negative integer(default sign is positive) The range of real constants in exponential form is -3.4e38 and -3.4e38 Ex: +3.2e-4, 4.1e8, -0.2e+4, -3.2e-4 CHARACTER CONSTANTS A character constant is an alphabet, a single digit or a single special symbol enclosed within inverted commas. The maximum length of a character constant can be 1 character. Allots 1 byte of memory Ex: B, l, # Types of C Variables Variable names are names given to locations in the memory. These locations can contain integer, real or character constants. An integer variable can hold only an integer constant, a real variable can hold only a real constant and a character variable can hold only a character constant. Rules for Constructing Variable Names A variable name is any combination of 1 to 31 alphabets, digits or underscores. Some compilers allow variable names whose length could be up to 247 characters. The first character in the variable name must be an alphabet No commas or blanks are allowed within a variable name.

No special symbol other than an underscore (as in net_sal) can be used in a variable name. Ex.: si_int e_hra pod_e_81 C compiler makes it compulsory for the user to declare the type of any variable name that he wishes to use in a program. This type declaration is done at the beginning of the program. Following are the examples of type declaration statements: Ex.: int si, e_hra ; float bas_sal ; char code ; Since, the maximum allowable length of a variable name is 31 characters, an enormous number of variable names can be constructed using the abovementioned rules. It is a good practice to exploit this enormous choice in naming variables by using meaningful variable names.

C Operators/Expressions
Operators are used with operands to build expressions. For example the following is an expression containing two operands and one oprator.

4 + 5
The following list of operators is probably not complete but does highlight the common operators and a few of the outrageous ones.... C contains the following operator groups.

Arithmetic Assignment Logical/relational Bitwise Odds and ends! Operator precedence table.

The order (precedence) that operators are evaluated can be seen here.

Arithmetic
+ / * % -++

modulo Decrement (post and pre) Increment (post and pre)

Assignment
These all perform an arithmetic operation on the lvalue and assign the result to the lvalue. So what does this mean in English? Here is an example:

counter = counter + 1;
can be reduced to

counter += 1;
Here is the full set.

= *= /= %= += -= <<= >>= &= ^= |=

Multiply Divide. Modulus. add. Subtract. left shift. Right shift. Bitwise AND. bitwise exclusive OR (XOR). bitwise inclusive OR.

Logical/Relational
== != > < >= <= && || ! Equal to Not equal to

Logical AND Logical OR Logical NOT

Bitwise
& | ^ << >> ~ AND (Binary operator) inclusive OR exclusive OR shift left. shift right. one's complement

Odds and ends!


sizeof() size of objects and data types. strlen may also be of interest. & Address of (Unary operator) * pointer (Unary operator) ? Conditional expressions : Conditional expressions , Series operator.

Precedence of C Operators
Category Postfix Unary Multiplicative Additive Shift Relational Equality Bitwise AND Bitwise XOR Bitwise OR Operator () [] -> . ++ - + - ! ~ ++ - - (type) * & sizeof */% +<< >> < <= > >= == != & ^ | Associativity Left to right Right to left Left to right Left to right Left to right Left to right Left to right Left to right Left to right Left to right

Logical AND Logical OR Conditional Assignment Comma

&& || ?: = += -= *= /= %= >>= <<= &= ^= |= ,

Left to right Left to right Right to left Right to left Left to right

Note 1: Parentheses are also used to group sub-expressions to force a different precedence; such parenthetical expressions can be nested and are evaluated from inner to outer. Note 2: Postfix increment/decrement have high precedence, but the actual increment or decrement of the operand is delayed (to be accomplished sometime before the statement completes execution). So in the statement y = x * z++; the current value of z is used to evaluate the expression (i.e., z++ evaluates to z) and z only incremented after all else is done.

You might also like