You are on page 1of 22

Characters and tokens

Characters are the basic building blocks in C program, equivalent to letters in English language Includes every printable character on the standard English language keyboard except `, $ and @

Example of characters:
Numeric digits: 0 - 9 Lowercase/uppercase letters: a - z and A - Z Space (blank) Special characters: , . ; ? / ( ) [ ] { } * & % ^ < > etc

A token is a language element that can be used in forming higher level language constructs Equivalent to a word in English language

Several types of tokens can be used to build a higher level C language


construct such as expressions and statements There are 6 kinds of tokens in C:
Reserved words (keywords)
Identifiers Constants String literals Punctuators Operators

Reserved Words
Keywords that identify language entities such as

statements, data types, language attributes, etc.


Have special meaning to the compiler, cannot be used as identifiers in our program. Should be typed in lowercase. Example: const, double, int, main, void, while, for, else

(etc..)
Displayed in BLUE color in MS Visual C++

Identifiers
Words used to represent certain program entities (program

variables, function names, etc).


Example: int my_name; my_name is an identifier used as a program variable void CalculateTotal(int value)

CalculateTotal is an identifier used as a function name

Rules for Constructing Identifiers

1. Identifiers can consist of capital letters A to Z, the lowercase a to z,


the digit 0 to 9 and underscore character _ 2. The first character must be a letter or an underscore. 3. There is virtually no length limitation. However,, in many implementations of the C language, the compilers recognize only the first 32 characters as significant. 4. There can be no embedded blanks

5. Reserved words cannot be used as identifiers.


6. Identifiers are case sensitive. Therefore, Tax and tax both different.

Rules for Constructing Identifiers

Examples of legal identifier:


Student_age, Item10, counter, number_of_character

Examples of illegal identifier


Student age (embedded blank)

continue (continue is a reserved word)


10thItem (the first character is a digit)

Principal+interest (contain operator character +)

Recommendations for Constructing Identifiers

1. Avoid excessively short and cryptic names such as x or wt. Instead,


use a more readable and descriptive names such as student_major and down_payment. 2. Use underscores or capital letters to separate words in identifiers that consist of two or more words. Example, student_major or studentMajor are much easier to read than studentmajor.

Constants
Entities that appear in the program code as fixed values. 4 types of constants: Integer constants Floating-point constants Character constants String Literal

Integer Constant
Positive or negative whole numbers with no fractional part Optional + or sign before the digit. It can be decimal (base 10), octal (base 8) or hexadecimal (base 16) Hexadecimal is very useful when dealing with binary numbers Example: const int MAX_NUM = 10;

const int MIN_NUM = -90;


const int Hexadecimal_Number = 0xf87;

Integer Constant
Rules for Decimal Integer Constants
1. Decimal integer constants must begin with a nonzero decimal digit, the only exception being 0, and can contain decimal digital values of 0 through 9. An integer that begins with 0 is considered an octal constant 2. If the sign is missing in an integer constant, the computer assumes a positive value. 3. Commas are not allowed in integer constants. Therefore, 1,500 is illegal; it should be 1500. Example of legal integer constants are 15, 0, +250 and 7550 0179 is illegal since the first digit is zero 1F8 is illegal since it contains letter F 1,700 is illegal since it contains comma

Floating Point Constant


Positive or negative decimal numbers with an integer part(optional), a
decimal point, and a fractional part (optional) Example 2.0, 2., 0.2, .2, 0., 0.0, .0 It can be written in conventional or scientific way 20.35 is equivalent to 0.2035E+2 (0.2035 x 102 )

0.0023 is equivalent to 0.23e-2 (0.23 x 10-2)


E or e stand for exponent In scientific notation, the decimal point may be omitted. Example: -8.0 can rewritten as -8e0

Floating Point Constant


C support 3 type of Floating-point: float (4 bytes), double (8 bytes), long double (16 bytes) By default, a constant is assumed of type double Suffix f(F) or l(L) is used to specify float and long double respectively Example: const float balance = 0.125f; const float interest = 6.8e-2F const long double PI = 3.1412L; const long double planet_distance = 2.1632E+30l

Character constants
A character enclosed in a single quotation mark Example:
const char letter = n; const char number = 1; printf(%c, S); Output would be: S

How to write a single quotation mark? is ambiguous, so escape character back slash \ \

String Literals
A sequence of any number of characters surrounded by double quotation marks. Example:
Human Revolution

How to write special double quotation mark? is ambiguous, so use escape character
Example: printf(He shouted, /Run!/); output: He shouted, Run! - The escape character along with any character that follow it is called Escape Sequence

String Literals
Escape Sequence \a \b \f \n \r \t \v \\ \ \ \? Alert Backspace Formfeed New line Carriage return Horizontal tab Vertical tab Back slash Single quotation Double quotation Question mark Name Meaning Sounds a beep Backs up 1 character Starts a new screen of page Moves to beginning of next line Moves to beginning of current line Moves to next tab position Moves down a fixed amount Prints a back slash Prints a single quotation Prints a double quotation Prints a question mark

Punctuators (separators)
Symbols used to separate different parts of the C program. These punctuators include:
[ ] ( ) { } , ; : * #

Usage example:
void main (void) { int num = 10; printf (%i, num); }

Operators
Tokens that result in some kind of computation or action when applied to variables or or other elements in an expression. Example of operators:
*+=-/

Usage example:
result = total1 + total2;

Operators
Bit wise operators are important when dealing with low level processing (image processing etc)
Complement ( ~ ) Reverse every bit in the data. char num = 0xF5; //1111 0101 printf(%X, ~num); //output is 0x0A - left shift ( << ) Shift to left and replace the blank space with ZERO char num = 0xF5; //1111 0101 printf(%X,num<<2); //output is 0xD4 Equivalent to multiply the operand with 2

Operators
- right shift ( >> ) Shift to right and replace the blank space with ZERO char num = 0xF5; //1111 0101 printf(%X,num>>2); //output is 0x3D Equivalent to divide the operand by 2

- Bit wise AND ( & ) AND every bit of first operand with the second one. char num = 0xF5, mask = 0x0F; //1111 0101 0000 1111 printf(%X,num&mask); //output is 0x05 Useful for masking to CLEAR (set to zero) specific bits

Operators
- Bit wise OR ( | ) OR every bit of first operand with the second one. char num = 0xF5, mask = 0x0F; //1111 0101 0000 1111 printf(%X,num|mask); //output is 0xFF Useful for masking to SET (set to 1) specific bits

End...

You might also like