Professional Documents
Culture Documents
STRINGS (chap 8)
Youve printed literals, or string constants. Now well learn to store string values. C++ has a
string class. (Whatever that iswell see in chapter 10. Think of it as a type like int or double.)
This class allows one to declare objects (which we will treat like variables of data type string) and
provides methods (or member functions) for manipulating strings. To use strings, you must
#include<string>. The string header file includes the prototypes for the string functions,
just as cmath contains the prototypes for mathematical functions, and iostream contains the
prototypes for I/O functions.
Lots of details and whatifs. Even I cant do them all in class. Read the text.
#include <iostream>
#include <string>
int main()
{
string str1, str2;
str1 = "Hello";
str2 = "Harry";
cout << str1 << ", " < < str2 << endl;
return 0;
}
This will print "Hello, Harry" and go to a new line.
You can
initialize a string in the declaration
assign the value of one string to another,
send a string to cout
read in a string from cin dont need to type the quotes
But cin stops reading as soon as a whitespace (space, tab, or newline) character is
encountered. This is quite limiting if you want to read in someone's full name or a sentence.
getline() reads characters (including whitespace characters) from the input_stream into the
specified string_variable until the delimiting_character is encountered. The third
parameter is optional. If it is omitted, the default is the newline character .
8-2
string name;
getline(cin,name);
cout << name << endl;
reads until it encounters the newline (i.e. the user hits "Enter"). The \n is read but not
stored in name.
The optional third parameter can be any character used as a delimiter. getline() reads until it
encounters the delimiter, which is not stored in the string.
getline(cin,first,',');
getline(cin,second,'D');
Thus, if the line typed in was "Hello, John Doe" followed by a newline character, first would get
the value "Hello" and second would get " John ", including the space before 'J' and the space after
'n'. Neither string would include the comma, because that was the delimiter.
LAB Strings1.doc
ignore()
Warning, warning, warning: Not in the book, but necessary: if you try to use getline after >>, it
will read whatever is left on the line, even if it is just a newline character. You have to use the
function ignore() (in <iostring>) to ignore the rest of the line and go to the beginning of the next
line.
cin.ignore(numchars,char);
ignore() skips the number of characters specified or all of the characters up to and including the
character specified by the second parameter, whichever comes first.
For example,
cin.gnore(80,\n);
Use infile.ignore(100,\n); if you are reading from infile rather than the
keyboard.
String operations in C++ are performed using some of the same operators that are used for
arithmetic operations. The symbols are interpreted slightly differently in some cases. We've already
seen the use of the = operator for assignment.
Concatenation
It is possible to join two or more strings together using the + operator. Rather than addition, this
use of the + operator joins one string to the end of another; this operation is called
concatenation.
string str1, str2, str3;
str1 = "yesterday";
str2 = "morning";
str3 = str1+ " " + str2;
String str3 has the value "yesterday morning", and that value is sent to cout.
Notes:
It is necessary to concatenate a space between the two words to produce a space in the
resulting string.
At least one of the operands must be a variable. [Useless to concatenate two literals.]
The use of an operator (like +) for two different actions is called operator overloading.
[not responsible this term]
str += 's';
Comparing Strings
You can compare two strings using the standard relational operators (<, <=, >, >=, ==, and !=).
Two strings are the same (equal to each other) if they have the same number of characters and if
each corresponding character matches: e.g., "cat" is the same as "cat" but not the same as "act",
"Cat" or even as "cat " (with a space at the end).
8-4
if (str1 == str2)
cout << "alike" << endl;
else
cout << "different" << endl;
This will, of course, print "different" because "cat" is not the same string as "dog".
Strings can be compared to determine whether one is greater than or less than the other.
String functions
As I said, the string class is a pre-defined class. (Whatever that iswell see in chapter 10. Think
of it as a type like int or double.)
Many pre-defined classes in C++ have functions associated with them, called methods or
member functions of the class. Member functions define the operations that can be performed on
[?by] objects of the class. We've already used several member functions of other classes: in
Chapter 3, we introduced the iostream member functions cout.precision() and cout.width(). These
member functions work on cout, which is an object of the iostream class. We also introduced the
fstream member functions open() and close() which work on files, objects of the fstream class.
A string variable is an object of the string class. Among its most useful member functions are
length(), substr(), find(), insert(), and erase().
The length of a string is the number of characters currently stored in the string. There are two
member functions that will return this value: length() and size(). (They do the same thing.)
Example:
string city=Queens;
cout << city.length(); Will print 6, the number of characters stored in city.
Notes:
We are using the names of the strings--state and city--to indicate what string object
[variable] the member function should act upon. [to whom the function belongs?]
It is safest to use string::size_type when referring to string sizes or to positions in a string.
LAB See censor.doc EXERCISE: Write a program to censor 4 letter words. It reads strings
consisting of words and prints censored each time a 4 letter word is read. Otherwise it echoes
the word. [Uses while(cin) and CTRL-Z. Unfortunately doesnt necessarily use string::sizetype,
unless I insist.]
The characters in a string have numbered positions, starting at 0, just like the values in an array.
We can refer to the position (or index) of a character or a group of characters in a string.
Remember, we start counting at 0. For example, here is the string "house" together with the
position (or index) of each character.
character h o u s e
position 0 1 2 3 4
We can use the position to access the value stored at that location.
If a function were to return the position of the letter 'u' in this string, it would return 2; similarly,
the string "use" appears starting in position 2 of this string. The letter 'x' is not found. A function
that tries to find something which is not present in a string returns string::npos. Cant use 0
is used to indicate that a value is not found, because 0 would mean position 0 of the string. The
value npos is the maximum number of characters that a string can hold, which is one greater than
the largest possible character position. (Notice that the string above has 5 characters, but the
largest position is 4.) The exact value of npos is machine dependent and irrelevant; what matters
is that it represents a value which cannot be an index into the string.
8-7
find takes two parameters: the string to find, and the starting position. It searches till the end of
the string. find returns the position at which the item is found or string::npos if the item is not
found. Reminder: find returns a value of type string::size_type, so declare the
variable to hold the result that way.
See str_find.cpp;
Removing characters from a string can be done using erase(). This member function takes
two parameters, the starting position and the number of characters to remove.
Syntax: source.erase(start_pos, numchars);
The erase() function removes the characters and closes up the string. If the second parameter
is omitted, the function erases characters from the starting position to the end of the string.
See strex2.cpp
Syntax: source.insert(start_pos,newstr);
See strex2.cpp
substr() extracts part of a string, leaving the original string unchanged. It takes two
parameters of type string::size_type: the first is the position to start extracting, and the
second is the number of characters to extract. It returns a substring consisting of the extracted
characters. If you omit the second parameter, substr() will extract characters from the starting
8-8
SEE strex2.cpp
LAB String_functions.doc
Sending a string to a function works the same way as sending any other type. The type of the
formal parameter must be string. If the (? Actual parameter) string is to be changed inside the
function, it must be sent as a reference parameter, string&. string may be the return type of a
function.
Problem 8: A direct mail advertising agency has decided to personalize its sweepstakes offers. It
has prepared a basic sample letter. They would like to have a computer program change the letter
so that it can address each customer individually. As a first test case, the company would like a
program to make one set of changes to the text. It would like to replace all occurrences of
1. NAME by Jones
2. TITLE by Ms.
3. ADDRESS by Baltimore
Write a C++ program that will read in the text line by line, make the changes outlined in the list,
and display the revised text.
Pseudocode:
while there is a line of the letter to read
read in a line of the original letter
replace the old strings in the line by the new ones
8-9
We can get started from this pseudocode pretty quickly. The amount of data to input is large and
repetitive so we will use files for input and output. As you can see, we process the letter line by
line; that's a standard way of processing large amounts of data, especially when the data values are
being read in from a file.
What do we need to do? To start, we need a function to read in and print lines of the letter. main
can do that. Since the lines of text contain whitespace characters, we must use getline().
The change() function will accept one parameter: the line to change . It will read in the
changes and call another function to make the actual changes.
Pseudocode for change()
Let's develop the pseudocode for change(). The string line is sent to change() as a parameter. In
order to make all the replacements in each line of the text, the function change() will have to read
in the entire set of old and newstr values for each line of the text. This suggests a loop that will
read in sets of data values until there are no more.
[Would be nicer, but maybe more complicated, tor read once into array instead of opening and
closing each time.]
Notes:
We will assume that the changes come in pairs.
The values we are reading in (such as 421 Main St.) can contain spaces, so we must use
getline() to read in the values.
Change() will read the entire changesfile for each line, that is, every time it is called. So
the file will be opened and closed in change().
changesfile.clear(); // clear the EOF flag
In C++, a status flag is set when we read to the end of a file; it keeps us from reading from that
file again. Once we've read to the end of the file, closing the file and re-opening it does not
8-10
change the status flag for that file, and the program will still consider that it is at end-of-file.
Calling clear() resets the status flag so that we can read again from the beginning of the file.
[Ziegler didnt do this! How did his work?] [Also to continue after error]
Changes will call change_one_line() to make the changes.
change_one_line() receives the values for line, oldstr and newstr. change_one_line() will try to
find oldstr in line. If oldstr is found in line, it will replace oldstr by newstr.
change_one_line() receives three parameters: line, oldstr, and newstr. It will replace each
occurrences of oldstr in line by newstr. This is the most complicated of the functions. Good that
we isolated it. Modular top down programming.
Note:
pos = line.find(old,pos+newstr.length());
dont want to find oldstring again if oldstring is part of newstring e.g cause and because. Would
create infinite loop.
LAB and HW: Do the tracing exercises in the text. Maybe I can pick some.
SKIP SECTION 6: ENRICHMENT: DATA TYPE char, character I/O functions get() and
put(), whose headers are in <iostream> and functions which allow us to check and change the
values of variables of type char (or individual characters in a string) whose headers are in
<cctype>.
Character-Oriented I/O
8-11
We have discussed token-oriented input, performed when you use the operator >> on streams
and file streams. Token-oriented input reads a token, an item (generally) separated from the next
token by whitespace characters. We have also discussed line-oriented input, performed by
getline(), which reads an entire line of input. C++ has a third kind of I/O, character-oriented
I/O, which reads in or prints out one character at a time. Character-oriented I/O is performed by
the functions get() and put().
The general form of a call to get() to read a character from an input stream infile into a variable
ch:
ch = infile.get();
If you were to read this string character by character with the >> operator and count the
characters, you would read in 16 characters. If you were to read it character by character with
get(), you would read in 19 characters, counting the spaces. (And if you were to read it with
getline(), you'd read the whitespace characters, but you'd read the whole string at once; and you
couldn't count the characters as they are being read in.)
EXAMPLE 8-14: This loop will read characters from cin until the user signals the end of data
input (<Ctrl>-z in Windows, <Ctrl>-d in Unix) and count the number of characters read in. If the
user enters "this one" (without quotation marks), count will get the value 8. (See Chapter 7,
Section 9, for reading to the end of input.)
char let;
int count = 0;
let = cin.get();
8-12
while (cin) {
count ++;
let = cin.get();
}
cout << " characters entered: " << count << endl;
EXAMPLE 8-15a: Suppose we want to continue processing in a program if the user enters 'y',
but terminate if the user enters 'n'. (This is an example of the user-response method introduced in
Chapter 6.) The code below reads in a single character, assigns it to the variable answer, and
continues processing if answer equals 'y':
charanswer;
do{
actionoftheloopgoeshere
cout<<"Doyouwanttocontinue?>(y/n)";
answer=cin.get();
}while(answer=='y');
If the person using the program types in 'y' (and then <Enter>), answer gets the value 'y'. The
while condition is true, and the loop repeats.
If the person using the program types in the letter 'n' (and then <Enter>), answer gets the value
'n'. The while condition is false, and the loop terminates.
If the user enters a response like 'M' or '?' or even 'Y', the loop also terminates.
Unfortunately, this loop does not work correctly. The next subsection explains the
problem.
Buffering of Input
The function get() may appear not to work properly in some circumstances, as in Example
8-15a. The loop may continue several times or terminate unexpectedly, seemingly without regard
for what the user enters.
This happens because pressing <Enter> causes the entire line to be stored in an input area
called a buffer. An input function like get() first tries to retrieve data values from the input buffer.
If the user enters 'y', the buffer contains two characters: 'y' and the newline character, ('\n') caused
by pressing <Enter>. First, get() reads the 'y', causing the loop to repeat. However, on the next
8-13
call, get() doesn't wait for the user to enter a new value; instead, get() reads '\n' from the input
buffer; since '\n' is not 'y', the loop stops.
You may think that changing the loop condition will make the loop wait for an 'n' to stop:
while(answer!='n');
However, this also won't work. Each user response of 'y' causes the loop to execute twice, once
for 'y' and once for the newline character. Example 8-15b shows a simple solution to this
problem.
EXAMPLE 8-15b:
To eliminate the newline character, place an extra get() in the loop (to read the newline
character) and ignore the value read by this extra call:
charanswer,trash;
do{
actionoftheloopgoeshere
printf("Doyouwanttocontinue?(y/n)>");
answer=cin.get();
trash=cin.get();
}while(answer=='y');
The response is read into answer, and the newline character is read into trash, which is ignored.
(Because of this, the line trash = cin.get() can be simplified to cin.get().)
The put() function takes one parameter--the char value to print (in this example, ch)--and sends it
to the specified stream, in this case cout or outfile.
Example 8-16 modifies Example 8-14 to display each character as it is read in.
EXAMPLE 8-16:
This loop will read characters from cin until the user signals the end of data input (<Ctrl>-
z in Windows, <Ctrl>-d in Unix) and count the number of characters read in. It also sends a copy
8-14
char let;
int count = 0;
let = cin.get();
while (cin) {
count++;
cout.put(let);
let = cin.get();
}
cout << "the number of characters entered is " << count <<
endl;
Suppose you type in these characters, terminated by pressing <Ctrl>-z. (Don't press <Enter>
before <Ctrl>-z, or the newline character will be counted!)
abc def
abc def
abc def
the number of characters entered is 7
Notice that the echo of all the characters typed into cin (the first line) appears before all the
characters sent to cout (the second line).
It is possible to test the value of a specific character by comparing it with others in a specific
range. Example 8-17 shows how to write a function to detect a lowercase alphabetic character
(although, as we'll see shortly, this function already exists in C++).
EXAMPLE 8-17:
Here is a possible definition for a function islower(), which receives a character ch as a parameter
and returns true if ch is a lowercase alphabetic character or false if it is not:
/*returnstrueifchislowercaseletter,falseifanything
8-15
else*/
boolislower(charch)
{
if(ch>='a'&&ch<='z')
returntrue;
else
returnfalse;
}
We can write similar functions for the other tasks described above, but some of them are
rather complicated (the char values representing punctuation, for example, have ASCII codes
which are not adjacent). To make the task simpler, C++ has functions to perform these checks.
Some of these functions are isalpha(), isdigit(), isalnum(), isspace(), ispunct(), islower(), and
isupper(). Each function takes a single character as its sole parameter and returns true or false.
To use them #include<cctype>.
Function Checks
The following example uses four function calls to determine the type (alphabetic, digit, space, or
punctuation) of each character read in and count how many there are of each type.
charch;
intalpha=0,digit=0,space=0,punct=0;
ch=cin.get();
while(cin){
if(isalpha(ch))
alpha++;
elseif(isdigit(ch))
digit++;
elseif(isspace(ch))
space++;
elseif(ispunct(ch))
punct++;
ch=cin.get();
}
cout<<"Thenumberofalphabeticcharactersis"<<alpha
8-16
<<endl;
cout<<"Thenumberofdigits09is"<<digit<<endl;
cout<<"Thenumberofspacesenteredis"<<space<<endl;
cout<<"Thenumberofpiecesofpunctuation"<<punct<<
endl;
If sent anything else as a parameter, either function returns the value unchanged.
Without using toupper() or tolower(), here is what we have to do to allow the user to enter either
'n' or 'N' to stop executing the loop:
charans;
do{
...
cout<<"Doyouwanttocontinue?(y/n)>";
ans=cin.get();
cin.get();
}while(ans=='y'||ans=='Y');
}while(toupper(ans)=='Y');
In this case, if the user enters 'y', 'y' is sent to toupper(), which returns 'Y' for the comparison, and
the result is true. If the user enters 'Y', the character that was entered is returned and the result is
also true. If the user enters anything else, the result of the comparison is false.
The call to toupper() in Example 8-19b doesn't change the value of ch. To change
the value of the character sent to toupper() or tolower(), you must assign the return value
to a variable.
EXAMPLES:
charstr;
str="3BlinDMICe";
for(inti=0;i<12;i++)
str[i]=tolower(str[i]);
8-17
TRACE
Keith said to know how to sort them. They are treated like any other array except [?]
SEE sortstrings.cpp Assumes I already did sorting, else write myself a note to do this after
BubbleSort
SUMMARY
String Basics
2. The declaration of a string specifies the name of an object of class string; the name of the
object in this example is str.
stringstr;
3. A string can be given a value in the declaration or in an assignment statement, as shown below:
stringfruit,dessert="pie";
fruit="apple";
dessert=fruit;
4. A string can be sent to cout or to a file stream using the insertion operator (<<).
5. A value can be read from cin or from a file stream into a string variable. However, since the
extraction operator (>>) stops when it finds a whitespace character, this does not permit reading
in strings that contain blanks or tabs.
6. C++ has a stream input function, getline(), which allows a program to read a line of input
from cin or from a file stream. This function reads past whitespace characters, which allows it to
read in a string like "New York". It generally takes two parameters, the name of a stream and the
name of the string into which to read the input value. It reads until it encounters a newline
character (or until it reads a character which may be specified as a third parameter).
7. A string can be concatenated (joined to) the end of another string by using the + or +=
operator.
8. A string variable can be compared with another string variable or with a literal string, using the
standard relational operators. Strings are compared character by character using their ASCII
codes, which are assigned in increasing alphabetic order, with all capitals less than all lowercase
letters. (See the Appendix for a complete list of ASCII codes, and see Chapter 2, Section 5, for
more discussion of data type char.) If the first characters in the two strings are the same, the next
characters are compared, and so on, until a difference is found. At that point, the string whose
8-19
character is closest to the beginning of the alphabet is considered "less" than the other string.
9. Each position in a string has a number, starting at position 0. Individual positions in a string
can be accessed or changed using those numbers, just like the positions in an array.
10. Most other operations on strings are performed through member functions of the string
class. The string member functions have their headers in file string.h. A program which uses
strings must#include<string>
11. String member functions are called by appending the name of the function to the name of the
string. If the string is named str and the function is called func(), this would be the format for a
call to the function (assuming the function does not return a value): str.func();
12. Two member functions, length() and size(), do the same thing. Each returns an unsigned
integer) representing the number of characters in the string; this is known as the size or length of
the string. See 14. below.
13. SKIP: A string of length 0 is called an empty or null string. It can be produced by using the
string member function clear(). Here is an example:
After the call to clear(), there will be nothing in name. An empty string can also be produced by
assigning a value that consists of two quotation marks placed side by side (""), as shown below:
name = "";
14. The string class includes a data type: string::size_type . This is an unsigned integer type. The
member functions size() and length() return a value which has type string::size_type. (An
unsigned integer can never be negative and can hold a value larger than what can be stored as an
int.)
15. The string class has a member constant, string::npos, which is the value returned by a
function that is unable to find an item in a string. The value npos is the maximum number of
characters that a string can hold; it is one greater than the largest possible character position.
16. The function find() tries to find one string in another. In its simplest form, insert() takes two
parameters: the first parameter is the string to find, and the second is the starting position. The
search continues to the end of the string. The member function returns the position in which the
item is found within the source string, or it returns string::npos if the item is not found.
8-20
17. SKIP: The function replace() allows replacing part of a string by another string.
18. The function erase() allows removing characters from a string. This function generally takes
two parameters, the starting position and the number of characters to remove. The erase()
function removes the characters and closes up the string.
19. The function insert() permits inserting one string (or part of a string) into another string. The
simplest form of the member function insert() takes two parameters: the first is the position into
which to insert characters, the second is the string to insert.
20. The substr() function extracts part of a string, leaving the original string unchanged. The
function takes two parameters: the first is the position from which to start extracting, and the
second is the number of characters to extract. The function returns a substring consisting of the
extracted characters. If you omit the second parameter, substr() will extract characters from the
starting position and continuing to the end of the string.
21. When you send a string as a parameter to a function, the type of the formal parameter must be
string. If the string is to be changed inside the function, it must be sent as a reference parameter,
using the & operator.
23. Typically, processing strings involves working with large chunks of text. The text is stored in a
file. The standard method of reading and processing text from a file is to read a line, process the
line, and write the line out to a file, repeating this until the entire file has been processed.
The char Data Type, Character I/O Functions, and Functions from cctype.h
24. The char data type has 256 possible values, including all the letters, digits, and symbols on
the keyboard. Each character of a string is a member of the char data type. The char data type
is a subset of type int, and each character has a numerical equivalent--called its ASCII code--in
the range from 0 to 255.
25. C++ has two stream functions dedicated to character I/O; they allow input or output to be
done one character at a time. The functions for character-oriented I/O are get() and put().
8-21
26. The get() function is useful because the extraction operator (>>) skips over whitespace
characters (tab, space, and newline) when reading from a stream. The get() function reads
whitespace characters.
27. The get() function reads in one character at a time. A call to get() returns the next character
in the input stream. A call to get() to read from cin looks like this:
charch;
ch=cin.get();
28. The put() function displays a character to a stream; it does the same thing as printing a
character with the insertion operator (<<). Here is a call to put() to send a character to cout:
charlet='B';
cout.put(let);
29. You can use the get() function to read in one or more lines of text, perhaps counting the
number of characters read in. Here is an example which reads and prints characters in a loop,
testing the success of input to detect the end of the set of data:
charch;
intnumstrings=0;
ch=cin.get();
while(cin){
cout.put(ch);
numstrings++;
ch=cin.get();
}
cout << numstrings << " characters were read in" << endl;
30. The header file cctype.h contains the prototypes for a number of functions dedicated to
testing and manipulating characters. A program which uses these functions should contain the
following line:
#include<cctype>
8-22
31. Among the functions in cctype.h are isalpha(), isdigit(), isspace(), ispunct(), islower(),
isupper(), and isalnum(). These functions accept a character as a parameter and classify the
character as, respectively, an alphabetic character, a digit, a space, punctuation, lowercase,
uppercase, or an alphanumeric character (a letter or a digit). Each function returns true to
indicate success or false if not. The form of a call to one of these functions is illustrated by this
call to ispunct():
charch;
ch=cin.get();
if(ispunct(ch))
cout<<"thecharacter"<<ch<<"ispunctuation"<<
endl;
32. Two other useful functions from cctype.h are toupper() and tolower(). These allow
conversion of a letter from lowercase to uppercase and vice versa. Arrays of Strings
34. You can send one element of a string array to a function or to a string member function. Here
is an example that sends each of the first five strings from the names array (from paragraph 33) to
the member function length():
35. If the string is to be changed in the function, it must be sent as a reference parameter.
36. A parameter which is a string array is always a reference parameter and does not need the &
operator, but does need the [] operator, in both the function header and prototype.
37. SKIP To access any single string in an array of strings requires one subscript; to access a
character in that string requires two subscripts. An element of a string array can be used in the
same way as a single string. Here are some examples: