Professional Documents
Culture Documents
John Lamertina (Dietel Java 5.0 Chp 14, 19, 29) April 2007
Content
Reading and Writing Data Files (chp 14) String Tokenizer to Parse Data (chp 29) Comma Separated Value (CSV) Files an exercise which applies:
Multi-dimensional
arrays (chp 7) Exception Handling (chp 13) Files (chp 14) ArrayList Collection (chp 19) Tokenizer (chp 29)
Data Hierarchy
Field a group of characters or bytes that conveys meaning Record a group of related fields File a group of related records Record key identifies a record as belonging to a particular person or entity used for easy retrieval of specific records Sequential file file in which records are stored in order by the record-key field
End-of-file
Java program processing a stream of bytes receives an indication from the operating system when program reaches end of stream
Reading & Writing Files 4
Java opens file by creating an object and associating a stream with it Standard streams each stream can be redirected
System.in standard input stream object, can be redirected with method setIn System.out standard output stream object, can be redirected with method setOut System.err standard error stream object, can be redirected with method setErr
java.io classes
FileInputStream and FileOutputStream byte-based I/O FileReader and FileWriter character-based I/O ObjectInputStream and ObjectOutputStream used for
input and output of objects or variables of primitive data types File useful for obtaining information about files and directories
File Class
name of file to constructor If file does not exist, will be created If file already exists, contents are truncated (discarded) Use method format to write formatted text to file Use method close to close the Formatter object (if method not called, OS normally closes file when program exits) Example: see figure 14.7 (p 686-7)
Reading & Writing Files 8
Possible Exceptions
occurs when opening file using Formatter object, if user does not have permission to write data to file FileNotFoundException occurs when opening file using Formatter object, if file cannot be found and new file cannot be created NoSuchElementException occurs when invalid input is read in by a Scanner object FormatterClosedException occurs when an attempt is made to write to a file using an already closed Formatter object
SecurityException
Scanner object can be used to read data sequentially from a text file
File object representing file to be read to Scanner constructor FileNotFoundException occurs if file cannot be found Data read from file using same methods as for keyboard input nextInt, nextDouble, next, etc. IllegalStateException occurs if attempt is made to read from closed Scanner object Example: see Figure 14.11 (p 690-1)
Pass
Reading & Writing Files 10
Tokenization breaks a statement, sentence, or line of data into individual pieces Tokens are the individual pieces
Words
from a sentence Keywords, identifiers, operators from a Java statement Individual data items or fields of a record (that were separated by white space, tab, new line, comma, or other delimiter)
String Tokenizer 11
String Classes
Class java.lang.String Class java.lang.StringBuffer Class java.util.StringTokenizer
String Tokenizer
12
StringTokenizer
String Tokenizer
13
Example 29.18
import java.util.Scanner; import java.util.StringTokenizer;
public class TokenTest { public static void main (String[] args) { Scanner scan = new Scanner(System.in); System.out.println("Enter a sentence to tokenize and press Enter:"); String sentence = scan.nextLine(); // default delimiter is " \t\n\r\f" String delimiter = " ,\n"; StringTokenizer tokens = new StringTokenizer(sentence, delimiter); System.out.printf("Number of elements: %d\n", tokens.countTokens()); System.out.println("The tokens are:"); while (tokens.hasMoreTokens()) System.out.println(tokens.nextToken()); } }
(Refer to p 1378)
String Tokenizer 14
15
3.
4.
5.
6.
Each record is one line Fields are separated by comma delimiters Leading and trailing white space in a field is ignored unless the field is enclosed in double quotes First record in a CSV may be a header of field names. A CSV application needs some boolean indication of whether first record is a header. Empty fields are indicated by consecutive comma delimiters. Thus every record should have the same number of delimiters Fields with embedded commas must be enclosed in double quotes
StringTokenizer with a comma delimiter will read most CSV files, but does not account for empty fields or a quoted field with embedded commas:
Empty
fields in a CSV file are indicated by consecutive commas. Example: 123, John ,, Doe (Middle Name field is blank) Fields with embedded commas are enclosed in quotes. Example: 456 , King , the Gorilla , Kong
Comma Separated Values 17
Exercise Part 1
Develop and test classes to read and write CSV data files, satisfying the first four CSV File Format Rules (listed on a previous slide). Your completed classes must:
Handle the usual possible file exceptions Read CSV-formatted data from one or more
files into
a single array Print the data array Write data from the array to a single file in CSV format
Multi-dimensional Arrays
Java implements multi-dimensional arrays as arrays of 1-dimensional arrays. Rows can actually have different numbers of columns. Example:
int b[][]; b = new int[ 2 ][ ]; // create 2 rows b[ 0 ] = new int[ 5 ]; // create 5 columns for row 0 b[ 1 ] = new int[ 3 ]; // create 3 columns for row 1
(Refer to p 311-315)
Comma Separated Values 19
int b[][] = new int[ 10 ][ 20 ]; int size1 = b.length; // number of rows int size2 = b[ i ].length; // number of cols for i-th row
20
TestFile1.cvs
987, 413, 123, 990, Thomas ,Jefferson,7 Estate Ave.,Loretto, PA, 15940 Martha,Washington,1600 Penna Ave,Washington, DC,20002 Martin , Martina ,777 Williams Ct.,Smallville, PA,15990 Shelby, Roosevelt,15 Jackson Pl,NYC,NY, 12345
TestFile2.cvs
ID, FName, LName, StreetAddress, City, State, Zip 123, John ,Dozer,120 Main st.,Loretto, PA, 15940 107, Jane,Washington,220 Hobokin Ave.,Philadelphia, PA,0911 123, William , Adams ,120 Jefferson St.,Johnstown, PA,15904 451, Brenda, Bronson,127 Terrace Road,Barrows,AK, 99789 729, Brainfield,Blanktowm, PA, 16600
21
Exercise Part 2
Develop an application that uses your CSV reader and writer classes Read the test files (or create your own test files) and perform data validity checks by displaying an appropriate error message and the offending record(s):
If any fields are missing If extra fields are found If any records have duplicate IDs If any record has an invalid zip code (i.e. not exactly 5 digits)
Write all records to a single CSV file (i.e. concatenate the multiple test files in a single file)
22
23
Hints 1.a
CSVFile
+ + + + + + + + + + + boolean hasHeaderRow; String fileName; Scanner input; List<String> records; String data[][]; int numRecords; int maxNumFields; CSVFile(String fileName) CSVFile(boolean headerRow, String fileName) boolean getHasHeaderRow() String getFileName() int getNumRecords() int getMaxNumFields() void getData(String a[][]) void openFile() void readRecords() void parseFields() void printData()
Comma Separated Values 24
Hints 1.b
import import import import import import import import java.io.File; java.util.Scanner; java.io.FileNotFoundException; java.lang.IllegalStateException; java.util.NoSuchElementException; java.util.List; java.util.ArrayList; java.util.StringTokenizer;
25
Hints 1.c
public void openFile() { try { input = new Scanner(new File(fileName)); } catch (FileNotFoundException fileNotFound) {
...
public void readRecords() { // Read all lines (records) from the file into an ArrayList records = new ArrayList<String>(); try { while (input.hasNext()) records.add( input.nextLine() );
...
26
Hints 1.d
public void parseFields() { String delimiter = ",\n"; // Create two-dimensional array to hold data (see Deitel, p 313-315) int rows = records.size(); // #rows for array = #lines in file data = new String[rows][]; // create the rows for the array int row = 0; for (String record : records) { StringTokenizer tokens = new StringTokenizer(record,delimiter); int cols = tokens.countTokens(); data[row] = new String[cols]; // create columns for current row int col = 0; while (tokens.hasMoreTokens()) { data[row][col] = tokens.nextToken(); col++; }
Hints 1.e
public static void main (String[] args) { CSVFile file1 = new CSVFile(true,"TestFile1.csv"); file1.openFile(); file1.readRecords(); file1.parseFields(); file1.printData(); String fileData[][] = new String[file1.getNumRecords()][file1.getMaxNumFields()]; file1.getData(fileData);
28
CSV Libraries
http://ostermiller.org/utils/CSV.html http://opencsv.sourceforge.net/