You are on page 1of 11

LexisNexis Concordance 2007

Creating Databases
Importing a Delimited ASCII Text (DAT) File

Document Overview
Before You Begin Creating a New Database File Configuring Fields for Your Data Importing Your Data Additional Resources

Creating Databases Importing a Delimited ASCII Text (DAT) File

Concordance 2007 Quick Help


Concordance is a registered trademark of Applied Discovery, Inc. 2007 Concordance. All rights reserved. LexisNexis and the Knowledge Burst logo are registered trademarks of Reed Elsevier Properties Inc., used under license. Concordance is a registered trademark and FYI is a trademark of Applied Discovery, Inc. Other products or services may be trademarks or registered trademarks of their respective companies. 2007 Concordance. All rights reserved. Concordance Concordance Image Concordance FYI

Copyright 2007 Concordance. All rights reserved.

Creating Databases Importing a Delimited ASCII Text (DAT) File

Before You Begin


Delimited ASCII text files store 2-dimensional arrays of data by separating the values in each row with specific delimiter characters. Most database and spreadsheet programs are able to read or save data in a delimited format. Delimited-text files may have extensions such as .DAT, .ASC, .CSV or even .TXT, as long as the file is structured properly with text qualifiers, field delimiters and line breaks. For many Concordance databases the files will also include optical character recognized (OCR) text and scanned document images. DAT files will often accompany the OCR text and image files containing the metadata for each document. The procedure outlined in this document describes how to import a delimited ASCII text (.DAT) file. You will need Concordance Text editor program (TextPad, UltraEdit or similar) Delimited ASCII Text (DAT) file

Creating a New Database File


1 2 Open Concordance. In the File menu select New.

Figure 1: Concordance Menu File

In the Create database from template dialog (see figure 2), select the Blank database type.

Figure 2: Create database from template General tab

Copyright 2007 Concordance. All rights reserved.

Creating Databases Importing a Delimited ASCII Text (DAT) File

4 5

Click OK. When prompted, choose a file name and directory (choose to store your database locally or on a network drive). NOTE You must have full access to the directory.

Click Open to save the database and begin creating and customizing your fields.

Configuring Fields for Your Data


Selecting the Blank database template creates an empty database containing no fields and is best to use when you are creating a custom structure for a delimited ASCII text (.DAT) file.

Plan your database structure


Open your DAT file with a text editor. Note the following: Delimiters used in the file (Text qualifier, field, and new line delimiters) Field Headers (the first line will usually contain the field headers) Type, format, and length of data Date fields are 8 digits max, may be in any order with slashes, or in the universal true date format without slashes Field(s) database users need to search and sort Field (if any) to be linked to an image OCR content (if any) to be imported

Tip - While you have the DAT file open, scroll to the bottom of the file, and ensure that the last record (the last line) has a new line delimiter (create by pressing Enter on your keyboard) at the end of the record. Without the final return, the last record will not be imported into your database. Immediately upon creating a blank database the New field dialog will open prompting you to begin creating and configuring your fields. 1 Type the name of your first field in the Name field (see figure 3). NOTE Field names do not need to match field headers specified in the DAT file. They may be up to 12 characters long and entered in upper or lower case letters. All characters will all be converted to upper case by the system. They must begin with a letter and may contain only alphanumeric characters and the underscore. 2 Select the field type in the Type drop-down, and select the appropriate attributes for the field. Types and Attributes - To successfully import your DAT file, you must create fields to match the data type and size of your data. Refer to Tables 1 and 2 below for information about Field Types and Attributes. Field Order - Create your fields in the order in which you will want to view them in Table and Browse views. Use the Insert and Delete (Similar functions to Paste and Cut respectively in MS Office products) buttons to arrange fields into the desired order as necessary.

Copyright 2007 Concordance. All rights reserved.

Creating Databases Importing a Delimited ASCII Text (DAT) File

Click New to confirm your choices and to create the next field. NOTE If you accidentally click OK instead of New to create a new field, the New field definition dialog will close. To access this panel again, select Modify in the File menu.

Figure 3: New field definition dialog

Field Types
Type
Text*

Capacity
1-60 alpha or numeric characters, keyed by default

Notes
Use for numeric values that are not used mathematically (i.e. phone numbers, social security numbers, and other serial numbers) Note - If you intend to sort records based on this field, zero fill any numeric values stored in to ensure they sort correctly.

Numeric*

1-20 digits long (including the decimal place, negative sign, and all digits following the decimal place), keyed by default

Display options available: Currency Commas Zero filled Plain The date format selected here will control how the data appears after it is imported into the database. It does not need to match the date format in DAT file. Most flexible and variable in size, not ideal for sorting or searching by comparison. Supports rich text formatting. *Fixed-length field

Date*

MMDDYYYY YYYYMMDD DDMMYYYY

8 bytes in length

Paragraph

12,000,000 characters (12 MB), indexed by default

Table 1: Field Types

Copyright 2007 Concordance. All rights reserved.

Creating Databases Importing a Delimited ASCII Text (DAT) File

Field Attributes
Attribute*
Key

Use
Most commonly applied to fixed length fields, however it may be applied to any field (including paragraph fields) to make relational searches faster. Used to link Concordance with an Image viewer, it indicates which field contains the image name or alias.

Notes
Keying a field creates a .KEY file, as KEY files grow in size, their efficiency decreases and may slow relational searches. All keyed fields will appear in the default table view. Select only one field per database as an Image field. Identifying multiple fields in a database as an image field will interfere with the linkage between Concordance and the viewer. Places every word in the field into a dictionary file (.NDX and .DCT) for fast retrieval. System fields should never be indexed, added, deleted or modified by users. Concordance will create these fields for replication and synchronization information.

Image

Indexed

Enables full text searching.

System

Special field that is hidden with no read or write access to end-users.

Accession

Unique serial numbers internally assigned to each record, managed entirely by Concordance. Will not index text that is not contained in a defined dictionary.

Accession numbers may not be edited or modified. Helpful as load order identifier. Note As records are edited, exported or removed you gaps in numbering may occur. Not recommended for any fields. Causes increased indexing times, and will limit the indexing to Websters dictionary and will include only English words. Use Synonyms instead.

Optical Character Recognition (OCR) Indexing

*Not every Attribute is available for every field type Table 2: Field Attributes

4 5

Repeat steps 1 through 3 as necessary to create a structure to match your DAT file. When you have completed creating all your fields, click OK. Your database structure is ready for the data import.

Additional Considerations
Embedded Punctuation
Embedded punctuation is provided so that hyphenated words, dates, decimal numbers, and contractions are not split into two or more words. You may add or delete punctuation as needed, by default Concordance includes . , / characters as embedded punctuation for all fields.

If you will be importing OCR


Create your OCR fields now, in addition to the fields for your DAT file import. As a best practice, create at least two OCR fields labeled with ascending numbers (example: OCR1 & OCR2) When using the ReadOCR.cpl to import your OCR text, the CPL will automatically overflow text from the first OCR field if it is over 12 million characters into the next sequential named OCR field.

Copyright 2007 Concordance. All rights reserved.

Creating Databases Importing a Delimited ASCII Text (DAT) File

NOTE If you do not create a second sequentially named OCR field, you run the risk of losing overflow data. You will not receive an error on the import if your content exceeds the 12 million character limit.

Importing your data


1 In the Documents menu, select Import then Delimited text.

Figure 4: Documents Menu Import> Delimited text

Select the Import/Overlay Wizard in the Import method dialog, and then click OK.

Figure 5: Import Method

Accept the default Load option for your initial import of data, and then click Next.

Figure 6: Import Wizard dialog Load Method

Copyright 2007 Concordance. All rights reserved.

Creating Databases Importing a Delimited ASCII Text (DAT) File

Select the delimited format that matches the one used in your DAT file, then click Next. NOTE The Import Wizard defaults to the standard Concordance delimiters, but you may also select Comma Delimited (CSV), Tab Delimited, or choose the Custom format and specify your unique ASCII character delimiters in the drop-down menu shown in figure 7.

Figure 7: Import Wizard dialog Format

In the Date format window, select a date format that matches the dates in your DAT file, and then click Next. NOTE Selecting the date format will not affect how it will display in table and browse view. That preference was set when the date field was created in the New field definition dialog.

Figure 8: Import Wizard Date format

Copyright 2007 Concordance. All rights reserved.

Creating Databases Importing a Delimited ASCII Text (DAT) File

By default all of the fields you created will appear in the Selected Fields box, make sure the order of the fields matches the order in your DAT file.

Figure 9: Import Wizard Fields

If you need to change the order of the files Move all the Selected Fields to the Available fields list by clicking on the << button, then add them back in the proper order one by one using the > button. Or Click on a field to reorder and use the Up and Down buttons as needed to correct the order.

NOTE If the DAT file contains the field information as the first line in the file, select the Skip first line checkbox to ensure that the data imported from the DAT File has the associated fields in the Selected Fields window. 7 8 Click Next to confirm the Selected Fields and their order. Click Browse in order to navigate to and select your DAT file (delimited ASCII), and then click Next.

Copyright 2007 Concordance. All rights reserved.

Creating Databases Importing a Delimited ASCII Text (DAT) File

10

Figure 10: Import Wizard Open

Confirm the location of your DAT file in the File field and click Finish to import your data.

Figure 11: Import Wizard Finish

10 When the import is complete, the dialog will close. Select the Browse view to verify that your data import was successful. If you are not linking to images or loading OCR, you are ready to index your database and get started searching, tagging, and working with your records.

Copyright 2007 Concordance. All rights reserved.

Creating Databases Importing a Delimited ASCII Text (DAT) File

11

Additional Resources
General Product Information http://law.lexisnexis.com/concordance

Concordance Technical Support Phone: 866-495-2397 Email: concordancesupport@lexisnexis.com Concordance Training Phone: 425-463-3503 Email: concordancetraining@lexisnexis.com

Copyright 2007 Concordance. All rights reserved.

You might also like