You are on page 1of 1

Recognize text in scanned documents

http://help.adobe.com/en_US/Acrobat/8.0/Standard/WS2A3DD1FA-CFA...

Adobe Acrobat 8 Standard


Creating PDFs / Creating simple PDFs with Acrobat

Recognize text in scanned documents


You can use Acrobat to recognize text in previously scanned documents that have already been converted to PDF. OCR runs with header/footer/Bates number on image PDF files. 1. Open the scanned PDF. 2. Choose Document > OCR Text Recognition > Recognize Text Using OCR. 3. In the Recognize Text dialog box, select an option under Pages. 4. (Optional) Click Edit to open the Recognize Text - Settings dialog box, and select the options you want to use.

Recognize Text - Settings

Optical Character Recognition (OCR) software enables you to search, correct, and copy the text in a scanned PDF. If you do not apply OCR when you create a PDF by scanning a paper document, you can apply OCR to the PDF later if you have set the scanner resolution at 72 ppi and higher. OCR runs with header/footer/Bates number on image PDF files. Primary OCR Language Specifies the language for the OCR engine to use to identify the characters. PDF Output Style Determines the type of PDF to be produced. All options require an input resolution of 72 ppi or higher (recommended). All formats apply OCR and font and page recognition to the text images and convert them to normal text. Searchable Image Ensures that text is searchable and selectable. This option keeps the original image, deskews it as needed, and places an invisible text layer over it. The selection for Downsample Images in this same dialog box determines whether or not the image will be downsampled and to what extent. Searchable Image (Exact) Ensures that text is searchable and selectable. This option keeps the original image and places an invisible text layer over it. Recommended for cases requiring maximum fidelity to the original image. Formatted Text & Graphics Reconstructs the original page using recognized text, fonts, and graphic elements. The accuracy of the results depends on the scanning resolution and other factors. You may need to review and correct the OCR text in the new PDF page after scanning. Note: The Formatted Text & Graphics option is available for only some languages.

Black-and-white scanning at 300 ppi produces the best text for conversion. At 150 ppi, OCR accuracy is slightly lower, and more font-recognition errors occur. For text printed on colored paper, try increasing the brightness and contrast by about 10%. If your scanner has color-filtering capability, consider using a filter or lamp that drops out the background color. Downsample Images Decreases the number of pixels in color, grayscale, and monochrome images after OCR is complete. Choose the degree of downsampling that you want to apply. Higher-numbered options do less downsampling, producing higher-resolution PDFs.

Content from Using Adobe Acrobat 8 Standard. Other versions: Using Adobe Acrobat X Standard Using Adobe Acrobat 9 Standard More

1 of 1

1/27/2000 2:46 AM

You might also like