You are on page 1of 4

P R O D U C T I N F O R M AT I O N P R O D U K T I N F O R M AT I O N I N F O R M AT I O N P R O D U I T

ABBYY FineReader Engine 8.0


with Extended Platform Support
supporting development on Linux , BSD, Mac OS X (Intel) platforms

Document Recognition and PDF Conversion


FineReader Engine provides powerful OCR, PDF conversion, and barcode recognition technologies. Key benefits of the FineReader technology include:
I

Highly accurate optical character recognition (OCR). Based on ABBYYs latest recognition technologies, the 8.0 platform delivers 30% higher accuracy than previous versions. Advanced, accurate PDF conversion. Uses internal information (annotations, meta data, content streams, etc) to analyse files and determine when to extract text layers and when to apply full recognition on a block by block basis. Includes Adobe PDF Library technology to ensure integrity and compatibility. (Only available on Linux and Mac OS X). Multiple PDF output options. Supports output to all types of PDF file formats, provides ability to add meta data to PDF output, and advanced PDF security settings and encryption. Barcode recognition. Quickly finds and extracts barcodes at any angle. Supports more than 16 types of 1D barcodes and 2D PDF 417 barcodes. High accuracy on multilingual document processing. FineReader supports up to 189 OCR languages and maintains higher quality of accuracy on multi-lingual documents than any other OCR software*. Asian language support. A special module exists for reading Chinese, Japanese and Korean languages and its mixes with English (Linux only). Old European font and language support. Supports Fraktur, Schwabacher and the majority of Gothic fonts. For materials printed between 17th and 19th centuries in English, French, German, Italian and Spanish. Multiplatform support. Tested to work with operating environments such as RedHat Linux, Fedora Core, SUSE Linux Enterprise, FreeBSD, and Mac OS X (Intel). Field level/zonal recognition. Provides recognition at the field or zone level, designed to recognise key information from snippets of text. Ideal for data extraction, key-word indexing, and key word classification. Also supports OCR-A, OCR-B and MICR (E13B).

ABBYY FineReader Engine 8.0 is a Software Development Kit (SDK) for integrating ABBYY's core document recognition and conversion technologies into external applications. The latest enhancements to the ABBYY Engine increase support for developers working on systems such as Linux and BSD, and Mac OS X running on Intel CPUs.

I I

I I

I I

Flexible, easy development


FineReader Engine provides an efficient and flexible means for integrating ABBYY technology into a variety of applications:
I

Easy development. Offers easy access to core technologies and its COM API. Offers command line interface (CLI) for rapid integration, where a single command line call can be used to set up a complete task. Offers programming samples which can be used as a source for development. Flexible architecture. Can be used to build applications of any scale or complexity from simple client applications to server-based, distributed projects. New functionality can be easily added over time. Cost-effecitve, modular licensing scheme. A modular architecture and pricing model allows developers to choose only the functions and features they require.

* in accordance with internal tests

A B BY Y D E F I N I N G R E C O G N I T I O N

SDK Functions and Features


FineReader Engine 8.0 with Extended Platform Support provides a comprehensive list of features and functionality. A summary of functions includes:

Input from Multiple Sources


I I I I

Recognition
I I

Disk and memory Supported image Formats BMP, PCX, DCX, JPEG, JPEG 2000, PNG, TIFF, GIF Digital cameras PDF (Linux and Mac OSX only)

Multilingual OCR and 189 languages (Japanese, Korean, Chinese only available on Linux) Multiple Text types - Normal, Matrix, Typewriter - OCR-A, OCR-B, MICR - Recognition of typewritten characters - Fraktur/Gothic - Mixed text type processing with autodetection on a word-level Barcode Recognition - More than 16 1D barcodes, 2D PDF 417 - Fast Barcode Extraction - Damaged barcode extraction - Extraction at any angle 3 Document Analysis (DA) Modes - Normal for full page conversion tasks - DA for full text indexing - DA for invoice pre-processing Fast mode recognition for OCR, barcode recognition Core parameters tuning for customising processing speed by switching on/off certain pre-processing, document analysis and recognition algorithms User language support Field level/Zonal recognition - Data extraction from zones/fields with: underlined fields, boxes, data which does not fit within the field border - Definition of field content by setting alphabets and dictionaries - Detection of in-field spacing, accurately recognising fields where the spaces are allowed - Dictionaries which contain word-combinations with spaces - Intelligent processing of blocks with intersecting parts and lines - Text block despeckle, ability to specify the size of white or black garbage - Recognition tuning by influencing the recognition, based on multiple word-level and character-level hypothesis Throughput management - Ability to adjust speedaccuracy balance according to: - 3 content processing modes (thorough, balanced and fast) - ready-to-load profiles helping reduce time on choosing proper parameters

Image Preprocessing
I I I I I I

Built-in adaptive binarisation and texture filtering Auto-detection of page orientation Auto-detection of text blocks, tables, and pictures Auto-detection of vertical text in table cells Manual block zoning (adding, removing and editing blocks) Digital Camera OCR - Special image preprocessing algorithms to correct wrong exposure - Differentiation between document images captured from digital cameras or scanners - Straightening text lines to correct camera lens distortions
I

I I

Export/Output
I I I I I I I

Rich Text Export (RTF) TXT - 8-bit, Unicode HTML (8-bit, Unciode/HTML version 3.2 and 4.0) XLS Word XML Extended character information via API and XML, such as formatting, word/character variants PDF - Security and encryption settings, print restrictions - Tagged PDF Format - Recreation of hyperlinks within a PDF file - 4 modes: Text and image, Text over image, Text under image, and Image only - Multilingual PDF file support - Replacement of uncertain words with their corresponding images - PDF/A support Ability to set document-related properties in RTF, HTML, XLS formats Multiple levels of text format retention, including columns, tables, frames, fonts, font size, paragraph styles, borders, etc. Full picture and text color retention

I I

I I

ABBYY FineReader Engine Daemon / CLI


ABBYY FineReader Engine technology with Extended Platform Support is also available to work as a Daemon, a Command Line Interface (CLI)-controlled background service available 24/7. The FineReader Engine Daemon offers:
I I I I

Fast creation of powerful and accurate OCR services Easy setup Easy to use CLI-based Control Easy Integration in existing architectures
BSD Daemon Copyright 1988 by Marshall Kirk McKusick. All Rights Reserved.

ABBYY Licensing Policy


ABBYY FineReader Engine with Extended Platform Support is sold via a flexible, modular licensing policy that allows developers to select the best combination of tools and pricing options for their project. Licencing is offered as:
I

Developer Licences SDK Providing right to develop and test applications integration FineReader technology. This licence is needed to use the API of the SDK. Special licences required for work with Asian Language recognition and recognition of Fraktur fonts.

Software Maintenance and Upgrade Assurance (SMUA) ABBYY offers contracts for ongoing support, maintenance and assurance for future product upgrades. SMUA contracts are valid for a select time period. Trial Licences ABBYY offers a time-limited fully functional version of ABBYY FineReader Engine 8.0 with Extended Platform Support for free evaluation, so that our prospective customers can test it in real working conditions without any limitation of functionality. To obtain an evaluation copy, please contact your ABBYY sales representative.

Runtime Licences Grants right to distribute applications with FineReader Engine functions incorporated. RTLs differ by functionality and pages processed per month. The Runtime Licence provides access to core recognition technologies.

Add-on Modules for Runtime Licences RTLs can be enhanced by adding one or more of the following functionalities offered as add-on modules: PDF opening, PDF export, Word XML export, Extended Character Info (accessible via API or native XML), CJK (Chinese, Japanese, Korean) OCR, 2D barcode recognition, document analysis for invoices.

A B BY Y D E F I N I N G R E C O G N I T I O N

Supported Languages
189 Recognition Languages
I

Supported Operating Systems


FineReader 8.0 Engine with Extended Platform Support has been tested with the following distributions:

37 Main Languages, with dictionary support: Armenian (Eastern, Western, Grabar), Bulgarian, Bashkir, Catalan, Croatian, Czech, Danish, Dutch (Netherlands and Belgium), English, Estonian, Finnish, French, German (new and old spelling), Greek, Hungarian, Indonesian, Italian, Latvian, Lithuanian, Norwegian (Nynorsk and Bokmal), Polish, Portuguese (Portugal and Brazil), Romanian, Russian, Slovak, Spanish, Swedish, Slovenian, Tatar, Turkish and Ukrainian 4 East Asian Languages: Chinese (Traditional, Simplified), Japanese, and Korean 5 FineReader XIX Languages, for recognition of old European document printed in 17-19 th centuries: English, French, German, Italian and Spanish 132 Additional Languages with Latin, Cyrillic or Greek characters, For Example: Albanian, Azerbaijani, Bashkir, Basque, Belarusian, Breton, Bugotu, Buryat, Cebuano, Chamorro, Chechen, Eskimo, Fijian, Frisian, Friulian, Gagauz, Galician, Ganda, German (Luxemburg), Icelandic, Indonesian, Irish, Kabardian, Kirghiz, Koryak, Kurdish, Latin, Macedonian, Malagasy, Malay, Maltese, Mansy, Mari, Mohawk, Moldavian, Mongol, Mordvin, Nahuatl, Nenets, Nivkh, Nogay, Nyanja, Ossetian, Papiamento, Rhaeto-Romanic, Romany, Rundi, Russian (old spelling), Rwanda, Sami , Scottish Gaelic, Serbian, Slovenian, Somali, Sorbian, Sunda, Tabasaran, Tahitian, Tongan, Tswana, Tun, Turkmen, Tuvinian, Uzbek, Welsh, Zulu 4 Artificial Languages: Esperanto, Interlingua, Ido and Occidental 6 Programming Languages: Basic, C/C++, COBOL, Fortran, JAVA and Pascal Simple chemical formulas Digits Tools for creating user-defined languages

Linux & BSD (designed for Linux kernel 2.6.9 and above, with gcc 3.4.3 and glibc 2.3.2)
I I I I I I

I I I

RedHat Enterprise Linux ES 4 Fedora Core 4.0 SUSE Linux 10.0

White Box Enterprise Linux 4 Respin 1 Debian GNU/Linux 3.1


FreeBSD(R)6.1 with gcc 3.4.4

MacOS X (Intel)
I

Mac OS X 10.4.6 with gcc 4.0.1

I I I I I

In addition to the list of supported operating systems, porting to other configurations and operating systems may be available upon request. Please contact your local ABBYY sales representative for further details.

A list of all languages can be found on www.ABBYY.com

Barcode Types
I

1D: Check Code 39, Check Interleaved 25, Code 128, Code 39, EAN 13, EAN 8, Interleaved 25, CODABAR (without checksum), UCC Code 128, Code 2 of 5 (Industrial, IATA, Matrix), Code 93, UPC-A, UPC-E and Postnet 2D: PDF 417

Linux is the registered trademark of Linux Torvalds in he U.S. and other countries. The Free BSD logo is a trademark of The FreeBSD Foundation and is used by ABBYY with the permission of The FreeBSD Foundation. MacOS is a trademark of Apple Computer, Inc. registered in the U.S. and other countries. Adobe, the Adobe Logo, PDF logo and Adobe PDF Library are either registered trademarks or trademarks of Adobe Systems, Incorporated in the United States and/or other countries. , the ABBYY logo, and ABBYY FineReader and FineReader ABBYY Engine are either regisetered rademarks or trademarks of ABBYY Software, Ltd.

ABBYY Europe GmbH D - 80339 Munich, Tel: +49 89 511159-0, Fax: +49 89 511159-59 , sales_eu@abbyy.com, www.ABBYY .com

A B BY Y D E F I N I N G R E C O G N I T I O N

You might also like