Ligature Hebrew OCR 5.0
Ligatur-OCR incorporates numerous specialized features including: support
for all Hebrew, English and Western European languages, an editor with
pop-up verifier and multilingual spell checkers (Latin only), trainable
mode, batch processing, image rotation, 20° auto deskewing, manual
or automatic page analysis, and deferred processing.
In addition to outstanding accuracy and a robust feature set, Ligature-OCR
offers a modular design, providing the user and the developer with a great
deal of flexibility. Ligature-OCR consists of several standalone modules
that can be operated separately and are integrated under one simple shell.
This modular approach enables the user to repeat certain operations without
having to rescan the document.
Scanning Module - Input
Ligature-OCR supports the leading image file formats and desktop scanners,
including Twain and HP AccuPage, as well as their optional automatic document
feeders. Ligature-OCR supports full control of scanner brightness and
contrast; it also enables the user to customize the size of the page to
be scanned, allowing for non-standard page sizes. This option is especially
useful to skip running headings and other page areas. Ligature-OCR also
includes 400 dpi scanning capability.
Page Decomposition Module - Analyze
Ligature-OCR features automatic page analysis that identifies columns
and tables, distinguishes between text and graphics, and ignores noise
areas.
The user can choose among the following four options:
1) Full Automatic Decomposition - Ligature-OCR selects the text areas
and defines their proper reading sequence prior to the recognition stage.
2) Force One Column - This option is designed for the reading of tabular
information.
3) Columns - This option is used for pages consisting of two or more columns
(e.g., newspapers).
4) Predefined Zones - Ligature-OCR offers graphic tools with which the
user can mark specific areas of the page for reading, deletion and inversion.
These areas can be stored for multiple template reading, especially useful
with pages of similar layout.
Recognition Module - OCR
Ligature-OCR features a unique and powerful engine for Omnifont reading
that is based on Stochastic Algorithms and Neural Networks. Ligature-OCR
reads Hebrew, English and all Western European character sets (including
most monetary symbols), without user intervention. Ligature-OCR also has
features for handling mixed Hebrew and English text in the same document
as well as kerned pairs, smeared characters and broken letters. This advanced
OCR engine is also designed for recognizing 200 dpi faxes with a special
option for degraded text and dot matrix/draft printing.
Users interested in reading documents with special typefaces or non-Latin
characters can further benefit from the Customize option. This option
enables the user to fully train Ligature-OCR to recognize non-standard
and highly stylized fonts such as Rashi, Gothic, or Greek. The Customize
option can also be used to train languages which are Windows supported
that are not listed in the Ligature Omnifont language list.
Ligature-OCR saves time, typically required for post-OCR reformatting,
by retaining margins, indents, paragraph rulers, centering and justification,
tabulation, line spacing, point size, bold, and underline. Ligature-OCR
also features a Pop-Up Verifier in its Editor Window which allows for
easy proofing of documents. By simply clicking on any questionable character
or word, the Pop-Up Verifier enables the user to compare the recognized
text to the original scanned image, which appears simultaneously on the
background of the screen. Once saved in the file format of choice, the
scanned text can be edited with the full power of the user selected application.
Batch Processing
Ligature-OCR supports two modes of deferred processing: (i) scanning documents
and saving them as image files for subsequent OCR, and (ii) scanning and
reading documents, while saving them as native Ligature-OCR files for
subsequent proofreading with the Pop-Up Verifier. Ligature-OCR also supports
Continuous Reading, a feature which enables the user to scan a group of
documents either manually or by use of an Automatic Document Feeder, perform
Analysis and OCR and then save and name the files automatically.
Ligature-OCR provides several options for simplifying the naming of files
while using the batch processing mode, for example, when scanning two-sided
documents. The customization of page size improves efficiency by hiding
running headers or footers. In Append mode, numerous pages can be saved
under one file name. Batch processing also supports the ability to predefine
reading zones.
Key Features
- Setting OCR Options for each Text Box
- Page Number Display in Append Mode
- Improved Recognition of 200 dpi Faxes
- Reading of Low Resolution TIFF or PCX files (100 x 200 dpi)
- Additional Output Format: MS Word for Windows 2000
- Numbers Only Recognition
- Improved Attribute Retention
- User Dictionary (Latin only)
- Selecting File Format within "Save As..." Dialog Box
- Improved Handling of Complex Tables
New Features and Enhancements in Version 5.0
- Hebrew Speller
- Automatic recognition of mixed Hebrew/English text
- Auto Flip - enables recognizing page regardless of scanning orientation
- Improved File Management System
- Improved Batch processing
- Improved retention of table format
- Improved Latin recognition
|