Readiris Pro 11.0.3 — A Powerful OCR Application for the Macintosh
reviewed by Harry {doc} Babad
I.R.I.S. 10 rue du Bosquet B-1348 Louvain la Neuve Belgium +32-(0)10-45 13 64 Fax: +32-(0)10-45 34 43 1-561-921-0847 (USA) http://www.irisusa.com/support/readiris/index.html http://www.irisusa.com/products/readiris/mac/index.html Released: September 20, 2005. $129 USD and CND, €152 Euro. 30-day Demo version. 17 MB download installed. Requirements: G3 Macintosh, Mac OS X 10.3 or later including Tiger. A Windows version Readiris Pro 10 is available. Supported Scanners: Readiris supports all TWAIN compliant scanners. In other words, each scanner for which a Twain module version 1.7 is available is supported! Detailed Manual and de-installer Provided. Internationally Applicable: Recognizes over 118 languages and a variety of alphabets. Audience: Anyone who needs to translate scanned graphic formatted text accurately into editable words. Strengths: Readiris Pro is flexible and powerful and accurate OCR software for Mac OS X. It accurately and rapidly transforms your paper documents, your PDFs and images files into editable information ready to for use. With it you can rapidly retype your paper or graphic format PDF documents while maintaining page layouts in the recognized text. The software interface is simple and straight forward (Mac-like) and easy to use and although a function of input quality, the recognition quality is excellent. Weakness: The last update if the venders Webpage appears to have been February 2004, although the pages do contain information on this new and more recent enhanced version for the Macintosh, it was buried several levels deeper that the PC-Windows information. Product and company names and logos in this review may be registered trademarks of their respective companies. |
General Users
|
Publisher’s Highlights
Readiris Pro 11, the most advanced OCR software for Mac, quickly and easily transforms your paper documents into electronic files you can edit into your favorite application. Readiris Pro 11 not only retypes the text but also reproduces the layout of your original documents until perfection. Columns of text, titles, fonts, bullets, tables, graphics, etc ... are well recreated. With the PDF capabilities of Readiris Pro 11 you will be able to transform the information locked in PDF files into editable text and if desired, turn your documents into PDF files, offering an optimized file size. Archiving and sharing information become easier than ever! Extremely powerful, Readiris Pro almost exactly recreates the original format of your documents and replaces columns of text, tables, and graphics in the output file.
Introduction
I have several uses for Optical Scanner Software [OCR]. I often have scanned (copied) recipes and articles from magazines to my desktop, which generated graphic-form PDFs, which I’d like to edit. I also had a scanner capable of sheet feed and was gradually converting some of my technical documents to electronic format to reduce the amount of paper I needed to store in my all too small office.
In the past, I’ve been a devotee of OmniPage X, but despite promises to the contrary, ScanSoft has neglected the Macintosh version of the product. The current 2003-2004 version of OmniPage X is completely broken in Mac OS Tiger. Although a free copy of Readiris 7 came with my HP ScanJet 8250 software, I ignored it. My fist look at the software, in haste, did not impress me. Despite is quirky interface, it should have. However, at the time I was still using first Jaguar and then Panther as my OS, so could live with the somewhat crash-prone OmniPage Pro X, which I’d gotten used to.
|
|
PDF of Scanned Recipe |
Readiris Scan Capture |
With my adoption of Tiger, OmniPage X become completely broken so I bought (upgraded) to ReadIris 9. I’ve explored version 9 and actively used it until getting the version 11 upgrade, which I now review. This new version 11, with a simple pt powerful Macintosh friendly interface, really zips along. This power holds despite the acquisition process being driven by flaky HP software required by my scanner. [More about that later.] I have been using version 11 extensively for the last two weeks, upgrading to version 11.03 a few days ago. The only problem I’ve had with the product occurred when I tried to acquire an additional image, using the latest version of the HP Scan Pro Software, without saving the original Readiris document. Then the HP software froze, requiring me to REBOOT and then reacquire the new image. Yes the HP ScanJet software v. 6.1.3 does not accommodate a forced quit. Shame on HP!
Although Readiris batches multipage documents in a variety of formats, I did not make extensive use of this feature. The product OCR's accurately and reproduces an original's with dual-column plus footnotes very well as RTF output. Although I’m not yet comfortable with this feature, its dual-layered image plus text format, also works well. The images below illustrate my direct scan of Barrio's Shrimp, Crab, and Bacon. & Saffron Bisque, as well as a simple scan to PDF of the original magazine page.
More Thoughts About OCR and Scanning with Readiris
In order to directly scan to OCR software, two criteria needed to be met. The software must interact well with your scanner and it must be easy to select the portions of the scanned image you need (either as text, graphics or both) to recognize these page portions for future formatting and editing. ReadIris does both of these quite well.
Indeed, Readiris not only detects the various blocks on a page (called page decomposition), but also logically sorts them. The identified zones are sorted by top down, left to right by default conforming to columnized documents. The block ID numbers indicate the sort order and zone type. These zones can be reordered or unnecessary zones or noise or borders identified as zones can be ignored. You can also choose to select and order zones manually using the ‘windowing tools. Read the manual — check out the details.
I was generally impressed with the accuracy achieved by Readiris. Although 99.6% isn’t perfect even for good quality originals (4 typos out of 1000 letters; its much better than I can do on a keyboard. The mistakes are obvious to my spell checker. The combination of my HP scanner and Readiris 11 was able to cope with a wide range of documents, admittedly in good condition, arranged in various format layouts, font styles and which contained colored graphic images. Unlike OmniPage Pro X, the software did not balk at black type on blue paper.
The acquisition process is simple: Simply scan your document, recognize it with Readiris Pro 11 Mac, and send it automatically to your favorite application. I tend to use RTF as an output but Adobe Acrobat PDF, HTML, RTF (“Rich Text Format” is available. [I did not test conversion to HTML). Indeed, the process is so easy and straightforward that I am finally catching up on my recipe clippings files after several years of collecting more paper than I’ve captured.
Iris
has reworked the Spartan user interface of version 7 and 9 for this release.
Two sets of three icons down a left-hand pane now represent the Scanning and
Recognition steps. Clicking on any of these icons calls up the relevant options
for that step, either as a small pop-up list or within a dialogue window.
A single column of buttons down the right-hand side provide mark-up and viewing tools for dealing with the scans shown in the central preview pane. Page scan thumbnails are shown to the left of the preview pane and individual page information given underneath. This means you may never need to use the program menus.
The huge window for specifying the text formatting and export options is a bit challenging, especially since your choice of export format enables and disables options without explanation. Read the manual, it will all become clear and easy to deal with. Although it’s not possible to save custom options as reusable selections, so if you want to switch between different text formats, you must reset the options manually every time. You can however create scan style template with can be saved and reused.
The Software - Its Features
Readiris Pro 11 from I.R.I.S. takes over from the basic and limited optical character recognition programs (OCR) that are bundled with most desktop scanners. After recognizing your documents, which take just a few seconds, you than have an editable copy of it. According to the publisher, Readiris 11 was specifically designed for version 10.4 of the Mac OS operating system (“Tiger”.) Its brushed-metal “look and feel” is used across the board (although the user can disable it if he prefers the white “Aqua” look). I’ve recounted some if its features below. For additional information on this feature-rich product, check out the vendors website.
Fully Configurable User Interface Redesigned in Mac OS X Style — Easily enable actions using one button to acquire a document and another to launch recognition upon completion of settings; - New Readiris setting bar now offers automatic functions to improve OCR, indicate document language, enable or disable interactive learning, and definition of output format;
Excellent
and clearly written Manual (both paper and PDF)— These focused exceptionally well on the more traditional
aspects of using the software but was too succinct about getting me to
understand it’s enhanced features in a single read, to grasp without practice.
Capturing tables fell into that category, but after a try or two, it too became
easy.
Text Formatting — When processing a document you can select how you want the recognition result to show in your new document:
"Create Body Text": you get a continuous, running text. The user does all formatting, if any, afterwards.
"Retain Word and Paragraph": the font type (serif, sans serif, proportional, fixed, normal, condensed), size and typestyle (bold, italic, underlined, superscript, subscript) are maintained across the recognition, tabs and the alignment of each block are recreated and tables are recaptured correctly.
"Recreate Source Document" recreates a facsimile copy of the original document with text blocks, tables, graphics, bulleted and number lists recreated in the same place and the word and paragraph formatting maintained across the recognition. You get a true copy of your source document, be it a compact and editable text file. It’s no longer a scanned image of your document.
Efficient Batch OCR and Batch Scanning Treatment –
Batch OCR executes recognition on all pre-scanned images in a specific folder while recognized documents receive a file name corresponding to the image file.
You can insert blank pages between documents to separate, recognize and save them to different output files
Handwriting Recognition— For the first time, Readiris 1 offers the recognition of hand printing – uppercase “block letters” a unique feature based on the company’s ICR (Intelligent Character Recognition) engine. You can attend a meeting and take handwritten notes, scan them afterwards and convert them into editable text with Readiris and distribute that report promptly to your colleagues (Well, you do have to respect a specific writing style. Hand-printing recognition is limited to numerals, uppercase and separated letters (AZ) and some punctuation symbols (comma, dot and hyphen). I found this feature unusable, but that my inability to block print, not the software itself. I’m also a failure at entering ext directly into PDAs.
Locked PDF Conversion - When you convert PDF files and multipage (TIFF) image files, you can now select the appropriate page range. If your objective is, say, to capture just a chapter of a lengthy PDF publication, it doesn’t make any sense to load the entire book into Readiris. Indicate the proper range to save lots of time!
More Advanced Readiris’ Features
Bar Code Reading — define a “bar code zone” around any bar code printed on a document and the data contained will be automatically re-typed.
Multiple Language Capability — Accurate Recognition of up to 118 Languages: based on 3 alphabets (Latin, Greek and Cyrillic), including East- and West European languages, Baltic and Cyrillic (Russian) languages, Greek and Turkish; A Hebrew, Japanese, Chinese and Arabic modules are also available.
Powerful Image Adjustment Techniques — A new version of the binarization routine gives you extra control over image adjustment: whatever the background color, there’s always a way to separate the foreground (the text) from the background by adjusting the brightness and contrast.
Graphics Support — It offers rotate, deskew, contrast-adjustment, and despeckle tools for cleaning imperfect scans and digital photos.
Enhanced Page and Font Analysis — The page analysis was also refined, and is now more efficient at separating text zones from graphic areas of a scanned document. Note that Readiris now recognizes inverted drop letters, for instance! In addition, you can automatically ignore text zones that occur on the page borders: some document scanners tend to generate black borders around the actual document: Readiris 11 sees to it that this “noise” doesn’t get picked up by the page analysis.
Working with Colored Backgrounds — Readiris Pro 11 maintains the colors of the text and of the background. Scan text where the titles are in blue and they’ll be blue in the recognized document. Text that’s placed in a yellow frame will show up in a yellow frame in the output document. (Just disable those features if you don’t like them…)
Accurate Color Output Files — Colored text, backgrounds and graphics are fully reproduced with the look and feel of originals with high accuracy. This isn’t a feature I needed, but it its effects were was visually obvious.
Maintaining Links — When web site URLs occur in a scanned document, these are now recreated in PDF output as visible links. Click a link in a PDF file and you’ll surf to the mentioned web site…
A Few Gripes
Updater Versions — After some searching, guided by a MacUpdate search I found updaters to the program. However, the absence of a readme files, explaining the changes in the updater, left me feeling vulnerable.
Version 11.0.4 – Although a Version Tracker search indicated there was a version 11.0.4, I could not find that updater.
Language Localizations - Although the text recognition process supports 118 languages, I was unable to save disk space by eliminating the ones I would never use. Such a feature is available in most of the other multilingual packages I use (e.g., OS X or in MS Office.) Although I could open (not boot) the Readiris application, by control clicking on the application, to view its contents) it was not obvious which files to delete so I took the coward’s way out.
Automatic Zone Detection Identification — Zone detection was not always accurate in complex pages but seemed to improve as I updated from version 11.0.1 to 11.0.3; but of course I could always get results I needed by manually selecting the location and order of the zone in a scanned image.
Conclusions
According to Chris Breen of Macworld, ”accurate optical character recognition (OCR) is difficult to achieve. An OCR program must not only decipher text printed in different fonts, sizes, and alphabets, and convert it to editable text, but also distinguish between text, graphics, and tables.” (Macworld, March 2004. Read Iris 11 meets these criteria.
I agree with Charles W. Moore, Applelinks Contributing Editor [November 16, 2005] that “Readiris Pro 11 productively converts volumes of documents and images into editable text in numerous applications. Used with flatbed scanners, multiple-function "all-in-one" devices and digital cameras, this OCR software is an intelligent document-to-knowledge tool with a graphical interface that complements the recognized Macintosh look and feel. Readiris Pro 11 and Readiris Pro 11 Corporate Edition include unique features, such as hand-printing recognition and indexing bar code scanning. Thanks to continuous research and development combined with innovative technology, I.R.I.S. software is now the leading OCR application on the Mac platform.”
With the exception of hard to become aware of updaters and a bit of problem with my HP Scanner software, using Readiris 11 has become as second nature to me now as MS Word. Ups, MS Word crashes a lot more and was much harder to learn even at a simple level.
This software has evolved very nicely since version 7. I am pleased to be able to give it 4.5 macCs with respect to the needs of most general users, but for you folks who have more sophisticated OCR needs, there a bit more evolution of those features needed. Therefore although it’s a great product which I now use almost daily, I give it only 4.0 macC’s.
I did not review the corporate version of Readiris Pro, which may have more enhanced features that more sophisticated and demanding professional user might need.
§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§




