ABBYY Scanning Tools Seek Out PDF Files To Convert

While solution providers that sell document-scanning solutions in small businesses often depend on one software package to do it all, it is not always beneficial to scan every document through a single tool. Because single software-scanning solutions essentially operate like black boxes, if problems occur in a conversion process, users are forced to start from scratch.

CRN Test Center engineers instead recommend splitting the process between two PCs so the scanning and converting can run independently. Both scanning and conversion typically tax PC resources to work properly, so separating these processes not only improves performance but also circumvents delays that sometimes occur due to errors in large batch jobs.

ABBYY document-scanning software provides a simple solution for businesses that cannot afford a batch-scanning tool. After examining ABBYY PDF Transformer and ABBYY ScanTo Office, CRN Test Center engineers found that by combining the two products, solution providers can create an automated scanning and conversion solution that will work on most document forms and produce most file formats used in offices. Since both products are priced at $49.99, most of the solution providers' revenue must come from professional services.

Like most conversion tools, PDF Transformer will use most of the memory and CPU access that is available to convert a file. Engineers found that all the tests performed on two PCs and a server produced similar results. PDF Transformer slowed down each system considerably and made it difficult for engineers to start other processes. Conversion with Transformer took longer than with other conversion tools tested by engineers last year. A 950-Kbyte PDF file took about 1 minute and 30 seconds to convert into text.

id
unit-1659132512259
type
Sponsored post

PDF Transformer uses a four-step algorithm that proved extremely accurate for PDF files that did not contain graphics with text layers. The software uses ABBYY's OCR technology to recognize a document's layout. The software can also maintain formats of PDF files that use table layouts with white grid lines, free-flowing columns, bulleted lists, different text formats and font sizes, and complex HTML formats.

Once scanned, PDF Transformer reconstructs the document based on user-specified formats. The software can also transfer images in PDF files accurately, even vector-based 3-D images. In addition to Office formats, PDF Transformer can convert PDF files into HTML and XML stylesheets.

However, several PDFs that contained graphics with text layers were converted improperly every time they were scanned. The distortions in most cases were difficult to fix right away, so engineers do not recommend using this tool to convert PDFs with graphics.

When running password-protected PDF files, PDF Transformer checks for owner passwords during processing. Because user-level passwords are usually not set up to extract documents, PDF Transformer can only process files with owner passwords. When running batch-conversion jobs, this feature might not work at all, since it is interactive and scripts might have to take care of several files with passwords during a scan. Customers should be aware of this limitation of the tool.

As a stand-alone application, PDF Transformer uses an interactive wizard that requires user input. However, PDF Transformer can also work as a plug-in for Microsoft Office applications, so solution providers can use the Visual Basic (VBA) capabilities in Word or Excel to develop batch scripts. Solution providers can also tie in PDF Transformer functionality with VBA for Outlook and extract and convert e-mail PDF attachments automatically.

ScanTo Office also works as a plug-in for some Microsoft Office tools, so solution providers can add OCR scanning into their conversion scripts. Since the software also uses ABBYY's OCR technology, it performed as accurately as PDF Transformer when it analyzed scanned images.

During various tests, engineers noticed several distortions around the edges of scanned forms. Because ScanTo Office cannot maintain format when there are slight column shifts in the same scanned report or form, only documents that have the same text size and spacing are converted completely correctly.

In cases where the scanning was not done with any precision, engineers recommend using text formats as outputs if formatting is not an issue.

ScanTo Office was more inaccurate than anticipated, since it uses the same OCR algorithms as PDF Transformer. However, since ABBYY provides a free trial version, solution providers can try it out before recommending it to customers.

PRODUCT SNAPSHOT
>> Product: ABBYY PDF Transformer, ABBYY ScanTo Office
COMPANY: ABBYY
Fremont, Calif.
(510) 226-6717
www.abbyy.com
DISTRIBUTORS: Ingram Micro, Navarre
> PRICE: $49.99 each