How to Use ScanJet IIc Scanner
with
OmniPage Pro Software

 
 

 

Introduction
OmniPage Pro is a program that allows you to make use of documents such as reports or magazine articles without having to retype the entire piece of work.  This type of technology is referred to as Optical Character Recognition, or OCR.  This software will permit you to take scanned documents and image files and use them in your favorite applications with editable (notice this does not say edible) text.

Continue on in this section to find out more about:
 

What is Optical Character Recognition (OCR)?
OCR is the process of converting an  image file into computer-editable text.  This means an electronic picture of text, such as a sanned document or fax file, can be recognized as editable text by your computer.  OmniPage Pro, in addition to performing OCR, can also retain graphics, text formatting, and page formatting in the files that it reads.  There are four basic steps in OmniPage Pro OCR: The OmniPage Pro Desktop
Omni Page Pro's desktop displays the pages of a document in its thumnbqail viewer, image viewer, and text viewer.  You can use buttons in the Standard, AutoOCR, and Zone toolbars to perform various tasks on the document.
h

i
AutoOCR Toolbar
This toolbar contains buttons that can activate each step of the OCR process.
j

o
Set certain commands in the AutoOCR toolbar for the operations you wish to perform.  Do this using the drop down menus. Standard Toolbar
The standard toolbar contains buttons and drop-down lists for performing various tasks.
 ]
=
Zone Toolbar
This toolbar contains buttons that allow you to draw and define zones on a page image.
[
\
Zones are borders created around areas of a page image to identify what will be recognized as text or tretained as a graphic during OCR.  Zones play a big part in determining OCR results.  You can create zones automatically, manually, or with a template.

Options Dialog Box
You can select settings for OmniPage Pro in the Options dialog box.  To open it, click the Options button or choose Options... in the Tools Menu.
d


x

Getting Online Help
After installing OmniPage Pro, you can use its online help system to get information on features and procedures.

Help Menu
Use commands in the Help menu to open topics that provide information on features and procedures.
 

Context-Sensitive Help
You can get on-the-spot information about a particular OmniPage Pro command, toolbar button, or dialog box option in the following ways. Product Support
For the fastes and easiest way to get help, please lok for solutions in this manual or in the online help.  For troubleshooting tips, see General Troubleshooting Solutions.  If you need additional help, product support and information are available to registered users through the services listed in this table.
 
Service How to Contact
World Wide Web home page  http://www.caere.com
Download Service (BBS) (408) 395-1631
Automated Fax Response Service (408) 354-8471
Telephone Support in North America (408) 395-8319
For international telephone numbers, pelase refere to the Caere Product Support insert in your OmniPage Pro package. pppp
Please have the following information ready for the best service if you call Caere Product Support:

 

Processing Documents
This section describes how to work with documents in OmniPage Pro, along with each step in the OCR Process.
 

Ways to Process Documents
OCR is the process of turning and image into computer editable text so you do not have to retype the text manually.  The basic steps of this process were stated earlier and are repeated below: Using the OCR Wizard
The OCR Wizard guides yuou through the entire OCR process by asking you questions about your document and selecting the appropriate settings for you.

To process your document using the OCR Wizard:
 

Automatic Processing
Use the AUTO button to process a new document from start to finish or finish processing an open document.

To process your document automatically:
 

Each page of the document is processed and finished in order according to the selected commands.  If page images in an open document already have zones, OmniPage Pro will skip zoning for those pages and continue with the selected OCR and export operations.

Performing Multiple Tasks at Once
OmniPage Pro takes advantage of your computer's ability to handle more than one process at a time.  You can simultaneously scan, create zones, recognize, and edit documents. You do not have to wait for any process to complete before moving on to the next task.  For example, if you scan a multiple-page document, you can draw zones on an image as soon as the first page is scanned and you can edit recognized text as soon as it appears in the text viewer. These tasks can be done at the same time other pages are being scanned and recognized.

Starting the OCR Process Outside OmniPage Pro
You can start the OCR process outside Omni Page Pro in a variety of ways.  For example, you can use the OCR Aware feature to initiate OCR from another application and paste recognized text into an open document.  See  Using OCR in Other Applications  for more information.

Bringing Document Images into OmniPage Pro
You can bring document images into OmniPage Pro by:
 

Scanning Pages
You can scan paper documents to convert them to electronic images in OmniPage Pro.  If a document is already open, scanned pages are inserted as new pages.  To scan in OmniPage Pro, you must install the Scan Manager and select your default scanner.

To scan pages into OmniPage Pro:
 

Pages are scanned in order and combined into one working document.

Loading Image Files
An image file is an electronic picture of text, Such as a scanned paper document or an electronic fax, that is saved in an image file format such as PCX or TIFF. You can load image files into OmniPage Pro. If a document is already open, loaded image files are inserted as new pages.

To load image files into OmniPage Pro:
 

Loading Exchange Faxes
You can load fax images into OmniPage Pro from Microsoft Exchange or Outlook if you have the Microsoft Fax component installed with those applications.  Please see Microsoft documentation for information oil configuring these applications.  If a document is already open, loaded faxes are inserted as new pages.

To load Exchange faxes into OmniPage Pro:
 

Exchange faxes are loaded in the order selected and combined into one working document.

Creating Zones for OCR
Page images are displayed in OmniPage Pro's image viewer where zones are created before OCR.  Zones are borders that identify areas of an image that will be recognized as text or retained as graphics. Any part of an image not enclosed by a zone is ignored during OCR.

Creating Zones Automatically
OmniPage Pro can analyze a page and create zones automatically for you.  It uses the selected setting in the Zone button to determine the text flow on a page and breaks it into ordered zones.
To create zones automatically:
 

OmniPage Pro automatically draws zones on the current page in the image viewer.  Each zone has a number indicating its order and a letter indicating its zone properties.  Make sure zones are identified correctly before performing OCR.  For example, if you want to retain an area as a graphic, that area should be identified as a Graphic zone type.  See  Changing Zone Properties  for more information.

Performing OCR on a Document
Performing OCR converts an image to editable text.  This is also referred to as recognizing text.

To perform OCR:
 

Checking OCR Results
After performing OCR, recognized text appears in the text viewer where you can check for errors.  Error checking starts automatically if you chose OCR and Check as the OCR process command.  OmniPage Pro marks suspected errors in green and inserts a red "reject" character for any character it cannot recognize.  To turn off these color markers, choose Show Markers in the View menu.

To check and correct errors:
 

Verifying Text
After performing OCR, you can compare recognized text against the original image to verify that the text was recognized correctly.

To verify text against its original image:
 

Checking OCR Results in Microsoft Word
You can check for OCR errors directly in Microsoft Word 7 or Microsoft Word 97 if you have those versions installed on your computer.

To check and correct errors in Microsoft Word:
 

To verify text against its original image in Microsoft Word: Using OCR in Other Applications
You can use OmniPage Pro's OCR Aware feature to use OCR in other applications.  For example, you can scan, recognize, and paste text directly into a word-processing document without ever leaving the application.  You can use OCR Aware with 32-bit (and some 16-bit) applications that have been registered with OmniPage Pro. An application must be installed on your computer in order to use it with OCR Aware. See OCR Aware Settings for more information on registering applications with OCR Aware.

To use OCR Aware in an application:
 

Working with Documents
OmniPage Pro's thumbnail, image, and text viewers to look at and work with pages in the current document.  This section describes the following procedures: Saving a Document as You Work
Click the Save button in the Standard toolbar or choose Save in the File menu to save changes to the current document as you work.  The first time a document is saved, the Save As dialog box appears.  See  Saving a Document  for more information.  If a document has been saved as an OmniPage Document (*.met), all the changes you make in the open document are saved.  If a document has been saved as a text-based file type, only the text changes are saved out to that file.  For example, suppose you save the current document as a text file called Memo.txt but continue to work with the recognized text in OmniPage Pro.  Whenever you click the Save button, changes in the recognized text will overwrite the Memo.txt file.

Resizing a Page View
You can resize a page displayed in the image viewer or text viewer to enlarge or reduce the view.

To resize a page view:
 

Changing Pages
The thumbnail viewer, image viewer, and text viewer all display the same page in a document.  You can change pages in a document in the following ways: Reordering Pages
You can reorder pages in a document by dragging their thumbnails to different positions in the thumbnail viewer.

Deleting Pages
If you delete a page from a document in OmniPage Pro, the thumbnail, original image, and recognized text for that page are all deleted.

To permanently delete pages:
 

Printing a Document
You can print the current document's original page images or recognized text.

To print a document:
 

Closing a Document
Choose Close in the File menu to close a document.  You are prompted to save your document if you have not saved it or have modified it since the last save.  Save a document as an OmniPage Document (*. met) if you want to reopen it in OmniPage Pro again.

Closing OmniPage Pro
Choose Exit in the file menu to close OmniPage Pro.  You are prompted to save the current document if you have not saved it or have modified it since the last save.

Exporting Documents
You can export a document to other applications by:
 

Saving a Document
You can save recognized text and original images to disk in a variety of file types.

To save recognized text:
 

To save original images: Copying a Document to the Clipboard
You can copy every page of a recognized document to the Clipboard and then paste the text directly into another application.

To copy a document to the Clipboard:
 

Sending a Document as a Mail Attachment
You can send a recognized document as a file attached to a mail message if you have a MAPI-compliant mail application, such as Microsoft Exchange or Outlook, installed.

To send a document as a mail attachment:
 

OmniPage Pro Settings Setting AutoOCR Toolbar Commands
The AutoOCR toolbar buttons allow you to take a document through each step of the OCR process.  Every toolbar button has different process commands that can be set for the operations you want to perform.  OmniPage Pro can go through all steps automatically, or you can start each step individually. You can set AutoOCR Toolbar commands in two locations: AUTO Button Commands
Use the AUTO button to process a document from start to finish. The AUTO button's drop-down list contains the AutoOCR and OCR Wizard commands.

AutoOCR
Select AutoOCR to finish processing a new or open document according to the selected process commands. See Automatic Processing on page for more information.

OCR Wizard
Select OCR Wizard to have the OCR Wizard guide you through the entire OCR process.

Image Button Commands
Use the Image button to bring a document image into OmniPage Pro's image viewer.  The Image button's drop-down list contains the Load Image, Load Exchange Fax, and Scan Image commands.

Load Image
Select Load Image to load existing image files such as TIFF or PCX files.

Load Exchange Fax
Select Load Exchange Fax to load faxes from Microsoft Exchange or Outlook.  This command only appears in the drop-down list if you have the full Microsoft Fax application installed.

Scan Image
Select Scan Image to scan paper documents in your scanner.  This command only appears in the drop-down list if you have installed the Caere Scan Manager and have selected your default scanner.

Zone Button Commands
Use the Zone button to automatically create zones on document images. Zones are boxes that specify what will be recognized as text or retained as graphics on an image. The Zone button's drop-down list contains the Single-Column Pages, Multiple-Column Pages, Tables, Mixed Pages and HP AccuPage commands and the names of any zone templates you have created.  See  Creating Zones for OCR  for more information.

Single-Column Pages
Select Single-Column Pages to have OmniPage Pro automatically draw and order zones on single-column document images such as letters or memos.

Multiple-Column Pages
Select Multiple-Column Pages to have OmniPage Pro automatically draw and order zones on multiple-column document images such as magazine or newspaper articles.

Tables
Select Tables to have OmniPage Pro automatically draw and order zones on table format document images such as spreadsheets, or any page that contains a table.

Mixed Pages
Select Mixed Pages if your document contains multiple pages with a variety of page layouts. OmniPage Pro will automatically draw and order zones on each page.

HP AccuPage
If you use a scanner that supports HP AccuPage, you can select HP AccuPage as the auto zoning option for scanned pages.

Zone Templates
Select a zone template to create zones on document images using that template.  See  Creating Zone Templates  for more information.

OCR Button Commands
Use the OCR button to perform the selected OCR operation on document images. The OCR button's drop-down list contains the Perform OCR, OCR and Check, Train OCR, and Defer OCR commands.

Perform OCR
Select Perform OCR to recognize text on document images. During OCR, OmniPage Pro analyzes the image and identifies characters to produce editable text.  See  Performing OCR on a Document  for more information.

OCR and Check
Select OCR and Check to recognize text on document images and automatically start checking for errors after OCR.  See  Checking OCR Results  for more information.

Train OCR
Select Train OCR to teach OmniPage Pro how to recognize special characters. These pre-recognized characters are saved in a training file, which OmniPage Pro can use to compare with the characters in document images during OCR.  See  Training OCR for Special Characters  for more information.

Defer OCR
Select Defer OCR to delay text recognition during automatic processing. OmniPage Pro will process your document up to the point of OCR and then ask if you want to schedule the document to be finished later.  See  Scheduling OCR  for more information.

Export Button Commands
Use the Export button to export recognized text and retained graphics to other applications. The Export button's drop-down list contains the Save As, Send Mail, Copy to Clipboard, and Defer Export commands.

Save As
Select Save As to save a recognized document to disk in a specified file format.  See  Saving a Document  for more information.

Send Mail
Select Send Mail to send a recognized document as a file attached to a mail message if you have a MAPI-compliant mail application, such as Microsoft Exchange or Outlook, installed.  See  Sending a Document as a Mail Attachment  for more information.

Copy to Clipboard
Select Copy to Clipboard to place a copy of a recognized document on the Clipboard.  See  Copying a Document to a Clipboard  for more help.

Defer Export
Select Defer Export if you do not want to export your document right after automatic processing. OmniPage Pro will process your document up to the point of export and then stop.

Selecting OmniPage Pro Settings
Click the Options button or choose Options... in the Tools menu to open the Options dialog box. This is the central location for OmniPage Pro settings.
=

Accuracy Settings
Click the Accuracy tab to select settings that affect OCR accuracy the most.

Scanner Settings
Click the Scanner tab to select settings for scanning pages.

Page Format Settings
Click the Page Format tab to select settings that determine how the formatting of a page is handled during OCR.

Language Settings
Click the Language tab to select language settings for your document.

OCR Aware Settings
Click the OCR Aware tab to select settings for the OCR Aware feature.  OCR Aware allows you to initiate OCR from another application.  See  Using OCR in Other Applications  for more information.

To register an application with OCR Aware:
 

Process Settings
Click the Process tab to set commands and settings for each step of OCR.

Microsoft Word Settings
Click the Microsoft Word tab to select settings for performing check recognition directly in Microsoft Word.  See Checking OCR Results in Microsoft Word for more information.

Settings Guidelines
The settings you select in OmniPage Pro can greatly affect OCR results.  Make sure that settings are appropriate for your document before you begin processing.  You may have to experiment with different settings to get the results you want.  Answer the following questions to get settings recommentdations for your documents.
 

Magazine and newspaper pages Memos and letters Spreadsheets and tables Legal documents Mixed formats or not sure Poor or not sure
Degraded copies, colored or shaded backgrounds or text, run-together or broken text characters. Good
Clear, well-formed, black text characters on a clean, white background. Minimal
You plan to keep one font and one font size only. Some
You want to keep font characteristics and paragraph formatting. As much as possible Yes
You are going to keep graphics such as logos photos during OCR processing. No
You have decided to ignore graphics such as logos and photos during OCR processing. One Language More Than One Language Yes No
Customizing OCR
OmniPage Pro has many features that allow you to customize the way your documents are handled during OCR. This section describes how to use these features. Please continue reading for information on these topics: Adjusting Page Images Before OCR
You can rotate and straighten page images in OmniPage Pro's image viewer before zoning and OCR take place. This is recommended to improve OCR accuracy on pages that are not oriented correctly.

To rotate a page image:
 

To straighten a page image: Customizing Zones
Zones are borders created around areas of a page image to identify what will be recognized as text or retained as a graphic during OCR. Zones play a big part in determining OCR results.  You can create zones automatically, manually, or with a template.  Topics in this section describe how you can customize zones including: Zone toolbar
The Zone toolbar contains buttons for drawing and modifying zones.
p

;
Drawing Zones Manually
You can draw zones manually on a page image using buttons in the Zone toolbar. Rectangular zones are the most common, but you can also draw irregular-shaped zones.

To draw rectangular zones:
 

To draw irregular-shaped zones:
p
p
Modifying Zones
You can modify zones by moving, resizing, reordering, extending, subtracting, connecting, or dividing them.

To move zones:
 

To resize zones: To reorder zones: To extend an area of a zone: To subtract an area of a zone: To connect two or more zones: To divide a zone: Deleting Zones
You can delete the current zones if you want to create new zones.  You can also delete individual zones that you do not want to process during OCR.  Any part of a page image not enclosed by a zone is ignored during OCR.

To delete zones:
 

Changing Zone Properties
You can set certain properties for zones to customize how each zone will be treated during OCR.  The Zone Properties dialog box contains settings for zone type and zone content.
t
o i
Zone Type
Every zone on a page has a zone type setting.  You can select the following zone types: Zone Content
All text zones on a page also have a zone content setting.  This specifies the characters OmniPage Pro looks for within a zone during OCR.  You can select Alphanumeric or Numeric as the zone content setting.  The letter A appears within an alphanumeric zone and the letter N appears within a numeric zone.  For example, if a particular zone only contains numbers and mathematical signs, you can specify the contents of that zone to be Numeric.  OmniPage Pro will only look for numeric characters in that zone during recognition.

To change the properties of a zone:
 

Creating Zone Templates
You can use zone templates to create zones on a page image.  A zone template contains zone attributes including size, shape, position, order, type, and content.  Zone templates are useful if you frequently process documents that have the same layouts and similar content.

To create a zone template:
 

To create zones with a template: Specifying Fonts
You can retain the font characteristics in your document during OCR if you select an Output Format option other than Remove formatting in the Page Format section of the Options dialog box.  OmniPage Pro automatically maps detected font types to specified fonts. To map fonts, OmniPage Pro analyzes text and categorizes it as one of these font types: To customize the font mapping for font types:
Training OCR for Special Characters
A training file is a set of pre-recognized text characters that OmniPage Pro compares with characters on a page image during OCR. You can create a training file for special characters that might normally be difficult to recognize such as the copyright symbol © or the registered trademark symbol ®.

To create a training file:
 

To edit a training file:
Creating User Dictionaries
A user dictionary is used when you perform OCR and check for errors afterward. You can select a user dictionary in the Language section of the Options dialog box.

To customize a user dictionary:
 

Saving Settings Files
You can save OmniPage Pro settings to a file.  A settings file is useful for quickly loading particular settings that you need for certain documents.

To save settings to a file:
 

To load a settings file:
Scheduling OCR
You can schedule OCR to take place on one or more OmniPage Documents, supported image files, and pages in your scanner.  This processing can take place while you are away from your computer as long as OmniPage Pro is still running.  Scheduled documents are opened at the specified time, unfinished pages are recognized, and the documents are saved in a preselected format and location.

Topics in this section include:
 

Scheduling Individual Documents
You can schedule individual documents from different folders. Scheduled documents are recognized at the specified time and then saved in the designated output folder.

To schedule individual documents:
 

Scheduling Documents from an Input Folder
You can set up OmniPage Pro to automatically schedule documents from a specified input folder.  Scheduled documents are recognized at the specified time and then saved in the designated output folder.

To schedule documents from an input folder:
 

Modifying Output Options for Documents
All newly scheduled documents have the same default output folder and file format assigned to them.  The default output file name uses the original file name and the extension of the output file format.  You can modify all of these output options for any scheduled document.

To modify the output options for an individual document:
 


Technical Information
This section provides troubleshooting and other technical information about using OmniPage Pro.  Please also read the Release Notes and Scanner Setup Notes that came in your OmniPage Pro package.  These contain the latest information on OmniPage Pro and its supported scanners.  Please continue reading for information on these topics: General Troubleshooting Solutions
Although OmniPage Pro is designed to be easy to use, problems sometimes occur.  Many of the onscreen error messages contain self-explanatory descriptions of what to do--check connections, close other applications to free up memory, and so on.  Sometimes that is all the troubleshooting help you need.

Topics in this section include:
 

Solutions to Try First
Try these possible solutions if you experience problems using OmniPage Pro: Testing OmniPage Pro
Restarting Windows 95 in safe mode or Windows NT in VGA mode allows you to test OmniPage Pro on a simplified system.  This is recommended when you cannot resolve crashing problems or if OmniPage Pro has stopped running altogether.  See Windows online help for more information.

To test OmniPage Pro in safe mode (Windows 95):
 

To Test OmniPage Pro in VGA mode (Windows NT): Low Memory Problems
OmniPage Pro may run poorly under low memory conditions.  This may be indicated by various error messages or if OmniPage Pro works slowly and accesses the hard drive often.  Try these solutions for low memory conditions: Low Disk Space Problems
Problems may occur if your system runs low on free disk space.  Try these solutions for low disk space problems: Using Visioneer Scanners with OmniPage Pro
During installation, OmniPage Pro automatically integrates with your Visioneer PaperPort software.  However, you cannot scan directly into OmniPage Pro if you use a Visioneer scanner or if your scanner is set up to work with PaperPort software (such as the HP ScanJet 5s).  Instead, scan pages into PaperPort and then drag the page images onto the OmniPage Pro icon at the bottom of the PaperPort Desktop.  The page images will be loaded into OmniPage Pro.  See OmniPage Pro's online help for more information.

Supported File Formats
OmniPage Pro can open these file formats:
 

OmniPage Pro can save original images to these file formats: OmniPage Pro can save recognized text to these file formats:
 
Ami Professional 2.0, 3.0, 3.1 FrameMaker Text Only
ANSI HTML** Ventura Publisher (MS Word)
ANSI Standard Lotus 123 Windows Write 3.x
ANSI Stripped Microsoft PowerPoint (*.rtf) Word for DOS 5.0, 5.5
ASCII Microsoft Publisher Word for Windows 2.0, 6.0, 7.0, 97
ASCII Standard OmniPage Document (*.met) Wordpad
ASCII Stripped PageMaker (MS Word) WordPerfect 5.0, 5.1, 6.0, 6.1
dBase III, III+, IV Quattro Pro 4.0 WordPerfect for Windows 5.1, 5.2, 6.0, 6.1
DisplayWrite (DCA/RFT) Quattro Pro for Windows 4.0 WordPro 96, 97
Excel 3.0, 4.0, 5.0, 6.0, 7.0, 97 Rich Text Format WordStar for Windows 1.x, 2.0
XyWrite IIIPlus, IV
**When saving to HTML, all graphics are saved as separate image files using JPEG format.

Scanner Setup Issues
This section contains information on scanner setup and solutions for scanning problems you may encounter.  Topics in this section include:
 

Scanner Drivers Supplied by the Manufacturer
Many scanners are shipped with one or more scanner drivers.  This is software that allows your computer to communicate with your scanner.  Some scanners do not require drivers and other scanners require more than one driver.  Refer to your scanner documentation for information about installing any required scanner drivers.  Make sure that your scanner and scanner drivers are properly installed and configured before installing OmniPage Pro.  Make sure that you have installed the appropriate scanner drivers supplied by the manufacturer.

Scanner Drivers Supplied by Caere
OmniPage Pro is shipped with special scanner drivers that allow it to communicate with supported scanners.  These scanner driver files are installed on your computer when you install the Caere Scan Manager.  These drivers often work in conjunction with the drivers from your scanner manufacturer.  In order to use your scanner with OmniPage Pro, you must select the appropriate scanner in the Caere Scan Manager.  See Setting Up Your Scanner with OmniPage Pro for more information.

Problems Connecting OmniPage Pro to Your Scanner
Try these solutions if you experience a problem between OmniPage Pro and your scanner or if you receive a scanner error message when you launch OmniPage Pro.
 

Missing Scan Image Command
The Scan Image command does not appear in the Image button's dropdown list in the following cases: Scanner Message on Launch
The first time you launch OmniPage Pro after installing or changing your current scanner in the Caere Scan Manager, you may get this message: This scanner's configuration is set using the system-level driver.  If it asks for no more information, click OK in the dialog box.  You may also have the option to select the following: System Crash Occurs While Scanning
Try these solutions if a crash occurs during a scan: Scanner Not Listed in Supported Scanners List Box
Try these solutions if your scanner is not listed in the Scan Manager Supported Scanners list box: Scanning Tips
OCR results will be poor if an image is not scanned properly.  Remember the following tips when you scan: OCR Problems
This section contains information and solutions for possible OCR problems.  Topics in this section include: System Crash During OCR
Try these solutions if a crash occurs during OCR or if processing takes a very long time: Text Does Not Get Recognized Properly
Try these solutions if any part of the original document is not converted to text properly during OCR: Problems With Fax Recognition
Try these solutions to improve OCR accuracy on fax images: Uninstalling the Software
Sometimes uninstalling and then reinstalling OmniPage Pro and the Caere Scan Manager will solve a problem.  OmniPage Pro's Uninstall program will not remove any files saved to the OmniPage Install directory or subdirectories, in addition to the following files: To uninstall OmniPage Pro: To uninstall the Caere Scan Manager: