Selecting the right document imaging system can be a daunting task. There are many aspects to be considered to ensure the chosen product fits your organization's needs. Based on the experience we have gained while creating document imaging solutions for our customers, we've assembled a list of things to look for, including some essentials and some nice extras.

Evaluating Your Needs

When deciding on a document imaging system, there are a number of questions to consider.
  • How many documents will the system store, consider both the number of existing documents and the number of documents added annually? This information determines how much storage space is needed, the hardware configuration and the cost of the system.
  • How many users will be using the system at the same time? This determines preliminary software costs and server size.
  • What departments will be using the system and will the public have access? This determines what specific features and levels of security will be needed.
  • What serious problems must absolutely be solved, and what issues should be addressed to make life easier or reduce costs or improve productivity? This determines which functions will be system requirements and which might be optional. It also helps determine whether plug-ins or customizations will be needed.
  • Do you want a turnkey solution or a customized one? This determines the amount of consulting, installation, training, configuration and support that is needed.
  • What type of network is currently used — NT, Server 2000, Novell, or other — and will this platform remain the same in the foreseeable future? This determines network constraints, system configuration and workstation upgrades.
What things should be considered when selecting an imaging system?

Scanning

First, a document imaging scanner must have an Automatic Document Feeder (ADF). This device allows a stack of paper to be placed in a tray and automatically brought in one page at a time, much like a fax machine. Scanners without an ADF were designed for graphics and require you to place manually each page to be scanned. Second, don't skimp on the scanner. It's nice to have the fastest scanner available, but it's more important to consider your budget and the size and volume of paper
you have to scan.

Most scanners can handle letter-size paper (8 1/2" x 11") and smaller. Some scanners can handle paper up to 11" x 17". There are even scanners that can support E-sized (34" x 44") drawings, but the larger the paper size, the more expensive the scanner.

Speed is another consideration in choosing a document
scanner. Document imaging scanners can handle between 10-100 pages per minute. If you need to scan 3,000 pages per day, it's worth buying a fast scanner. However, the faster the scanner, the higher the price.

Sometimes it is cheaper to buy two 20-page-per-minute scanners than one 40-page-per-minute scanner. If you choose this option, you will need to have a document imaging system that can support multiple scan stations. If the scanning job is very large, it may be more realistic and economical to have an outside service bureau scan most of the documents. If you choose this option, make sure you have selected an imaging package that allows easy synchronization of the service bureau pages with those scanned in-house. That way, you will be able to add the scanned documents from the service bureau into your database without interrupting or re-indexing the work you have done in your office. This option is often referred to as "portable volumes." Similarly, if you only have a few large-size documents, it may be more cost-effective to have a service bureau scan these images. Another alternative is to use a photocopier to reduce the large document to a smaller one that can be scanned in-house.

A good imaging system will let you choose from a wide range of scanners and will be flexible enough to bring in documents from outside sources.

Storage

Whether you are the keeper of the public record or just storing office correspondence, a solid storage system is a must. Even electronic images need a place to reside, and for the purposes of imaging, this place needs to be long-term, expandable and reliable.

There are many different storage media available for imaging. Each one has its own strengths and weaknesses. For an imaging system, a good storage system must encompass changing technologies, increasing numbers of document volumes and the tests of time. Selecting the right medium depends on your needs and your budget.

In short, there are five storage options.
  1. Magnetic Media
  2. Magneto-Optical Storage
  3. Compact Disks
  4. DVDs
  5. WORM
Magnetic Media - With the fast response times and dramatic drops in hard drive prices, magnetic media such as hard drives or RAID (Redundant Array of Inexpensive Disks) systems are becoming a popular choice for storage of document images. These devices are relatively inexpensive and can be linked together to store large numbers of documents. In addition, magnetic media provides the fastest response time. (The time it takes to store and retrieve a document.) The problem with magnetic media is that while inexpensive, they still cost more than optical media and their moving parts are subject to mechanical failure. That's why computer personnel regularly schedule backups of hard drives: If data is erased or damaged, it can easily be restored from backup.

Magneto-Optical Storage - With the drops in hard drive prices, the attractions of magneto-optical storage are quickly fading. Magnetic Optical (MO) disks are reliable and can store large amounts of data. In addition, MO disks can be placed in a jukebox that can hold over a hundred disks at a time. MO technology is slower and more expensive than large hard drives. The drawbacks of MO technology include the expense and fragile nature of the media: As with magnetic drives, the information is written on a spinning platter, which can be erased or damaged. This type of damage would require restoration from backup.

Compact Disks - Most people are familiar with CDs from music and data disks available at many retail outlets. CDs offer a safe and reliable media that can provide long-term storage for images, in some cases up to 100 years. Disks can also be stored in jukeboxes that can hold 500 CDs at a time. Furthermore, CDs do not require any specialized hardware or software to retrieve information. The drawback with CDs is their limited storage capacity: A standard CD can only hold around 12,000 pages of documents. CD jukeboxes and towers make it more convenient to store large number of documents on large numbers of CDs.

DVDs - Visually similar to CDs, these disks offer the same storage capacity of a MO disk without using moving parts in the media or requiring special software for decoding. With the life expectancy of CDs, DVD represents the best long-term option for reliable document imaging storage. The drawback to this media is its high cost. Currently, affordable DVD recorders do not exist, but all indications are that the industry will make DVD available for mass usage within twelve months. When they are, they will probably make MO disks outdated.

WORM - The final storage medium is WORM (Write Once Read Many). This media format is not readily available and requires specialized hardware and software to operate. Because of the limited number of companies that provide materials and support for WORM technology, it is not highly recommended. The most important thing to remember is that a good document imaging system must be able to use any media format currently available - as well as those on the horizon - to provide long term document storage.

Indexing

When paper documents are received in an office, they must be organized to be useful. Documents are labeled, sorted, stapled, placed in folders and filed in a cabinet. Without these steps, nothing could be found in a busy workplace. Electronic documents are no different. A document imaging system must have a comprehensive indexing system that organizes documents for future use.

There are three different ways to index (organize) electronic documents.
  1. Indexing words inside the document
  2. Storing documents in folders
  3. Assigning index fields to a document
Indexing Words Inside the Document - Traditionally, keyword indexing has been used to make the information within a document available. Assigning key words from the document itself allows users to store and find pages later. Unfortunately, it can take a lot of time for qualified people to read and manually key word documents. Document imaging systems can eliminate the need for manual key word indexing by providing automatic full-text indexing. To do this, the software must have the capability to perform Optical Character Recognition (OCR). This process actually reads a scanned page and converts it into readable text. Once read, the imaging software can then automatically index every word to track the location of each word and phrase within every document, dramatically reducing indexing costs while providing improved searching capabilities.

Storing Documents in Folders - Along with keyword or full-text indexing, an imaging system must have a visual method of filing documents. In any office, files are normally found by looking in a particular folder in a particular drawer in a particular file cabinet. An imaging system must have the ability to duplicate this filing system. A flexible folder structure eases the transition from paper filing to electronic filing and makes imaging systems more successful.

Assigning Index Fields to a Document - The final method of organizing documents is through index fields or templates. An imaging system must use a robust index field structure to accommodate large volumes of documents. Generally, these structures are based on a database that maintains these index fields. It is important that whatever the data storage design, it must be non-proprietary and expandable. Proprietary systems put the user at the mercy of a vendor who can alter service, costs or functionality without the customer's consent.

Because of the need to integrate imaging systems with other applications, these databases must use industry-standard languages and tools such as SQL-compliant databases. Systems that do not provide commercially available databases lock users into technologies and systems that may not keep pace with advancements in the computer industry.

Whatever combination of indexing methodologies are used, it's important to remember that they be easily used and understood by the people who need to retrieve the documents as well as the people who need to file them.

Retrieval

Once documents have been entered and indexed within an imaging system, rapid retrieval is a must. Users need to be able to use common sense tools to find any document within the system based on the most logical method. In some cases, this means using text, in other cases it would be based on the document folder or index field information. Whatever the method, document retrieval must be simple and user-friendly.

Retrieval is where a powerful indexing system pays off. Users who are familar with a document's text should be able to use that information to find what they want. Some systems can only find pages based on "key words" found on the page. This method is not always helpful because the person who selected the key words is probably often not the person searching for the document. To be truly useful, a document imaging system must be able to use full-text retrieval.

Simlarly, using the document name and folder view to find a document is also nice, but not always the best method. Once an imaging system contains thousands or millions of pages, folder trees become more complicated and document names become less unique. To assist searches, an imaging system needs to combine different criteria into one comprehensive search.

The same is true for index field information. A full-featured imaging system will have user-definable template fields. Index field searches will allow a user to comb through millions of records in seconds to find the document necessary. Having the flexibility to combine template searches along with text and document names offers users the greatest control of their documents. A good imaging system makes retrieval of relevant documents fast, easy and efficient.

Access

The final component of a document imaging is the access to the system. In today's computer environment, users are located in different locations, with different equipment, and different access rights. A full-featured imaging system must provide the ability to permit access to those users who need it, without compromising security. To create this access, a system must have two fundamental features:
  1. Broad availability
  2. Comprehensive security
Broad availability - An imaging system must offer different ways of accessing images. The most common method is through the user's desktop. Every document imaging system must provide a client-based user interface that enables the scanning, indexing and retrieval of documents. Without this basic interface, the system cannot function. To provide broad availability and access flexibility, imaging systems now must meet the requirements of offices with diverse uses and remote locations. Document imaging is no longer an "in-the-office" process. Many users require portability to exchange imaging information with other colleagues or to work off-site. An imaging system that does not offer this flexibility limits not only the usefulness of the system but also the abilities of the user.

In addition, sharing documents through the Internet or an intranet allows system administrators to deploy imaging systems across their entire network and to the public. Having browser-based document access removes the final limitations that can plague an imaging system. Users can search, retrieve and view documents with the simplicity of a web browser from any desktop on any platform at any location.

A broad level of access to document imaging is a must to save limited financial resources, intellectual capital and network bandwidth.

Comprehensive security - The ability to provide imaging to a larger group means stronger control must be placed on user access. A comprehensive security system must allow the system administrator to control what users can or cannot do as well as what they can or cannot see. The system must control access to folders, documents and even redacted pages and text in a simple and complete manner. The ability to deploy imaging to a wide variety of users requires a robust security system combined with an elegant user interface.

A good access system will make document imaging available to everyone, whether they are in an office or at a remote location, all without compromising system security.