Challenges and Complexity of Document Archiving
The Proliferation of Data
As digital data increases, there is a lot more information for companies and organizations to keep track of. Emails, related attachments, incoming surface mail, and faxes are examples of common document types companies and organizations usually handle. These documents include forms, invoices, checks, and contracts. Archiving a document should include maintaining the look and feel of the original, such as margins and pagination. As such, Word and other MS formats that depend on platform settings and viewer preferences would not be suitable for archiving purposes. JPEG and TIFF are both image formats that might preserve the look of the original document, but lose much of the functionality, such as text searching and metadata.
In searching for a newspaper or journal article from many years ago, one would often go to the library and pull the appropriate microfiche reel. The microfiche film faithfully re-created that newspaper page, but it was very time consuming to retrieve and not particularly functional with respect to search queries. There was no effective way to insert metadata, i.e., data about the document, into the microfiche film.