PDF as a Records Management Document Solution
The evolution of PDF into the most widely used format for document workflows is significant, but does not, in and of itself, demonstrate that PDF can be used reliably and legally in regulated areas, including records management (RM). In particular, corporations and government agencies have rather strict guidelines with respect to RM and document retention. These guidelines, embodied in ARMA International’s Generally Accepted Principles of Recordkeeping, require, among other things, that documents be reproducible in case of legal discovery or regulatory inquiry.
PDF format has the necessary attributes to be used reliably as the document format of choice in the most demanding corporate and government RM applications.
For Records Management (RM) and archiving, the records need to be:
It can be shown that PDF directly supports each of these RM objectives.
With respect to authenticity, how can we prove that a record is what it claims to be? How do we show that it was created by someone at a certain time? One effective method is through the use of metadata. Metadata is essential in long-term preservation because it allows users to insert identifiable information regarding the details of a document such as author, date, subject, keywords, and more. Metadata insertion adds to the portability of a database as information about a document is kept both at the database level and at the document level itself. In a managed document workflow environment, the creation, receipt, and transmission of records can be controlled such that each record creator is authorized and identified. The relevant metadata, such as who created this document and when was it created, can be programmatically inserted into the PDF file. Digital signatures can also be used to ensure that the document is unaltered
Reliability is also a key RM objective. A record must be a faithful rendition of the event or transaction that it represents. To ensure this, the document should be created in a timely manner by someone with the relevant knowledge, or by an automated process that is used on a regular basis to generate such records. Using the PDF specifications, an automated system can be designed and integrated within the document workflow that can generate digital PDFs directly from the source application.
Another important RM objective is to maintain the integrity of the record by keeping it complete and unaltered. To accomplish this, a record needs to be protected against unauthorized alteration. This includes the ability to monitor any authorized annotation, addition, or deletions. A PDF file can be secured with password protection and encryption. This limits the documents access, even when sent as a routine email attachment, to only those who are authorized to have document access. This access can be controlled and limited with respect to viewing, editing, and printing of PDF content. There are also digital signatures supported by PDF that do not allow any further modifications to a document once signed.
Usability is also an important RM property that needs to be supported. This means that for a record, it must be easy to locate, retrieve, render, and understand its contents. There are at least two ways in which PDF can be considered very usable. One is consistent full-text search and retrieval. Namely, all PDF documents are either searchable or can be made text searchable in a very straightforward way. If the PDF is “digitally born” then it is searchable from its creation. If it’s a PDF image, i.e., captured paper document, then it can always be made searchable through Optical Character Recognition (OCR) and converted into PDF image plus hidden text format. In this way, electronic and image PDFs are fully supported using the full-text search engines of many well-known software vendors, such as Verity and Hummingbird. The use of metadata, including PDF support for encapsulating XML metadata, ensures that PDF files will be usable with record-quality metadata way into the future.
As just described, PDF format has the necessary attributes to be used reliably as the document format of choice in the most demanding corporate and government RM applications.