What is PDF?
PDF stands for Portable Document Format. The PDF format was originally released in 1993 by Adobe Systems as a proprietary format for document representation and sharing. In 2008, the PDF format was released as an open standard, and is listed by the International Organization for Standardization under ISO/IEC 32000-1:2008. PDF code is heavily based on the PostScript page description language.
PDF is a container document format, meaning that a given PDF file can include many different types of objects such as text, images, hyperlinks, and even occasionally video. The PDF format is platform-independent, which means that the only thing needed to open a PDF file is a PDF reader, regardless of what type of computer or operating system it is opened on. The ubiquitous, free Adobe Reader (formerly Acrobat Reader) can be found on almost every computer, and is available to download. Because of its portability, PDF has become the preferred format for printable documents on the web and for document circulation of any kind.
There are several subsets of PDF; the main ones are PDF/X (ISO TC130), which is intended for graphics exchange and consequently contains extra printing requirements; PDF/A (ISO 19005), which is for long-term digital archival and is not reliant on standard system elements; PDF/E, for the exchange of engineering workflow related documents; and PDF/UA which provides standards for accessibility by individuals using assistive technology. Of these, the PDF/A format is the most widespread, since more and more attention is being focused on digital archiving and the ideal of the 'paperless office'.
Unlike editable text files such as Microsoft Word documents or .txt files, PDF files are not inherently searchable. The text in a PDF is not stored in a form that is machine-readable, and is therefore unsearchable without modification. OCR stands for optical character recognition, and refers to software used to extract text from images and convert it to a machine-readable language like ASCII. One of the primary applications of OCR software is making PDF files text-searchable. In order to create a searchable PDF, the OCR software extracts the text from the original PDF. The OCR software then adds an invisible layer of searchable text that lines up with the visible text to the PDF file. This enables the user to use the search function in their PDF reader and quickly locate information without having to look through the whole document.
PDF files, although portable in one sense, can be very large depending on their length and content. Large PDF files take a long time to load from websites, are difficult to email, and take up a significant amount of storage space. To combat this problem, PDF compression software was developed. Using PDF compression software on a PDF file reduces its file size, and optimizes it for viewing and transfer.
CVISION Technologies, Inc. offers the leading software for PDF compression, OCR, and PDF/A archiving.