| PDF Compression, Optimization and Search |
The basic concept of file compression is to take a large file and make it smaller. There are many time and money saving reasons to compress files. In particular, the advantages of PDF file compression include saving storage space, saving web hosting fees, saving time by transmitting and receiving files faster, and incorporating efficient OCR for fast, easy search and retrieval of files.
What is Document Compression?The key challenge in both document and image compression is to compress files to their minimum size without sacrificing image quality. If an image is compressed with absolutely no change to the original image bitmap then it is called lossless compression. When relying on a compression method, such as perceptually lossless, where databits can change, it is really important to validate that the method does not degrade the original data in any way. In fact, a good compression method should enhance, not degrade, the input data. Examples of these compression forms include both MP3, for digital audio, and JBIG2, for digital scanned files. Both of these methods, if implemented correctly, can enhance the original digital data. Of course, improper implementation of any perceptually lossless compression standard can degrade the signal, so it's important to understand the distinctions between different implementations of the same compression standard that might be commercially available. One reasonable test of whether the compression system is enhancing or degrading involves comparing the fidelity of both the original data and compressed data with a recognition system, such as OCR (optical character recognition), to validate that recognition rates for the compressed data are as high as those for the original input data. Lossy compression is any compression system that degrades the input data such that either humans notice a perceptual difference or machine recognition systems exhibit a statistically significant difference. Such lossy compression methods are generally not recommended for corporate document storage and retrieval, or for retention of digital image records.
|
Resources 


