CVISION home
 
 
 
Litigation Support Web Repositories Scanning Bureaus Wireless Telecom
 

 
   CVista Suite Overview
   CVista PdfCompressor
   CVista Viewer
   CVista API Toolkit
   CBatch
   OCR
 
  Professional Services Overview
  LeapReader Overview
  Submit Inquiry
 
   Case Studies
   Litigation Support
   Web Repositories
   Scanning Bureaus
   Wireless Telecom
 
   Resellers
   Service Bureaus
 
   Case Studies
   Clients
   Testimonials
   Information/Support Blog
   Submit a File to our Staff

 

PDF Compression, Optimization, and Search


What is Document Compression?

The basic concept of file compression is to take a large file and make it smaller. There are many time and money saving reasons to compress files. In particular, the advantages of PDF file compression include saving storage space, saving web hosting fees, saving time by transmitting and receiving files faster, and incorporating efficient OCR for fast, easy search and retrieval of files.

The key challenge in both document and image compression is to compress files to their minimum size without sacrificing image quality. If an image is compressed with absolutely no change to the original image bitmap then it is called lossless compression.

To achieve an order of magnitude compression 10x or even 100x smaller than the original, we need to consider other forms of compression. Most of the file size for compressed digital images and video is used to code noise and digitization artifacts, not the important symbolic information. Consequently, it is essential to identify which bytes correspond to "noise" and which correspond to "signal". When this signal/noise classification is done accurately then the resulting compressed file should appear identical to the original file. This compression method is referred to as perceptually lossless.

When relying on a compression method, such as perceptually lossless, where databits can change, it is really important to validate that the method does not degrade the original data in any way. In fact, a good compression method should enhance, not degrade, the input data. Examples of these compression forms include both MP3, for digital audio, and JBIG2, for digital scanned files. Both of these methods, if implemented correctly, can enhance the original digital data.

Of course, improper implementation of any perceptually lossless compression standard can degrade the signal, so it's important to understand the distinctions between different implementations of the same compression standard that might be commercially available. One reasonable test of whether the compression system is enhancing or degrading involves comparing the fidelity of both the original data and compressed data with a recognition system, such as OCR (optical character recognition), to validate that recognition rates for the compressed data are as high as those for the original input data.

Lossy compression is any compression system that degrades the input data such that either humans notice a perceptual difference or machine recognition systems exhibit a statistically significant difference. Such lossy compression methods are generally not recommended for corporate document storage and retrieval, or for retention of digital image records.


Use Compression to Save Time and Money

With more companies hosting and sharing documents online and working in distributed database environments, recent advances in compression technology have become very relevant. It's hard to ignore the value equation of saving 90% of a company's available storage space. Web hosting fees, as well as costs involved in archiving and storing data, are usually significantly lowered by compressing files. Having the capability to reduce the transmission time of a file by a factor of 10x makes sharing documents more feasible and efficient.


PDF - The Best Format to Compress, Web Optimize, and Search your Files

There are many reasons why conversion to compressed, Web-optimized, searchable PDF format will yield a greater return on investment than alternative file formats. PDF is more universally viewable than TIFF, JPEG or any other image format, it can be compressed automatically, and it can be made seamlessly text-searchable for immediate file retrieval.

Of course, many companies already have their bitonal files in TIFF format and their color images in JPEG. One solution is to utilize a software package that performs three functions, conversion, compression and OCR. These programs allow users to input TIFF, JPEG and many other image formats and easily convert them to compressed, web-optimized, searchable PDF files. The end result will be the smallest, most universally viewable, Web-ready files anywhere.

 

PDF Converter

PDF Shrinker

PDF OCR Solutions

Back to PDF Resource Library

PdfCompressor 3.0 - Information

PdfCompressor 3.0 - Free Download

 
 
   
 


Copyright (c) 1998-2007 CVISION Technologies, Inc.
CVISION, CVista, CBatch, and the CVISION logo are registered
trademarks of CVISION Technologies, Inc.

 
Litigation Support Web Repositories Scanning Bureaus Wireless Telecom