PDF files can be reduced by compressing the PDF document and the reducing the size of data in order to save space as well as transmission time. In PDF files, reduction refers to image compressing. The PDF document formats are designed to compress information to the maximum as they tend to become very large files. Compression can be either lossy (some information is permanently lost) or lossless (all information can be restored) depending on the user requirement.
PDF (Portable Document Format) is one of the most common file formats for storing text based and image based information in digital files. It consists mainly of objects which are used to display text, numbers, names, images. PDF is a page description language like PostScript. But it is simplified with restricted functionality to be smaller which accounts to a better data structure. It supports efficient compression algorithms to reduce the file size to about half the size of an equivalent PostScript file. Some of the PDF file size reduction or compression are LZW (Lempel-Ziv-Welch) ,FLATE (ZIP, in PDF 1.2) ,JPEG and JPEG2000 (PDF version 1.5), CCITT (the facsimile standard, Group 3 or 4) ,JBIG2 compression (PDF version 1.4) ,RLE (Run Length Encoding). These compression filters produce binary data. The binary data can be further converted to ASCII base-85 encoding if a 7-bit ASCII representation is required.
These algorithms can be divided into two distinct categories: lossless or lossy. The Lossless compression does not change the content of a file. If the file is compressed and then decompressed, the file is not changed. The following algorithms are lossless: CCITT group 3 & 4 compression, Flate compression, LZW compression, RLE compression and ZIP. The Lossy compression achieves better compression ratios by selectively getting rid of some of the information in the file. Such compression is used for images or sound files but not for text or program data. For Grayscale or Color, Flate compression is the best option which works well on images with large areas of single colors or repeating patterns, such as screen shots, simple images created with paint programs, and black-and-white images that contain repeating patterns.