PDF is one of the formats for storing text data. It is used in the publishing industry. It may be used for web related purposes like sending files over the internet. Lesser space utilization and font compatibility are the main advantages of using this file format. It stores the data in the form of images, occupying less amount of space. Even though the space utilization of this format is low, files tend to be very large in space. In such cases PDF compression may be used.
Compressing a PDF file is the encoding of data to make it smaller. Compression works by substituting the data in the file with the encoded data which is lesser in size. This encoded data is then decompressed which involves decoding of data to get back the original data before compression. The PDF compression has two types namely lossy and lossless. In lossless techniques, the compression will not lead to loss of data consistency. However the compression ratio is limited and it cannot be compressed more. Lossless compression results in high ratio of compression but cannot guarantee data consistency.
Compressing PDF file mainly involves removing redundancies in the file. Redundancies are unwanted blank spaces, repeating parts, etc. These redundancies are eliminated by substituting these repeating parts by a single character like newline. This is the primary part of compressing PDF file. This will result in about a 50% reduction in file size. The file is then analyzed to look for patterns in the data in the file. These patterns may be utilized to encode the data. This will result in further compression of PDF file. The receiver has decoding algorithms that will substitute the encoded data with the original text data. Compression of PDF file may result in overhead before compression in utilizing time and processor power. This however is compensated by the benefits of compression as it will result in saving bandwidth and time during transmission. Compression of PDF files is done by softwares that take a PDF file as input and give a compressed PDF file that may utilized for transmission.