OCR basically converts the mechanical and electronic typewritten, handwritten text into a form that is machine edible. The early systems needed training for reading a specific font where as intelligent systems having a high degree of recognition accuracy are used abundantly at present. OCR includes digital image processing also.
The file reduction is possible by making use of compression techniques. File reduction is defined as the process of reducing the size of the image files for the purpose of transmission, processing and storage. File compression can be done by both standard and non standard techniques. The standard compression technique has got some advantages over the non standard compression techniques. The attributes of the original should be considered for selecting a compression technique. Some file reduction techniques are meant for compressing pictures while others are meant for compressing text. The level of compression and the compression technique used may affect the quality of the data.
Two types of file reduction techniques are there.
1. Lossless compression which reduces the storage space that an image file needs. This takes place without any loss of data. In case a lossless compression is performed on an image, the compressed image will be identical to the image before compressing. Lossless compression technique is mainly used with bitonal images.
2. Lossy compression which is another technique which reduces the storage space required by an image file but by discarding information. The redundant information along with the information which is not predictable to the human eye and is impossible to detect are discarded. If we perform decompression on the compressed image the resulting image will differ from the original image.
OCR file splitter is a program which can be used for file reduction. It can be defined as a program which is designed for monitoring and watching a file or folder for the arrival of a multi page document of Tiff images. When a multi page document of Tiff image arrives it is split into a number of smaller multi pages. This splitting is done by a fixed number of pages or depending on the content of the file.