Scanner Distortion Resolved
With a scanned document, the precision of an image is limited by the DPI.
In order to fully appreciate the ways in which JBIG2 can enhance image quality, it is important to understand the distortions introduced by the scanning process. When used properly, JBIG2 can rectify many of these distortions and make the image appear closer to the original printed document.
The scanning process transforms a document from the continuous space of the printed page to the quantized space of a digital image. Strictly speaking, the level of precision of the characters on a printed page is limited by the molecules of the page. However, from the perspective of the human visual system, the page is perfectly smooth.
With a scanned document, the precision of an image is limited by the DPI. While at higher DPIs (such as 300 dpi and above) this will not cause any visible artifacts, it still presents a problem. Even a perfect sensor with a well-behaved monotonic response to the image will often divide a single font on the printed page into many different discrete shapes in the scanned image, i.e., fragmentation. The scanner can’t identify each font and create the ideal digital representation of it. It is forced to digitize each character on its own, and slight differences in how each character of a font is positioned in relation to the scanner grid will often result in different digital representations in the scanned image.
Consider the variations in these original-to-scanned examples of the letter “a”. Take an idealized “a”, as on the left in the first two examples. Scanners will take that “a” and align the pixels according to the gridlines the scanner uses. The pixels of the scan are more “boxy” than smoothed. This is because a scanner represents each pixel with a grid box. Each box or pixel must be either white or black in a bitonal scan; there can be no “partial” decision. As a consequence, the scanned image loses precision in a process known as quantization. What is relevant for image compression is that the resulting bitmaps are highly sensitive to the precise placement of the scanning grid vis-à-vis the image.
Scanner Distortion Examples
In the first two examples, you see how a slight change in the positioning of the scanner grid causes the same model to result in different bitmaps. The third example shows the resulting bitmaps side by side.
Furthermore, a scanner will not be precisely as sensitive to black throughout the image. This can cause the same font to appear slightly thicker in some parts of the page and slightly thinner in other parts of the page. To further compound the problem, many scanners tend to pick up visual noise that did not appear in the original document, some of which may become attached to characters.
This fragmentation of a font during digitization into many distinct characters has several drawbacks. The characters representing a given font and character code all appeared the same in the original document and there is no benefit in their looking different in the scanned image. Some of these characters may even appear awkward and ill-formed. From a compression perspective you have another problem. Every difference between those characters needs to be encoded, resulting in much larger files. To help overcome these problems, JBIG2 allows pattern matching & substitution.