Scanner Distortions Resolved

In order to fully appreciate the ways in which JBIG2 can enhance image quality, it is important to understand the distortions introduced by the scanning process. When used properly, JBIG2 can rectify many of these distortions and make the image appear closer to the original printed document.

The scanning process transforms a document from the continuous space of the printed page to the quantized space of a digital image. Strictly speaking, the level of precision of the characters on a printed page is limited by the molecules of the page. However, from the perspective of the human visual system, the page is perfectly smooth.

With a scanned document, the precision of an image is limited by the DPI. While at higher DPIs (such as 300 dpi and above) this will not cause any visible artifacts, it still presents a problem. Even a perfect sensor with a well-behaved monotonic response to the image will often divide a single font on the printed page into many different discrete shapes in the scanned image, i.e., fragmentation. The scanner can't identify each font and create the ideal digital representation of it. It is forced to digitize each character on its own, and slight differences in how each character of a font is positioned in relation to the scanner grid will often result in different digital representations in the scanned image. Consider the variations in these original-to-scanned examples of the letter "a". Take an idealized "a", as on the left in the first two examples. Scanners will take that "a" and align the pixels according to the gridlines the scanner uses. The pixels of the scan are more "boxy" than smoothed. This is because a scanner represents each pixel with a grid box. Each box or pixel must be either white or black in a bitonal scan; there can be no "partial" decision. As a consequence, the scanned image loses precision in a process known as quantization. What is relevant for image compression is that the resulting bitmaps are highly sensitive to the precise placement of the scanning grid vis-a-vis the image. In the first two examples, you see how a slight change in the positioning of the scanner grid causes the same model to result in different bitmaps. The third example shows the resulting bitmaps side by side.

Figure 7. This shows a scanner's grid overlay on an idealized "a" and the resulting scan. Note the boxy appearance of the bitonal scan.

Figure 8. This is a second example of a scanner's grid overlay on the same letter "a", but with slightly different (smaller) font size, and its resulting bitonal scan.

 

Figure 9. Same letter, two scans. There are plenty of variations in the results, which makes accuracy and verifiability crucial elements in JBIG2 compression.

Furthermore, a scanner will not be precisely as sensitive to black throughout the image. This can cause the same font to appear slightly thicker in some parts of the page and slightly thinner in other parts of the page. To further compound the problem, many scanners tend to pick up visual noise that did not appear in the original document, some of which may become attached to characters.

This fragmentation of a font during digitization into many distinct characters has several drawbacks. The characters representing a given font and character code all appeared the same in the original document and there is no benefit in their looking different in the scanned image. Some of these characters may even appear awkward and ill-formed. From a compression perspective you have another problem. Every difference between those characters needs to be encoded, resulting in much larger files. To help overcome these problems, JBIG2 allows pattern matching & substitution.

 

 
Generated in 0.60452 Seconds