Pattern Matching & Substitution
Pattern matching & substitution (PM&S) is perhaps the most powerful technique available within JBIG2. It enables the better JBIG2 implementations to achieve superior compression results even as they improve image quality. However, in the hands of lesser JBIG2 implementation, PM&S can severely distort the image and lose information.
The premise behind PM&S is quite simple. If distinct characters on a scanned page are really different instances of the same font in the original document, you can improve image quality and drastically reduce file size by replacing each of those distinct characters with the same font in the compressed file.
Pattern Matching and Substitution Sample
Consider the figure below. The left box contains all the instances of a lowercase “h” in a standard document. The right box contains the single bitmap which replaces all of them in the compressed file. As you can see, the instance of the font used by the JBIG2 encoder looks much nicer than many of the characters in the scanned document.
The cost of maintaining those 184 distinct instances of a lowercase “h” in the original document is very high. Each of them needs to be sent to the decoder, even though many of them are not particularly attractive and can even detract from the image quality. Merging them all into a single font allows a much smaller symbol dictionary, which can drastically reduce the file size.