OCR & Novel Fonts
In classical OCR, the recognition systems were trained on a very specific set of fonts. If these fonts varied in any material way, recognition rates would fall off accordingly. Today's systems are much more robust and can handle the myriad of novel fonts that are used in publishing and available on the Web. What becomes more relevant for modern OCR systems is adaptability. If the shapes of characters in a new font is fairly unpredictable, what can be relied upon? It would be nice if, at the very least, topological properties, e.g., Euler number, are preserved. But often even this property is not invariant either due to novel fonts that modify basic character topology or because of scanning noise that introduces or eliminates holes.
As a result, what has become more prevalent in recent OCR technology is "shape-free" OCR. These algorithms seek to find the appropriate mapping between learned font symbols and the symbol alphabet. These newer methods seek to solve the OCR problem relying heavily on order statistics. Among the methods used, numbered strings that make use of the word structure to limit or uniquely identify the correct mapping. Obviously, the longer the document being analyzed, the more relevant the document statistics (such as K-tuples) will be.
It would seem that shape-only OCR systems have somewhat limited in applicability. Such systems want to solve the OCR puzzle strictly from the shape of a component image. This method can also be referred to as context-free, since no neighboring context is required to solve for the correct ASCII mapping. Similarly, OCR methods that are highly statistical can be thought of as context-sensitive, as these methods want to first compute order stats, or k-tuples, and only then infer the ASCII mapping. A combination of context-free and context-sensitive methods, incorporating geometric and topological properties of each component in conjunction with shape-free statistical methods, is probably most likely to yield accurate OCR results.
Click here to read next topic: Locating Multidirectional Text with OCR
Return to Table of Content





