Dictionary Lookup & OCR
In comparing and rating OCR systems, recognition rate is usually the most significant feature. Recognition rates can be improved often by using dictionary lookups to aid in the recognition process. Essentially, when a dictionary is used to help in the OCR process, the system will bias its recognition in favor of words that are in the dictionary. Of course, there are many strings in a document that might not be in any dictionary, like an invoice number. So even if the system is using dictionary lookup, it must allow for strings that are not in the dictionary.
When using dictionary-aided OCR, there are different functions to determine which word in the dictionary is closest to a given string. A very typical distance metric that is used to measure word distance is edit distance. The edit distance is defined recursively, and can be implemented efficiently in a dynamic programming (DP) framework.
The algorithm
A commonly-used bottom-up dynamic programming algorithm for computing the Edit distance involves the use of an (n + 1) × (m + 1) matrix, where n and m are the lengths of the two strings. Here is pseudocode for a function EditDistance that takes two strings, s of length m, and t of length n, and computes the Edit distance between them:
int EditDistance(char s[1..m], char t[1..n])
// d is a table with m+1 rows and n+1 columns
declare int d[0..m, 0..n]
for i from 0 to m
d[i, 0] := i
for j from 1 to n
d[0, j] := j
for i from 1 to m
for j from 1 to n
if s[i] = t[j] then cost := 0
else cost := 1
d[i, j] := minimum(
d[i-1, j] + 1, // deletion
d[i, j-1] + 1, // insertion
d[i-1, j-1] + cost // substitution
)
return d[m, n]
Intuitively, we are interested in understanding how close a given string A is to a known dictionary word B. One way to measure the distance between any two strings is to compute the number of keystrokes or edits required to transform string A into string B. If this value can be computed efficiently for every string B in a given dictionary, then we can determine the dictionary word B that is closest to some string A. If this word is sufficiently close to the desired string, we assume the correct ASCII mapping for this string is B.
Dictionary-based OCR is usually advantageous and yields overall better recognition rates. Among the drawbacks: the system tends to run slower, and sometimes technical terms are "wrapped" into dictionary words.
Click here to read next topic: Rating an OCR System
Return to Table of Content




