OCR Spell Checking Method
Optical Character Recognition (OCR) is software used to recognize text in image files and convert them into Word or text editable documents. OCR spell checking helps in attaining a high percentage of accuracy in recognition. A lot of errors may occur while performing OCR, if the document quality is not up to the mark or if the software is not good enough. The OCR spell checking method is used to correct the spelling mistakes that might appear after OCR is performed.
Recognition of Errors
OCR software cannot produce 100% accurate recognition and it definitely needs some proof reading. As mentioned earlier some OCR software makes use of the feature extraction or pattern matching methods to scan text characters while some uses OCR spell checking method to recognize characters. After the OCR software is made to run on the files initially, OCR spell checking is done to recognize the unrecognized characters. The OCR spell checking software tries to match the words with unrecognized letters, with similarly spelled words in its dictionary. Suppose, the character recognition software has recognized "tne", the OCR spell checking software will find and replace the incorrect letter "n" with the right one `h', after consulting its dictionary. All software has dictionaries which enable the user to add new words. This way, the errors posed by the OCR programs can be minimized.
OCR spelling errors are not similar to the spelling errors that occur in Word processing software. OCR spelling mistakes are due to the machine or software's recognition problems while the latter are typographical errors. The similarities in the shape of the letters, font and font size can impose a lot of problems in OCR spell checking. Indentation, Justification and other word breaks can be mangled after OCR. It might at first look fine but when the document is maximized the words will be misplaced. There are some types of spell- checking software that just underline the unknown words. This way, manual checking and proof reading is made easier. But the difficulty in using them is that the entire document has to be reviewed to find spelling errors and the impending corrections must be performed or approved manually.