Disoriented Images

Feb 18
2010

It’s impossible to avoid, scanning documents upside down is going to happen in all medium to high volume scanning scenarios. Fortunately, the technology exists to very accurately rotate images to the proper orientation. Lets take a look in detail how image auto-rotation and liner distortion correction works.

When an image is scanned via a document scanner in batch, it’s not uncommon to have pages flipped the wrong direction. It’s also not uncommon to have pages where there is a vertical shift from the top of the document to the bottom. In order to leverage the best data capture technology, or even for long term storage of a document, the pages need to be right-side up and without skews or distortion.

There are two phases when correction of images occur. First is during scan or just after scan. These algorithms work on the image only to determine proper direction and eliminate skews. They are very fast and sometimes equipped in the scanner driver itself. Image based auto-rotate is not as accurate as contextual, but it’s speed makes up for it. Additionally the fact that an inaccuracy in rotation simply means a page that was not rotated when it should have been, the risk is not high. Very rarely will image based auto-rotate mis-rotate an image, ie an image that was in the correct orientation but turned upside-down. Once rotation is correct, the distortions can be checked. When a page is in image format, it is the ONLY opportunity to correct liner distortion. The algorithms that perform this function work on a pixel level to determine the base alignment of the document vertically and horizontally to find portions of the document that do not match the base alignment and make proper shifts.

Phase two of document correction is the contextual auto-rotate. Using a full-page OCR read at several orientations the software can determine at which orientation the quality of the read is best. This is the most accurate way to rotate a document. Documents with little text, or text at various angles are the only risky documents. In these cases, the software chooses the orientation of the MOST readable text.

Auto-Rotate and Deskew are a must when scanning documents for purpose of OCR and Data Capture. The technologies are very accurate and sometimes used exclusively for the purpose of accurate document scanning and image storage.

Chris Riley – About

Find much more about document technologies at www.cvisiontech.com.