What is OCR
OCR is an abbreviation for Optical Character Recognition. Optical Character Recognition is the technology that is used to convert a scanned digital image of a hard copy document into an electronic document. There are OCR engines that are based on this OCR technology. A scanned document is processed by an OCR engine to produce an electronic document out of the hard copy document. The output of a scanner may be in any format, for instance a PDF file. When a scanned PDF file is OCR processed, the output is a fully text searchable PDF document. It may either contain text or a PDF image. Whatever the content may be, it is an electronic document that can be accessed by anyone from any part of the word if it is published on the World Wide Web.
How to Process a PDF Image with an OCR
When an image file is scanned with PDF chosen as the output, you get a PDF image. This PDF image can easily converted into an electronic document using an OCR. For this you need an OCR device. OCR processing is easy. First, install the OCR and set it to accept inputs. Feed the PDF image into it. Experiment with different settings until you are satisfied with the output. Usually, processing at 300 dpi or higher yields good results. If you are processing multiple images you may consider running the OCR in batch conversion mode. This is usually a good idea if you don't have to make different settings for each page. In case you need to do so, manual processing is the right option for you.
Where to Find a Good OCR Program
OCR programs are offered on many websites that either sell them or offer them for free. These websites are owned by software manufacturing companies that market their products through their websites. The free OCR programs can either be freewares or sharewares. While a freeware is absolutely free, a shareware can be used for free only for a limited period of time. So pick and choose from among the hundreds of choices you have.