CVISION home
 
 
 
Litigation Support Web Repositories Scanning Bureaus Wireless Telecom
 

 
   CVista Suite Overview
   CVista PdfCompressor
   CVista Viewer
   CVista API Toolkit
   CBatch
   OCR
 
  Professional Services Overview
  LeapReader Overview
  Submit Inquiry
 
   Case Studies
   Litigation Support
   Web Repositories
   Scanning Bureaus
   Wireless Telecom
 
   Resellers
   Service Bureaus
 
   Case Studies
   Clients
   Testimonials
   Information/Support Blog
   Submit a File to our Staff

 

OCR, MRC & JPEG2000

One of the novel aspects to JPEG2000 is that, although an algorithm for segmenting color images is not specified, the JPEG2000 spec does support segmentation-based coding. In fact, the most effective rates for color compression are obtained by analyzing, understanding, and reversing the page layout process. It becomes important in JP2 segmentation-based coding to be able to separate foreground from background, and more specifically, text regions from non-text regions.

Unlike many OCR systems, which can "tolerate" missing text regions that are not aligned horizontally or vertically, color compression using Mixed Raster Content (MRC) coding or based on JPEG2000 part 6, is much less forgiving. The basis of MRC coding is separating out the high frequency signal information from the low frequency information. Usually, the high frequency information in an image is text-related. Of course, there is also line art, edge structures, and other objects that may degrade when kept at low resolution. But text regions must be recognized and lifted for MRC-based compression to be non-degrading. This necessitates finding all text regions, regardless of skew, rotation, etc.

Accurate detection of all text regions is very helpful in improving OCR accuracy. In this way, a system that supports MRC-based or JPEG2000 compression coding (like CVISION) will benefit with respect to improved OCR as well. The detected text regions can be fed directly to the OCR engine to be sure that no image text is left unsearchable. Accurate OCR and perceptually lossless compression for color images both rely on robust, precise page segmentation of foreground text from background image. This segmentation leads to the best color compression rates and the most reliable OCR.

Click here to read next topic: Reverse Video & OCR

Return to Table of Content

 
 
   
 


Copyright (c) 1998-2007 CVISION Technologies, Inc.
CVISION, CVista, CBatch, and the CVISION logo are registered
trademarks of CVISION Technologies, Inc.

 
Litigation Support Web Repositories Scanning Bureaus Wireless Telecom