CVISION Technologies

Document Imaging, Information, and Tech Support

Archive for the 'JBIG2 Compression' Category

PDF Compression

September 23rd, 2008 by Chris

Question: We need a PDF compression solution. A colleague had recommended your compression product, PdfCompressor. Our organization deals with scanned images that can be either color or black & white. The scanners output are either PDF or TIFF files. Do you offer a trial of the compression product?

Answer: Yes, we do offer a trial of PdfCompressor, I have attached it below. We do accept PDFs and TIFFs as input for compression. You said you have black & white files, as well as color files. For B&W files, we can achieve compression rates of 10% the original file size, for color documents, the compression rates are as much as 1% the original size.

Here is the link to a free trial for PdfCompressor:

www.cvisiontech.com/download_main.html

Category: CVISION PdfCompressor, Compress File, Document Compression, Evaluation PdfCompressor, JBIG2 Compression, PDF Compression, PDF Reduce, Shrink PDF, Tiff Compression, compress TIFF | No Comments »

OCR & JBIG2

June 5th, 2008 by Chris

There is a clear correlation between OCR and the new ITU bitonal JBIG2 standard. In particular, an important aspect of JBIG2 is font learning. Whereas in the previous CCITT4 TIFF image specifications there was no notion of fonts, or font learning, it is a very important part of the JBIG2 compression specs and is one of the main reasons that JBIG2 compression rates are as high as 10:1 with respect to TIFF G4 compression.

Of course, font learning is important for OCR performance as well. When a font is “learned” it imposes constraints on all the connected components that map to that font character. One of the aspects of JBIG2 is font models, another aspect is global models, and a third is composite model. Each of these is not only useful for compression purposes, but also for effective OCR rates. Models, assuming a perfect font matcher, impose intra-page node constraints, but do not impose any constraints between nodes on different pages. Global models impose inter-page constraints on nodes linked to the same global font model. Composites impose n-gram constraints between groups of n consecutive nodes.

Most OCR engines deal with recognition a page at a time. Thus, there is no constraint satisfaction across different pages of the same document. JBIG2 compression can allow a system to see multiple inter-page constraints, all at the same time. Through the use of model-based propagation, the OCR process can be sped up considerably in this way.

If you are interested in learning more about PdfCompressor with OCR and testing our free 30-Day, click
http://www.cvisiontech.com/pdf_compressor_31.html

Category: All, JBIG2 Compression, OCR | No Comments »

Document Compression: Lossy vs. Lossless Compression

May 30th, 2008 by Chris

Question: What is the difference between lossy and lossless compression ? How can I be sure that the converted documents look the same as the originals?

Answer: Lossless compression does not change any of the original pixel values as captured by the scanning or MFP device. Lossless color JPEG is easily 10x-100x larger than standard JPEG. In fact, no hospital we’re aware of stores its brain MRIs and CAT scans using a lossless format (such as lossless JPEG). For all real applications, certainly in the greyscale and color domains, some modification of original pixel values is accepted. The general rule is that compression and dpi reduction are allowed separately, or in conjunction, as long as the output image appears identical to the original. So this condition of “appears identical” seems to be key. Clearly, appearing identical is also a function of the device, application, practitioner, and other factors.

With perceptually lossless compression, pixel values are allowed to change provided the output image looks like the input image. With perceptually lossless compression, there should be no loss in readability. With effective perceptually lossless compression, recognition rates after compression should be identical to recognition rates before compression.

Checking the documents manually is still the best way to be sure there are no differences between the original files and the converted ones. CVISION ICert is an automated accuracy checking system to verify that each output file accurately corresponds to its input file.

Category: All, Document Compression, File Compression, JBIG2 Compression | No Comments »

JBIG2 and PDF

May 21st, 2008 by Chris

Question: Does bitonal compression of scanned documents to JBIG2 format makes sense on its own or should such conversion be done as part of a general conversion to PDF format?

Answer: JBIG2 is a new ITU-approved, international standard for compression of scanned black and white files. The effectiveness of JBIG2 compression versus the previous ITU TIFF G4 standard is very much dependent on the JBIG2 compression software used. The quality of the scanned document is also a funcion of the JBIG2 software used since the decompression specs for JBIG2 are open but the individual JBIG2 compression algorithms used are proprietary.

Using the right JBIG2 compression software can results in compression rates where the JBIG2 files are 5x-10x smalller than TIFF G4 and G4 PDF, with No Loss of image quality.
Although JBIG2 is an ITU approved format, it is still very new to the industry. The assumption that a typical client or system user has a pre-installed JBIG2 viewer is probably false. The advantage of using JBIG2-compressed PDF as the document format is several fold. First, PDF fully supports JBIG2 so that the compression advantages of JBIG2 can be fully utilized within the PDF specs. Second, PDF Reader 5.0 and up can handle JBIG2-compressed files, so that your user base most likely has a JBIG2 PDF Reader pre-installed on their computer. Third, adding OCR searchability to your JBIG2-compressed file is very easy within the PDF specs using a hidden text layer. And finally, for multipage files that need to be web-hosted and viewed remotely, JBIG2 files that are made to fit the PDF specs (i.e., given a PDF wrapper) can take full advantage of the web-optimization feature supported by PDF and Adobe Reader, which means that large multipage files will open and display quickly on the Web.

So, in short, there are serious advantages in converting scanned documents into JBIG2 format. But having decided to convert a database to JBIG2, there are additional features available and more file control when the files are converted to JBIG2-compressed PDF format.

Category: All, JBIG2 Compression, JBIG2 and PDF | No Comments »

Reduce Scanned Documents

May 6th, 2008 by Chris

Question: My office has started to scan our paper documents. We would like a program to reduce the file size of the documents. How can we test your software?

Answer: We offer a free 30-day evaluation of PdfCompressor. We can reduce typical black & white scans up to 10x smaller, and color scans up to 100x smaller.

Below is the link to download:

http://www.cvisiontech.com/download_main.html

Category: All, CVISION PdfCompressor, Compress File, Document Compression, JBIG2 Compression, PDF Compression, PDF Conversion, PDF OCR, Scanned Documents, Shrink PDF | No Comments »

Compressing PDF Files

April 1st, 2008 by Chris

Question: I want to compress my PDF files. What are your compression rates for tyical bitonal captured files. I am looking to compress both PDF files, as well as TIFFs.

Answer: For bitonal (black & white) scanned files we can compress up to 10X smaller than the oringinal. We do offer a 30-day trial for you to test the results yourself. I have attached the link below.

http://www.cvisiontech.com/download_main.html

Category: All, Compress File, Document Compression, Evaluation PdfCompressor, File Compression, JBIG2 Compression, PDF Compression | No Comments »

Batch compress PDF

February 15th, 2008 by Chris

Question: We scan about 1000 images daily; storage size is becoming more of an issue. Considering we deal with such a large volume of scanned files, we need a program to batch compress these PDFs. Do you offer a program to batch compress PDF files?

Answer: Yes, PdfCompressor Professional edition is designed for corporate, batch compression, and OCR. If you are interested in testing the product for 30 days, I have attached a link to download below:

http://www.cvisiontech.com/pdfpro40_download.html

Category: All, Batch PDF OCR, CVISION PdfCompressor, Compress File, Document Compression, Evaluation PdfCompressor, JBIG2 Compression, PDF Compression | No Comments »

Decrease TIFF file size

December 13th, 2007 by Chris

Question: Does PdfCompressor decrease the file sizes of TIFF files as well as PDFs? I want to compress TIFF images, and then convert the TIFFs to PDF files. Also, will I need a special plug-in to view the compressed PDFs.

Answer: Yes, you can convert TIFF images into compressed PDFs with PdfCompressor. These PDFs are 100% Adobe Reader compatible, which is free and installed on almost all computers today. The compressed PDFs are viewed just like any other PDF you encounter.

If you are interested in the free trial of PdfCompressor, I have attached the link below:

http://www.cvisiontech.com/download_main.html

Category: Adobe PDF Conversion, All, Compress File, Convert PDF, Document Compression, JBIG2 Compression, Tiff, Tiff Compression | No Comments »

Scanned Document Compression

November 1st, 2007 by Chris

Question: I need to reduce/compress the size of my scanned documents. Currently, they are PDFs; I want to keep them as PDFs, just make them smaller.

Answer: PdfCompressor takes scanned PDFs as input, and converts them to compressed PDFs. (We also take TIFFs, JPEGs, and other formats, and convert them to PDFs.) We can compress the size of scanned black & white documents up to 10X smaller, and color up to 100X smaller. PdfCompressor also makes the scanned files text-searchable with OCR.

If you are interested in evaluating PdfCompressor for free, click here:

http://www.cvisiontech.com/download_main.html

Category: All, CVISION PdfCompressor, Compress File, Document Compression, Evaluation PdfCompressor, File Compression, JBIG2 Compression, PDF Compression, PDF Document Conversion, PDF OCR | No Comments »

JBIG2 Compression and Document OCR

February 26th, 2007 by Chris

Question: Are JBIG2 file compression and document OCR completely separate problems? Should these processes be done together or separately? Does JBIG2 conversion prior to OCR lower the recognition rates?

Answer: JBIG2 and OCR are related problems. A basic element of JBIG2 compression is bottom-up font learning. This font learning is used for compression but can easily be used to cross-check font mappings returned by the OCR engine. So an effective JBIG2 compression algorithm can be used to improve on OCR recognition rates, see http://www.cvisiontech.com/pdf_compressor_31.html.

For example, if we use global models in JBIG2 which is an effective compression tool, it can also be used to propagate correct OCR mappings throughout the document.

In general, these processes should not be constructed in a linked way, and each process, JBIG2 & OCR, needs to be able to run without the other. One important reason for process separability is speed: JBIG2 tends to runs at 3-5 pages/sec. while OCR can take 5 secs a page. Another reason is that to achieve good OCR rates the right language dictionary needs to be used. JBIG2 compression is language independent and should not rely on any language dependencies.

So the JBIG2 and OCR problems are certainly very-much interrelated. Having said that, there are many reasons (including speed) to solve them separately and then combine results. Certainly, there should be an integration phase where a higher level module is aware of both the JBIG2 and OCR results and is able to combine these results to acheive improved OCR (and maybe also JBIG2) results.

There are problems inherent in propagating OCR results across a document. If there are any OCR errors these can also propagate across the document with negative consequences. Obviously, it is important in any such fusion of OCR and JBIG2 results to make sure this kind of error propagation does not occur.

Reliable JBIG2 compression, done with precision, should not result in any degradation of the document. As such, JBIG2 conversion prior to OCR should not lower recognition rates. There are, however, JBIG2 compression implementations that are clearly lossy and degrading in nature. If one of these degrading JBIG2 methods is run prior to OCR, a drop in recognition rates can usually be expected.

Category: All, Document Compression, JBIG2 Compression, JBIG2 and PDF, OCR, OCR Download, OCR Software, Optical Character Recognition | No Comments »