CVISION Technologies

Document Imaging, Information, and Tech Support

Archive for January, 2007

Conversione PDF, Compressione PDF e riconoscimento OCR

January 31st, 2007 by Chris

CVISION has resellers all over the world, our Italian reseller, IDM Consulting Srl, has recently encountered a question regarding PdfCompressor:

Domanda: Devo comprimere le dimensioni dei miei documenti scannerizzati, convertire le immagini in formato PDF e rendere ricercabile il testo con il motore OCR, PdfCompressor supporta queste esigenze?

Risposta: PdfCompressor rende i tuoi documenti totalmente ricercabili grazie ad un accurato motore OCR CVISION PdfCompressor 3.1 rende i file PDF Web-friendly. PdfCompressor 3.1 comprime i documenti scannerizzati in bianco/nero con un fattore di compressione tipicamente di 5-10x ed i file scannerizzati a colori con un fattore di 10-100x. Il nostro processo di riconoscimento, unico per accuratezza, assicura il massimo del riconoscimento anche su documenti fortemente compressi.

L’unico software completo che presenta le fiunzionalita’ di image processing quali: avanzata compresione Pdf, OCR, Bates-stamping, opzioni di settaggio PDF, ottimizzazione per il web e criptazione.

Provate direttamente la versione di prova 30 giorni disponibile all’indirizzo http://www.cvisiontech.com/download_main.html

Se interessati ad un contatto con il nostro rivenditore locale in ITALIA:

IDM Consulting Srl
corso Appio Claudio 5
10143 - TORINO (TO)
mailto:pghivarello@idmconsulting.it

Category: All, International Resellers, OCR Software | No Comments »

Document Compression

January 29th, 2007 by Chris

Question: What document formats does PdfCompressor accept as input?

Answer: PdfCompressor converts PDFs, TIFFs, JPEGs, and 11 total formats into compressed, web-optimized and text-searchable PDF files equipped with OCR.

For a free trial of PdfCompressor, click here:

http://www.cvisiontech.com/download_main.html

Category: All, CVISION PdfCompressor, File Compression, PDF Compression | No Comments »

PDF Images

January 27th, 2007 by Chris

Question: Recently, I have scanned many color documents. When I see it on the computer, the color quality is good. However, once I print them, they come out blurry? Is that supposed to happen?

Answer: No, that is not supposed to happen. PdfCompressor’s specialty is compression of scanned color documents. We can compress the size of scanned color documents up to 100x smaller while maintaining file quality. The PdfCompressor with the same file generates an intelligent scanned image document with an embedded OCR layer and full text searchability. The most likely reason for the blurriness is that the PdfCompressor uses state-of-the art technology; however, most printers do not have the latest technology. The remedy for this is to consistently update your printer with newest technology.

Click on the attached link, if you are interested in testing us out.

http://www.cvisiontech.com/download_main.html

Category: All, Create PDF, PDF OCR | No Comments »

Adobe PDF

January 26th, 2007 by Chris

Question: I am looking for info regarding Adobe PDF files.

Answer: We have a resource you can view:

http://www.cvisiontech.com/pdf/adobe-pdf/adobe-pdf-software.html

Category: Adobe PDF Conversion | No Comments »

Real-time Remote Database Backups

January 25th, 2007 by Chris

Question: We need to keep real-time remote backups of our Company’s database system. We also want to archive our RM documents for long-term retention. Are these two distinct problems or are they closely coupled and require an integrated solution?

Answer: These problems can certainly be decoupled, that is, remote backups and document archiving. If possible, however, it seems more efficient to find an integrated solution. Once a document is attached to a database, there is effort involved in modifying the document. For example, in a records management dbase system, modifications to a document are strictly controlled. All such modifications, e.g., adding OCR to scanned files, must be done through API calls to the RM dbase. This is done so that an audit log of all modifications is kept and available on demand. As a consequence, it is much easier to modify a document before it is entered into the dbase system and becomes a document of record.

The advantage of converting a document into whatever fornat is intended for long-term archiving early in the workflow process is not just to simplify attaching the file to the database. The leading format for document archiving is PDF, and its archival variant PDF/A. PDF supports long-term archiving (PDF/A), web-optimization for web-based hosting, and compression for fast data transfer and efficient storage. It is generally more efficient to convert corporate documents once, before archiving. If this document format supports compression then it is possible to greatly reduce the time required for offsite backups, as well as reduce the web-hosting costs which are typically based on the file size of the hosted data.

For example, PDF is probably the most common format used for document RM and archiving. But for scanned documents, PDF also supports JBIG2 and MRC compression, with file size reductions of up to 10x for bitonal files and up to 100x for color files. If conversion to PDF is done at the time of capture, or at least prior to database attachment, then compression, web-optimization, text-search, and meta data insertion call all be done once, while generating the PDF file; see, for example, http://www.cvisiontech.com/pdf_compressor_31.html. This PDF-based compression will allow faster real-time backups, cheaper web-hosting, and obviate the need to modify the document later for the purpose of archiving.

Category: All, Remote Backups | No Comments »

Shrinking a PDF document

January 24th, 2007 by Chris

Question: I want to shrink my PDF documents. I understand PdfCompressor can reduce the size of scanned documents. What sort of compression results can I expect by using PdfCompressor?

Answer: It depends on the files. We are able to achieve up to 100x compression results for color documents and up to 10x compression results for black & white documents. But, sometimes the results are a little less. It is best to simply test the software out for free, and see what happens. We offer a free 30-day trial; I have attached a link below. Or, if you would prefer, you could always send your files to support@cvisiontech.com, I will take a look at them, and see what results we can get.

Here is the link for a free download:

http://www.cvisiontech.com/download_main.html

Category: All, CVISION PdfCompressor, Compress File, Document Compression, File Compression, JBIG2 Compression, Reduce PDF File Size, Shrink PDF, Tiff Compression | No Comments »

Advantages of PDF Files

January 23rd, 2007 by Chris

Question: Several clients send my company TIFF, JPEG, and PDF files all the time and the company’s scanner output is TIFF. Is it advantageous if the company converts the TIFF files and JPEG files into PDF files. so that we have one file form?

Answer: It would be advantageous for your company to convert TIFFs and JPEGs into PDFs. PDF are readily viewable on any PC with Adobe’s free reader. The nature of PDFs allows the documents to be uniformly printed and viewed. Furthermore, PDFs can be equipped with an OCR layer to make your documents text-searchable. OCR converts image documents into text searchable PDF files. Searchable files created by OCR, are far more manageable and users are more efficient.

If you are interested in converting TIFF files and JPEG files into compressed, OCRed PDF files, you can click the link below.

http://www.cvisiontech.com/download_main.html

Category: All, Batch PDF OCR, Convert PDF, OCR, OCR PDF, OCR Software, PDF Compression, PDF OCR, PDF Search, Tiff Compression | No Comments »

Handling Large Documents on the Web

January 22nd, 2007 by Chris

Question: What are the major issues with respect to handling very large files on the Web? Is one file format preferrable to another?

Answer: The major issues for handling large files on the Web would seem to be : i. compression, ii. web-optimization, iii. search, iv. chunking. We’ll briefly review each of these issues.

Compression: Just because a file has lots & lots of pages, does not mean that the file size must necessarily also be large. Compare, for example, a 1,000 page electronic file with a 1,000 page scanned color TIFF file. The scanned file can easily be 100x larger than the electronic file (e.g., 1 GB vs. 10 MB). So compression can be a key factor in making sure your documents are amenable to web-hosting. Compression is particularly a factor when dealing with scanned image documents. Compression can yield reductions of up to 10x for black and white image documents and up to 100x for color image documents. See, for example, http://www.cvisiontech.com/pdf_compressor_31.html.

Web-optimization: If files are large, its unlikely that someone specifically wants to view the first page of the document. More than likely, they want to get to some page in the middle of the document. Web-optimization is a feature that allows a document viewer to view any page in an arbitrarily large document in constant time (e.g., 1-2 seconds), requesting from the server than it jump to the byte boundary where the file page starts. This allows for efficient web browsing of a file. PDF format has native support for web-optimization.

Search: The larger a file is, the more likely you’ll need text search capability. If a file is only one or two pages then perhaps you can find what you’re looking for simply by perusing through the document. If a file is large, say over 30 pages, then it is very difficult to find what you’re looking for without text search capability. Although most electronic files are already searchable, some are not (e.g., vector graphics). For scanned files without OCR (unsearchable), finding what you want in the file is akin to finding a needle in a haystack. So make sure all your large web-hosted files are searchable. For scanned documents, this means running the files through an OCR process.

Chunking: Another problem with large files in a Web-based environment, even if they’re web-optimized, is that just downloading the file in your viewer (e.g., Adobe Reader) may tie up all your computer memory resources. Especially when file sizes run into the 100’s of MegaBytes. For efficient handling of large files, even in a web-optimized viewer, the file being viewed will continue to stream and consume available machine RAM. One solution to this problem is chunking, meaning that a very large file is divided into subfiles, none of which exceeds a maximum byte size. For example, if we select 50 MB as a reasonable chunking size then a very large PDF file would be chunked so that no single PDF subfile exceeds 50 MB. Now the total memory consumption on a document search is bounded.

Adobe PDF is recommended for web hosting document databases. The PDF format has native support for 3 of the 4 features we listed as desirable when web-hosting files, namely, compression, web-optimization, and searching (i.e., hidden text layer). As such, there is very little “engineering” required on the IT side when implementing a web-hosted database that is already in PDF format.

Category: All, Compress File, Document Compression, Web Optimization | No Comments »

OCR for Italian

January 20th, 2007 by Chris

Question: Does your OCR engine support Italian?

Answer: CVISION’s OCR supports over 60 different languages including Italian. For a list of OCR supported languages, click here: http:/Post Options/www.cvisiontech.com/language_list.html

Category: All, OCR Languages | No Comments »

PDF Conversion: PDF/A vs. Document Security

January 19th, 2007 by Chris

Question: How can I convert company documents to PDF/A when I’m also concerned about file security and encryption?

Answer: There is an inherent conflict between a document being open & accessible and also being secure. The focus of the PDF/A specs is accessibility, not security. Which works great at the library level, but not necessarily for an investment bank.

Sensitive company documents can always be kept unencrypted, in an open PDF format, with security enforced at the company database level. In other words, only users with the proper database security in the company could view, print, or edit a given document.

Of course, enforcing security for PDF files at the database level has its drawbacks. Sending a file across the Internet makes it vulnerable to being “sniffed” or read by a 3rd party. What if it’s necessary at certain times to web-host the document and make it viewable to people outside the company? What if you need to email the document reliably to a 3rd party?

One of the advantages to using PDF for conversion & archiving in the first place is the format’s view, print, and edit protection features. But these security features all require encryption and must be disabled for a document to satisfy the PDF/A requirements. So it seems that satisfying the PDF/A specs requires disabling some of PDF’s finest features, at least with respect to security. For many companies, this is not always a winning proposition and should be considered carefully before implementation.

Category: All, Convert PDF, PDF Conversion, PDF/A | No Comments »