Performance considerations for PDF documents
This document explains how the properties of the toolkit can effect the speed at which PDF documents are processed. The PDF extension allows you to read barcodes from PDF documents.
Version 7.4.1 or later…
Version 7.4.1 introduced a faster and more accurate method of processing image-only PDF documents (e.g, a PDF document generated as the result of scanning); if the PdfImageOnly property is set to true then the toolkit will assume that all PDF documents are image-only and will strip out images rather than rendering the document into a bitmap. Stripping images out of a PDF document is considerably faster than the rendering process and also ensures that the images are processed at their original resolution. When PdfImageOnly is set to true the PdfDpi and PdfBpp properties have no effect for an image-only PDF document.
For pre-7.4.1 versions or if PdfImageOnly is set to false…
PDF documents are rendered into a bitmap before they are scanned for barcodes and the resolution and color depth of the bitmap have an effect on the speed of scanning. The following table shows the time taken (in ms) to run ScanBarCode on the same PDF document, varying the values for PdfDpi and PdfBpp.
PdfDpi=100 | PdfDpi=200 | PdfDpi=300 | PdfDpi=400 | |
PdfBpp=1 | 540 | 570 | 650 | 720 |
PdfBpp=8 | 530 | 590 | 690 | 820 |
PdfBpp=24 | 580 | 670 | 810 | 1002 |
There is little advantage in loading PDF documents in 24 bit format unless you want to split color documents, because the toolkit will immediately convert the bitmap to grey scale before scanning for barcodes. Since there is little speed difference between using 1 and 8-bit color depth, the recommendation is to use the default of 8 bits per pixel.
The resolution has a clear effect on speed, but will also have a considerable effect on the ability of the toolkit to read the barcode. The default value of 300 dpi will work for most documents, but if you know that your documents were scanned at 200 dpi then you could make a small gain in performance by setting PdfDpi to 200.
The above begs the question as to why the toolkit needs to know this – why can’t it derive this information from the source pdf document? This would be ideal – but at the moment we have no way of enquiring this information from pdf source document – which is why PdfBpp and PdfDpi are available as public properties.