Is Scan-to-PDF/A Coming to America?
While the overall market for PDF/A may still be in its nascent stages, it seems that one niche, directly related to document imaging, has already hit its stride. That’s the market for capturing document images and saving them as PDF/A files, especially in Europe. According to Carsten Heiermann, managing director of German-based image compression specialist LuraTech, his company has at least 30 large clients taking advantage of the PDF/A functionality introduced into the LuraDocument product line this spring.
“You can’t believe what a big movement PDF/A is in Germany,” said Heiermann. “We’ve held several classes on the topic and they are always overbooked. In a room for 70, we end up with more than 100 people that want to come. It’s like leading the thirsty to water. At the CeBIT show, one DMS guru called PDF/A the hottest trend of the year.”
PDF/A is the ISO document archiving standard (no. 19005-1:2005) that was finalized last October. Impetus for the standard came from organizations like AIIM, NARA (The National Archives Records Administration, the U.S. Courts, and Harvard.
LuraTech’s success with PDF/A has come in a number of industries. “We’ve seen great interest in the engineering niche,” he said. “Companies constructing things like airplanes and buildings have to save their documentation for the life of the objects and sometimes beyond. This can last more than 100 years. We’ve also seen interest in the insurance and financial services industries. We have a bank customer scanning 15-to-30-page loan documents in color and saving them as PDF/A files. They are also saving color PDF/A files of their checks. We are also doing some land records business.”
It’s worth noting that LuraTech’s specialty is a niche within the document capture niche. It involves capturing and compressing color documents that include text. This type of document enables LuraTech to take full advantage of its MRC (mixed raster content) technology, which separates document images into layers and compresses text with bi-tonal methods and graphics and/or backgrounds with color compression. MRC is designed to create greatly reduced file sizes compared to traditional, straight JPEG compression.
“Anybody can create a PDF/A file using straight JPEG or Group 4 compression,” said Heiermann. “That’s not really our business. We excel in applying techniques like MRC and JBIG2.”
Last year, we ran an article questioning whether JBIG2 was being recommended for PDF/A files, even though, as the standard is based on the PDF 1.4 spec, it is clearly allowed. Heiermann did not indicate that JBIG2 has been an issue so far for LuraTech. Further, he said the layering methods LuraTech applies in its MRC technology are clearly allowed within PDF/A. “We use a black-and-white mask that clearly differentiates between the foreground and background,” he said. “What is not allowed is a soft mask, in which you have color gradients between your foreground and your background.”
The second version of PDF/A, which is under development now, is being built around PDF spec 1.6, which incorporates JPEG 2000 compression—which is not included in 1.4. “Currently, we use standard JPEG compression in PDF/A,” said Heiermann. “We will incorporate JPEG 2000 when the next version is approved, but it will not make that much difference in file size. We’ve found that on average, in our current files, 80% of the size is created by the bi-tonal parts.”
Current LuraTech customers utilizing PDF/A include Shell Oil’s German operations, the state bank Helaba, and a large European manufacturer in the transportation industry. “We are just staring to roll out our PDF/A technology to our service bureau customers as well,” he said.
It is Heiermann’s view that the creation of image-based PDF/A files has taken off faster than the conversion of digital files to PDF/A, because it’s a much simpler process. “In Germany, we’ve seen a lot of projects where digital file conversion of PDF/A has been planned, but we have not seen the install base we have seen scanning to PDF/A,” he told DIR. “That’s because it’s fairly straightforward to take a scanned image of almost any format and convert it to a PDF/A. You don’t have to deal with issues like making sure you have the correct license to embed a font in your PDF/A file.”
Standardizing the standard
LuraTech has gained a better understanding of electronic file conversion through its work with partner PDF Tools AG and the PDF/A Center of Competence. Last year, LuraTech and PDF Tools co-founded the Center of Competence as a forum for vendors of PDF/A products. “The PDF/A standard is not that long, it’s only 65- pages if you download it,” he said. “But, if you print all the information it references, you can fill 10 thick books. The bottom line is that the rule set for creating PDF/A files is very complex.
“One goal of the Center of Competence is to keep different vendors from interpreting these rules in different ways and creating non-compatible PDF/A files. Once customers start realizing that all PDF/A files aren’t meeting the same requirements, it’s going to be bad for the market.”
One aim of the Center of Competence is to create a test suite that will enable vendors to check their products for compliance. “We are encouraging vendors to submit information on their support cases, as well as sample files,” said Heiermann. “We have formed a technical committee to work on a test suite to enable vendors to run test files and compare their results with valid PDF/A files. Our goal is to put all our knowledge about compliance with the standard on one table to ensure the market will be successful.”
Currently, there are six companies involved in the Center of Competence, mainly with European interests. According to Heiermann, the organization is looking to expand its membership and has extended invitations to several leading PDF software developers worldwide.
LuraTech itself will be jumping more heavily into the PDF/A market this fall when it releases a product for converting electronic files to PDF/A. The product will be at least partially based on technology licensed from PDF Tools. “For the first time, we will be branching out from the raster market and moving into new territory,” he said. “We will offer a product for converting files from programs like Office and CAD into PDF/A files. We will also be offering a validation application.”
For more information: http://www.luratech.com; c.heiermann@luratech.com; m.mckinney@luratech.com (U.S. contact: Mark McKinney)
http://www.aiim.org/standards.asp?ID=25013
http://www.apagoinc.com/prod_home.php?prod_id=29
http://www.digitalpreservation.gov/formats/fdd/fdd000125.shtml
http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=38920&scopelist=PROGRAMME
|