PDF_Blog

Mark Gavin's PDFblog

This blog contains items I find interesting or useful related to Portable Document Format (PDF). 

Tools for Creating Acrobat Forms

Otherwise known as AcroForms; Acrobat form technology was first introduced in PDF version 1.2; and, has been around for more then ten years.  In addition to Adobe Acrobat; there are third parties which have released products to create Acrobat forms.  

Following is a list of tools to create AcroForms:

Adobe Acrobat Professional

Nuance PDF Converter Professional Versions 4 or 5

FoxIt Reader Form Designer

Amgraf OneForm Designer Plus

The Acrobat Professional package includes tools to create documents…

Free Software

With the opening of the new Appligent Online Store; Appligent has released two more free products; APSaveAs and APConductor. 

APSaveAs is a tool for cleaning up PDF files.  Its primary function is to perform a garbage collected save on a PDF file.  In addition, it will also correct many types of malformed and corrupt PDF files. 

APConductor is a stand alone SOAP server.  It can be used with Appligent applications; in addition, it can also we used to turn any CLI based application into a SOAP…

Forms Data Format

A Forms Data Format ( FDF ) file is a text file that contains a list of form field names and their values. Acrobat Forms, or AcroForms, were introduced in PDF Version 1.2.  To allow for the import and export of data from AcroForms; Adobe developed the Forms Data Format.  The documentation for the Forms Data Format is located in the PDF Reference in the chapter on "Interactive Features" under the section "Interactive Forms".

There are two kinds of FDF files:

• Classic - supplies data to fill out…

Presenting Data and Information

Last week while in Boston for the AIIM Conference; I used the Monday before the conference to attend a one day course taught by Edward Tufte on "Presenting Data and Information".  The course focuses on effectively presenting and communicating information.

The course is given in various locations around the country throughout the year.  I've known about the course for the past several years; but, until last week the scheduling didn't work out to make it convenient for me to attend.

I found the…

Acrobat 8 Crash FreeText Annotation

The following simple PDF document contains a single FreeText annotation.  The FreeText annotation is displayed correctly under both Acrobat 7 and 8.  However, using the mouse to click on the annotation under Acrobat 8, causes Acrobat 8 to crash.  FreeTextCrash.pdf

Below is a screen shot of the FreeText annotation labeled "COV".

freetext_screeshot

The crash occurs in Acrobat 8 on both Windows and Macintosh.

Clicking on the annotation under Acrobat 7 selects the annotation as expected.

HP Smart Document Scan

Recently we have received a couple of malformed PDF files produced by the HP Smart Document Scan software.  It appears that the HP Smart Document Scan software is only included with the HP Scanjet 7800 Scanner and the  HP Scanjet 8350 & 8380 scanners.

The version number of the PDF files produced is PDF 1.0. The first problem we found is located in /Name objects which contain '#xx' hex values.  The use of # hex values were not part of PDF 1.0;  # hex values in Name objects were introduced in…

Linearization

Linearization

Linearization is a variant on the PDF file layout as described previously.  Linearization is also called "Fast Web View".  Linearization shuffles the contents of the PDF file to place all of the information needed to display the first page near the beginning of the file.

pastedgraphic-5_textmedium

This allows the user to see the first page while the remainder of the file is still downloading from the web.  

Incremental saves on a linearized file can actually break linearization; but, Acrobat still reports…

Jim King's Presentations

Jim King, Principal Scientist at Adobe Systems has a personal web site which contains a collection of his public presentations.  These presentations include PDF Tutorials, Color Management, Color Science, XML/PDF Tutorial and High Resolution Rendering.  Several of the presentations are annotated with speakers notes.  I would encourage everyone to check it out.  The URL to the presentations is as follows: http://home.comcast.net/~jk05/presentations/

ISO-PDF

The first meeting of the Portable Document Format (PDF) Reference Committee will be held in Silver Spring, MD on July 16 and 17, 2007.  The meeting location and agenda can be found on the AIIM web site using the above link.  In addition, the same web page also contains a link to the draft of the document submitted to AIIM by Adobe.

The official name of the proposed standard is expected to be ISO 32000.  The draft document submitted to AIIM by Adobe is 768 pages.  The PDF Reference 1.7 is 1310…

PDF Basic File Layout

A Typical PDF File

For the most part; the basic layout of a PDF file can be fairly simple.  A PDF file consists of four primary sections as illustrated below:

image

The PDF file "Header" is just one or two lines starting with %PDF.  The "Body" is a collection of objects which include the page contents, fonts, annotations, etc.  The "xref Table", or cross reference table, is a collection of pointers to locate the individual objects contained in the "Body".  The "Trailer" contains the pointer to the…

Adobe Bates Numbering?

We received an email from one of our customers, who is an attorney, who uses Bates numbering on a regular basis.  Following is one of the sentences from this customers email:

"I wouldn't have thought it possible, but Adobe has managed to implement its Bates-stamping in a manner which makes it virtually useless [or at least highly impractical for use by] attorneys, the primary users of Bates-stamp utilities." 

When I saw this I decided to take a look at Acrobat Bates Numbering. 

I really don't use…

Acrobat 8 Text Shifting

Following is a collection of screen shots taken using a single PDF file displayed under Acrobat 4 through Acrobat 8.

Acrobat 4

Acrobat_4

Acrobat 5

Acrobat_5

Acrobat 6

Acrobat_6

Acrobat 7

Acrobat_7

Acrobat 8

Acrobat_8

Following is a PDF file which demonstrates the text shifting problem:

Acrobat_8_Text_Shift.pdf

This particular drawing error is caused by passing a large negative character spacing in a text array when the text is of zero length. 

132.96 741.6 TD -0.06048 Tc [()-4800()] TJ -0.32976 Tc (A) Tj

PDF - The Missing References

The Adobe PDF Reference is similar to the Adobe Postscript Language Reference; in that they can both be compared to a dictionary.  A dictionary is a document which contains all of the words that can be used in a language; but, it doesn't teach you how to combine those words into a good, well structured book.

PDF is based on Postscript.  The documentation for Postscript was released as a set of three volumes.

Postscript Language Reference - Red Book

PostScript Language Program Design - Green Book

PDF Version Numbers

I find that there is a general misunderstanding about the nature of Portable Document Format (PDF) version numbers.

Version 1.0 of the PDF file format was released by Adobe in 1993. Over the past fourteen years PDF has been updated seven times.  The current version of PDF is 1.7. These changes to the PDF version number represent additions to the file format.

All of the "older" stuff in PDF works exactly the same way it did.  None of the basic PDF text drawing primitives have changed.  PDF 1.0 is…

Acrobat XML Tags for Bates Numbering

Adobe has released a technical note talking about additional XML data Acrobat 8 adds to each page of a PDF file when the file is Bates numbered using Acrobat 8.

Bates Numbering in PDF documents (PDF, 123K)

Here is what the XML looks like:

  <Bates start="1" ndigits="6" prefix="ADBE" suffix="DRAFT"/>

The above XML is added to each page of the PDF file and will produce a Bates number on each page: for example;  ADBE000001DRAFT.

So, instead of simply correctly numbering each and every page;…

The Limits of Resolution

I’ve been developing an application to generate Fresnel Zone Plates; and, ran into an interesting problem.  A Zone Plate is similar to a lens in its ability to focus light.  It differs from a lens by using diffraction instead of refraction.

The problem I encountered is that Acrobat creates many significant drawing artifacts when it renders this PDF drawing to the screen.  In the above screen capture; only the rings centered on the center of the graphic are real.  All other rings, centered off…

Copyright 2008 by Appligent, Inc.