April 29, 2013

Extracting & Analyzing PDF Form Data

Filed under: Uncategorized — frameworker @ 9:28 pm

PDF Form Export is an OS X app that extracts field names and values from PDF forms snd puts them into a text file.

To use the app, just drop a PDF Form on it – or “Open…” the form from it. A PDZ file is created in the directory containing the form. The PDZ file contains one line per field – the field names and values are tab-separated – like this:

  • NAME  \T  Some Guy
  • ADDRESS  \T  1234 Somewhere St.
  • etc.

You can then paste the exported data into a spreadsheet or manipulate it with a script. My Benford’s Analysis program, which I wrote as a component of a tax-return-audit-risk-identification suite, is an example of doing this.

When businesses provide forms to customers to be filled out and returned, this utility simplifies data extraction avoiding tedious and error-prone recopying.

The program is written in Objective-C with Cocoa and PDF Kit. PDF Kit makes it straightforward to iterate through a PDF’s form fields extracting their names and values.


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Blog at

%d bloggers like this: