April 29, 2013

Extracting & Analyzing PDF Form Data

Filed under: Uncategorized — frameworker @ 9:28 pm

PDF Form Export is an OS X app that extracts field names and values from PDF forms snd puts them into a text file.

To use the app, just drop a PDF Form on it – or “Open…” the form from it. A PDZ file is created in the directory containing the form. The PDZ file contains one line per field – the field names and values are tab-separated – like this:

  • NAME  \T  Some Guy
  • ADDRESS  \T  1234 Somewhere St.
  • etc.

You can then paste the exported data into a spreadsheet or manipulate it with a script. My Benford’s Analysis program, which I wrote as a component of a tax-return-audit-risk-identification suite, is an example of doing this.

When businesses provide forms to customers to be filled out and returned, this utility simplifies data extraction avoiding tedious and error-prone recopying.

The program is written in Objective-C with Cocoa and PDF Kit. PDF Kit makes it straightforward to iterate through a PDF’s form fields extracting their names and values.

October 14, 2011

Thank you, Steve

Filed under: Uncategorized — frameworker @ 6:30 pm

Steve changed my life completely in a good way. First, in the 1980’s, Macintosh became a vehicle that infused my – languishing – career with new purpose. And then again, when I lost my bearing in the last decade, OS X was a nurturing place to come home to. I fervently hope that Steve felt the deep gratitude of the developer community for having begotten this fertile ground for our achievement. Namasté, Steve.

April 20, 2010

Cocoa to Cappuccino – Spatially Formatting Text Fields

Filed under: Uncategorized — frameworker @ 6:18 am


I’m using pdf images as the background of electronic forms. The purpose of doing this is to make the electronic form feel just like the familiar paper one. It’s user friendly. Also the electronic forms can perform calculations automatically and accurately.


One of the cases that inevitably has to be handled is when a string has to be entered, with its characters equally spaced in a sequence of contiguous boxes, like this:


Cocoa has this nifty NSString method, drawAtPoint, that lets you do this when the field is not active. (But when it is active you can enter the string – unformatted – in the text field, which exactly covers the boxes in the background, so it all feels quite natural.)

- (void)drawRect:(NSRect)rect
    if ((![self focus]) && ([self format] == kWideFormat))
        // drawWideString calculates the bounds for each character
        // and calls a routine to draw it using drawAtPoint.
        [self drawWideString: [self stringValue]];
        [super drawRect:rect];


Unfortunately, there is no -[CPString drawAtPoint] method in Cappuccino, so this approach can’t be used.

I must confess that this “deficiency” left me with some confusion about how to procceed to implement the WIDE string behavior in Cappuccino.

I wondered if there was a way to do it using Canvas or CSS, or if Cappuccino text support for this kind of thing might be “just around the corner.”

And the sledgehammer approach of creating a sequence of single character text fields to display the inactive text, to be swapped-out with a regular text field while editing, seemed inelegant.

But a recent conversation with @saikatc at #shdh37 convinced me that the “buffered” approach was, in fact, a reasonable way to do this. And as it is with so many things in life, once the approach was determined, it started happening.


The one CPTextField subclass I use throughout the forms project overrides becomeFirstResponder / resignFirstResponder to bracket the active field with controlTextDidBeginEditing / controlTextDidEndEditing calls, something that neither Cocoa nor Cappuccino does, but which is critical to knowing when to swap the static text array in and out with the regular text field. Having this mechanism obviates the need for a special CPTextField subclass for these WIDE fields!

The Begin/End Editing messages are passed to the Text Field’s delegate, a widget subclass, which sends setFocusedDisplayFormat/setUnfocusedDisplayFormat messages to the object being activated/inactivated.

It’s important to note that the approach used here takes advantage of fact that the text field covers the char-array. If this were not the case, it would be necessary to buffer the char-array’s stringValues, so they wouldn’t be displayed, while the text field was active. That would be confusing.

So to avail ourselves of this pattern, we create a WIDE Widget subclass which switches the display of the static array on and off.

Here’s the code for that class.

PDQWideWidget descends from the concrete PDQTextWidget class and adds an array for the equally spaced characters.

// PDQWideWidget.j

@import "PDQTextWidget.j"

@implementation PDQWideWidget : PDQTextWidget

	CPMutableArray chars @accessors; // The array of equally spaced characters

Init calls super, then creates the array.

- (id) init
	self = [super init];

	if (self)
		// Do any initialization here!
		chars = [];

    return self;

makeView is overridden to create the array of one-char text fields. The order in which the constituent fields are created is crucial. The real CPTextField is created last, so it will be on top and receive mouse events. And the widget is put into “unfocused” mode.

- (void)makeView:(CPView)itsSuperview
	[self buildChars:itsSuperview];

	[super makeView:itsSuperview];

	[self setUnfocusedDisplayFormat];

buildChars builds the array of one-char text fields using the maxLen parameter to determine how many to create that will cover the widgetRect.

- (void)buildChars:(CPView)itsSuperview
	var frameRect = [self widgetRect];

	var width   = frameRect.size.width;
	var height  = frameRect.size.height;
	var x	    = frameRect.origin.x;
	var y	    = frameRect.origin.y;

	var cellWidth = width/[self maxLen];

	for (var index = 0; index < [self maxLen]; index++)
		var left = x + cellWidth * index;
		var charFrame = CGRectMake(left, y, cellWidth, height); // x,y,w,h

		var newCharField = [[CPTextField alloc] initWithFrame:charFrame];

		[self initCharField: newCharField];
		[newCharField setStringValue: @""];
		[newCharField setDelegate:self];
		[itsSuperview addSubview:newCharField];
		[chars addObject: newCharField];

Each charField must be initialized to (among other things) not accept events nor become a responder.

- (void) initCharField:(CPTextField)aCharField
	[aCharField setBordered:NO];
	[aCharField setBezeled:NO];
	[aCharField setEditable:NO];
	[aCharField setEnabled:NO];
	[aCharField setSelectable:NO];
	[aCharField setAlignment: CPCenterTextAlignment];
	[aCharField setBackgroundColor:[CPColor clearColor]];
	[aCharField setDrawsBackground:YES];
	[aCharField setVerticalAlignment:CPCenterVerticalTextAlignment];
	[aCharField setFont:[PDQAbstractWidget getFont]];

When the text field is focused, compose and set its stringValue from the char-array. The TextField will be displayed over the char-array, masking it.

- (void)setFocusedDisplayFormat
	// Build the stringValue from the char-array

	var itsStringValue = @"";

	for (var index = 0; index < [chars count]; index++)
		var aCharField = [chars objectAtIndex: index];
		var aChar = [aCharField stringValue];
		itsStringValue += aChar;

	[[self attachedControl] setStringValue:itsStringValue];

When the text field loses focus, or is first created, unpack stringValue into the char-array, then clear stringValue. The equally spaced chars will be displayed, but the empty TextField covering it, will not.

- (void)setUnfocusedDisplayFormat
	// Set the chars from stringValue and then clear it.
	var itsStringValue = [[self attachedControl] stringValue];
	var count = [itsStringValue length];

	for (var index = 0; index < count; index++)
		var aCharField = [chars objectAtIndex: index];
		[aCharField setStringValue: [itsStringValue characterAtIndex:index]];

	[[self attachedControl] setStringValue:@""];



March 20, 2010


Filed under: Uncategorized — frameworker @ 7:52 pm

I’d been testing PDQForms and hadn’t seen any performance problems, but then I saw a noticable recalculation delay when certain fields were changed in a particular form.

After a moment of doubt whether my approach was simply wrong, I sucked it up and asked myself “What would Mike Ash do?”

So I jumped into the debugger, and after tracing the flow of execution – aided by liberal logging of intermediate results – I realized that I was seeing a cascading dependency problem.

I was adding a notifier for each cell reference in a formula, so when it had more than one reference to the same cell, I was creating duplicate notifiers. And if that cell was referenced more than once in another cell’s formula, there would be duplicated recalculations. This is what I was seeing*. And the problem could become arbitrarily worse than this, since there could be an indefinite coupling of such formulae. Ouch!

* Formula A, of cell a, has n references to cell b, who’s formula B contains m references to cell c. So when cell c changes, formula B would be recalculated m times and formula A would be recalculated m * n times.

The solution to this cascading dependency problem was to allow only one notifier for any cell reference in a formula, even if the cell was referenced more than once in that formula.


When a PDQ document is opened, its form widgets are “internalized.” One step in this process is, for widgets that have a formula, to add observers to cells referenced by that formula. This stack-trace depicts the widget internalization pattern.

-[PDQDocument windowControllerDidLoadNib:]
Code added here is executed when the windowController has loaded the document’s window.

  -[PDQDocument internalizeWidgets]
  This contains the widgets’ internalization logic.

    -[PDQDocument observeReferencedCells]
    This document method calls observeReferencedCells for each widget.

      -[PDQAbstractWidget observeReferencedCells]
      Adds notifications to observe each cell referenced by this widget.

        -[NSString coalesceObservers]
        Constructs the observer list and removes duplicates, before adding notifications.


Here is the add/coalesceObservers code associated with the widget internalization pattern.

observeReferencedCells creates an array of the referenced cell IDs. Then it finds the object referenced by each ID and adds an observer to it.

- (void) observeReferencedCells
    if ([self hasExpression])
        NSMutableArray * observers = [[self expression] coalesceObservers];

        int index;
        for (index = 0; index < [observers count]; index++) // Work OK for empty array?
            NSString * theToken =  [observers objectAtIndex:index];
            // iterate the document's widgets (a global variable)

            PDQAbstractWidget *referencedCell = [self findWidgetWithID:theToken];
            [self addObserverToReferencedCell:(PDQAbstractWidget *)referencedCell];

coalesceObservers constructs the observer list, avoiding duplicates, by copying one instance of each cellRef token into coalescedObservers before adding notifications

- (NSMutableArray *) coalesceObservers
    NSMutableArray * theTokens = [self createTokensForExpression];
    NSMutableArray * coalescedObservers = [NSMutableArray array];

    int index = 0;
    while (index < [theTokens count])
        NSString * theToken = [theTokens objectAtIndex:index];

        if ([theToken tokenType] == eCellRefToken)
            [coalescedObservers addObject:theToken];
            // Remove all occurrences of theToken from theTokens.
            [theTokens removeObject:theToken];

    return coalescedObservers;

When theReferencedCell changes value addObserverToReferencedCell tells the dependent cell to handleVariableChanged by sending the NSNotificationCenter a changed message.

The NSNotificationCenter then sends the observer a PDQReferencedCellChanged message with an object reference to the cell that changed.

- (void) addObserverToReferencedCell:(PDQAbstractWidget *)theReferencedCell
    NSNotificationCenter* nc = [NSNotificationCenter defaultCenter];

    [nc addObserver:self
           name    :@"PDQReferencedCellChanged"
           object  :theReferencedCell];

handleReferencedCellChanged is the “action procedure” being set by addObserverToReferencedCell.

// Update the cell's value since a cell it depends on has changed.
- (void) handleReferencedCellChanged:(NSNotification *)notification
    [self recalculate];
    // Now say "changed" to tell the cells that depend on me to update also. 
    NSNotificationCenter* nc = [NSNotificationCenter defaultCenter];
    [nc postNotificationName:@"PDQReferencedCellChanged" object:self];

February 8, 2010

Cocoa to Cappuccino – Thinking About Strings

Filed under: Uncategorized — frameworker @ 11:52 pm


I found working with strings in Cappuccino to be more logical than in Cocoa, but I experienced some uncertainty about what methods and functions were available in JavaScript. So I thought this example and writeup might be useful to others, who are coming to Cappuccino from Cocoa, as many of us are.

One of the main differences between Cocoa and Cappuccino when working with strings, is that Cappuccino lacks a scanner class. So Cappuccino applications must handle scanning by themselves. I thought it would be useful and illustrative to implement a Category that performs scanning, similar to Cocoa’s NSScanner. And since JavaScript strings are toll-free bridged to Cappuccino’s CPString class that was not too difficult.

I’ve constructed my API as a Category, rather than as a Subclass, so it can be used with all CPStrings. It corresponds roughly to NSScanner, but doesn’t mirror it.

I’ve maintained a Cocoa-like style. This may change as time goes by, but for now it feels more readable, especially since I’m going back and forth between Cappuccino and Cocoa.

And I haven’t addressed performance issues in writing this. The code is adequate for my purposes. But it would be interesting to profile and optimize it somewhere down the line.

IMPLEMENTATION NOTE: I’m not currently supporting “skip characters.” That’s because the next scan position is determined by the length of the previously scanned string. If you skip characters there isn’t a simple way (that I could think of) to communicate this to the next use of the the scanner. Also all scans are case sensitive. An extended scanning category could add methods to support these. But for now, I’m avoiding that to keep things simple.

If you’re not familiar with NSScanner, it’s useful to note that the term “scan” means scanning from a particular starting point. That is, if you’re scanning for a particular string and it isn’t at the starting location of the string being scanned, then an empty string will be returned.

I’ll summarize the API here, but it will be much more instructive to view the source, which I’ve wrapped in a test program called cappscanner.


scanString – scans SELF, returning theString if a match is found.

    -(CPString)scanString:(CPString)theString startingAt:(int)startIndex

scanUpToString – scans SELF until a given string is encountered, accumulating characters into a string that’s returned. Scans to the end of SELF if stopString is not found.

- (CPString) scanUpToString:(CPString)stopString startingAt:(int)startIndex

scanUpToCharactersFromSet – scans SELF until a stopChar is encountered, accumulating characters into a string that’s returned. Scans to the end of SELF if no stopChars are found.

-(CPString)scanUpToCharactersFromSet:(CPString)stopChars startIndex:(int)index

scanCharactersFromSet – scans SELF as long as charsToScan are encountered, accumulating characters into a string that’s returned. Returns an empty string if no charsToScan are found.

-(CPString)scanCharactersFromSet:(CPString)charsToScan startIndex:(int)index


stringByReplacingString – replaces “target” with “replacement”, where “target” is a substring of SELF.

- (CPString)stringByReplacingString:(CPString)target withString:(CPString)replacement

setCharacterAtIndex – replaces the character at “index” in SELF.

-(CPString)setCharacterAtIndex:(unsigned)index theChar:(unichar)character

filterString – returns a copy of SELF filtering out the specified characters.


stripPrefix – returns a copy of SELF without thePrefix. Does nothing if SELF doesn’t have thePrefix


stripSuffix – returns a copy of SELF without theSuffix. Does nothing if SELF doesn’t have theSuffix.


dropCharacters – drops numCharsToDrop from the end of SELF. Does nothing if charsToDrop > [string length]. Returns an empty string if numCharsToDrop == [string length]



decimalTail – returns the tail of a decimal string.


formatNodecString – replaces the period in a formatted decimal string with a single space character.


formatNosepString – removes commas and the period from a formatted decimal string.


removeSurroundingParentheses – strips any leading or trailing spaces too. N.B. Won’t remove an odd parenthesis on one end!


parseScript – parses the if, then and else components of a script string. Notice how the scanner walks down the script using the cumulative offset of previously scanned components. This is illustrative of a repetitive scanning pattern.


rectFromAnnot – converts the RECT string from a pdf annotation (e.g. RECT [432.97 580.92 441.86 589.95]) into a CGRect.


scanRect – scans the RECT string found in pdf annotations. Returns an array of strings for the left, bottom, right and top coordinates. Note that scanRect also employs a repetitive scanning pattern.


tokensSeparatedByCharactersFromSet – breaks the input string into an array of substrings.



When you double-click the index.html file and then click the “Perform Scan Tests” button in the Scan Tests window, a Cappuccino Run Log Window will appear that contains these statements:

  Performing scan tests.

  Testing Scanning Methods.

  Scan if-then-else script.

  script is =IF(L222<12000;3500*L106e;0)
  scriptIf = L222<12000
  scriptThen = 3500*L106e
  scriptElse = 0

  Scan pdf style Rect.

  Build CGRect.

  left = 432.97
  bottom = 580.92
  right = 441.86
  top = 589.95

  Testing Utility Methods.

  Test stringByReplacingString

  string A plus string B
  string A + string B

  Test setCharacterAtIndex


  Test stripPrefix

  Mr. Coffee

  Test stripSuffix

  String Jr.

  Test dropCharacters


  Testing Test Related Methods.

  Test decimalTail


  Test formatNodecString

  1,099 87

  Test formatNosepString


  Scan tests complete.

“It works”🙂

My thanks to the Cappuccino Community, and especially to the Core Team.

September 11, 2009

How to constrain window size in Cappuccino?

Filed under: Uncategorized — frameworker @ 8:13 pm

I’m displaying an image in a CPWindow and wanted to keep the window from getting bigger than the image it’s displaying. I tried setting the window’s maximum size in applicationDidFinishLaunching. But that didn’t accomplish what I wanted; I’m still able to make the window arbitrarily large.

However, there is a change in behavior when setMaxSize is called.  The scroll bars persist at the maxSize position until both x and y are greater than maxSize.

With maxSize set, as you grow the window vertically, the horizontal scroll bar is fixed at the maxSize position and there is white space between the scroll bar and the bottom of the window.  Also, the vertical scroll bar is depicted as an empty, shaded, rect.


Once you make the window bigger than both maxWidth and maxHeight both scroll bars disappear and the window looks just like it would in the case where maxSize isn’t set.

In the case when maxSize isn’t set, as you grow the window vertically, the horizontal scroll bar is displayed at the window’s bottom edge with white space separating the bottom of the image and the scroll bar.  And if you make the window wider than the image’s width, the horizontal scroll bar disappears.

However, there is a change in behavior when setMaxSize is called.  The scroll bars persist at the maxSize position until both x and y are greater than maxSize.
With maxSize set, as you grow the window vertically, the horizontal scroll bar is fixed at the maxSize position and there is white space between the scroll bar and the bottom of the window.  Also, the vertical scroll bar is depicted as an empty, shaded, rect.  Once you make the window bigger than both maxWidth and maxHeight both scroll bars disappear and the window looks just like it would in the case where maxSize isn’t set.
In the case when maxSize isn’t set, as you grow the window vertically, the horizontal scroll bar is displayed at the window’s bottom edge with white space separating the bottom of the image and the scroll bar.  And if you make the window wider than the image’s width, the horizontal scroll bar disappears

How do you constrain the window’s size? Do I need to do something in addition to setMaxSize? Is a different approach required to do this?

Here’s my AppController.j source code.

@import <Foundation/CPObject.j>
@import <Foundation/CPString.j>

@implementation AppController : CPObject


- (void)applicationDidFinishLaunching:(CPNotification)aNotification
    var kImageWidth  = 612.0,
        kImageHeight = 792.0,
            kMargin  = 16.0;

    // Create a window to take up the full screen

    var theWindow = [[CPWindow alloc] initWithContentRect:CGRectMakeZero() styleMask:CPBorderlessBridgeWindowMask],

        contentView = [theWindow contentView],

        bounds = [contentView bounds];

    // Create a scrollView to contain the pdf image

    var scrollView = [[CPScrollView alloc] initWithFrame:CGRectMake(0, 0, CGRectGetWidth(bounds), CGRectGetHeight(bounds))];

    [scrollView setAutoresizingMask: CPViewHeightSizable | CPViewWidthSizable];

    [scrollView setAutohidesScrollers:YES];

    // Create the image and imageView.

    var theImage = [[CPImage alloc] initWithContentsOfFile:[[CPBundle mainBundle] pathForResource:@"testImage.pdf"]];

    // Create the image view.

    var imageView = [[CPImageView alloc] initWithFrame:CGRectMake(0.0, 0.0, kImageWidth, kImageHeight)];

    // Put the image in the imageView.

    [imageView setImage: theImage];

    // We don't want the image to resize.

    // (Could have [imageView setAutoresizingMask:CPViewNotSizable | CPViewNotSizable] instead.)

    [imageView setImageScaling: CPScaleNone];

    // Put the imageView in the scroller.

    [scrollView setDocumentView:imageView];

    // Put the scroller in the window's view hierarchy.
    [contentView addSubview:scrollView];

    var maxWidth   = kImageWidth  + kMargin,

        maxHeight  = kImageHeight + kMargin,

        theMaxSize = CPMakeSize(maxWidth, maxHeight);

    [theWindow setMaxSize: theMaxSize];

    // Bring the window forward to display it.

    [theWindow orderFront:self];


June 29, 2009

Inside PDQForms

Filed under: Uncategorized — frameworker @ 9:31 pm


PDQForms is tax preparation software for State and Federal forms. It integrates spreadsheet logic into pdf forms using PDFKit to combine form field information with labels and expressions. The result is a spreadsheet-like application layer over the form.

Creating a PDQ Form

The idea was to automate
everything that could be.

The first step is to extract information for all the form’s fields, or “Annotations” in PDF lingo.

We need the type of Annotation: TEXT or CHECKBOX, and its PAGE, RECT and MAXLEN values.

This is accomplished with a PDFKit based program called PDQ Annotation Editor, which outputs a text file.

Next we must define expressions that contain the logic for each of the form’s fields. This is just like programming a spreadsheet.

Adding this information to the Annotation meta-data completes our task.

We now have a text file containing the information PDQForms will need to automate the form.

We then “bake” this back into the original PDF file.

And the form can now be opened with PDQForms!

June 23, 2009


Filed under: Uncategorized — frameworker @ 5:26 am

Since some tax forms use tables as well as formulae, I had to implement a tax table lookup algorithm. Things were complicated by the fact that these tables were not always available in sorted format.

I first tried using the unsorted tax tables. This required doing a comparison of each table entry until the proper tax bracket was found. But tax lookup tends to occur whenever you change any numeric field on the form, since this usually affects taxable income. So, doing a comparison for each entry, while easy, was way too slow! Suddenly I was seeing a totally unacceptable delay.

The solution was to pre-sort the tables, shifting the performance burden completely out of the user’s work flow, and then to use an efficient, recursive binary search, algorithm to do the table lookup in PDQForms.


Copy the table from the tax handbook into a text file and clean-out any patches of non-table data. Fortunately, the topology of non-table data is amenable to doing this, and also, the table data consists of an integral number of tuples on each line.

Finally, filter-out commas, and prepend TUPLESIZE to the table.

The table we’re creating will have the same name as the form it applies to, but with a suffix that’s specified in the call to createTupleTable. Here it’s “tax”.

    NSString * theTableContents = [self createTupleTable: theTableData named: @"tax"];


1. Put the elements (NSStrings) of each tuple into an array (NSArray)

2. Put these tuples into an array so they can be sorted.

3. Sort the array of tuples, from smallest to largest, using “sortedArrayUsingFunction”
and passing it the comparison function “sortByValue”:

    NSArray *sortedTuples = [tuples sortedArrayUsingFunction: sortByValue context: 0];

    NSInteger sortByValue(id tuple1, id tuple2, void *context)
        double value1 = [[tuple1 objectAtIndex: 0] doubleValue];
        double value2 = [[tuple2 objectAtIndex: 0] doubleValue];
        if      (value1 > value2) return NSOrderedDescending;
        else if (value1 < value2) return NSOrderedAscending;
        else                      return NSOrderedSame;


<strong>4.</strong> Convert the sorted tuples back into a single NSString.

<strong>5.</strong> Finally, "bake" the table data into the pdq file using PDFKit (10.5).


This stack-trace depicts the tax calculation mechanism:

<strong>[PDQAbstractWidget evaluateExpression: theTableLookup]</strong>
Table lookups are processed as a special case by evaluateExpression.  
evaluateExpression calls doTableLookup. 

&nbsp;    <strong>[PDQAbstractWidget doTableLookup: theTableLookup]</strong>
&nbsp;    doTableLookup is analagous to evaluateFunction, but for tableLookups.
&nbsp;    doTableLookup packages the call's parameters and calls "execute."
&nbsp;    Note that quoted parameters are passed in as literal strings.

&nbsp; &nbsp;        <strong>[NSString+PDQFunctionAdditions execute: parameterArray]</strong>
&nbsp; &nbsp;        execute dispatches the call to taxTableLookup.

&nbsp; &nbsp;                <strong>[NSString+PDQFunctionAdditions taxtablelookup: parameterArray]</strong>
&nbsp; &nbsp;                taxTableLookup unpacks the parameters, finds the taxBracket 
&nbsp; &nbsp;                and uses it to determine the tax.

&nbsp; &nbsp; &nbsp;                    <strong>[NSArray+PDQTableAdditions taxBracket]</strong>
&nbsp; &nbsp; &nbsp;                    taxBracket is the recursive binary search algorithm,
&nbsp; &nbsp; &nbsp;                    initially called from taxTableLookup,
&nbsp; &nbsp; &nbsp;                    that does the "heavy lifting."
&nbsp; &nbsp; &nbsp;                    taxBracket calls the helper routine taxInTuple

&nbsp; &nbsp; &nbsp;                <strong>[NSArray+PDQTableAdditions taxInTuple]</strong>
&nbsp; &nbsp; &nbsp;                taxInTuple tests to see if taxable income falls
&nbsp; &nbsp; &nbsp;                within a bracket, to its left, or to its right.
&nbsp; &nbsp; &nbsp;                If income isn't in the current tax bracket
&nbsp; &nbsp; &nbsp;                taxInTuple's return value is then used 
&nbsp; &nbsp; &nbsp;                to seed recursive calls to taxBracket.

taxBracket and taxInTuple are shown below.

- (int) taxBracket: (double) income 
        tupleWidth: (int) tupleSize 
      startingWith: (int) firstTuple 
     andEndingWith: (int) lastTuple
    int tupleIndex = -1;

    int comparison;

    int middleTuple = firstTuple+(lastTuple-firstTuple)/2;

    comparison = [self taxInTuple: income 
                          atIndex: middleTuple 
                       tupleWidth: tupleSize];
    if (comparison == 0)
        tupleIndex = middleTuple;
    if (comparison == 1)
        tupleIndex = [self taxBracket: income 
                           tupleWidth: tupleSize 
                         startingWith: middleTuple+1 
                        andEndingWith: lastTuple];
    if (comparison == -1)
        tupleIndex = [self taxBracket: income 
                           tupleWidth: tupleSize 
                         startingWith: firstTuple 
                        andEndingWith: middleTuple-1];

    return tupleIndex;

// Tuples are laid out end to end as one long array.
// The first two items of a tuple are its income bracket.
// the trailing items are the tax for each filing status
// in that tuple’s income bracket.
– (int) taxInTuple: (double) income
atIndex: (int) tupleIndex
tupleWidth: (int) tupleSize
int leftIndex = tupleIndex*tupleSize;
int rightIndex = leftIndex + 1;

NSString * leftItem = [self objectAtIndex: leftIndex];
NSString * rightItem = [self objectAtIndex: rightIndex];

double leftValue = [leftItem doubleValue];
double rightValue = [rightItem doubleValue];

// Test if intervals overlap:
// IF YES use <= for right value // IF NO use < for right value. // CA brackets don't overlap // they're [x,y] [y+1,z]. // So when income is exactly "y" // we want to match the ONLY bracket containing "y", // not the higher of two brackets! // One even, one odd means the intervals shouldn't "overlap." if ((int)leftValue%2 != (int)rightValue%2) { if ((income >= leftValue) && (income <= rightValue)) { return 0; } } else // IRS brackets are [x,y] [y,z] so when income is exactly "y" // we want to match the higher of the two brackets! { if ((income >= leftValue) && (income < rightValue)) { return 0; } } if (income < leftValue) { return -1; } else { return 1; } } [/sourcecode]


Filed under: Uncategorized — frameworker @ 4:58 am


The PDQForms Debug target contains a Debug menu with commands that are enabled if a PDQDocument is open.

  Read test file…
  Save test file…

The “Save test file…” command journals edited forms into test files. The “Read test file…” command causes the test file to be read back into the current form, and verifies that values of the calculated fields are correct. This allows for rapid regression testing of forms after making changes to the code base.


Test files have the same names as the forms they “exercise,” but with the suffix “test”.

Test files are kept in the same folder as their corresponding “pdq” file.

The test file consists of FIELD_NAME, STRING_VALUE pairs, one pair per line.


Saving the test file writes out the form’s FIELD_NAME, STRING_VALUE data.

Reading the file back into the current form, populates each FIELD_NAME with its STRING_VALUE and causes that field’s dependents to update.

But if a STRING_VALUE read from the test file begins with an “=”, then it is an expected result and it will be compared to the calculated value of the form’s field, not stuffed into the form.


The test command logs descrepencies in calculated values:

“Unexpected value for calculated widget: ‘WidgetID’ shown: ‘itsValue’ expected: ‘itsExpectedValue'”

If there are no descrepencies, the test command logs the message:

“All calculated widgets have expected values :-)”


Since the accuracy of calculations in forms is paramount, this simple but powerful approach solves an important problem.


Filed under: Uncategorized — frameworker @ 4:50 am

I realize this document is pretty dry, but I needed to document this implementation to facilitate discussion with othere in the quest to improve it. Without going into why I took the approach I did, I will say that I did become very proficient in using strings with Objective C🙂



Expressions are written in infix notation, just as you’d expect. They consist of OPERATORS, OPERANDS, SEPARATORS and FUNCTION NAMES.


== <= or == or => != or && || !


+ – * % (modulus) ^ (exponentiation)

B. OPERANDS may be cell references or numbers. Numbers are evaluated as double-precision floating point.

C. SEPARATORS include ( ) , ;

The “=” character is prepended to all expressions, just like in VisiCalc.

D. All other tokens, those which are not OPERATORS, OPERANDS or SEPARATORS, are FUNCTION NAMES.

FUNCTION NAMES that end with “TABLE” are a special case. They perform Table Lookups for forms that use Tax Tables.


Operands, within formulae, may be cell references or numbers. Numbers are evaluated as double-precision floating point.


Boolean Formulae, expressions that resolve to YES (1) or NO (0), are used in the scriptIf component of Conditional Expressions. They aren’t used elsewhere at this time, but they could be. Any positive number could be interpreted as YES, but we currently require “1”.


Functions are snippets of code that get dispatched interpretively.

Function arguments may be formulae or can, themselves, be functions.

Quotes are used to transmit function (and table lookup) arguments as literals.

*Describe how a function gets added to PDQForms and the function dispatch mechanism.*
*Show the recursive function parsing routine.*


Table Lookups are used to find income tax, for example, for Federal filers whose taxable income is less than $100,000.

See the blog post on TABLE LOOKUPS IN PDQFORMS for a more detailed description of how they’re implemented.


Conditional Expressions are of the form =IF(scriptIf;scriptThen;scriptElse)

(“IF” is a reserved word)

ScriptIf, a Boolean Formula, resolves to a value of zero or one. If scriptIf resolves to 0, scriptThen is executed, otherwise scriptElse is executed. ScriptThen and scriptElse can be Table Lookups, Functions or Formulae.

Conditional Expressions may not be “nested.”


Expressions may not contain functions. For now, if an expression needs an embedded function, an invisible widget, that calls the function, can be referenced from the expression.

Older Posts »

Blog at