February 8, 2010

Cocoa to Cappuccino – Thinking About Strings

Filed under: Uncategorized — frameworker @ 11:52 pm


I found working with strings in Cappuccino to be more logical than in Cocoa, but I experienced some uncertainty about what methods and functions were available in JavaScript. So I thought this example and writeup might be useful to others, who are coming to Cappuccino from Cocoa, as many of us are.

One of the main differences between Cocoa and Cappuccino when working with strings, is that Cappuccino lacks a scanner class. So Cappuccino applications must handle scanning by themselves. I thought it would be useful and illustrative to implement a Category that performs scanning, similar to Cocoa’s NSScanner. And since JavaScript strings are toll-free bridged to Cappuccino’s CPString class that was not too difficult.

I’ve constructed my API as a Category, rather than as a Subclass, so it can be used with all CPStrings. It corresponds roughly to NSScanner, but doesn’t mirror it.

I’ve maintained a Cocoa-like style. This may change as time goes by, but for now it feels more readable, especially since I’m going back and forth between Cappuccino and Cocoa.

And I haven’t addressed performance issues in writing this. The code is adequate for my purposes. But it would be interesting to profile and optimize it somewhere down the line.

IMPLEMENTATION NOTE: I’m not currently supporting “skip characters.” That’s because the next scan position is determined by the length of the previously scanned string. If you skip characters there isn’t a simple way (that I could think of) to communicate this to the next use of the the scanner. Also all scans are case sensitive. An extended scanning category could add methods to support these. But for now, I’m avoiding that to keep things simple.

If you’re not familiar with NSScanner, it’s useful to note that the term “scan” means scanning from a particular starting point. That is, if you’re scanning for a particular string and it isn’t at the starting location of the string being scanned, then an empty string will be returned.

I’ll summarize the API here, but it will be much more instructive to view the source, which I’ve wrapped in a test program called cappscanner.


scanString – scans SELF, returning theString if a match is found.

    -(CPString)scanString:(CPString)theString startingAt:(int)startIndex

scanUpToString – scans SELF until a given string is encountered, accumulating characters into a string that’s returned. Scans to the end of SELF if stopString is not found.

- (CPString) scanUpToString:(CPString)stopString startingAt:(int)startIndex

scanUpToCharactersFromSet – scans SELF until a stopChar is encountered, accumulating characters into a string that’s returned. Scans to the end of SELF if no stopChars are found.

-(CPString)scanUpToCharactersFromSet:(CPString)stopChars startIndex:(int)index

scanCharactersFromSet – scans SELF as long as charsToScan are encountered, accumulating characters into a string that’s returned. Returns an empty string if no charsToScan are found.

-(CPString)scanCharactersFromSet:(CPString)charsToScan startIndex:(int)index


stringByReplacingString – replaces “target” with “replacement”, where “target” is a substring of SELF.

- (CPString)stringByReplacingString:(CPString)target withString:(CPString)replacement

setCharacterAtIndex – replaces the character at “index” in SELF.

-(CPString)setCharacterAtIndex:(unsigned)index theChar:(unichar)character

filterString – returns a copy of SELF filtering out the specified characters.


stripPrefix – returns a copy of SELF without thePrefix. Does nothing if SELF doesn’t have thePrefix


stripSuffix – returns a copy of SELF without theSuffix. Does nothing if SELF doesn’t have theSuffix.


dropCharacters – drops numCharsToDrop from the end of SELF. Does nothing if charsToDrop > [string length]. Returns an empty string if numCharsToDrop == [string length]



decimalTail – returns the tail of a decimal string.


formatNodecString – replaces the period in a formatted decimal string with a single space character.


formatNosepString – removes commas and the period from a formatted decimal string.


removeSurroundingParentheses – strips any leading or trailing spaces too. N.B. Won’t remove an odd parenthesis on one end!


parseScript – parses the if, then and else components of a script string. Notice how the scanner walks down the script using the cumulative offset of previously scanned components. This is illustrative of a repetitive scanning pattern.


rectFromAnnot – converts the RECT string from a pdf annotation (e.g. RECT [432.97 580.92 441.86 589.95]) into a CGRect.


scanRect – scans the RECT string found in pdf annotations. Returns an array of strings for the left, bottom, right and top coordinates. Note that scanRect also employs a repetitive scanning pattern.


tokensSeparatedByCharactersFromSet – breaks the input string into an array of substrings.



When you double-click the index.html file and then click the “Perform Scan Tests” button in the Scan Tests window, a Cappuccino Run Log Window will appear that contains these statements:

  Performing scan tests.

  Testing Scanning Methods.

  Scan if-then-else script.

  script is =IF(L222<12000;3500*L106e;0)
  scriptIf = L222<12000
  scriptThen = 3500*L106e
  scriptElse = 0

  Scan pdf style Rect.

  Build CGRect.

  left = 432.97
  bottom = 580.92
  right = 441.86
  top = 589.95

  Testing Utility Methods.

  Test stringByReplacingString

  string A plus string B
  string A + string B

  Test setCharacterAtIndex


  Test stripPrefix

  Mr. Coffee

  Test stripSuffix

  String Jr.

  Test dropCharacters


  Testing Test Related Methods.

  Test decimalTail


  Test formatNodecString

  1,099 87

  Test formatNosepString


  Scan tests complete.

“It works” 🙂

My thanks to the Cappuccino Community, and especially to the Core Team.


Create a free website or blog at