Gel Tracker was a program used to “track and extract” DNA data from “sequenced gels.” It was my full-time project from December 1996 until June 1998. I wrote it in C++ using MacApp.
Gel Tracker drew lines, interactively, down the center of each lane on a gel to identify the data slices to be extracted for sequencing. It also used an auto-tracking algorithm, I wrote, that radically improved tracking accuracy and efficiency. I also implemented a panel display to visualize chromatograms for tracking validation.
Milestones Leading to Automated DNA Sequencing
This is not a comprehensive history of the “Genomic universe,” just a rough timeline to show where Gel Tracking fit into the scheme of things in DNA sequencing.
1953 – The molecular structure of the DNA molecule is discovered. (Watson & Crick)
1977 – DNA sequencing by electrophoresis (Frederick Sanger)
1983 – PCR enables rapid amplification of DNA (Kary Mullis)
1984 – Four color dye method of sequencing developed (Tim Hunkapillar)
Automated DNA Sequencers
1986 – ABI 370A
1990 – ABI 373A
1995 – ABI 377
1996 – ABI 377 96-lanes
Tracking a Gel
The early sequencers used slab gels. The “sequenced gel” looked like a series of multi-colored ribbons running down a glass plate plate. It was necessary to “track the gel,” that is, to define the locus of each lane, so the underlying data could be extracted, the next step in the bioinformatic data stream.
When you look at a Gel Image, you see a four color image: Red, Green, Yellow and Blue. While each point in a gel image has four intensity values, one for each dye, only the most intense color is displayed at each pixel to enhance the visualization of DNA sequence data.
Gel Tracker window containing a tracked 96-lane gel:
You can see comb makers to the left and lane markers above the gel. The lanes are depicted as vertical white lines with selection handles at points where the lanes and combs intersect. (The handles turn red if they are “selected.”)
As “Moore’s Law” played out in sequencing, lane geometry became increasingly tight. With the advent of the 96 lane sequencer, lane separation was no longer always obvious. So it became necessary to augment user tracking with algorithms to largely automate the tracking process.
The 96-lane gel has a gap between two sets of 48 lanes. If a gel was good*, and a lot of them were, you could track the outside lanes of each 48 lane group. Then interpolate each of the two interior lane sets and auto-track them. Sometimes, auto-tracking a second time would improve the results slightly, where the auto tracking algorithm had been a bit “conservative” for the given data.
Selection was just a matter of dragging a selection rectangle over the lane markers you wanted to include. You could, additionally, Command-click to invert the selection state of individual lanes or Shift-click to affect multiple lanes.
* High flourescence saturation, reactions completed for practically all lanes (i.e. no “dropouts”), uniform lane spacing.
Visualizing DNA Data for Extraction
The key to optimizing gel tracking was to auto-track good gels but to have an interface that visualized the gel for manual tracking as a fall-back for bad gels. Adding the chromatogram panel accomplished that by allowing users to interactively observe a slice of the gel as it was being tracked. By noticing the amplitude of the chromatogram and watching its “Chrom value,” the user could find the optimal path for any lane.
The chromatogram panel allows you to visualize a slice of the gel when a single lane is selected. It’s implemented with a synchronous scroller so that it always matches the selected lane as the gel is zoomed and scrolled.
In 1987, as a Christmas present to his wife who was a geologist, Larry Tesler who had invented MacApp, added Synchronous Scrolling to MacApp for visualizing scientific information. Here, ten years later, I was also using it for just that, in biochemestry.
Visualizing chromatogram data conferred many benefits:
• Made it easier to Track difficult parts of a gel.
• Provided interactive feedback to shorten the “learning-curve” for new users.
• Helped users develop confidence in auto-tracking and understand other Gel Tracker features.
• Was essential for testing Gel Tracker, especially in conjunction with the “Chrom value” field.
Tracking detail with chromatogram:
Gel Tracker Architecture
Gel Tracker was implented with MacApp which provided general mechanisms for user interaction: selection, dragging, tracking feedback and one level of undo/redo. I loved MacApp!
MacApp’s View architecture perfectly supported Gel Tracking and the addition of USynchScroller, used to visualize “data slices,” was “icing on the cake”!
The adjustable lane map overlaid the gel view. When the lanes were retracked either manually or automatically, the affected extent of the gel view was invalidated and redrawn, using an offscreen bitmap, prior to updating the lane map.
Scrolling and Zooming
Scrolling and zooming performance was quite good, minimizing flicker, because of the offscreen bitmap. But to further reduce flicker, I kept a grey scroller underneath the gel. Only visible during invalidation, it made the perceived flicker practically vanish!
In the figure, “Tracking detail with chromatogram,” you can see the zoom controls between the ACGT buttons and the Sample Name text field. The gel is shown fully zoomed-in, so the horizontal and vertical magnification controls are dimmed.
When zooming, lane map positioning is conserved by numerically scaling the lane map. It’s also necessary to compensate for the fact that the zoomed image representation may be “downsampled,” a slight wrinkle.
Downsampling is a technique to visualize a reduced gel image. Rather than taking every nth vertical pixel, swaths of contiguous pixels are taken – still in proportion to the zoom factor – to maintain the characteristic appearance of DNA data.
Auto-tracking was performed, first, by shifting each tracking segment left or right, then by adjusting its slope and, finally, by “annealing” the endpoints of adjoining segments to maximize the lane’s image value.
The lane’s image value is the sum of intensity values for each pixel under each of its segments.
Of importance, the first two stages of auto-tracking are constrained by “edge detection” when the number of color changes between an original and alternate segment position exceeds a preset value.
When this occurs, it is assumed that a lane boundary has been reached and auto-tracking is halted for that stage.
In honor of Tim Hunkapillar, who besides being a world-class biochemist, was a big fan of The Rocky Horror Picture Show, the first phase of auto-tracking was eponymously named “The Time-Warp,” for the song from that movie containing the lyric: “just a jump to the left, then a jump to the right.”
Early in the project, I encountered a really weird toolbox “bug” that drove me nuts. In my “jihad” to defeat it, I was fortunate to find one of the last existing copies of Programming QuickDraw (which had been published six years earlier, in 1991) in Stacey’s Bookstore in Palo Alto. When I still couldn’t figure out how to solve the problem, I called the author, David Surovell, who graciously spoke with me about it! In the end, the solution involved “QuickDraw Voodoo.” But David’s assistance in ruling everything else out, helped me keep my sanity and find the eventual solution!