Notes on “Debugging Reinvented: Asking and Answering Why and Why Not Questions about Program Behavior”

September 21, 2008

Debugging Reinvented: Asking and Answering Why and Why Not Questions about Program Behavior, Andrew J. Ko and Brad A. Myers

Summary: The authors built “Whyline”, a tool that allows a user to browse backward through relevant code to explain program behaviour. To do this, they take a trace of program execution and combine it with an analysis of source code. Whyline filters the trace to detect what code is directly relevant to user-visible behaviour.

The good: Whyline seems like a very cool tool. The approach they describe seems helpful. I’d love to see the approach generalized to new programming environments.

The bad: The authors dedicate part of the paper to showing a productivity improvement due to Whyline. To do this, they compare the times of some programmers using Whyline to experts without it. They conclude that novices with Whyline are faster than experts without.

The first problem is that they tested Whyline with only nine people. The second problem is that they don’t give nearly enough information to determine if the methodology used was sound. The third problem is the data for the expert “comparison group” is from a different study. The fourth problem is that these experts were interrupted every three minutes.

We compared these people’s task performance with that of 18 self-described expert Java developers from a prior study, who used Eclipse 2.1 (in that study, participants were interrupted about every three minutes, but this time were removed from our analyses here).

Whyline seems to be a cool tool, but the “scientific” justification for it is a sham.

%d bloggers like this: