Exploration Through Example

Example-driven development, Agile testing, context-driven testing, Agile programming, Ruby, and other things of interest to Brian Marick
191.8 167.2 186.2 183.6 184.0 183.2 184.6

Tue, 27 Jan 2004

A paper to write

I'm sitting in on a sociology of science seminar at the University of Illinois. It's about Andrew Pickering's The Mangle of Practice: Time, Agency, and Science. The idea is that participants will write a paper that may be included in an edited volume of follow-ons to Pickering's book.

I'll be using this blog category to summarize Pickering's ideas as a way of getting them clear in my mind. The basic idea is one of emergence, that novelty and change arise from collisions. A scientific fact might arise from the collision of a theory of the world, a theory of instrumentation, and the brute fact of what the instrument does when you use it. A mathematical fact might arise from bashing together algebra and geometry in an attempt to achieve certain goals.

What attracts me to Pickering's work is what attracts me to the agile methods: an emphasis on practice, on doing things in the world; the idea that the end result is unpredictable and emergent; the idea that everything is up for grabs, up for revision, including the goals you started with; and a preference for the boundaries of fields over their purest forms.

The paper I'm thinking of writing is "The Mangling of Programming Practice from Smalltalk-80 to Extreme Programming". I think it's fairly well-known that the agile methodologists were disproportionately involved in Smalltalk and patterns way back when. What was the trajectory from Kent Beck's and Ward Cunningham's early days at Tektronix to the development of XP as it is today? It's a historical oddity that Smalltalk found a niche in the IT/insurance/business world. What was the effect of bashing the untidiness and illogicality of that world against the elegance of "all objects, all the time"? We see CRC cards being described in 1989. Today, the 3x5 card is practically a totem of XP. So is colocation, big visible charts, and so forth. Here we see an ever-increasing recognition of how the material world interacts with what is so easy to think of as a purely conceptual world. What's going on? Etc.

Does Pickering's account of change shed any light on how we got from Xerox PARC to where we are today? And, more important to my mind, does it give us any ideas about where we really are and how we might proceed?

## Posted at 21:03 in category /mangle [permalink] [top]

Types of bugs

Jonathan Kohl writes about the superbugs that remain after test-driven design. That inspired this ramble.

Long ago, I was much influenced by mutation testing, an idea developed by DeMillo, Hamlet, Howden, Lipton, and others. (Jester is an independent reinvention of the idea.) "Strong" mutation testing goes like this:

  1. Suppose you created a huge number of "mutants" of a program. A mutant is created by changing one token in the original program. For example, you might change a < to a <= or you might replace a use of variable i with a use of variable j. These one-token changes are the mutant transforms.

  2. Run your existing test suite against the original and the mutants. A mutant is killed if it gives a different answer from the original. The test that kills the mutant has the power to discover the bug in the mutant. (Assuming that the original is correct; alternately, the test might discover that the original is wrong in a way that the mutant corrects.)

  3. Add tests until you kill all the mutants (but see note 3 below). What do you now know? You know that the program is free of all bugs that can be caused by your set of mutant transforms.

  4. So? What about the other bugs? Early mutation work made two explicit assumptions:

    • The competent programmer hypothesis: Most bugs are such one-token bugs.

    • The coupling hypothesis: A test suite adequate to catch all one-token bugs will be very very good at catching the remainder.

    Given those hypotheses, a mutation-adequate test suite is generally adequate. That is, it will catch (almost) all bugs in the program.

I quickly grew disillusioned with mutation testing per se.

  1. The number of mutants is enormous (it's often O(N2) in the size of the code under test). Most are easy to kill, but some aren't. So mutation testing is a lot of work. Worse, in my experiments, I didn't get the feeling that those last hard-to-kill mutants told me anything profound about my program.

  2. Some mutants are equivalent to the original program. Compare

    	def max(a, b)
              if a > b
                a
              else
                b
            end
          
    to
    	def max(a, b)
              if a >= b
                a
              else
                b
            end
          
    No test can distinguish these programs, so you have to spend time figuring out whether an as-yet-unkilled mutant is unkillable. When you do - which isn't necessarily easy - it's a let-down.
  3. I don't believe the competent programmer hypothesis because we know that faults of omission are a huge percentage of the faults in fielded products.

  4. I find it hard to believe that mutation-adequate tests are adequate to catch enough faults of omission, so I don't buy the coupling hypothesis either.

Nevertheless, I was - and am - enamored of the idea that after-the-fact testing should be driven by examining the kind of bugs that occur in real programs, then figuring out specific "test ideas" or "test requirements" that would suffice to catch such bugs. Back then (and sadly still sometimes today), test design was often considered a matter of looking at a graph of a program and finding ways to traverse paths through that graph. The connection to actual bugs was very hand-wavey.

It was my belief that programmers are not endlessly creative about the kinds of bugs they make. Instead, they code up the same kinds of bugs that they, and others, have coded up before. Knowledge of those bug categories leads to testing rules of thumb. For example, we know to test boundaries largely because programmers so often use > when they should use >=.

It was my hope back then that, by studying bugs, we could come up with concise catalogs of test ideas that would be powerful at finding many likely bugs. I published a book along those lines.

(I was by no means the only person thinking in this vein: Kaner, Falk, and Nguyen's Testing Computer Software had an appendix with a similar slant. And Kaner has students at Florida Tech extending that work.)

What test ideas were novel in my book were based on bugs I found in C programs. For some time, I've thought that different programming technologies probably shift the distribution of bugs. Java programs have many fewer for loops walking arrays than C programs do, so there'll likely be fewer off-by-one bugs and less need for boundary testing. Languages with blocks/closures/lambdas encourage richer built-in collection functions, so are likely to have even fewer implementation bugs associated with collections. Etc.

As I've gotten more involved in test-driven design, it seems to me that micro-process will probably also have a big effect. Jonathan's note increases my suspicion. So now I'm thinking of things we might do at the Calgary XP/Agile Universe conference, which looks set to be a hub of agile testing activity.

  • Test-first programmers should learn to write better tests as they get feedback from missed bugs. Yet we don't have the sorts of catalogs that we have for refactorings, patterns, or code smells. Nor do we have a folklore of bug prevention. What can we do in Calgary to kick-start things?

  • How should people collaborate to reduce the bugs that slip past code reviews? Jonathon is pushing hard to understand tester-programmer collaboration. Since he'll be at Calgary, maybe we should do something - have programmers adopt testers and vice versa? - so that everyone can accelerate their learning.

It's too late to submit formal proposals to XP/AU, but there's lots of scope for informal activities.

## Posted at 21:03 in category /agile [permalink] [top]

About Brian Marick
I consult mainly on Agile software development, with a special focus on how testing fits in.

Contact me here: marick@exampler.com.

 

Syndication

 

Agile Testing Directions
Introduction
Tests and examples
Technology-facing programmer support
Business-facing team support
Business-facing product critiques
Technology-facing product critiques
Testers on agile projects
Postscript

Permalink to this list

 

Working your way out of the automated GUI testing tarpit
  1. Three ways of writing the same test
  2. A test should deduce its setup path
  3. Convert the suite one failure at a time
  4. You should be able to get to any page in one step
  5. Extract fast tests about single pages
  6. Link checking without clicking on links
  7. Workflow tests remain GUI tests
Permalink to this list

 

Design-Driven Test-Driven Design
Creating a test
Making it (barely) run
Views and presenters appear
Hooking up the real GUI

 

Popular Articles
A roadmap for testing on an agile project: When consulting on testing in Agile projects, I like to call this plan "what I'm biased toward."

Tacit knowledge: Experts often have no theory of their work. They simply perform skillfully.

Process and personality: Every article on methodology implicitly begins "Let's talk about me."

 

Related Weblogs

Wayne Allen
James Bach
Laurent Bossavit
William Caputo
Mike Clark
Rachel Davies
Esther Derby
Michael Feathers
Developer Testing
Chad Fowler
Martin Fowler
Alan Francis
Elisabeth Hendrickson
Grig Gheorghiu
Andy Hunt
Ben Hyde
Ron Jeffries
Jonathan Kohl
Dave Liebreich
Jeff Patton
Bret Pettichord
Hiring Johanna Rothman
Managing Johanna Rothman
Kevin Rutherford
Christian Sepulveda
James Shore
Jeff Sutherland
Pragmatic Dave Thomas
Glenn Vanderburg
Greg Vaughn
Eugene Wallingford
Jim Weirich

 

Where to Find Me


Software Practice Advancement

 

Archives
All of 2006
All of 2005
All of 2004
All of 2003

 

Join!

Agile Alliance Logo