Exploration Through Example

Example-driven development, Agile testing, context-driven testing, Agile programming, Ruby, and other things of interest to Brian Marick

191.8 ⇒ 167.2 ⇒ 186.2 183.6 184.0 183.2 184.6

Fri, 19 Dec 2003

Agile Project Management with Scrum

I'm reading a prepublication copy of Ken Schwaber's Agile Project Management with Scrum. It rocks. Stories of real projects, just like I like. Watch for it.

## Posted at 17:57 in category /agile [permalink] [top]

Open source automated test tools written in Java

Carlos E. Perez has a nice list of tools.

## Posted at 10:08 in category /misc [permalink] [top]

Do the right thing

Joe Bergin's Do the Right Thing pattern. (Warning: humor.)

## Posted at 10:02 in category /misc [permalink] [top]

Decorating programmer tests

[Updated to get Alan Green's name right.]

Charles Miller talks about decorating programmer tests to check for things like resource leaks. Alan Green chimes in. I met Charles Simonyi briefly on Tuesday and, weirdly enough, he had the same idea (with an aspect-ish flavor).

A clever idea, plus it's a trigger for me to go meta. On the airplane Saturday, I highlighted this sentence from Latour and Woolgar's Laboratory Life: The Construction of Scientific Facts:

... the solidity of this object... was constituted by the steady accumulation of techniques (p. 127 of the 2nd edition, 1986).

The object in question was Thyrotropin Releasing Factor (TRF), which is something you have inside you. Latour and Woolgar give a story of how TRF went from being a name given to something hypothetically present in one kind of unpurified glop to, specifically, Pyro-Glu-His-Pro-NH₂.

Now, you might say that TRF was always a solid object - that Pyro-Glu-His-Pro-NH₂ always existed, whether or not any person knew of it. Fine. But with respect to people, TRF didn't exist until the end of an enormous and time-consuming effort that yielded both a formula and, eventually, a Nobel prize. After that effort, other researchers could depend on that structure with unreflective certainty, and manufacturers could manufacture TRF in bulk rather than extracting minute quantities from slaughtered sheep.

Latour and Woolgar say that the sheer mass of diverse techniques applied - mass spectrometry, skilled technicians performing bioassays, computer algorithms, techniques for writing persuasive papers, and so forth - made TRF into a solid object people can use for their purposes.

I don't particularly care about TRF. I read this kind of stuff to give me ideas about what I do. And part of what I do is help construct facts.

You see, people make things in the world. Some of those things are very concrete: bridges. Some of them are very abstract: democracy. Both kinds of things have power in the world, and both are solid: the idea of democracy is as resistant to attack as the Golden Gate Bridge is to weather and tides.

I can interpret Latour and Woolgar as implying that democracy is not just an idea; rather, it's built from the techniques used to implement it. Robert's Rules of Order make parliamentary democracy. They're not merely one set of adornments around its unchanging essence.

Similarly, object-oriented programming is not just programming with languages that provide inheritance, polymorphism, and encapsulation. It's built from how people use those languages - from design patterns, from CRC cards, from old talk about finding objects by underlining the nouns in a problem description, from even older practices of using OO languages for simulation, from the push toward ever-more rapid iteration and feedback, from allowing exceptions to be stashable objects rather than transient stack transfers, and so forth. All these things - some of which don't seem in any way a part of any abstract essence of OO - nevertheless make up the solid notion we share.

What postings like Charles's and Alan's signify to me is that programmer testing is rapidly solifying into a new fact in the world. For the first time, it's approaching the reality of the Golden Gate Bridge. Weird, huh? And hope-making.

## Posted at 06:45 in category /testing [permalink] [top]

Wed, 17 Dec 2003

Smooth acceptance testing

Suppose I'm using programmer-style test-first design to create a class. I want four things.

I want a first test right at the beginning of the task.
Whenever a test passes, I want to reach out and near-instantly have a next test to make pass.
I want all these tests to push me in small steps toward a solution. I'm unhappy unless the steps are small.
When I'm done, I want the collection of tests to add up to a set that gives me good confidence that the task is truly done, not "done modulo bug fixes".

That's all pretty easy because I'm one person creating both tests and code, plus I know conventional testing.

Now consider acceptance testing, the creation of what I quixotically call customer-facing checked examples. A programmer will want all the same things (plus more conversation with business experts).

This is harder, it seems to me, because more people are involved. Suppose testers are creating these acceptance tests (not by themselves, but taking on the primary role, at least until the roles start blending). They have a tough job. They want to get those first tests out fast. They want to keep pumping them out as needed. They have to be both considerate of the time and attention of the business experts and also careful to keep them in charge. They must create tests of the "right size" to keep the programmers making steady, fast progress, but they also have to end up with a stream of tests that somehow makes sense as a bug-preventing suite at the end of the story.

There must be a real craft to this. It's not like conventional test writing. It's more like sequencing exploratory testing, I think, but still different.

Fortunately, it looks like there's a reasonable chance I'll be helping teams get better at this over the next year, and I still plan to make a point of visiting other teams that have their act together. Mail me if you're on one, or if you have good stories to tell.

## Posted at 16:18 in category /agile [permalink] [top]

Sun, 14 Dec 2003

Walk through the walls of knowledge guilds

A (the) Viridian research principle:

The boundaries that separate art, science, medicine, literature, computation, engineering, and design and craft generally are not divinely ordained. The most galling of these boundaries are socially generated entities meant to protect the power-interests of knowledge guilds. This is not to say that that all research techniques are identical, or that their results are all equally valid under all circumstances: quantum physics isn't opera. But there exists a sensibility that can serenely ignore intellectual turf war, and comprehend both physics and opera. You won't be able to swing a grant or sing an aria by knocking politely at the stage door. They won't seat you at the head of the table and slaughter the fatted calf. But you can take photographs, plant listening devices and leave. If you choose, you can step outside the boundaries history makes for you. You can walk through walls.

## Posted at 16:50 in category /misc [permalink] [top]

Fri, 05 Dec 2003

Testing language

Martin Fowler writes (quoted in its entirety):

I'm currently sitting in a session at XP day where Owen Rogers and Rob Styles are talking about the differences between XP's unit and acceptance tests. This triggered a thought in my mind - what should a language for writing acceptance tests be?
Commercial tools for UI testing tend to have their own proprietary language. Brett Pettichord, amongst others, question this; preferring a common scripting language such as Ruby or Python.
But I wonder if a language designed for programming is really the right language for writing tests. The point about tests is that they operate by example. They don't try to cover how to handle any value, instead they describe specific scenarios and responses. I wonder if this implies a different kind of programming language is required. Perhaps this is the truly startling innovation in FIT.

Acceptance tests are things to point at while having a conversation. The speakers will be testers, programmers, and business experts. The neatest surprise with FIT, to my mind, is the way you can turn a web page of tests into a narrative. I've mislaid my good example of this, but here's a snippet of page about an example Ward often uses: a bond trading rule that all months should be treated as having 30 days. (Note: I don't know if the following is correct - I was just playing around with FIT, not writing a real app.)

The end of the month is another special case. The last day always counts as day 30, even though that means a jump between the next-to-last day and it.

fit.Thirty360

from to actual() thirty360()

Feb 27, 2003 Mar 1, 2003 2 4

Feb 28, 2003 Mar 1, 2003 1 1

Feb 28, 1996 March 1, 1996 2 3

Feb 29, 1996 March 1, 1996 1 1

Days past day 30 in a month just count as 30.

fit.Thirty360

from to actual() thirty360()

Jan 29, 1950 Feb 1, 1950 3 2

Jan 30, 1950 Feb 1, 1950 2 1

Jan 31, 1950 Feb 1, 1950 1 1

The page is organized as set of special cases, each with a brief description followed by some checked examples. I really like this style: it's an easy-to-write mini-tutorial for the programmer and reader.

However, when it comes to tests that are sequences (action fixtures), I find that the tabular format grates. The table, instead of being a frame around what I read, is something that intrudes on my reading. It may be that I'm too used to code.

Another difficulty is variables: such useful beasts. Think of constructing a circular data structure. Yes, you can write a parseable language to do that (Common Lisp had one, as I recall), but to my mind it's simpler just to create substructures, give them names, and use the names to lash things together.

Or consider stashing results in a variable and then doing a set of checks. Or comparing two results to each other, where the issue isn't exact values but the relationships between values.

You could invent variables in the FIT language, but you're starting to get into notational problems that scripting languages have already solved. That way lies madness (after inventing loops and subroutines and...)

And yet, there is the niceness of HTML format. I've been toying with the idea of a free-form action-ish fixture that looked like this:

fit.ScriptFixture
start 'stqe' stop check records.size, 1 forget 1 check records.size, 0

Running the test might yield this:

fit.ScriptFixture
start 'stqe'
stop
check records.size, 1
forget 1
check records.size, 0
0 expected
1actual
stack trace
log

Here, I'm following RoleModel's lead in using Ruby as a somewhat customer-readable language. (<rant>And I also fixed the backwards order of xUnit assert arguments. Few English speakers say, "check that nice is the weather" instead of "check that the weather is nice". When I'm talking to people as I write an assert, it's really awkward that the required typing order isn't the natural order of speaking or thinking - first you think about what you're going to check, then you think about what the value should be. It's too late to fix this in xUnit, but we can do better for our customers.</rant>)

This format also allows for a mixture of scripting and tabular styles. I have some tests that look like this:

# A simple start->stop
start mail # record 1
                        assert_states [mail], []
                        assert_new_record 1, mail, mail_start
stop mail  # record 1 stops
                        assert_states [], []
                        assert_stopped_with(1, 1.minute)

You'll notice that I indent the asserts. That's because the tests (especially the longer ones) do a lot of starting, stopping, and pausing of various interacting jobs. It's too hard to see what's going on if the the asserts aren't visually distinguished from the actions.

Nevertheless, it's still pretty ugly. I showed it to a customer-type (my wife, who knows no programming languages). She understood it fine after I explained it, but it didn't thrill her. This might be better:

timeclock.fit.StateActions
step	running job	paused jobs	extra checks
start mail	mail	none	check records.last.number, 1 check records.last.job, mail check records.last.start_time, mail_start
stop mail	none	none	check records.last.number 1 check records.last.elapsed_time, 1.minute

The fixture synthesizes all the cells into one Ruby method. Is this better? (I'm not a particularly gifted visual designer, as anyone who's looked at my main website can tell.)

A final thought. I've read a couple of papers on intentional programming. (Can't find any good links.) It didn't click for me. I think it was the examples. One example had a giant embedded in some C code. Stacking and n and C are simultaneously variables in the surrounding function and mathematical variables in the equation. The idea of intentional programming is that there's one underlying representation (our friend, the syntax tree). Different bits of it get converted into the most suitable representation. So the mathematical bit gets turned into an equation¹.

The thing is, I found the mixture of C and mathematical notation jarring. Is the equation really clearer - to a programmer immersed in the app - than
Stacking = sum(0, n, C) / det(C)? The thing is, here we have one person reading two styles. I'm not sure that the required switching between styles isn't more cognitive trouble than it's worth.

But when we're talking about tests, we have two people who have very different cognitive styles, goals, and backgrounds reading one text. I can envision a common underlying test representation that can be switched between two modes. One is the "discuss the test with the customer" mode. The other is the "see the test as executable code" mode.

Perhaps Intentional Software will take up Martin's challenge. (If so, my consulting fees are quite modest...)

¹ The equation shown isn't the one used in the paper, not unless there's been an amazing coincidence.

## Posted at 16:41 in category /testing [permalink] [top]

Tue, 02 Dec 2003

A screenplay co-starring me

And now, a break for silliness.

## Posted at 17:58 in category /junk [permalink] [top]

Mon, 01 Dec 2003

Debugging, thinking, logging: how much of each?

Charles Miller quotes two authors on opposite sides of the debate over whether programmer tests eliminate the need for a debugger. I'm on the "I hardly ever use a debugger" side, but perhaps that's only because programmers perk up when I say I don't even know how to invoke the debugger in my favorite language (Ruby). Since it's my job to make programmers perk up, it's not in my interest to be a debugging fiend.

The debate made me think of a third approach, or perhaps a complement to the other two, that doesn't get the press it deserves. Let's step into the Wayback Machine...

It's 1984, the height of the AI boom. Expert systems are all the rage. The company I worked for hatched the idea that what builders of flight simulators (the kind that go inside big domes filled with hydraulics) really wanted was... Lisp. (I love Lisp, but this was not the savviest marketing decision.)

They cast around for someone who knew Lisp. I'd played around with it for a week. That qualified me to be technical lead. Dan was a quite good C programmer and happy on strange, out-of-the-way projects. Sylvia was a half-time graduate student who knew Fortran and Prolog.

In the end, we produced the best Lisp in the world, if by "best" you mean "quality of the final product divided by how much the team started out knowing about Lisp".

Actually, it wasn't bad. I'm pleased with what we accomplished.

It wasn't really that momentous an accomplishment, though. There was a free Lisp-in-Lisp implementation from Carnegie-Mellon. It ran on a machine called the Perq, whose main feature was user-programmable microcode. CMU had microcoded up a Lisp-machine-like bytecode instruction set, and their compiler produced bytecodes for it to execute. So we got a good start by coding up an interpreter (virtual machine) for the same instruction set. We just used C instead of microcode. I did the infrastructure (garbage collector, etc.) and Sylvia did most of the bytecodes.

I now get to the point...

Early on, I decided on a slogan for my code: "no bug should be hard to find the second time". Whenever a bug was hard to find, I wrote whatever debug support code would have made that kind of bug easy to find. Over time, the system turned into something that was eminently debuggable. Snap your fingers, and it told you what was wrong.

The things I did were very situated: they depended on the bug. But one thing I did was add a lot of logging. By letting the bugs drive where I put in logging statements, I avoided cluttering up the code too much. I remain a big fan of logging, and I'm distressed that the logging you see is so often so useless to anyone but the original author.

Resources:

I wrote a set of patterns for ring buffer logging for PLoP 2000. They could be a lot better (and a lot more complete), but they don't seem to be rewriting themselves, and I'm not gonna.
I also wrote a logging package for Ruby that does all the things I want. I don't think anyone else uses it, alas, due at least in part to its lame installation procedure. People are probably better off with logger (built into 1.8) or log4r (more popular).

The use of logging makes debuggers less necessary. Instead of single-stepping to figure out how on earth the program got to a point, you look at the log. If the logging is well-placed, and you have decent logging levels, you don't get mired in detail.

Having said that, it doesn't seem that logging is that useful to me in programmer tests. I don't need to know how the tests got somewhere. It's more useful in acceptance tests, where more is happening before the point of failure. Still, I rarely find myself looking at the log. It's most useful when trying to diagnose a bug not found by an automated test. Such bugs could be found by users or by exploratory testing. (Because exploratory testing is rather free-form, the log can help remind you of what you did when it's time to replicate a bug.)

One logging tip for large systems: I had a great time once doing exploratory testing of a big java system that had decent logging. I'd dink around with the GUI, but have the scrolling log open in another window. Every so often, something interesting would flash by in the log: "Look! The main event loop just swallowed a NullPointerException!" That would reveal to me that I'd tickled something that had no grossly obvious effect on the external interface. It was then my job to figure out how to make it have a grossly obvious effect.

## Posted at 21:18 in category /misc [permalink] [top]

Fri, 28 Nov 2003

Examples of programmer tests

I'm looking for really exemplary programmer tests (unit tests, whatever) to discuss in a class. The tests should be for a substantial product, something that's been put to real use. (It could be a whole product or some useful package.) I favor Java examples, though I'll also take C#, C++, C, Ruby, or Python. The example should be open source, or I should get special permission to use it. I'd really like to be able to run the tests, not just read them.

If you know of such a thing, tell me. Thanks.

## Posted at 16:49 in category /testing [permalink] [top]

Thu, 27 Nov 2003

Coding standards (and a little on metaphors)

Somewhere around 1983, I shot my mouth off one time too many and found myself appointed QA Manager for a startup. I'm sure I would have been ineffectual no matter what - I didn't have the technical credibility nor personal skills for the job.

The moment I realized I was doomed was probably in the middle of a rambunctious company-wide argument about a coding standard. I still have bad dreams about where to put the curly braces in C code.

Bill Caputo has a posting on coding standards. What I'll remember from it is a slogan I just made up:

Coding standards are about the alignment of teams, not the consistency of code.

Where were you when I needed you, Bill?

I also quite like Bill's earlier posting about consistency. My thoughts on consistency and completeness are moving in an odd direction, it seems. For example, I'm fond of Lakoff and Johnson's thesis that reasoning is metaphorical. So I think that our understanding of Understanding is freighted with the metaphor UNDERSTANDING IS SEEING. That changes the way we look for (ahem) understanding.

Some time ago, I started wondering why I have such a visceral sense of whether a system of thought is complete and consistent. Some of them simply seem whole, and that feeling is important to me. Why? Lakoff and Johnson say, "We are physical beings, bounded and set off from the rest of the world by the surface of our skins... Each of us is a container, with a bounding surface and an in-out orientation." (p. 29) Quite a lot of reasoning is based on metaphors of the form X IS A CONTAINER, and it seems like I'm using the CONCEPTUAL SYSTEM IS A CONTAINER metaphor. And I think others are, too.

But why should a conceptual system be a container? Why should it have an inside and an outside? So I'm actively on the lookout for systems that are partial, fuzzy, inconsistent - but nevertheless useful.

## Posted at 12:23 in category /misc [permalink] [top]

Esther Derby on influence

Something from Esther.

I know her premise, but I too often forget it in the heat of the moment. Maybe if I write it 500 times, it will become a habit. I'll start with one time:

"My premise is that influence depends on:

having a relationship

understanding the other persons interests and concerns

being willing to state your concerns in terms that relate to the other persons interests and concerns

looking for a mutually beneficial outcome - influence isn't about getting someone else to act against their own best interest"

## Posted at 10:12 in category /misc [permalink] [top]

James Bach's Blog

James Bach, tester extraordinaire, has a blog now. Two nice heuristics:

## Posted at 09:42 in category /testing [permalink] [top]

Wed, 26 Nov 2003

Final Vocabulary

I believe in social trends. The trend these days, certainly in the USA, is to cry 'Havoc,' and let slip the dogs of war. You see it in politics, you hear it on the radio, and - I believe - it's increasingly common in the software world. (Not that it was unknown before, mind you - I think net.flame was created in 1982.)

Too many people's ideas are being wrongly discounted because of who they are, who they associate with, who they have sympathy for, what other ideas they have, or other trivia. I don't think I can persuade anyone to sip from the half-full glass if they prefer to smash it because it's half empty. But I have been on the lookout for tools that can help those inclined to take that sip. (Yeesh, that's self-righteous. Sorry.) One I found is based on Richard Rorty's notion of a "final vocabulary". I invite you to read my essay on it.

## Posted at 11:45 in category /misc [permalink] [top]

Mon, 24 Nov 2003

Element of the Art: Triggers

Here are some thoughts about my topics and exercises for the Master of Fine Arts in Software trial run. They are tentative.

The topics are driven by a position I take. It is that requirements do not represent the problem to be solved or the desires of the customers. Further, designs are not a refinement of the requirements, for practical purposes. Neither is code. And tests don't represent anything either.

Rather, all artifacts (including conversation with domain experts) are better viewed as "triggers" that cause someone to do something (with varying degrees of success). Representation and refinement don't enter into it (except in the sense that we tell stories about them). So both requirements and system-level tests are ways of provoking programmers to do something satisfying. And code is something that, when later programmers have to modify it, triggers them to do it in a more useful or less useful way.

In practical terms, I am thinking of covering these topics:

How conversation triggers tests, text, and more conversation

Goal: to increase understanding of, and skill at, interviewing domain experts and feeding the resulting information into programming.

Exercise: My wife is a large animal veterinarian at Illinois. Their medical records system is wretched. I told her I wanted to practice programming by writing a new one (without any expectation that it would really be used). I'll have a bit of a start on that by the time of the MFA trial.

She and her graduate students are domain experts. We can interview them to see what they do and what they want. We can use two variant interviewing styles: just talking, and talking augmented by writing tests. (We will compare and contrast the two. We'll also see what questions arise as interviewers try to explain what they learned to people not involved in the interviews.) (I'm also hoping to get some sociology students to watch the interviews and comment on them.)

Thereafter, we will flesh out sets of tests and keep track of questions that arise. Why didn't they come up earlier? We'll also think about what's missing from the tests. Is there anything we feel the need to write down? Why?

Then we'll do some coding. What questions arise?

The order of coding

Goal: to learn how the order in which tests are developed affects the final code.

Exercise: Pairs of people will do test-driven development. Each pair will be given a small set of tests to pass. They're to follow YAGNI, writing as little code as they can. After the tests pass, they'll come get a new set of tests, which will (I hope) provoke them to implement a more elaborate state machine pattern. Iterate several times.

Each pair will get the same set of tests, but in different orders.

After each pair is finished, they'll join up with another finished pair. First question: how different is the code (and why)? Second question: were any of the sequences better than the others (and why)?

Learning from tests

Goal: To learn to write or organize tests so they're more useful to later readers.

Exercise: Each person will bring some code+tests that they are familiar with. They will also bring a set of questions for someone else to answer about the code. Another person will try to answer the questions by first looking at or running the tests, then (if necessary) looking at the code, running it, etc. The two will then discuss how the tests could have been more informative (new tests? different organization? better names?)

How code triggers code readers

Goal: to become more skillful at writing code that targets a particular kind of reader.

Exercise: We'll begin with a set of code that is stylistically idiomatic for one audience. Working in pairs, people will identify what makes it idiomatic and rewrite it to match the expectations of another audience.

## Posted at 08:05 in category /misc [permalink] [top]

Fri, 14 Nov 2003

In praise of Hiroshi Nakamura, and the Ruby community

I'm writing an article about doing exploratory testing of APIs using Ruby. The article shows how to test the Google web services API, so I'm using Hiroshi Nakamura's SOAP4R package. I found and reported a bug in that package. 23 minutes later, he replied with a patch. A reviewer of the article had a question about proxy servers. Less than 20 minutes later, he had a response. Other emails to Hiroshi have led to similarly pleasant results.

Hiroshi is emblematic of the best spirt of the Ruby community. One thing I noticed right off the bat when learning Ruby is how helpful the Ruby community is. I hope we can keep it up.

(I'm feeling really regretful that I didn't make it to the RubyConf this year.)

## Posted at 19:36 in category /misc [permalink] [top]

Thu, 13 Nov 2003

Christian Sepulveda on coaching

Christian Sepulveda writes what I want to call a mission statement for agile coaching, except that "mission statement" is too often a synonym for "vacuous", which this is not.

My father compares parenting to bumper bowling; a parent's job is to keep the ball out of the gutter. But, if their child is not approaching the edge, the parent should let the child find his own way.

Similarly, a good coach provides guidance but allows (and hopefully encourages) a team to find their own identity. It's critical for a team to take ownership of its own process if they are to maintain and adopt it.

Were I starting as a coach, I'd hand Christian's essay out to the team and say, "Here. Tell me when I don't live up to this."

Christian follows the mission statement with some practical guidelines. He talks about identifying and understanding stakeholders. At PLoP, Jeff Patton walked us through a two-hour example of doing just that, using a variant of Constantine/Lockwood style usage-centered design. Shameless plug: Jeff has written an article on that topic for STQE magazine. It'll be out in the January issue.

## Posted at 11:02 in category /agile [permalink] [top]

Wed, 12 Nov 2003

Matrices into trees

New blogger Jonathan Kohl has a different way of explaining my four categories of agile testing. He uses a tree instead of a 2-way matrix.

I think I like his approach better than mine. It provides an appealing sequence for presenting the ideas. The root of the tree sets the stage. The first two branches emphasize the importance of engaging with two worlds:

the business world the product will live in
the technology world the project team lives in

When the target audience is from the business world (as it was for Jonathan), you're immediately showing them they're high on your list. (In drawing the tree, I might put "business facing" on the left - in cultures that write left to right, that's usually the dominant position, the one that comes first in sequence and perhaps importance.)

Then you introduce another distinction: between testing as support for other people during their work and testing as a critique of something those people have completed. You can dive into the details according to the interests of the audience: "which of these do you want to talk about?"

I may give this tree a try some time, although I'm still fond of "business-facing product critique". Rolls so elegantly off the tongue, don't you think?

Calgary, Canada - where Jonathan lives - is, by the way, a real hotbed of agile testing activity. It also happens to be where XP Agile Universe will be next year. Let's hope a lot of agile testers come.

## Posted at 12:48 in category /agile [permalink] [top]

Sat, 08 Nov 2003

Marick's fourth law

Every email you send provokes, on average, 1.5 emails you have to respond to.

I sent my first email in 1975 (maybe 1976), and I think I'm well past the knee in the exponential curve. If I've been tardy about responding to your email, now you know why. Alack, alas: I'm a helpless victim of the laws of nature.

## Posted at 16:24 in category /misc [permalink] [top]

Thu, 06 Nov 2003

Links

I find the following quote (from Laurent Bossavit, part of a longer post) pretty evocative, though I can't quite say yet what it evokes.

Or consider a related model, "Baumol's cost disease": industries where productivity remains constant will have increasing costs, because the industries where rising productivity is keeping costs down can raise wages to attract workers. To keep costs down, flat-productivity industries may have to lower quality.

Perhaps this is even directly applicable to software development. Consider the productivity of programming compared to that of testing/debugging (the old-fashioned kind). Today it is possible to write code in one hour that does a lot more than one hour's worth of coding thirty years ago: there's huge amounts of function in the OS, in the core class libraries, etc. But the productivity of testing remains constant: there's only so much complexity you can poke and prod given one hour of debugging. So the trend would be toward testing becoming ever more costly compared to development, leading to less testing being done in order to keep costs down, leading to more quality problems.

Blogs on testing from Microsoft (via Tim van Tongeren): Sara Ford, Josh Ledgard, and Joe Bork.

## Posted at 07:43 in category /misc [permalink] [top]

Wed, 05 Nov 2003

Links

Alan Francis and Christian Sepulveda comment on my note about Agile Alliance public relations.

Via Jason Kottke, Esther Derby has a wonderful set of guidelines for learning. I much like Esther's addition: "make the most generous possible interpretation".

Ben Hyde has a discussion of what happens as firms get ahead of their customers.

The point of this chart is that over time a firm's product offerings begin to outstrip the ablity of the customers to absorb all the functionality the product offers.

## Posted at 08:09 in category /misc [permalink] [top]

Sun, 02 Nov 2003

A new role within the Agile Alliance

Completely to my surprise, I was elected vice-chair of the Agile Alliance nonprofit at the recent board meeting. I think this marks the definitive tipping point: I'm no longer a testing person interested in agile methods. I'm an agile methods person interested in testing.

I'm an odd duck, in that I am both attracted to the revolutionary, the iconoclastic, yet also want to be - and be seen as - reasonable and sensible. My first main task will be one that combines the two desires.

The agile methods are, I think, well-established among the enthusiast and visionary segments of the technical communities. Left to themselves, the agile programmers, testers, and line managers would keep successfully pushing agile methods into the mainstream.

Where the agile methods are not established is among the ranks of the CIOs, CFOs, CEOs. That's sad, because one of the things that first struck me about the agile methods was the fervor with which rank techies seemed to care about things like ROI and making the "gold owners" truly happy. That was certainly something I'd not seen before. And yet the CxOs don't know that. The message they hear from the heavyweight competitors to the agile methods is that agile projects are havens for mad hackers who can't be trusted. Or they hear the message that no kind of development really works, so you might as well get the inevitable dissatisfaction for 1/5 the salary cost by going offshore. So the technical communities are not being left to themselves.

We have to counter those messages. The agile methods need better PR directed at the executive suite. I'm the chair of a committee within the Agile Alliance board. Within a month, we are to deliver a proposed approach and budget for better PR. If you have ideas to suggest, please send them to me. Lord knows that speaking to CxOs is not my strong suit.

## Posted at 11:25 in category /agile [permalink] [top]

Tue, 21 Oct 2003

Intellij IDEA

I downloaded a copy of Intellij IDEA a few days before Ward Cunningham and I were scheduled to do a presentation on test-first programming at PNSQC. I must say that it's the only IDE that hasn't made me run screaming back to Emacs.

It seems, however, kind of flaky on the Mac. The first time I launched it, it wouldn't let me dismiss one of the floating windows, but it's worked fine all later times. It's crashed a few times, but without losing anything. And, literally just before our presentation, we wanted to enlarge the font. I'm still not quite sure how we did it, but we changed the font instead. The result:

Which menu do you think has the command that takes you to a declaration? And when we pulled down a menu, we got something with this information content:

ðððð
ðððððððððð
ððððð
ðððð
ððððððð
ððððð
ððððððððð

Fortunately, we had presence of mind, Emacs, and we remembered that the little green arrow on the right meant "run the FIT tests". So we edited code in Emacs, ran tests with IDEA, and a good time was had by all. And we later had Andy Tinkham to help Ward get the JUnit tests working in Eclipse while I pontificated to distract the audience.

Not the most auspicious day for IDEA, but I'd still use it for Java coding if I could get some employer to buy it for me. At the current price, with the Mac flakiness, it doesn't quite cut it for this independent consultant. (It was not too hard to figure out how to correct the problem - just not in the heat of the moment, with maybe 100 people watching. So I have a usable copy for the next few days.)

(We were also using Word to write FIT tables. Word crashed during the presentation. Moreoever, we stripped off all Word's toolbars to remove clutter. Now, when I get home, I discover that I cannot add back the Drawing toolbar. When I do, Word just hangs. I guess I have nothing better to do than reinstall.)

(Seems harsh to be more snarkish to Microsoft than to Jetbrains, makers of IDEA. But I bet Jetbrains would trade some goodwill for monopoly positioning and a huge cash hoard, some of which could be usefully spent on Word for the Macintosh, which is - on my machine - markedly buggier than the Windows version. Please do better, Redmond guys.)

## Posted at 09:13 in category /misc [permalink] [top]

Mon, 06 Oct 2003

Bad RSS link

If you subscribed to this blog's RSS feed but never got any updates, it's because I used a bad relative link. It caused you to only get updates to category "blog". This is the first one ever. Change your subscription to this: http://www.exampler.com/old-blog/index.rss

If you have been getting updates, don't change anything.

Sorry.

## Posted at 19:59 in category /blog [permalink] [top]

Sun, 05 Oct 2003

Agile testing directions: Postscript

The last part (hurray!) of a series
The table of contents is on the right

Thus ends my essay on where agile testing is going and should go. I want to reemphasize that I fully expect I'll look back on it in five years and think "How naïve". That's always been the case in the past. Why should the future be different?

I like being wrong, as long as the wrongness is a step along a productive path. I feel that way about this essay. I feel good about the direction my work will now take me. I hope this flood of words is also useful to others.

## Posted at 09:19 in category /agile [permalink] [top]

Learning time

Via Keith Ray, something from Jim Little about continuous learning. A practical approach that I've not seen before.

## Posted at 09:19 in category /misc [permalink] [top]

Sat, 04 Oct 2003

Agile testing directions: Testers on agile projects

Part 7 of a series
The table of contents is on the right

Should there be testers on agile projects?

First: what's the alternative? It is to have non-specialists (programmers, business experts, technical writers, etc.) perform the activities I've identified in this series: helping to create guiding examples and producing product critiques. Or, symmetrically, it's to have testers who do programming, business analysis, technical writing, etc. It's to consider "testing" as only one set of skills that needs to be available, in sufficient quantity, somewhere in the team, to service all the tasks that require those skills.

Why would non-specialists be a bad idea? Here are some possible reasons:

Testing skills are hard to learn. If you try to be a tester and a programmer or a tester and a technical writer, you won't have the minimum required skills to be a good enough tester.
Suppose you're the best basketball player in the world and also the best car washer. You should nevertheless pay someone else to wash your car, because you could earn far more in that hour playing basketball than you'd save washing your own car. That's an example of comparative advantage, what Paul Samuelson advanced as the only proposition in the social sciences that's both true and non-trivial. It's a general argument for specialization: it's to the advantage of both you and the person you hire for both of you to specialize. So why shouldn't a person with a knack for testing do only testing, and a person who's comparatively stronger at programming do only programming?
Testing might not be so much a learned skill as an innate aptitude. Some people are just natural critics, and some people just aren't.
All the other tasks that a tester might take on in a project imply sharing ownership of the end product. Many people have trouble finding fault in their own work. So people who mix testing and other tasks will test poorly. It's too much of a conflict of emotional interest.
A tester benefits from a certain amount of useful ignorance. Not knowing implementation details makes it easier for her to think of the kinds of mistakes real users might make.

Argument

Let me address minimum required skills and comparative advantage first. These arguments seem to me strongest in the case of technology-facing product critiques like security testing or usability testing. On a substantial project, I can certainly see the ongoing presence of a specialist security tester. On smaller projects, I can see the occasional presence of a specialist security tester. (The project could probably not justify continual presence.)

As for the exploratory testers that I'm relying on for business-facing product critiques, I'm not sure. So many of the bugs that exploratory testers (and most other testers) find are ones that programmers could prevent if they properly internalized the frequent experience of seeing those bugs. (Exploratory testers - all testers - get good in large part because they pay attention to patterns in the bugs they see.) A good way to internalize bugs is to involve the programmers in not just fixing but also in finding them. And there'll be fewer of the bugs around if the testers are writing some of the code. So this argues against specialist testers.

Put it another way: I don't think that there's any reason most people cannot have the minimum required exploratory testing skills. And the argument from comparative advantage doesn't apply if mowing your lawn is good basketball practice.

That doesn't say that there won't be specialist exploratory testers who get a team up to speed and sometimes visit for check-ups and to teach new skills. It'd be no different from hiring Bill Wake to do that for refactoring skills, or Esther Derby to do that for retrospectives. But those people aren't "on the team".

I think the same reasoning applies to the left side of the matrix - technology-facing checked examples (unit tests) and business-facing checked examples (customer tests). I teach this stuff to testers. Programmers can do it. Business experts can do it, though few probably have the opportunity to reach the minimum skill level. But that's why business-facing examples are created by a team, not tossed over the wall to one. In fact, team communication is so important that it ought to swamp any of the effects of comparative advantage. (After all, comparative advantage applies just as well to programming skills, and agile projects already make a bet that the comparative advantage of having GUI experts who do only GUIs and database experts who do only databases isn't sufficient.)

Now let's look at innate aptitude. When Jeff Patton showed a group of us an example of usage-centered design, one of the exercises was to create roles for a hypothetical conference paper review system. I was the one who created roles like "reluctant paper reviewer", "overworked conference chair", and "procrastinating author". Someone remarked, "You can tell Brian's a tester". We all had a good chuckle at the way I gravitated to the pessimistic cases.

But the thing is - that's learned behavior. I did it because I was consciously looking for people who would treat the system differently than developers would likely hope (and because I have experience with such systems in all those roles). My hunch is that I'm by nature no more naturally critical than average, but I've learned to become an adequate tester. I think the average programmer can, as well. Certainly the programmers I've met haven't been notable for being panglossian, for thinking other people's software is the best in this best of all possible worlds.

But it's true an attack dog mentality usually applies to other people's software. It's your own that provokes the conflict of emotional interest. I once had Elisabeth Hendrickson doing some exploratory testing on an app of mine. I was feeling pretty cocky going in - I was sure my technology-facing and business-facing examples were thorough. Of course, she quickly found a serious bug. Not only was I shocked, I also reacted in a defensive way that's familiar to testers. (Not harmfully, I don't think, because we were both aware of it and talked about it.)

And I've later done some exploratory testing of part of the app while under a deadline, realized that I'd done a weak coding job on an "unimportant" part of the user interface, then felt reluctant to push the GUI hard because I really didn't want to have to fix bugs right then.

So this is a real problem. I have hopes that we can reduce it with practices. For example, just as pair programming tends to keep people honest about doing their refactoring, it can help keep people honest about pushing the code hard in exploratory testing. Reluctance to refactor under schedule pressure - leading to accumulating design debt - isn't a problem that will ever go away, but teams have to learn to cope. Perhaps the same is true of emotional conflict of interest.

Related to emotional conflict of interest is the problem of useful ignorance. Imagine it's iteration five. A combined tester/programmer/whatever has been working with the product from the beginning. When exploring it, she's developed habits. If there are two ways to do something, she always chooses one. When she uses the product, she doesn't make many conceptual mistakes, because she knows how the product's supposed to work. Her team's been writing lots of guiding examples - and as they do that, they've been building implicit models of what their "ideal user" is like, and they have increasing trouble imagining other kinds of users.

This is a tough one to get around. Role playing can help. Elisabeth Hendrickson teaches testers to (sometimes) assume extreme personae when testing. What would happen if Bugs Bunny used the product? He's a devious troublemaker, always probing for weakness, always flouting authority. How about Charlie Chaplin in Modern Times: naïve, unprepared, pressured to work ever faster? Another technique that might help is Hans Buwalda's soap opera testing.

It's my hope that such techniques will help, especially when combined with pairing (where each person drives her partner to fits of creativity) in a bullpen setting (where the resulting party atmosphere will spur people on). But I can't help but think that artificial ignorance is no substitute for the real thing.

Staffing

So. Should there be testers on an agile project? Well, it depends. But here's what I would like to see, were I responsible for staffing a really good agile team working on an important product. Think of this as my default approach, the prejudice I would bring to a situation.

I'd look for one or two people with solid testing experience. They should know some programming. They should be good at talking to business experts and quickly picking up a domain. At first, I'd rely on them for making sure that the business-facing examples worked well. (One thing they must do is exercise analyst skills.) Over time, I'd expect them to learn more programming, contribute to the code base, teach programmers, and become mostly indistinguishable from the people who started off as programmers.
Personality would be very important. They have to like novelty, they shouldn't have their identity emotionally wrapped up in their job description, and they have to be comfortable serving other people.
It would be a bonus if these people were good at exploratory testing. But, in any case, the whole team would receive good training in exploratory testing. I'd want outside exploratory testing coaches to visit periodically. They'd both extend the training and do some exploratory testing. That last is part of an ongoing monitoring of the risk that the team is too close to the product to find enough of the bugs.
To the extent that non-functional "ilities" like usability, security, and performance were important to the product, we'd buy that expertise (on-site consultant, or visiting consultant, or a hire for the team). That person would advise on creating the product, train the team, and test the product.
(See Johanna Rothman about why such non-functional requirements ought to be important. I remember Brian Lawrence saying similar things about how Gause&Weinberg-style attributes are key to making a product that stands out.)
I'd make a very strong push to get actual users involved (not just business experts who represent the users). That would probably involve team members going to the users, rather than vice-versa. I'd want the team to think of themselves as anthropologists trying to learn the domain, not just people going to hear about bugs and feature requests.

Are there testers on this team, once it jells? Who cares? - there will be good testing, even though it will be increasingly hard to point at any activity and say, "That. That there. That's testing and nothing but."

Disclaimers

"I'd look for one or two people with experience testing. They should...": Those ellipses refer to a description that, well, is pretty much a description of me. How much of my reasoning is sound, how much is biased by self-interest? I'll leave that to you, and time, to judge.
"... the whole team would receive good training in exploratory testing.": Elisabeth Hendrickson and I have been talking fitfully all year about creating such training. Again, I think my conclusion - that exploratory testing is central - came first, but you're entitled to think it looks fishy.

## Posted at 12:20 in category /agile [permalink] [top]

Thu, 02 Oct 2003

Master of Fine Arts in Software

Richard Gabriel has been pushing the idea of a Master of Fine Arts in Software for some time. It now looks as if the University of Illinois is seriously considering the idea of offering such a degree (which they would prefer to call "Master of Software Arts"). There is likely to be a trial run in early January. If that goes well, the next step is a full-fledged University program.

A software MFA would be patterned after a non-residential MFA. Twice a year, students would come to campus for about ten days. They'd do a lot of work. They'd get a professor with whom they'd work closely-but-remotely for the next six months. Repeat several times. An MFA will differ from conventional software studies in several ways:

You read the work of masters, take them apart to see what makes them tick.
You do a lot of work of your own, which is critically discussed and frequently revised.
You write and teach about the craft.
It's for people who already have experience.
etc...

Gabriel says:

The way that the program works is for each student to spend a lot of time explicitly thinking about the craft elements in any piece of software, design, or user interface... It is this explicit attention to matters of craft that matures each student into an excellent software practitioner.

Ralph Johnson and I are the local organizers, also two of the instructors. We're looking for students. This first trial run, we're especially looking for people with lots of experience and reputation. If those people get value, everyone can. And if they say they got value, the future program will get lots of students. Who wouldn't want to attend a program that people like Ward Cunningham or Dave Thomas or Andy Hunt or Martin Fowler or Mike Clark or Michael Feathers or Eric Evans or Paul Graham said was worth their time?

I've set up a yahoogroups called software-mfa. You can subscribe by sending mail to software-mfa-subscribe@yahoogroups.com.

As you can see, my efforts to become a hermit and not get involved in organizing things are proceeding well. But this is a chance to make an important difference.

## Posted at 08:35 in category /misc [permalink] [top]

Wed, 01 Oct 2003

The Science of Design

At Ralph Johnson's suggestion, I wrote a position paper for a National Science Foundation workshop on the science of design. ("The workshop objective is to help the NSF better define the field and its major open problems and to prioritize important research issues.")

On the principle that no good deed should go unblogged, here's a PDF copy of my position paper, "A Social Science of Design".

My position is that one Science of Design should be a science of people doing design... more akin to anthropology or social studies of scientific practice than to physics... A successful research program would win the Jolt software productivity award as well as help someone gain tenure.

The last time I was at a "how should government fund software research" workshop, I had absolutely no effect with a similarly iconoclastic and populist proposal. If I'm accepted this time, we'll see if my presentation and argumentation skills have improved in the last two decades. (They can't have gotten worse.)

## Posted at 08:13 in category /misc [permalink] [top]

Sat, 27 Sep 2003

Champaign-Urbana Agile study group

Some of us have been talking about creating an Agile methods study group for the C-U area. We want to get started. To that end, I'll be giving a preview performance of a talk on agile testing I'll be giving at the Pacific Northwest Software Quality Conference. It's titled "A Year in Agile Testing", but it might be better titled "Where I Think Testing in Agile Projects is Going". Readers of my "agile testing directions" series will know what to expect.

(Scroll down from here for an abstract.)

The talk is sized to fit a keynote slot, so it's about 1.5 hours long (including question time). Afterwards, we'll take off to a restaurant and set up the next study group session.

When: Thursday, October 2, 5:30 PM
Where: Motorola, UI technology park, 1st and St. Mary's Road

Motorola wants to know roughly how many people are coming, so let me know if you are.

(Yes, I know I was going to become a hermit.)

## Posted at 17:33 in category /agile [permalink] [top]

Thu, 25 Sep 2003

Agile testing directions: technology-facing product critiques

Part 6 of a series
The table of contents is on the right

As an aid to conversation and thought, I've been breaking one topic, "testing in agile projects," into four distinct topics. Today I'm finishing the right side of the matrix with product critiques that face technology more than the business.

I've described exploratory testing as my tool of choice for business-facing product critiques. But while it may find security problems, performance problems, bugs that normally occur under load, usability problems (like suitability for color-blind people), and the like, I wouldn't count on it. Moreover, these "ilities" or non-functional or para-functional requirements tend to be hard to specify with examples. So it seems that preventing or finding these bugs has been left out of the story so far. Fortunately, there's one quadrant of the matrix left. (How convenient...)

The key thing, I think, about such non-functional bugs is that finding them is a highly technical matter. You don't just casually pick up security knowledge. Performance testing is a black art. Usability isn't a technical subject in the sense of "requiring you to know a lot about computers", but it does require you to know a lot about people. (Mark Pilgrim's Dive Into Accessibility is a quick and pleasant introduction that hints at how much richness there is.)

Despite what I often say about agile projects favoring generalists, these are areas where it seems to me they need specialists. If security is important to your project, it would be good to have visits from security experts, people who have experience with security in many domains. (That is, security knowledge is more important than domain knowledge.) These people can teach the team how to build in security as well as test whether it has in fact been built in.

(It's interesting: my impression is that these fields have less of a separation between the "design" and "critique" roles than does straight functional product development. It seems that Jakob Nielsen writes about both usability design and usability testing. The same seems true of security people like Gary McGraw and maybe Bruce Schneier, although James Whittaker seems to concentrate on testing. I wonder if my impression is valid? It seems less true of performance testers, though the really hot-shot performance testers I know seem to me perfectly capable of designing high-performance systems.)

So it seems to me that Agility brings nothing new to the party. These specialties exist, they're developed to varying degrees, they deserve further development, and their specialists have things well in hand. It may be a failure of imagination, but I think they should continue on as they are.

It seems I've finished my series about future directions for Agile testing. But there remains one last installment: do I think, in the end, that there should be testers in Agile projects? It's a hot issue, so I should address it.

## Posted at 16:06 in category /agile [permalink] [top]

Wed, 24 Sep 2003

Agile testing directions: business-facing product critiques

Part 5 of a series
The table of contents is on the right

As an aid to conversation and thought, I've been breaking one topic, "testing in agile projects," into four distinct topics. Today I'm starting to write about the right side of the matrix: product critiques.

Using business-facing examples to design products is all well and good, but what about when the examples are wrong? For wrong some surely will be. The business expert will forget some things that real users will need. Or the business expert will express needs wrongly, so that programmers faithfully implement the wrong thing.

Those wrongnesses, when remembered or noticed, might be considered bugs, or might be considered feature requests. The boundary between the two has always been fuzzy. I'll just call them 'issues'.

How are issues brought to the team's attention?

Many agile projects have end-of-iteration demonstrations to the business experts and interested outsiders. These are good at provoking, "Oh... that's what I said, but it's not what I meant" moments.
Agile projects would like to deploy their software to its users frequently (probably more frequently than users want to upgrade). When users get their hands on it, they can point out issues.

These feedback loops are tighter than in conventional projects because agile projects like short iterations. But they're not ideal. The business experts may well be too close to the project to see it with fresh and unbiased eyes. Users often do not report problems with the software they get. When they do, the reports are inexpert and hard to act upon. And the feedback loop is still less frequent than an agile project would like. People who want instant feedback on a one-line code change will be disappointed waiting three months to hear from users.

For that reason, it seems useful to have some additional form of product critique - one that notices what the users would, only sooner.

The critiquers have a resource that the people creating before-the-fact examples do not: a new iteration of the actual working software. When you're describing something that doesn't exist yet, you're mentally manipulating an abstraction, an artifact of your imagination. Getting your hands on the product activates a different type of perception and judgment. You notice things when test driving a car that you do not notice when poring over its specs. Manipulation is different than cogitation.

So it seems to me that business-facing product critiques should be heavy on manipulation, on trying to approach the actual experience of different types of users. That seems to me a domain of exploratory testing in the style of James Bach, Cem Kaner, Elisabeth Hendrickson, and others. (I have collected some links on exploratory testing, but the best expositions can be found among James Bach's articles.)

Going forward, I can see us trying out at least five kinds of exploratory testing:

One exploratory tester.
Pairs of exploratory testers. James Bach and Cem Kaner probably have the most experience with this style.
Pairing an exploratory tester with a programmer on the project. Jonathan Kohl will have an article on that in the January 2004 issue of STQE Magazine. I've had some limited experience with this, and the programmers enjoyed it. Most noteworthy, when I did it at RoleModel Software, it led to an interesting and useful discussion about a fundamental ground-rule. In that way, it served as something like a retrospective, which reinforces my hunch that this is a good end-of-iteration activity.
Pairing an exploratory tester with the on-project business expert.
Pairing an exploratory tester with interested non-participants ("chickens", in Scrum terminology) like executives, nearby users, and so forth.

For each of these, we should explore the question of when the tester should be someone from outside the team, someone who swoops in on the product to test it. That has the advantage that the swooping tester is more free of bias and preconceptions, but the disadvantage that she is likely to spend much time learning the basics. That will skew the type of issues found.

When I first started talking about exploratory testing on agile projects, over a year ago, I had the notion that it would involve both finding bugs and also revealing bold new ideas for the product. One session would find both kinds of issues. For a time, I called it "exploratory learning" to emphasize this expanded role.

I've since tentatively concluded that the two goals don't go together well. Finding bugs is just too seductive - thinking about feature ideas gets lost in the flow of exploratory testing. Some happens, but not enough. So I'm thinking there needs to be a separate feature brainstorming activity. I have no particularly good ideas now about how to do that. "More research is needed."

## Posted at 14:01 in category /agile [permalink] [top]

Tue, 23 Sep 2003

Brian Eno and agile projects

I don't pair program much. I'm an independent consultant, I live at least 500 miles (1000 kilometers) from almost all my clients, and I can't be on-site for more than a quarter of the time. (This was easier to pull off during the Bubble.) So it's hard to get the opportunity to pair.

Most usually when I pair, one or the other of us knows the program very well. But once, when I was pairing with Jeremy Stell-Smith, neither of us knew the program that well. And I got an interesting feeling: I didn't feel confident that I really had a solid handle on the change we were making, and I didn't feel confident that Jeremy did either, but I did feel confident - or more confident - that the combination of Jeremy, me, and the tests did. It was a weird and somewhat unsettling feeling.

That reminds me now of something Ken Schwaber said in Scrum Master training - that one of the hardest things for a Scrum Master to do is to sit back, wait, and trust that the team can solve the problem. It's trust not in a single person, but in a system composed of people, techniques, and rules.

All this came to mind when I read a speech by Brian Eno describing what he calls "generative music". I don't think it's too facile to say that his composition style is to conventional composition as agile methods are to conventional software development. (Well, maybe it is facile, but perhaps good ideas can result from the comparison.) They both involve setting up a system, letting it rip, observing the results without attempting to control the process, and tweaking in response. There is, again, a loss of control that I like intellectually but still sometimes find unsettling.

Here's that other Brian:

In fact all of the stuff that is called ambient music really -- sorry, all the stuff I released called ambient music (laughter), not the stuff those other 2 1/2 million people released called ambient music, -- all of my ambient music I should say, really was based on that kind of principle, on the idea that it's possible to think of a system or a set of rules which once set in motion will create music for you.

Now the wonderful thing about that is that it starts to create music that you've never heard before. This is an important point I think. If you move away from the idea of the composer as someone who creates a complete image and then steps back from it, there's a different way of composing. It's putting in motion something and letting it make the thing for you.

Sounds cool, right? But then there's this, where he demos a composition. Remember, he only puts in rules and starting conditions, then lets the thing generate on its own:

There are a hundred and fifty of these kinds of rules. They govern major considerations like the basic quality of the piece to quite minor ones like exactly how the note wobbles. I'll play you a bit - is this thing up? - He cried to the empty void (laughter).

This piece of music, which is quite unpredictable and sometimes has quite large gaps in it, as it has chosen to do right now, it's embarassing, this music is making itself now. It is not a recording, and I have never heard it play exactly this before. If you don't believe me I'll start it again. See. It will start.

You, dear reader, may not have ever done a live demo. But if you have, I bet Eno's experience hits home: "Observe this!... um, it usually works... (Gut clenches)" Surely agile projects run into this problem at a slower scale: "We're going to self-organize, be generative, we're a complex adaptive system, just watch... um, it usually works. (Gut clenches)"

Agile development involves bets. (The XP slogan "You aren't going to need it" should really be stated "On average, you'll need it seldom enough that the best bet is that you won't need it".) Sometimes the bet doesn't pay off. I believe that, over the course of most decent-sized projects, it will. But surely there will be single iterations that collapse into silence. I don't think enough is said about how to cope with that.

## Posted at 20:14 in category /agile [permalink] [top]

Sat, 20 Sep 2003

Links and quotes

The philosopher Ian Hacking on definitions, in a section called "Don't First Define, Ask for the Point" in his The Social Construction of What? (pp. 5-6):

... Take 'exploitation'. In a recent book about it, Alan Wertheimer does a splendid job of seeking out necessary and sufficient conditions for the truth of statements of the form 'A exploits B'. He does not quite succeed, because the point of saying that middle-class couples exploit surrogate mothers, or that colleges exploit their basketball stars on scholarships... is to raise consciousness. The point is less to describe the relation between colleges and stars than to change how we see those relations. This relies not on necessary and sufficient conditions for claims about exploitation, but on fruitful analogies and new perspectives.

In that light, consider definitions of "agile methods", "agile testing", "exploratory testing", "testability", and the like: what's the point of making the definition? What change is the maker trying to encourage?

The Poppendiecks on the construction analogy and lean construction (via Chris Morris):

"What are you doing here?" they asked.

They were construction foremen, superintendents and project managers attending a course in construction planning from the Lean Construction Institute (LCI). Indeed, what was I doing there?

I started to explain: "In software development, we are told we should manage our projects like construction projects, where a building is designed at the start, cost and schedule are predictable, and customers get what they expect."

Silence. "You're kidding, right?" "No, honest, that's what we're told."

Incredulity turns to laughter. The idea that programmers would want to manage projects like the construction industry strikes my classmates as ludicrous.

Malcolm Nicolson and Cathleen McLaughlin, in "Social constructionism and medical sociology: a study of the vascular theory of multiple sclerosis" (Sociology of Health and Illness, Vol. 10 No. 3, 1988, p. 257 [footnote 15]):

In order for technical knowledge to [be given credit] it has to be able to move people as well as things.

Laurent Bossavit on models:

But just because diagrams and models have abstraction in common isn't enough to call diagrams models.

## Posted at 09:58 in category /misc [permalink] [top]

Mon, 15 Sep 2003

Some random links

Christian Sepulveda on "What qualifies as an agile process?"

I feel these guidelines offer a different perspective than the elements of the manifesto. For example, communication and collaboration are desirable because they promote discovery and provide feedback. As I consider the experiences I would characterize as "agile", I am better able to articulate their "agility" in terms of these guidelines.

Martin Fowler on application boundaries

I don't think applications are going away for the same reasons why application boundaries are so hard to draw. Essentially applications are social constructions.

Michael Hamman has comments on my earlier post about "the reader in the code". His description of how musicians make commentaries on musical scores is fascinating. I want to see it sometime, mine it for ideas.

Michael also elaborates on his earlier post on breakdowns. I'll have to ponder that for a while.

Finally, amidst today's posturing and self-righteous certainty - the conversion of real events into mere fodder for argument - a reminder of reflexive unity of people in the moment.

## Posted at 10:57 in category /misc [permalink] [top]

Sat, 13 Sep 2003

PLoP done; a hermit once more

This year's Pattern Languages of Programs conference is over. (I was program chair.) Thanks to everyone who attended and made it work. For me, the highlight of the conference was when Jeff Patton led some of us through a two hour example of his variant of Constantine/Lockwood style usage-centered design. I also liked Linda Rising's demonstration of a Norm Kerth style retrospective (applied to PLoP itself). At Richard Gabriel's suggestion, we brought in Linda Elkin, an experienced teacher of poets. She and Richard (also a poet, as well as Distinguished Engineer at Sun) taught us a great deal about clarity in writing. Finally, Bob Hanmer and Cameron Smith stepped up to keep the games tradition of PLoP alive.

As you can see, PLoP is no ordinary conference.

Thus ends my summer of altruistically organizing (or under-organizing) people. I need a break. So I'm going to do what I do when I need a break: write code. I'll be spending half my non-travelling time building a new app. Since I'm now an expert on usage-centered design (that's a, I say, that's a joke, son), I'm going to start with that.

But wait... we've been talking about starting a local Agile study group, and I think that's really important, and I bet it wouldn't be that much work...

P.S. The postings on agile testing directions will continue. Just don't expect me to organize a conference on the topic.

## Posted at 09:23 in category /misc [permalink] [top]

Fri, 05 Sep 2003

The Marquis de Sade and project planning

I dabble in science studies (a difficult field to define, so I won't try) partly because it causes me to read weird stuff. Last year, I read "Sade, the Mechanization of the Libertine Body, and the Crisis of Reason", by Marcel Henaff ¹. Here's a quote about Sade's obsession with numerical descriptions of body parts and orgies:

It as if excessive precision was supposed to compensate for the rather obvious lack of verisimilitude of the narrated actions.

The same could be said of most project plans. Affix this quote to the nearest Pert chart.

¹ In Technology and the Politics of Knowledge, Feenberg & Hannay eds., 1995.

## Posted at 14:32 in category /misc [permalink] [top]

Agile testing directions: business-facing team support

Part 4 of a series
The table of contents is on the right

As an aid to conversation and thought, I've been breaking one topic, "testing in agile projects," into four distinct topics. Today I'm writing about how we can use business-facing examples to support the work of the whole team (not just the programmers)¹.

I look to project examples for three things: provoking the programmers to write the right code, improving conversations between the technology experts and the business experts, and helping the business experts more quickly realize the possibilities inherent in the product. Let me take them in turn.

provoking the right code

This is a straightforward extrapolation of standard test-driven (or example-driven) design from internal interfaces to the whole-product interface. To add a new feature, begin with one or more examples of how it should work. The programmer then makes the code match that example. The stream of examples continues until the required code is in place. Then the feature is complete (for now).

Although the extrapolation is straightforward, we have a ways to go before we've ironed out the details, before the practice is well understood. I'll say more about that below.

improving project conversations

It makes no more sense to toss examples over the wall to a programmer and expect her to write the right code than it does to do that with requirements. Programmers need context, background, and tacit knowledge. They get that through conversation with business experts. Examples can improve that conversation, if only by giving people something to talk about. They ground conversation in the concrete.

Where examples can help particularly, I think, is in forging a common vocabulary. I'm a fan of the notion that domain terminology should be "reified" by being turned into objects in the code. Not in the naive "write about the domain and underline all the nouns" style of object-oriented design, but in the more sophisticated style of Eric Evans's Domain-Driven Design².

So what we must have is a process by which fuzzily-understood domain knowledge is made very concrete, turned into 1's and 0's. It seems to me that examples are an intermediate step, a way to gradually defuzz the knowledge. But, as with using examples to guide programmers, a lot of lore remains to be learned.

making possibilities more noticeable

We want business experts to have "Aha!" moments, in which they realize that because the product does A in way B and also X in way Y, it makes sense for it to do some new Z that hadn't been imagined before. We also want other people on the team to have the same kind of realizations, which they can then propose to the business experts. In short, we want creativity.

Probably the best way to unleash creativity is to get your hands on working software and try it out. (I'll write more about that in a later posting.) But another way is to explain an example to someone else. Ever had trouble finding a bug, then had the mistake jump out at you as soon as you started explaining the code to someone else? For me, writing user documentation has a similar effect: I use examples to explain what the fundamental ideas behind the software are and how they hang together. Quite often, I realize they don't. It's the same feeling as with bugs, even though the person I'm explaining it to is an imaginary reader, not a real person sitting next to me.

So the way in which we create examples and discuss them might accelerate the product's evolution.

One of my two (maybe three) focuses next year will be these business-facing examples. I've allocated US$15K for visits to shops who use them well. If you know of such a shop, please contact me. After these visits (and after paid consulting visits and after practicing on my own), I want to be able to tell stories:

Stories about the pacing of examples. When do people start creating them? How many examples are created before the programmer starts on the code? What kinds of examples come first?
Stories about the conversations around examples. Who's involved? What's the setting and structure of the conversation? Who writes the examples down? What's it like when business experts do it? programmers? testers? (And what did people notice if they switched from one scribe to another?) How much do examples change during the process of turning them into code?
Stories about the interaction between business-facing examples and technology-facing examples (unit tests). How and when do programmers turn their attention from one to the other? How often are the customer-facing examples checked? Do examples migrate from one category to the other?
Stories about the way business-facing examples affect the design and architecture of the code.
Stories about FIT, surely the notation with the greatest mindshare. For what sorts of systems is it most appropriate? One of FIT's neatest features is that it encourages explanatory text to be wrapped around the examples - how do people make use of that? When people have migrated to FIT from some other approach (most likely examples written in a scripting language), what have they learned along the way? And what did people who went in the other direction learn? How do FIT and scripting languages compare when it comes to developing a project vocabulary?
Stories about balancing examples that push the code forward ("... and here's another important aspect of feature X...") with examples that rule out bugs ("... don't forget that the product has to work in this situation..."). What kinds of bugs should be prevented, and what kinds should be left to after-the-fact product critique (the other half of my matrix)? (See also Bill Wake's "generative" and "elaborative" tests.)
Stories about the distinction between checked examples and change detectors. Does this play out differently in the business-facing world than in the technology-facing world?

Only when we have a collection of such stories will the practice of using business-facing examples be as well understood, be as routine, as is the practice of technology-facing examples (aka test-driven design).

¹ I originally called this quadrant "business-facing programmer support". It now seems to me that the scope is wider - the whole team - so I changed the name.

² I confess I've only read parts of Eric's book, in manuscript. The final copy is in my queue. I think I've got his message right, though.

## Posted at 14:04 in category /agile [permalink] [top]

Fri, 29 Aug 2003

The reader in the code

Christian Sepulveda writes about comments in code.

Not all comments are bad. But they are generally deodorant; they cover up mistakes in the code. Each time a comment is written to explain what the code is doing, the code should be re-written to be more clean and self explanatory.

That reminded me of the last time someone mentioned to me that some code needed comments. That was when that someone and I were looking at Ward Cunningham's FIT code. The code made sense to me, but it didn't to them. You could say that's just because I've seen a lot more code, but I think that's not saying enough. My experience makes me a particular kind of code reader, one who's primed to get some of the idioms and ideas Ward used. I knew how to read between the lines.

Let me expand on that with a different example. Here's some C code:

  int fact(int n) { // caller must ensure n >= 0
    int result = 1;
    for (int i = 2; i <= n; i++)
      result *= i;
    return result;
  }

I think a C programmer would find that an unsurprising and unobjectionable implementation. Suppose now that I transliterate it into Lisp:

  (defun fact(n) ; caller must ensure (>= n 0)
    (let ((result 1))
      (do ((i 2 (+ i 1)))
          ((> i n))
        (setq result (* result i)))
      result))

This code, I claim, would have a different meaning to a Lisp programmer. When reading it, questions would flood her mind. Why isn't the obvious recursive form used? Is it for efficiency? If so, why aren't the types of the variables declared? Am I looking at this because a C programmer wrote the code, someone who doesn't "get" Lisp?

A Lisp programmer who cared about efficiency would likely use an optional argument to pass along intermediate results. That would look something like this:

  (defun fact(n (result-so-far 1)) ; caller must ensure (>= n 0)
    (if (<= n 1)
        result-so-far
      (fact (- n 1)
            (* result-so-far n))))

(I left out variable declarations.) Unless my 18-year-old memories of reading Lisp do me wrong, I'd read that function like this:

(defun fact(n (result-so-far 1))

"OK, looks like an accumulator argument. This is probably going to be recursive..."

(if (<= n 1)
result-so-far

"Uh-huh. Base case of the recursion."

(fact (- n 1)

"OK. Tail-recursive. So she wants the compiler to turn the recursion into a loop. Either speed is important, or stack depth is important, or she's being gratuitously clever. Let's read on."

With that as preface, let me both agree and disagree with Christian. I do believe that code with comments should often be written to be more self-explanatory. But code can only ever be self-explanatory with respect to an expected reader.

Now, that in itself is kind of boringly obvious. What's obvious to you mightn't be obvious to me if we've had different experiences. And the obvious consequences aren't that exciting either: The more diverse your audience, the more likely you'll need comments. Teams will naturally converge on a particular "canonical reader", but perhaps that process could be accelerated if people were mindful of it.

We could do more with the idea. The line by line analysis I gave above was inspired by the literary critic Stanley Fish. He has a style of criticism called "affective stylistics". In it, you read something (typically a poem) word by word, asking what effect each word (and punctuation mark, and line break...) will have on the canonical reader's evolving interpretation of the poem. To Fish, the meaning of the poem is that evolution. I don't buy this style of criticism, not as a total solution, but it's awfully entertaining and I have this notion that people practiced in it might notice interesting things about code.

Affective stylistics is part of a whole branch of literary criticism (for all I know, horribly dated now) called "reader-response criticism". There are many different approaches under that umbrella. I've wanted for a few years to study it seriously, apply it to code, and see what happened. But, really, it seems unlikely I'll ever get the time. If there's any graduate student out there who, like me at one time, has one foot in the English department and one in the Computer Science department, maybe you'll give it a try. (Good luck with your advisor...) And maybe this is something that could fit under the auspices of Dick Gabriel's Master of Fine Arts in Software, if that ever gets established.

Thu, 28 Aug 2003

Some random links

Michael Feathers on "Stunting a framework":

The next time you are tempted to write and distribute a framework, run a little experiment. Imagine the smallest useful set of classes you can create. Not a framework, just a small seed, a seedwork. Design it so that it is easy to refactor. Code it and then stop. Can you explain it to someone in an hour? Good. Now, can you let it go? Can you really let it go?

Christian Sepulveda on "Testers and XP: Maybe we are asking the wrong question":

... there are other agile practices that address these other concerns and work in harmony with XP. Scrum is the best example. Scrum is about project management, not coding. When I am asked about the role of project managers in XP, I suggest Scrum.

I like Christian's idea of finding a style of testing that's agile-programmer-compatible in the way that Scrum is a style of management that's agile-programmer-compatible. It feels like a different way of looking at the issue, and I like that feeling.

Michael Hamman talks of Heidegger and Winograd&Flores in "Breakdown, not a problem":

Because sometimes our flow needs to be broken - we need to be awoken from our "circumspective" slumber. This notion underlies many of the great tragedies, both in literature and in real life. We are going along in life when something unexpected, perhaps even terrible, occurs. Our whole life is thrown into relief - all of the things, the people, and qualities that we once took for granted suddenly become meaningful and important to us. Our very conscious attitude toward life shifts dramatically.

Breakdowns usually don't have such a dramatic character - it could, after all, be something simple like the red bar (a unit-testing term for a failed software unit test). In fact, many common tools and technologies play upon the importance of breakdowns to help us be more attentive, to wake us up out of our circumspective slumber. At least one aspect of test-driven development is of this character.

Something to think about: what sort of breakdowns would improve your work?

I'm wondering - help me out, Michael - how to fit my talk of test maintenance (within this post) into the terminology of Heidegger and Winograd&Flores. The typical response to an unexpectedly broken test is to fix it in place to match the new behavior. Can I call that circumspective? I prefer a different response, one that makes distinctions: is this one of those tests that should be fixed in place, or is it one that should be moved, or one that should be deleted? Is that attending to a breakdown? And should we expect that a habit of attending to that kind of breakdown would lead to (first) an explicit and (over time) a tacit team understanding of why you do things with tests? And does that mean that handling broken tests would turn back into the fast "flow" of circumspective behavior?

Cem Kaner's "Software customer's bill of rights"

Greg Vaughn comments on my unit testing essay. Terse and clear: a better intro to the idea than I've managed to write. (Note how he puts an example - a story - front and center. That's good writing technique.)

## Posted at 09:39 in category /misc [permalink] [top]

Wed, 27 Aug 2003

Agile testing directions: technology-facing programmer support

Part 3 of a series
The table of contents is on the right

As an aid to conversation and thought, I've been breaking one topic, "testing in agile projects," into four distinct topics. Today I'm writing about how we can use technology-facing examples to support programming.

One thing that fits here is test-driven development, as covered in Kent Beck's book of same name, David Astel's more recent book, and forthcoming books by Phlip, J.B. Rainsberger, and who knows who else. I think that test-driven development (what I would now call example-driven development) is on solid ground. It's not a mainstream technique, but it seems to be progressing nicely toward that. To use Geoffrey Moore's term, I think it's well on its way to crossing the chasm.

(Note: in this posting, when I talk of examples, I mean examples of how coders will use the thing-under-development. In XP terms, unit tests. In my terms, technology-facing examples.)

Put another way, example-driven development has moved from being what Thomas Kuhn called "revolutionary science" to what he called "normal science". In a normal science, people expand the range of applicability of a particular approach. So we now have people applying EDD (sic) to GUIs, figuring out how it works with legacy code, discussing good ways to use mock objects, having long discussions about techniques for handling private methods, and so forth.

Normal science is not the romantic side of science; it's merely where ideas turn into impact on the world. So I'm glad to see we're there with EDD. But normality also means that my ideas for what I want to work on or see others work on... well, they're not very momentous.

I hope future years will see more people with a mixture of testing and programming skills being drawn to Agile projects. Those people will likely neither be as good testers as pure testers, nor as good programmers as pure programmers, but that's OK if you believe, as I do, that Agile projects do and should value generalists over specialists.

I'm one such mixture. I've done limited pair programming with "pure programmers". When I have, I've noticed there's a real tension between the desire to maintain the pacing and objectives of the programming and the desire to make sure lots of test ideas get taken into account. I find myself oscillating between being in "programmer mode" and pulling the pair back to take stock of the big picture. With experience, we should gain a better idea of how to manage that process, and of what kinds of "testing thinking" are appropriate during coding.

There might also be testers on the team who do not act as programmers. Nevertheless, some of them do pair with programmers to talk about the unit tests (how the programmers checked the code). The programmers learn what kinds of bugs to avoid, and the testers learn about what they're testing. For some reason, Calgary Canada is a hotbed of such activity, and I look to Jonathan Kohl, Janet Gregory, and others to teach us how to do it well.

I want to emphasize that this is all about people. Testers traditionally have an arms-length (or oceans-length) relationship to programmers. For the programmer-support half of the matrix, that relationship is, I believe, inappropriate.

I've been using the phrase "checked examples" for programmer support tests. We can split that idea in two. There are new examples that guide decisions about what to do next. And there are automated examples that serve as change detectors to see whether what you just did was what you expected to do.

The common habit is that the change detectors are merely retained code-guiding examples. (You make your unit test suite by saving, one by one, the tests you write as you code.) That's not a logical necessity. I'd like to develop some lore about when to do something else.

For example, consider this maintenance scenario: you develop some code example-first. A month later, someone adds a new example and changes the code to match. Many prior examples for that hunk of code become "bad examples" (the tests fail, but because they're now wrong, not because the code is). The tendency is to fix those examples so that they're essentially the same. What I mean by that is that the left sequence of events in the table below is expected to yield the same tests as the right. (Read the left column, then the right.)

Example foo written Example bar written

Code written to match foo Code written to match bar

Example bar written (foo invalidated) Example better-foo written (bar is still a good example)

Code changed to match bar - oops, now foo doesn't check out Code changed to match better-foo (and bar continues to check out)

Update foo to be better-foo

That is, newly broken examples are rewritten to match an ideal sequence of development in which no example ever needed to be rewritten. But why? In the left column above, example new-foo is never used to drive development - it's only for checking. What's optimal for driving development might not be optimal for checking.

Let me be concrete. Suppose that software systems develop shearing layers, interfaces that naturally don't change much. For maintainability, it might make sense to migrate broken examples to shearing layers when fixing them. Instead of being an example about a particular method in a particular class, we now have an example of a use of an entire subsystem. That can be bad - think about debugging - but it reduces the maintenance burden and could even provide the benefit of thorough documentation of the subsystem's behavior.

I'm hoping that people who distinguish the two roles - guiding the near future and rechecking the past - will discover productive lore. For example, when might it be useful to write technology-facing change detectors that never had anything to do with guiding programming?

I said above that test-driven development is "one thing that fits" today's topic. What else fits? I don't know. And is EDD the best fit? (Might there be a revolution in the offing?) I don't know that either - I'll rely on iconoclasts to figure that out. I'm very interested in listening to them.

## Posted at 15:22 in category /agile [permalink] [top]

Sat, 23 Aug 2003

Signature surveys

Jim Weirich writes on stripping out the text from a program, leaving only the "line noise" (punctuation). He notes:

What I find interesting is the amount of semantic information that still comes through the "line noise". For example, the "#<>" sequence in the C++ code is obviously an include statement for something in the standard library and the "<<" are output statements using "cout".

That reminds me of something that Ward Cunningham described. He called them "signature surveys". It's a really sweet idea, even more impressive when you see him demo it.

## Posted at 11:23 in category /misc [permalink] [top]

Fri, 22 Aug 2003

Agile testing directions: tests and examples

Part 2 of a series
The table of contents is on the right

'It all depends on what you mean by home.'
[...]

'Home is the place where, when you have to go there,
They have to take you in.'
'I should have called it
Something you somehow haven't to deserve.'
-- Robert Frost, "The Death of the Hired Man"

In my first posting, I drew this matrix:

Consider the left to right division. Some testing on agile projects, I say, is done to critique a product; other testing, to support programming. But the meaning and the connotations of the word "testing" differ wildly in the two cases.

When it comes to supporting programming, tests are mainly about preparing and reassuring. You write a test to help you clarify your thinking about a problem. You use it as an illustrative example of the way the code ought to behave. It is, fortunately, an example that actively checks the code, which is reassuring. These tests also find bugs, but that is a secondary purpose.

On the other side of the division, tests are about uncovering prior mistakes and omissions. The primary meaning is about bugs. There are secondary meanings, but that primary meaning is very primary. (Many testers, especially the best ones, have their identities wrapped up in the connotations of those words.)

I want to try an experiment. What if we stopped using the words "testing" and "tests" for what happens in the left side of the matrix? What if we called them "checked examples" instead?

Imagine two XP programmers sitting down to code. They'll start by constructing an incisive example of what the code needs to do next. They'll check that it doesn't do it yet. (If it does, something's surely peculiar.) They'll make the code do it. They'll check that the example is now true, and that all the other examples remain good examples of what the code does. Then they'll move on to an example of the next thing the code should do.

Is there a point to that switch, or is it just a meaningless textual substitution? Well, you do experiments to find these things out. Try using "example" occasionally, often enough that it stops sounding completely weird. Now: Does it change your perspective at all when you sit down to code? Does it make a difference to walk up to a customer and ask for an example rather than a test? Add on some adjectives: what do motivating, telling, or insightful examples look like, and how are they different from powerful tests? ("Powerful" being the typical adjective-of-praise attached to a test.) Is it easier to see what a tester does on an XP project when everyone else is making examples, when no one else is making tests?

Credit: Ward Cunningham added the adjective "checked". I was originally calling them either "guiding" or "coaching" examples.

## Posted at 14:38 in category /agile [permalink] [top]

Thu, 21 Aug 2003

Some random links

Martin Fowler

... But I think these arguments, while valid, have missed another vital reason for direct developer-customer interaction - enjoyment...

Michael Hamman

... they had never before realized that physical space could have such a subtle impact on human behavior...

Laurent Bossavit

... An idiom, in natural language, is a ready-made expression with a specific meaning that must be learned, and can't necessarily be deduced from the terms of the expression. This meaning transposes easily to programming languages and to software in general...

Charlie Stross

...It's a mirror, made of wood...

## Posted at 16:01 in category /misc [permalink] [top]

My Agile testing project

At XP Agile Universe, two people - perhaps more - told me that I'm not doing enough to aid the development of Agile testing as a discipline, as a stable and widely understood bundle of skills. I spend too much time saying I don't know where Agile testing will be in five years, not enough pointing in some direction and saying "But let's see if maybe we can find it over there". They're probably right. So this is the start of a series of notes in which I'll do just that.

I'm going to start by restating a pair of distinctions that I think are getting to be fairly common.

If you hear someone talking about tests in Agile projects, it's useful to ask if those tests are business facing or technology facing. A business-facing test is one you could describe to a business expert in terms that would (or should) interest her. If you were talking on the phone and wanted to describe what questions the test answers, you would use words drawn from the business domain: "If you withdraw more money than you have in your account, does the system automatically extend you a loan for the difference?"

A technology-facing test is one you describe with words drawn from the domain of the programmers: "Different browsers implement Javascript differently, so we test whether our product works with the most important ones." Or: "PersistentUser#delete should not complain if the user record doesn't exist."

(These categories have fuzzy boundaries, as so many do. For example, the choice of which browser configurations to test is in part a business decision.)

It's also useful to ask people who talk about tests whether they want the tests to support programming or critique the product. By "support programming", I mean that the programmers use them as an integral part of the act of programming. For example, some programmers write a test to tell them what code to write next. By writing that code, they change some of the behavior of the program. Running the test after the change reassures them that they changed what they wanted. Running all the other tests reassures them that they didn't change behavior they intended to leave alone.

Tests that critique the product are not focused on the act of programming. Instead, they look at a finished product with the intent of discovering inadequacies.

Put those two distinctions together and you get this matrix:

In future postings, I'll talk about each quadrant of the matrix. What's my best guess about how it should evolve?

## Posted at 14:44 in category /agile [permalink] [top]

Mon, 04 Aug 2003

Soap Opera Testing

Here's something I wrote for the member's newsletter of the Agile Alliance.

To me, testing is about more than finding bugs. It's also about helping the whole team understand the domain and the needs of the users. It should be a way of provoking insight.

One good technique is from Hans Buwalda (www.happytester.com). He calls it "soap opera testing". It goes beyond the straightforward scenarios that teams often use in development. Soap opera tests exaggerate and complicate scenarios in the way that television soap operas exaggerate and complicate real life. Here's an example:

"A customer named Marick hires a car for a three-day business trip. (This, by the way, gives him enough rental points to reach Preferred status.) Midway through the rental, he extends it for another week. Several days later, he calls to report the car has been stolen. He insists that the Preferred benefit of on-site replacement applies, even though he was not Preferred at the start of the rental. A new car is delivered to him. Two days after that, he calls to report that the "stolen" car has been found. It turns out he'd mis-remembered where he'd parked it. He wants one of the cars picked up and the appropriate transaction closed. Oh, and one other thing: the way he discovered the mislaid car was by backing into it with its replacement, so they're both damaged..."

Soap opera testing is a kind of brainstorming, one that surfaces questions easily overlooked. When does Preferred status kick in? What are the implications of renting two cars at once? Like other types of brainstorming, it's best done in a group.

When I sent that text to Hans, he had some interesting comments that need to be written down somewhere public. So the above is by way of introduction, and here's what Hans had to say:

There is more to say, but these two points would be the most important:

Involve (in the group) some "real people" to get ideas, like experienced end-users, subject matter specialists, but also experienced testers (the old fashioned craftman types) and of course developers (they know the complexities and weak spots).

Cover/augment the scenarios with a list of "test objectives", atomic statements describing what should minimally be tested, and match the objectives with the soap operas after(!) those have been produced. That way a potential draw back of soap opera's (lack of systematic coverage) is effectively addressed.

P.S.: Ken Schwaber (the newsletter editor) was kind enough to let me publish my blurb here even though the newsletter isn't out yet. Join the Agile Alliance and you can read it again! I quote Hans's email with permission.

## Posted at 14:21 in category /testing [permalink] [top]

More on change detectors

Jonathan Kohl writes:

After reading the quote of Christian Sepulveda's point on "change detection" on your blog, I tried it out in a meeting today. It seemed to go over quite well with people. I was talking about automated unit tests in particular as "early change detection" as well as other automated tests with regards to a goal of automation. I asked what the goal of test automation was for the company, and no-one had really thought of it.

I was proposing that collaboration between testers and developers with automated regression tests is a good way to go. If the goal is change detection, an effective way to approach it is through different automated testing techniques. Instead of just trying to automate the manual regression tests, development automated unit testing and other automation efforts have the potential to compound each other.

The change detection concept really seemed to help people get a consensus on why we should be automating these sorts of tests instead of just "automation == good". It really came in handy.

(Posted with permission.)

P.S. I'm told that Cem Kaner coined the phrase "change detector" at Agile Fusion. The idea has been widespread, but as the patterns people know, a catchy phrase matters a lot. Here we have one, one that I at least don't remember hearing before.

## Posted at 06:17 in category /agile [permalink] [top]

Sat, 02 Aug 2003

If elected...

I've been nominated for a seat on the board of the Agile Alliance. That brings back memories of the time I ran for student council in middle school. (Non-US readers: "middle school" is for the early teenage years.) A nerd is someone who could not possibly be elected to student council in any US school outside the Bronx High School of Science - a true nerd is someone so clueless that he doesn't even realize how hopeless it is to try. Let's just say I didn't get enough votes.

Nevertheless, as I thought about the nomination I was surprised by my reaction. At the end of a summer of Altruistic Organizational Deeds that left me feeling I never want to do anything like that again, I find myself... wanting to do something like that again. The "why do I want to serve" part of the position statement below is heartfelt.

Linda Rising asked me to answer three questions. Here goes.
* Who am I?
I've been a programmer, tester, and line manager since 1981. I've been a software testing consultant since 1992. During the 90's, my colleagues and I articulated a style of testing that shares desires with Agile development: a desire to respond well to inevitable change, a desire to see working software soon, a preference for conversation over documentation, and so on. Perhaps because of that, I was invited to the Snowbird workshop that led to the Agile Manifesto and, later, this non-profit Alliance. See www.exampler.com/testing-com for more, especially www.exampler.com/old-blog.
* What have I done for you lately?
There's been a lot of interest in testing over the past year or so. Much of that is because "test-infected" programmers have been extending their enthusiasm to more than unit tests, but I've also contributed to the buzz. I've cohosted a series of workshops on agile testing (February 2002, August 2002, January 2003, and June 2003). Their goal was to bring together members of the Agile community and Agile-sympathetic testers. They worked well to get many of the right people talking to each other. I also cohost the Agile Testing mailing list and give many talks on Agile methods to testers.

http://www.xpuniverse.com/2002/schedule/W1
http://www.pettichord.com/awta4.html
http://www.exampler.com/old-blog/2003/06/03#agile-fusion-1
http://groups.yahoo.com/group/agile-testing

* Why do I want to serve?
After I visited an XP shop, I wrote some friends: "I am optimistic. I've often grumbled about the, uh, loss of youthful innocence in software, the loss of the sense of possibility, of expansiveness, of craft as an ideal. Agile methods mark, to me, an attempt to recapture that, tempered by experience."

That XP shop isn't around any more. Bad economy. So there's one fewer joyful place to work in the world. But maybe, just maybe, if someone at that shop had met the right person at a conference, they would have gotten the right lead. Or if The Economist or the Harvard Business Review or Forbes had published an article on Agile methods at just the right time, someone with budget and a project would have said, "Hey, I remember there's a local company that does that stuff..."

This Agile Alliance is a tool for making those things happen. Since I care about seeing them happen, I ought to put some work into it. That's why I want to serve.

## Posted at 12:58 in category /agile [permalink] [top]

Fri, 01 Aug 2003

Putting the object back into OOD

Jim Coplien writes about teaching - and doing - object-oriented design. I don't quite get it, but I think I would like to.

A taste:

In the object-oriented analysis course we typified the solution component as the class structure, and the customer problem component as the Use Cases. CRC cards are the place where these two worlds come together--where struggles with the solution offer insight into the problem itself.

I'd like to see an example of this style of design narrated from beginning to end. In the meantime, the article might well be of interest to people who favor prototypes over classes, prefer verbs to nouns, or are suspicious that categories really "carve nature at its joints".

## Posted at 15:12 in category /misc [permalink] [top]

Wed, 30 Jul 2003

A note I don't regret sending

In early September, 2001, I was embroiled in a mailing list debate about agile methods with someone I'll call X. Here's a note I wrote on September 12, 2001:

I began writing a note in which I treated Mr. X as an adversary. I was ready to pluck quotes from his last mail and show devastatingly how what he claimed agile methods people said was directly contradicted by (in the case of XP) Beck's _Extreme Programming Explained_.

With that, and with a little clever rhetoric, I was sure that the faceless hundreds of people on this mailing list would judge me the victor.

But then I couldn't do it. I've been thinking hard about the tragedies in New York and Washington, and about experiences I've had this year. I just don't care about being the victor any more. I don't care that I'm right, or if I'm right. I can't summon up any moral fervor.

I was all set to get dramatically upset (because it really does upset me) about Mr. X's implications that I'm a consultant intent on filling my pockets by promoting hype. But, by doing that, I'd be protesting his turning me into a caricature by turning him into one.

Neither one of us are caricatures, or symbols, or abstractions. We're both just people: poor, finite beings, trying to muddle along through life as best we can, thinking that the pathetic distance we can see is the furthest possible horizon.

So, instead - in a note I will probably regret sending - let me appeal to this list: we used to be friendly. We used to be about colleagues helping each other as best they could. About people asking questions and others answering, in a spirit of friendship, without claims of TRVTH or absolute authority. Some of that still happens, but so much of it has been lost. Can we get it back?

One of my tactics in life is to publicly proclaim virtues that I then feel obliged to live up to. Lots of debates about Agility looming ahead - this public posting will force me to treat my debating opponents with charity.

## Posted at 16:28 in category /misc [permalink] [top]

Tue, 29 Jul 2003

Evangelizing test-driven development

Mike Clark has a nice posting on evangelizing test-driven development.

I can vouch for his style. I taught a "testing for programmers" course from around 1994 to around 1999. I opened the course by telling how, back when I was a programmer, everyone thought I was a better programmer than my talents justified. The reason was that my tests caught so many bugs before I ever checked in. So other people didn't see those bugs. So they thought that I didn't make many.

## Posted at 14:44 in category /agile [permalink] [top]

Sat, 26 Jul 2003

Some random links

Here are some random links that started synapses firing (but to no real effect, yet):

Martin Fowler on multiple canonical models:

One of the interesting consequences of a messaging based approach to integration is that there is no longer a need for a single conceptual model...

Martin is speaking of technical models, but I hear an echo of James Bach's diverse half measures (PDF): "use a diversity of methods, because no single heuristic always works." Any model is a heuristic, a bet that it's very often useful to think about a system in a particular way.

Greg Vaughn on a source of resistance to agile methods:

Agile development requires a large amount of humility. We have to trust that practices such as TDD (Test Driven Development) might lead to better software than what we could develop purely via our own creative processes. And if it doesn't then the problem might be us rather than the method. To someone whose self-image is largely centered on their own intelligence, this hits mighty close to home and evokes emotional defenses.

Laurent Bossavit on exaptation:

In the context of software, an exaptation consists of people finding a valuable use for the software by exploiting some implementation-level behaviour which is entirely accidental and was never anticipated as a requirement. Exaptations are interesting because I think they have to do with more than managing agreements - they're part of the process of discovering requirements as the product is being built.

Laurent, again, on nuances:

We have a knack for turning anything we do into an expressive medium. As a beginning driver, I was surprised to find that it was possible to blink a turn light contemptuously, or aggressively... Source code does allow one an infinite range of nuances in a restricted domain of expression: the description of problems we expect a computer to solve for us.

## Posted at 12:26 in category /misc [permalink] [top]

Fri, 25 Jul 2003

Agile/testing convergence

At Agile Fusion, the team I wasn't on built some "change detectors" for Dave Thomas's weblog. If I understand correctly, they stored snapshots of sample weblogs (the HTML, I suppose). When programmers changed something, they could check what the change detectors detected. (Did what was supposed to change actually change? Did what was supposed to stay the same stay the same?) I can't say more because I didn't really pay any attention to them.

Christian Sepulveda has a blog entry that finishes by paying attention to them. He writes:

I have started using the term "change detection" when describing automated testing. It has allowed me to "convert" a few developers to embrace (or at least really try) the technique. It has also been a good way to explain the value of it to management.

This terminology switch made me think. It's now a commonplace that test-driven design isn't about finding bugs. Instead, it's a way of thinking about your design and interface that encourages smooth, steady progress. And that thinking produces long-term artifacts (be they xUnit tests or FIT tests or whatever) that aren't really tests either - they're change detectors. They too exist to produce smooth, steady progress: make a little change to the code, ask the change detectors to give you confidence you affected only the behavior you intended to affect, make the next little change...

So?

For the past year or so, I've been consciously trying to facilitate a convergence of two fields, agile development and testing. So I ask questions like these:

How can test-first design both smooth programming and also yield change detectors that are better at finding bugs? (That is, give higher confidence that the programmer did exactly and only what she intended.)
How can testers both work closely with programmers and also keep just enough intellectual distance?
How can exploratory testing both find bugs and also reveal possibilities for new features?

Today - a depressing day - these questions remind me of this one (from Hacknot): "What can you brush your teeth with, sit on, and telephone people with? Answer: a toothbrush, a chair, and a telephone." The implication is that straining after convergence leads to ridiculous and impotent monstrosities. As it becomes clear how different are the assumptions and goals and values that the two communities attach to the word "test", I must ask if I'm straining after such a convergence.

I don't believe so, not yet. Today's tentative convergence seems to work for me. I hope it works for people like me. But it's worth worrying about: will it work for enough people?

## Posted at 19:53 in category /agile [permalink] [top]

Code coverage

In recent weeks, I've been reading and hearing about code coverage. It may be coming back into vogue. (It was last in vogue in the early 90's.) It may be time to publicize my How to Misuse Code Coverage (PDF).

Code coverage tools measure how thoroughly tests exercise programs. I believe they are misused more often than they're used well. This paper describes common misuses in detail, then argues for a particular cautious approach to the use of coverage.

## Posted at 07:06 in category /testing [permalink] [top]

Wed, 16 Jul 2003

Lisp and XML, again

Earlier, I whined an old Lisper's whine about how unjust it is that XML, with all its noise characters, is seen as a human-readable language, whereas Lisp, with many fewer, is seen as weird. It appears that some other old Lispers are trying to make a silk purse out of XML.

I recently met Jay Conne, Director of Business Development at ClearMethods. They've developed a language called Water that makes me think "Lisp! In XML Syntax!". That's an oversimplification, of course. For example, it's also got prototype-style inheritance.

I only had the chance to read the first bits of Jay's copy of the Water book (on the way to and from the Wonderland T station in Boston), so I haven't gotten to the really good parts. What I've seen has stroked my biases, but is also different enough to awaken the novelty vampire within. When I get time, I'm going to download the runtime and IDE, mess around. Can the neat bits overcome the XMLishness? Is my bias against embedding code in HTML (JSP, eruby) wrong, and is this the right way to do it?

## Posted at 07:02 in category /misc [permalink] [top]

Tue, 15 Jul 2003

The Agile Context (continued)

I'm on vacation near Boston, so naturally I decided to take Ken Schwaber's ScrumMaster training course. (In my own defense, tomorrow is the water park.) What's a ScrumMaster? The closest analogue in conventional projects is the manager, but the ScrumMaster has very different goals:

"Removing the barriers between development and the customer so the customer directly drives development;
"Teaching the customer how to maximize ROI and reach their objectives through Scrum;
"Improving the lives of the development team by facilitating creativity and empowerment;
"Improving the productivity of the development team in any way possible; and
"Improving the engineering practices and tools so that each increment of functionality is potentially shippable."

The thing that most appeals to me about Scrum is the way the ScrumMaster is totally devoted to the success of the development team. There are three people I would unhesitatingly accept as my manager. Ken is one. Johanna Rothman is another. My wife Dawn is the third.

In any case, I recommend the course, even if you - like me - doubt you'll ever be a ScrumMaster on a Scrum project. (I am not a person I'd unhesitatingly accept as my manager.) It's important to know about the different agile approaches, to do some compare and contrast.

Ken reminded me of two more additions to my list of Things Agilists Want to be True.

Written documentation is impoverished and slow compared to face-to-face communication. For software development, the advantages of written communication - permanence, replicability, etc. - are exaggerated. How many of those advantages can you do without? How can you attain them without dislodging face-to-face communication from its central role?

When writing the above, bug reports leapt to my mind. We testers are greatly attached to the bug report as a written artifact. Many of us (including me) write and speak about the need to craft the writing well. For example, Cem Kaner's Bug Advocacy notes have some fantastic text about the importance of crafting a good subject line. The skills he teaches are essential in bug-heavy environments with contending political factions and testers on the periphery of attention. But do our bug-reporting habits serve us well in an agile context?
Iterations must deliver increments of potentially shippable, business-relevant functionality. When you do not tie project activities to that, you stand a great risk of succumbing to self-indulgence. Don't risk it.

## Posted at 18:24 in category /context_driven_testing [permalink] [top]

Fri, 04 Jul 2003

Fighting the last war: test automation

Round about 1985, I wrote a one-page document titled "Everything Brian Marick knows about software development". It was a bunch of bullet points. One of them read something like this: "A test that's not automated, or at least exactly repeatable manually, might as well not exist."

In the next decade, I changed my mind. That was largely due to a long email conversation with Cem Kaner, back before I'd ever met him. In the late 90's, I became one of the "anti-automation" crowd. That, despite putting a slide titled "I'm Not Against Automation!" in almost every talk about test automation I gave. Our crowd, roughly the same people as the context-driven crowd, made two main points:

The costs of automation are often underestimated.
It's a mistake to lump together unskilled following of manual checklists with skilled exploratory testing. In the former, we try to make people follow rote directions like a computer and still remain alert enough to notice bugs. People are bad at that. In exploratory testing, people don't follow a program-like script, but make use of intuition, judgment, experience, and chance discovery.

My contribution to this debate was a paper titled "When should a test be automated?" In it, I attempt to list the forces pushing toward automation and those pushing away. If you understand the forces, you can find a balance between automated tests and what I didn't yet think of as exploratory testing. You can balance cost against benefit.

Many of us in the "anti-automation" camp reacted to XP's glorification of the automated acceptance test with a sinking "oh no, not again" feeling and a general girding for battle. But I think that's a mistake, for two reasons.

First, in an XP project, more automation is appropriate. XP teams are committed to reducing the cost of automation. They also use tests for things beyond finding bugs: thinking concretely about what a program should do, supporting smooth change, etc. Those increase the benefit of testing. So the balance point is further in the direction of complete automation.

That, I think, the anti-automation crowd accepts. What bugs them is that the XP crowd doesn't accept the need for exploratory testing.

Oh, but they do. I've had two chances to introduce XPers to exploratory testing. In both cases, they were enthused. Because XP and other agile methods are full of exploration, it felt right to them. I'm immensely confident in generalizing from those people to XP as a whole. As we show XPers exploratory testing, they'll embrace it. Now, they'll likely use it differently. Sure, they'll be happy it finds bugs. But more important to XP people, I bet, will be the way it increases their understanding of the code and its possibilities, and of the domain and its quirks, and of the users and their desires. Automated tests are a way to decide how to move forward in the short term (this task, this iteration) and a way to make it so that such movement is almost always truly forward. Exploratory tests are a way to map out the territory in the longer term (next iteration and beyond).

So I declare that we in the anti-automation testing crowd needn't fight that last war again. This is a different war. It's not even a war. It's closer to what I call "a heated agreement". Time to move on.

## Posted at 14:06 in category /agile [permalink] [top]

Mon, 30 Jun 2003

The agile context

The noble premise of context-driven testing is that the tester's actions should be tailored to a particular project. The premise of the Agile Alliance is that certain projects have enough in common that they deserve a common name: "agile". It follows that those common themes should drive the actions of context-driven testers on agile projects.

But describing that commonality is a vexing question. The Agile Manifesto is an early attempt, still the definitive one. But, in my own thinking, I find myself coming back to different themes, ones more related to personality and style than values and principles.

Now, it's presumptuous of me to define Agility: although I was one of the authors of the Manifesto, I've always thought of myself as something of an outsider with a good nose for spotting a trend. So when I make pronouncements about Agility, I look for approving nods from those who I think get it more than I do. In recent weeks, I've gotten them from Michael Feathers, Christian Sepulveda, Jeremy Stell-Smith, Ward Cunningham, and Lisa Crispin.

Made bold by them, I present a partial and tentative list of what I'm calling Things Agilists Want to be True. I'm calling it that to avoid arguments about whether they are or are not true. Truth is irrelevant to whether those beliefs are part of the agile context.

A team of generalists trumps a team of specialists.
To get the best work out of programmers, protect them from distractions and interference. Instead, let them do what they think is best. The Scrum literature is the most clear on this point.
Tests are a tool for guiding design and making change safe. They are only secondarily about finding bugs, primarily about facilitating steady, smooth progress. (Ward once said to me, apropos unit tests, "Maybe we shouldn't have used the word 'test'.") Although this belief is strongest in XP, I sense that it's becoming common agile knowledge.
Every program contains a better, cleaner, more capable program that wants to get out. That better program is released when customers provide a well-paced stream of change requests to programmers who respect their craft.
As Lisa Crispin puts it: "On an XP team, if you ask for help, someone has to help you." At Agile Development Conference, I was surprised by how often the word "trust" came up. It was sure a heck of a lot more often than it does at testing conferences.

Of what use is this list? Well, I'm going to use it to remind me to think about my habits. Suppose I'm a specialist tester on an agile team. Being a specialist is comfortable to me - it's been my job many times - but I have to remember it cuts a bit across the grain of an agile project. I'll have to think more about earning - and giving - trust, about offering help outside my specialty, about taking care that my bug reports don't disrupt the smooth steady flow of improvement. Otherwise, I'll be wrong for my context.

My hunch is that many testers will find the team dynamics of an agile project their biggest challenge.

## Posted at 17:21 in category /context_driven_testing [permalink] [top]

FIT Fest

Along with Ward Cunningham, Bob Martin, Micah Martin, and Rob Mee, I'm hosting FIT Fest at XP Agile Universe. It's all about test-first customer-facing tests using Ward's FIT framework. Do join us.

One of the ideas of the event is that we'll give people a chance to use FIT to build solutions to commonly-posed problems. I collected a few at Ward's FIT tutorial at Agile Development Conference. For example, one was expressing tests that drive GUIs. Feel free to send me more.

## Posted at 09:53 in category /testing [permalink] [top]

Christian Sepulveda's oscillating testers

New blogger (and great guy) Christian Sepulveda has an interesting idea for how to do testing on agile projects. I think he's finding a way of being inclusive (though I don't know if that was part of his specific intent). Some testers are most comfortable being vigorous finders of mistakes. And those testers often emphasize coverage (in the broad sense: use of a well-thought-through testing model of the system/domain/failure space/etc.).

I, in contrast, want to extend test-first programming to the whole product. That makes testing about design, only indirectly about finding mistakes. I emphasize incremental progress and "just in time" testing. I accept the increased risk of gaps in coverage that comes with that (and concede that I'm making it harder to know how often and how far and in what ways a tester needs to step back and think comprehensively about the problem).

But suppose the team has testers of the first sort? Suppose it needs testers of the first sort? How should they be integrated into the work? Agile projects have some tricky differences from conventional projects - an emphasis on steady forward "flow" that batches of bugs can disrupt, a greater dependence on trust between members of different interest groups (most notably between programmers and customers, but also between programmers and testers), a programmer-centricity that a skeptic would think of as coddling, and so forth. I see in Christian's proposal ideas for integrating testers of the first sort while maintaining what's different about agile projects.

## Posted at 08:29 in category /agile [permalink] [top]

Thu, 26 Jun 2003

Agile Development Conference - Day 1

I'm in Salt Lake City, at Agile Development Conference. So far, so fun. I got to wear a odd smoking jacket when pressed into service as a nobleman in a reading of a scene from Shakespeare's The Tempest. And my trusty PowerBook saved the day when the hotel's DVD player couldn't handle the DVD Jerry Weinberg was using in his keynote.

On the technical side, I enjoyed a Technical Exchange about customer collaboration. It was interesting how rapidly people zeroed in on the need for a "bridge role" or "business coach" to mediate/translate between the business world and the program world. Alistair Cockburn pointed out that the common notion of "Customer" mushes together four potentially distinct roles: usage expert, domain expert, product owner ("goal donor"), and executive sponsor ("gold owner").

Alistair shares my interest in how personality affects methodology. He wondered what sort of personality a business coach needs. Here's a tentative and partial answer. Testers often fancy themselves in a bridge role, using knowledge of the business and users to find bugs. So Bret Pettichord's paper, Testers Think Differently, is relevant. It talks about personality differences between testers and programmers. Three of them, it seems to me, fit for the bridge role. Here they are, somewhat distorted:

A happiness to be a dilettante, to be OK with having a shallow knowledge instead of deep expertise. This lets you flit between people with relevant information quickly, bringing back something useful.
Paying attention to a variety of people, especially those whose opinions, desires, and needs tend to be discounted (such as system administrators and technical support). People tend to get captured by a single interest group, to see the world and the product through only one set of eyes. Testers resist that.
What Bret calls "living with conflict", which I interpret here as being comfortable with ambiguity and lack of agreement. While the Bridge needs to keep the project moving forward by feeding the programmers stories/tests/features to implement, she shouldn't rush to resolve ambiguity, to force agreement. That agreement is all too likely to be a sham that will be a constant irritant to the project, whereas deliberately acknowledged uncertainty can sustain a creative tension that drives "Eureka!" moments.

I attended another technical exchange on extending Continuous Integration to the whole enterprise. We mainly looked at difficulties. Jeff McKenna said something that sparked something like an idea. He said that some architectures are simply wrong for continuous integration. That made me think of particular architectures and the processes of integrating them as being like the systems that Charles Perrow describes as subject to "normal accidents" (in his book of the same title). Perrow and his followers describe characteristics of systems that make them prone to accidents. Can those characteristics, or something like them, be used to describe architectures that can't be continuously integrated? Would knowing about them help us avoid those architectures?

(Here's a short writeup of Perrow's ideas.)

Sadly, merging Analogy Fest into Open Space didn't work. It sort of fizzled away. Only two of the papers are going to be discussed. My apologies to the authors who went to the effort of writing their analogies.

## Posted at 06:39 in category /agile [permalink] [top]

Fri, 20 Jun 2003

The personal context

At Agile Fusion, I flashed on something about context-driven testing. James Bach said I should write it down.

In the ideal, adding context-driven testing to a project means that the tester observes the context and designs a testing strategy that matches it (while recognizing that the strategy will change as understanding increases).

Reality is less tidy, less rational. First, any particular strategist comes with a bundle of preferences, a set of experiences, and a bag of tricks. The natural first impulse is to do this project like a prior successful one. This project's context has an influence, to be sure, but does it really drive the strategy? Often not, I suspect. The test - perhaps - of context-driven-ness is how readily the strategist recognizes that what appears to be external context is the projection of internal biases.

This is especially tricky because internal biases take on external reality. To be trite, the observer affects the observed. The most important part of context is the people (principle 3). The strategist changes people's goals, activities, assumptions, and beliefs. So early choices shape the context, I suspect often in self-reinforcing ways.

This argues for rather a lot of humility on the part of the strategist. On the other hand, things have to get done. One cannot spend time in an agony of undirected self-doubt. So, an assignment for context-driven testers: tell stories about how you misjudged the context, then recovered. And about how you shaped a project wrongly. My sense is that the effect described here, though hardly insightful, is under-discussed.

## Posted at 05:19 in category /context_driven_testing [permalink] [top]

Sun, 15 Jun 2003

Agile Fusion: mid-course correction

We've decided to drop the veterinary exam project. It didn't seem likely to meet the learning goals for the group. But it led me to some tentative ideas:

Maybe I should pay more attention to "the natural starting size" of a project. I bet that one pair of people could have laid down infrastructure faster than seven people did, and it would probably have been a better one to build upon.
In this early phase, I believe that defining acceptance tests first helped clarify things. So did seeing the early version of the app. Seeing the early version was more valuable. To the extent that automating the acceptance test delayed deployment of the first version, automation was a bad thing.

We would have been better off deferring the automation. We should have just used the acceptance test manually. I suppose we might have locked in on an implementation infrastructure that made automation hard, but I doubt it.
I'm also inclined to think that choosing to test against the GUI (by using COM to drive IE) didn't help. I'm disappointed by that, because my bias has long been to write acceptance tests against the "domain model", not the GUI. It's boring to have your biases confirmed.

## Posted at 05:59 in category /agile [permalink] [top]

Sat, 14 Jun 2003

Where's the Agile Fusion play-by-play?

We'll soon start the third full day of Agile Fusion. Where's the blogging I promised?

Well, I find myself not good at all at producing a running summary of the action. Fortunately, Andy Tinkham is. And I find myself too close to things, and too involved, to produce grand generalizations and bold extrapolations. Those may have to wait until it's over. Sorry.

## Posted at 04:59 in category /agile [permalink] [top]

Mon, 09 Jun 2003

Involving the user in agile projects

Canonical XP has a customer role that speaks with a single voice. That role has the hard job of integrating the voices of perhaps a bazillion users, clumped into overlapping interest groups. But how are those voices integrated?

Charles Miller has an interesting post on dealing with bug reports in open source projects. There's something about it that makes me think it's relevant. There's an air of balancing forces: of treating the users with respect, of learning from them, but also of protecting the development team from being thrashed by them. Not directly applicable, I think, but worth pondering.

## Posted at 08:48 in category /agile [permalink] [top]

Normal accidents and pair programming

Keith Ray contrasts code reviews and pair programming. I'm reminded of an editorial I wrote several years ago. I'd just read Charles Perrow's book Normal Accidents. It's about accidents in complex systems. It describes characteristics of those systems that bias them toward failure. In the essay, I applied his ideas to pair programming and inspections, suggesting that pair programming is less less likely to suffer normal accidents.

Note: the second figure got messed up in production. It should be just like the first, but with clouds that wholely or partially obscure the boxes and lines behind them. I should produce a fixed copy, but I lack time.

## Posted at 08:37 in category /agile [permalink] [top]

Sun, 08 Jun 2003

One-on-one meetings

Johanna Rothman writes on one-on-one meetings. So does Esther Derby.

I haven't spent much time as a manager - a couple of years, maybe three. I'm not all that good at it. In that time, I remember being deeply thanked for just two things. For forcing one team of college freshouts to learn Emacs. (Hi, Kendall! Hi, Randy!) And for having one-on-one meetings.

I kick myself for not recommending Johanna and Esther's free teleclass on one-on-ones before it happened. (It was last week.) Maybe they'll have another.

## Posted at 16:43 in category /misc [permalink] [top]

Sat, 07 Jun 2003

Code Kata: Visualization

I really like this idea of Chad Fowler's.

## Posted at 18:00 in category /misc [permalink] [top]

Octopus eyes and agile testing

The story I've heard about octopus eyes goes like this: human eyes have blood vessels and nerves on the top, the side toward the lens. Octopus eyes have them on the bottom. The octopus design seems obviously better (though that's disputed): why put crud between the light receptors and the light source?

One point of the octopus eye story is that chance decisions linger. At some point, some ancestral critter "chose" a particular orientation for some particular type of cell, and here we are, so long later, stuck with crud in our vision and a blind spot where the nerves punch through to the brain.

Many in the testing world are scornful of test-driven design. "That's not real unit testing," they say. "Those people are ignorant of the testing literature." And some in the programming world are apologetic, saying things like "Of course, our tests would seem horribly inadequate to a real tester."

The assumption is that programmers should learn what the testers already know. As a tester, it's in my interest to agree. But what crud is in my vision because of chance decisions by my intellectual ancestors? It may have been a wonderful thing that Ward Cunningham, Kent Beck, and the other people who invented test-driven design had not read my book on programmer testing. Perhaps ignorance let them put the blood vessels on the bottom.

Today, agile testing is a hot topic. People are looking to augment programmer testing with equally successful customer-facing testing, and automated testing with exploratory testing. I'm pretty tightly tied to that effort. Does the above mean that I should say, "Go thou, programmers, and invent testing anew. I have nothing to offer you but intellectual corruption from a dead past."

Well, maybe. But I'm not gonna. What I will do, however, is hope that this summer's events make me discard some cherished beliefs. I'm going to be listening carefully to, and working closely with, people out of the testing mainstream. Not to teach, but to learn.

## Posted at 17:52 in category /testing [permalink] [top]

Fri, 06 Jun 2003

Analogy Fest papers have been posted

Here.

## Posted at 14:54 in category /analogyfest [permalink] [top]

Thu, 05 Jun 2003

Configuration Management

Laurent has a post on configuration management. He writes:

... if you ask any experienced developer about SCM, or version control, she will tell you that even if you work by yourself and are interested in none of these things, you would be foolish, verging on suicidal, to start a project without availing yourself of a version control system. It needn't be anything fancy, CVS or SourceSafe will do; but you have to have one.

I agree with this, though I do not go as far as Andy, who apparently puts everything under CM.

But what Laurent made me realize is that I use configuration management very differently for personal work than for paid work. In paid work, I delight in grovelling through CVS logs. I'll do a cvs log to see what change comments the programmers put in, how "churny" different files are (which ones change a lot, so are worth further investigation), etc.

In my own work, I use change control quite differently. I dutifully put in change comments, but I never look at them. All I use CM for is a way to backtrack when I realize I'm caught in a rathole. (And even then, I probably don't use it enough.)

I wonder why the difference? The facile reason is that I know my own code, so I don't need to use indirect means like change logs to start understanding it. But I suspect there's something more going on.

I wish I could watch Andy as he uses change control. I bet I'd learn something.

## Posted at 20:49 in category /misc [permalink] [top]

Rephrasing my message on tests vs. requirements

Earlier, I wrote about an observation in one of Bob Dalgleish's posts. Another one that struck me was this:

On the other hand, a "well written" Requirements Document will [...] make statements about the intended audience that will be impossible to capture in a Test Plan except by inference.

We test-first enthusiasts are sometimes guilty of making it seem that tests do everything that a requirements document does, and that the process of creating tests accomplishes everything that requirements elicitation does. I think we should stop.

Nowadays, I try to be explicit by saying that the goal of test-driven design is to provoke particular programmers to write a program that pleases their customer. That happens also to be a goal of requirements documents. They're different means to the same end. Suppose both tools accomplish that end. You probably don't want to pay for both. Which should you use? Well, the tests actually check that the program does something useful. The requirements don't. They just sit there on paper. So you should pay for tests and not for requirements documents.

Now suppose that tests do not completely achieve the goal. For example, a programmer reading them might not understand the "why" behind them, and that might be a problem. So you'll have to supplement them. Personally, I'd lean toward conversational solutions. Have the original programmer collaborate in the test writing. Make sure programmers work together so that latecomers learn the lore from the old hands. And so forth. But if that wouldn't work, we might want a document explaining the motivation. And that might look something like part of a requirements document.

Even if tests do a perfect job of provoking the programmers, that's not the only goal in a project. Perhaps a requirements document is a better way to achieve some other goal. In that case, you'd keep on creating them. Except I can't believe you'd keep making them the same way. By analogy, take the car: it's a means of transportation. In US culture, it's also a way for teenagers to signal status to each other. If there were no need for transportation, would you invent cars just for signalling status? No. There've got to be cheaper ways. Like clothes, maybe. And if requirements are no longer about instructing programmers, there've got to be cheaper ways to achieve their remaining goals.

So my stance is simultaneously less and more bold. I don't say that tests replace requirement documents. I'm saying that tests are so important and so central that everything else has to adjust itself to fit.

## Posted at 10:06 in category /testing [permalink] [top]

Wed, 04 Jun 2003

Hiring

Laurent has a timely post about hiring. He compares the recruiting process to Big Up Front Design, where you have to get it right because there's resistance to changing the decision once the person is hired.

It's timely because I'm actually considering taking a Real Job. I wasn't looking for one, but it dropped in my lap, and there's a chance that I can do some spectacular work.

But I only want the job if I really can do spectacular work. I don't want to do just OK work. So I have proposed a "trial marriage" that I believe is essentially like Laurent's solution in his post. I'll work for a while as a non-employee, doing the kind of job we'd expect me to do as a full-time employee. After some period, we'll look at what I've accomplished and see if it looks like I can do the calibre of job we both want. If not, the company will be more likely to cut me loose than they would if they'd hired me. And I'll be more likely to cut myself loose. (My past two jobs have taught me that I stick with companies well after I should leave.)

We'll see if they go for it.

Because this is a note about hiring, and that's one of her schticks, I want to announce that I finally put Johanna Rothman on my blogroll. I wanted to do it in a way that returned the grief, but I haven't come up with a good one. So this will have to do.

## Posted at 16:03 in category /misc [permalink] [top]

The language of tests

Bob Dalgleish has comments on test planning as requirements specification. He comes at things from a different perspective than I do, but I was struck by a couple of his points.

The first is from his last paragraph:

In fact, the worst that can happen, substituting a Test Plan for a Requirements Document, is that the resulting product is over-specified, with too much detail and decisions made too early. A Test Plan or a Requirements Document needs to be at the correct level of abstraction, but it is much easier to do this with the [Requirements Document] than the [Test Plan].

(Here, "Test Plan" means a document that describes inputs and expected results, and also allocates effort among testing tasks.)

Testers have long been aware of a tension between abstraction and concreteness. Take GUI testing. Your tests have to be concrete enough to catch bugs, but somehow abstracted enough that they're not broken by irrelevant changes to the UI.

Now that testing is taking on a new role, that of helping people understand the problem to be solved, the tension recurs in a new and interesting way. We're now in the realm of the mathematician A.N. Whitehead's "Notation as a Tool of Thought":

By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and in effect increases the mental power of the race.

The quote reminds me of the GUI testing problem: the purpose of notation is to keep you from having to worry about irrelevant details. You worry about the advanced problem - such as ways to express the needs and wants of the users in a checkable way - not about whether the menu items stay in the same order. And by not thinking at all about menu items, you can pay more attention to the users.

And yet... In test-first design, we're sometimes after something more: notation that helps you discover details that turn out to be very relevant. Notation as a tool for provoking Eureka! moments.

It's my hunch that the community of people who care about test-driven design at the product level are soon going to zero in on notation. How do you write tests that harness insight-provoking concreteness without being so specific that they discourage useful exploration of options? And how do tests do that while satisfying all the other goals we heap on them? Like being resistant to GUI changes. Like being so easy to change that they do not slow down the project. Like finding bugs.

(I've written elsewhere on abstraction and concreteness in tests.)

The second point will come later.

## Posted at 14:26 in category /testing [permalink] [top]

Magritte on projects

A clever graphic from Frank Patrick (via Alan Francis).

## Posted at 13:37 in category /misc [permalink] [top]

Tue, 03 Jun 2003

Why Ruby?

Bret writes why he favors Ruby for testing training and testing tasks.

Full disclosure: I had something to do with Bret's enthusiasm for Ruby.

## Posted at 20:50 in category /ruby [permalink] [top]

Agile Fusion

Agile Fusion is an event that Rob Mee and I have organized. We're bringing together context-driven testers and agile programmers to build product together and learn from each other. We'll be doing it next week, starting the 11th, at James Bach's very nice tech center in Front Royal, Virginia.

I'll be blogging it, as will others, I bet. The main purpose of this note is to lay out what we'll be doing, to set a context for the blogging. But first, a side note:

Our goal was to have around two programmers for each tester. Unfortunately, our programmers have been hit by a wave of bad luck: two have had management changes of heart, one has developed a medical problem that prevents travel, and one has been laid off. So we've drifted away from our ratio. (Testers seem immune to bad luck.)

We've been inviting people through word of mouth, but I'd love to add a couple more programmers at this last minute. If you are such a person - good programmer, inclined to agile methods, interested in the intersection of testing and programming, likes to pair program, generally plays well with others, and can make it to Virginia at short notice, drop me a line (marick@exampler.com). There's no charge, other than your share of expenses - we're trying to break new ground, not make money.

Here's a version of the original invitation I sent out.

Catch phrase: learning by doing, then talking

Deliverable: a better idea of XP/agile testing in the heads of the participants, who can then take that back to their company, their clients, and other events (notably the XP/AU XPFest). People will be prepared to do more than they could before, and they'll be ready to take the next step in learning.

Mechanism:

We will do two parallel XP projects with very short iterations. One will work with legacy code (Dave Thomas's RubLog) and the other will start from scratch (implementing a prototype replacement for the software used by the American College of Veterinary Internal Medicine for the large animal patient management section of the examination for board certification).

The software will be written in Ruby. We should have enough Ruby experts to help out those who are novices, but people should learn a little Ruby before coming.

The projects will differ from run-of-the-mill XP projects in that there are three novel testing components.

Many of the pairs will have self-identified testers in them. This person will suggest test-first tests from a tester perspective. My desired goals:
- The programmer will incorporate into his habits tests of a type he hadn't considered before. With luck, we'll elaborate on the notion of elaborative tests, which is like-but-different to conventional testing notions of bug-finding tests.
- The tester will realize which of his habits are inappropriate in an XP context.
Iterations will be driven not just by stories and their attendant conversation, but also by customer tests first. We will, through doing, learn various things about how that can be done:
- what kind of tests best drive programming?
- when should they be produced?
- what's a good format? (alternately: what are the details of working with FIT?)
- how should people collaborate on tests?
- how do tests fit in with the planning game?
As with the programmer testing, I really hope that both XPers and testers have "Aha!" moments in which they come to doubt some of their habits and assumptions.
We will carve out a space for manual exploratory testing. Through doing it, we'll come to a better understanding of how it might fit in an XP project.

The XP projects will last from 9-3 each day. 3-5 will be spent on a mini-retrospective about what happened that day. People will talk about what they did, what lessons they drew from it, and suggest things we might do the next day.

Those two hours might lead to testers asking for exercises or lectures about some aspect of XP. They might lead to programmers asking for exercises or lectures about some aspect of testing. Those could happen in the evening. Or, if people don't want that, they can happen during the day.

Commentary

That's my starting agenda. I'll be disappointed if it's not modified pretty radically over the week. I will push, however, for keeping the focus on learning through building product.

Leaving aside those specific goals, it is fundamental to my conception of this workshop that we all intend to push outside our comfort zone. As I've said ad nauseum, I don't believe "agile testing" will be a straightforward extension of either context-driven testing or XP-style test-driven development. Rather, I should say I hope it won't be.

## Posted at 14:20 in category /agile [permalink] [top]

Wed, 28 May 2003

The Earth from Mars

Jupiter, too. It's been colorized and enhanced, but still highly cool.

## Posted at 13:41 in category /misc [permalink] [top]

Links: Fowler, Vaughn, and Miller

A convalescing son watching a DVD on the iMac + an interruptible father nearby with a TiBook = a posting about some links, without the commentary they deserve.

Martin Fowler has a bliki, which is good news. His What Is Failure is a short and sweet summary of what's wrong with all those "28% of all projects fail" quotes. I'll leave the punch line to Martin, but he's given me a useful phrase.

Greg Vaughn has an essay on static vs. dynamic typing. I admire people whose attitude toward "either/or" debates is "some of both, please".

Charles Miller draws an analogy between quality and security. "Defense in depth" is a useful phrase. In testing, it's usually applied to a series of finer-grained tests (unit, integration, etc.). Do they provide defense in depth? Maybe. But perhaps not if the mechanisms by which different levels of tests are designed does not differ enough. (I've never met anyone who can articulate diffferent design criteria for integration testing. Inputs for integration tests seem to be picked either the same way as for unit tests or the same way as for whole-product tests. That, I think, makes them weaker than if one thought about integration tests differently.)

My own style these days is to lean heavily on test-first design, both at the "unit" level (where the programmer thinks of herself as the user) and at the whole-product level (where one hopes there's a business expert acting as a proxy for the user). But defense in depth argues for for a different type of thinking. Hence, exploratory testing.

## Posted at 13:06 in category /misc [permalink] [top]

Tue, 20 May 2003

Two experience reports

I'm on the program committee for the Agile Development Conference. One of my duties - or, in this case, a pleasure - has been to shepherd some experience report papers. I want to single out two. After the conference, the authors have agreed to send copies to people who request them. (But I think you should go to the conference and hear them in person.)

As an anti-spam measure, I've intermixed "marick" and "testing" in their email addresses.

Christian Sepulveda (csepulvmarick@testingatdesigntime.com) writes about his experience doing remote agile development. He is the remote team lead for a group in a different city. He talks about why they went that route, how they've made it work, and problems they've encountered.

Jeff Patton (JPattonMarick@testingtomax.com) writes about how you can vary scope in fixed-scope, fixed-cost contractual development. The desire to fix every vertex of the cost-scope-quality triangle is due to distrust. How can you start from a distrustful position and move to a trusting one that allows tradeoffs? Jeff provides specific techniques.

## Posted at 09:24 in category /agile [permalink] [top]

FIT in the news

Ward's FIT framework has gotten some press from Jon Udell. He proposes that tests be used to detect Windows rot as configurations change. The article kind of muddles together several issues, but it has some nice links.

I want to mention that FIT and acceptance testing will be featured parts of XP Fest at XP/Agile Universe. XP Fest is being organized by Ward, Rob Mee, Adam Williams, and me, so I want to plug it. Come join us.

## Posted at 09:09 in category /testing [permalink] [top]

Wed, 14 May 2003

Blogging at STAR East

I'm at STAR East, sitting with Bret. We're preparing for a birds-of-a-feather session on weblogs about testing. I'm showing him how I use Blosxom.

Yesterday, Bret and I taught a tutorial on scripting for testers. Our goal was to teach non-programming testers the kind of programming that helps testers. I thought it was a disaster. Bret thought it went well. Fortunately, the participants generally agreed with Bret. It went well enough that we'll be doing it again at STAR West and at PNSQC. (The PNSQC version will be not quite as geared to beginners.) We'll also be teaching a two- or three-day version through SQE next year. (And we'll be teaching it in-house, too, I hope.)

So I guess I'm a happy guy.

Today, I gave another talk about agile testing. (And just now, Mark Taylor came up to say he liked the talk. So that went well too.) In this talk, I spent half the time talking about problems with requirements documents, how agile projects get along without them, and how I think testers should help. Then we spent the second half brainstorming tests for a feature I want to add to my time tracking application. A good time was had by all. I wore funny hats.

## Posted at 14:44 in category /testing [permalink] [top]

Sat, 10 May 2003

Lisp with angle brackets, and static typing

As an old Lisper, I find XML depressing. Many of the people who I'm certain would dismiss Lisp as unreadable seem to be happy reading XML. Even though structurally it's the same, just with more noise characters.

So I'm with Dave.

Being somewhat sensitive to being an old fuddy-duddy who doesn't do XML much, I try not to bash it. But here's a nice note comparing XML configuration to Scheme/Lisp.

The same author's take on the current "static typing vs. testing" debate is also interesting. (See also postings from Bob Martin and Tim Bray.) He makes two arguments.

One is that a type failure points you more directly at the problem than does a test failure. I don't find that too compelling. You need to write your tests (and your code) so that problems announce themselves clearly. My hunch is that such effort has all sorts of ancillary benefits, so I'm inclined to call this disadvantage a net win. (In the same way that testing pushing your system toward being less coupled is a net win.)

(His comments on testing call to mind Glenn Vanderburg's on assertions vs. tests. More on this someday, perhaps.)

His other argument is that people and tools can read explicit type documentation and derive useful information from it. I'm not sure of the argument about people. The Smalltalk convention is to use the type as the name of the argument. So a method's signature might be something like


  createKey :named aString in: aDictionary

Is that less clear than this?


  Key create(String name, Dictionary where);

Only in that the return value is explicit, I think. (Keyword arguments are also a big win here, as are the descriptive names that seem - in my experience - more common in dynamically typed languages.)

On the other hand, given that people are frail beasts, giving them two opportunities to say something useful - in both the variable name and type name - might help.

It's the other argument that seems more compelling. Programmer's apprentices (tools) mostly work with static information, and statically typed languages have more of it there. I do most of my coding in Emacs, so I can't make a head-to-head comparison between, say, the Smalltalk refactoring browser and Java tools like Eclipse or Idea. I lack the experience to evaluate the argument. Does anyone out there have experience with equivalent tools in both types of languages? What do you think?

## Posted at 10:32 in category /misc [permalink] [top]

Thu, 08 May 2003

Learning tests

Mike Clark writes about learning tests. They're tests he writes to figure out an API. He's made me think about one of my habits.

I tend to use the interpreter for learning. Where Mike describes writing a test to see what the return value of a method is, I would do this:

irb(main):001:0> load 'timeclock-web-services.rb' true irb(main):002:0> session = start_session #<Timeclock::RichlyCallingWrapper:0x358094 @wrapped=#<Timeclock::RichlyCalledWrapper:0x3580bc @wrapped=#<Timeclock::Server::Session:0x35aa88 @user="web-services-default-user", @active_job_manager=<ActiveJobManager: {}>, @records=[], @jobs={}, @persistent_user=#<Timeclock::Server::PersistentUser:0x35a3f8 @user="web-services-default-user">>>> irb(main):003:0> puts session.methods last_change_log last_command wrapped method_missing last_complete_result ...

What does this gain?

It's faster than going through a testing cycle. I suspect I ask more questions than I otherwise would, because asking a question is so easy. I'm more apt to explore rather than stick to just what I think I need to know.
The interpreter tends to spew out a bunch of info at once. The several lines above show the contents of an ActiveJobManager within a Session within a RichlyCalledWrapper within a RichlyCallingWrapper. So a single question tells me a lot about what's going on, even though the spew is roughly as obfuscated as XML. In contrast, in a test, I get only the answer to the question I think to ask.

What does it lose? (Leaving aside that you can't do it in Java.)

You lose the tests and the documentation they provide. (I've never been quite comfortable with tests as documentation - I'd rather go to the code - so I think I underplay that testing role.)
Mike mentions extracting code for dealing with the API from the tests. I do some of that, since I will often write a little file of utilities that I load to help me explore. But they don't tend to get preserved and reused.
Having a file full of tests probably encourages methodical exploration, whereas an interpreter transcript makes it easier to overlook things. (This is the flip side of the last of the advantages.)

I've done roughly what Mike does, but only (I think) when I'm trying to learn a package preparatory to changing it. That's a different dynamic. I'm acting mostly as a maintainer who wants a safety net of tests, not just as a user of the API. So I think I'll adopt Mike's style next time I want to understand an API. Since Mike's doing some Ruby, maybe he'll try my style. I bet there's a blend that's better than either: a way of rapidly cycling between typing at the interpreter and typing tests in a file.

## Posted at 15:35 in category /testing [permalink] [top]

Sat, 03 May 2003

Changing Time in tests

Someone on the testdrivendevelopment mailing list asked a question that boils down to this. Suppose you have code that looks like this:


  class Mumble
    def do_something(...)
      ...
      ... Time.now ...  # Time.now gives current time.
      ...
    end
  end

You want do_something to work with all kinds of times: times in the past, times in the future, this exact instant, times way in the past, the times that don't exist because of daylight saving time, etc.

The conventional answer is that time should be a variable, not (in effect) a global. Either pass it in to do_something like this:


    def do_something(..., now = Time.now)
      ...
      ... now ...
      ...
    end

Or, if multiple methods depend on time, add a method that changes the object's current time:


  class Mumble
    def set_time(time)
      @now = time
    end

    def now
      @now || Time.now  # iff no time set, use the system time.
    end
  end

Or, if you're squeamish about giving clients the ability to change time, make Mumble have a now method that returns Time.now and make a TestableMumble subclass that looks like the above.

I recently found myself doing something different. In Ruby, all classes are "open". You can add methods to any of them at any time. Rather than adding a time-setting method to Mumble, I added it to Time. It looks like this:

  class << Time                       # change Time's methods
    alias_method :original_now, :now  # save old method.
  
    def set(time)
      @now = time
    end
  
    def advance(seconds)
      @now += seconds
    end
  
    def now
      @now || original_now
    end
  
    def use_system_time
      @now = nil
    end
  end

My tests look like this:


    def test_active_time_extends_over_day
      job 'oldie'
      Time.set(Time.local(2002, 'feb', 28, 14, 30))
      start 'oldie'
      ...
      assert_equal("'oldie', with 24.00 hours from 02002/02/28 2:30 PM.",
                   result)
    end

This is more convenient than running around telling a bunch of objects or methods what time they should be thinking it is. Instead, I tell everything what time it should be thinking it is. It's worked out rather well. I've not encountered a case where, say, I want two different objects in the same test to have different notions of "now".

That's not to say I don't use the more traditional ways for things like telling objects what servers they're talking to. But, in the case of time, there's already a global notion of "now" defined by your language. By introducing object- or method-specific notions of "now", you're violating Occam's Razor (in the "do not multiply entities unnecessarily" sense). What do I mean by that?

Consider servers. You can simply declare there is no global variable that an object can query to find out which server to use. If an object wants to know, it has to be passed the server or an object it can query for the server. You cannot similarly declare that Time does not exist, and you cannot prevent programmers from using it, especially not the programmers of third-party libraries.

It simply seems less error-prone to have a single source of time information that all code must use. Then, for testing, we treat the assembled product or any piece of it as a brain in a vat, unable to tell whether it's interacting with the real world's time or a simulation.

More generally, we need to be able to control all the product's perceptions. For this, it seems we need language/substrate support of the type Ruby allows. I believe this was also once a property of Community.com/Combex's Java Vat (hence the name), but I'm not sure if that's true any more.

## Posted at 10:43 in category /ruby [permalink] [top]

Fri, 02 May 2003

Dave Thomas, exposed

Those of us who know Pragmatic Dave Thomas have often wondered, "Who is he, really?" Thanks to a May 5th New Yorker article (unfortunately not on line), I now know that he's actually a Slovenian Lacanian-Marxist philosopher named Slavoj Zizek.

Evidence, you ask?

When Zizek "sees a tube of toothpaste advertising 'thirty per cent free' he wants to cut off the free third and put it in his pocket." When "Thomas" goes to the grocery store, he gets inspired to write a code kata that includes this question: "'buy two, get one free' (so does the third item have a price?)"
"One of [Zizek's] fundamental gestures is this: he will present a problem, or a text, then produce the reading that you have come to expect from him, and then he will say, 'I am tempted to think it is just the opposite." In a workshop devoted to the question of how to build software to outlive its creators, "Thomas", one of the workshop organizers, argued that software lives too long and we should rewrite it instead of trying to make it live longer.

(The article describes something Zizek says about elevator buttons, then says, "Like many of Zizek's observations, this is the kind of insight that forever changes one's experience" -- and "Thomas's" argument was one that made me go "hmm..." at the time and keeps echoing in the vast hollow within my skull. Frankly, I think the software one was better than the elevator one, but probably the New Yorker author thought the latter easier to explain.)
Zizek "speaks English... in an accent recalling that of Latka, the character of indeterminate Mitteleuropean origin played by Andy Kaufman on 'Taxi'." Need I say more?

## Posted at 13:49 in category /junk [permalink] [top]

A Red Letter Day

Yesterday, I attended a talk at U Illinois entitled "Evo-Devo for Cultural Evolution: Why Memes Won't Do", by William C Wimsatt of the University of Chicago. I did not come out of it with any elaborate analogies to software development. A sigh of relief was heard across the land...

## Posted at 12:53 in category /junk [permalink] [top]

Thu, 01 May 2003

Reward structure

Keith Ray has an interesting set of ideas and links about rewarding people in teams.

## Posted at 12:26 in category /agile [permalink] [top]

Wed, 30 Apr 2003

Process and personality

I've posted a short STQE essay of mine here. It contains a line I'm rather proud of:

Every article on methodology implicitly begins "Let's talk about me."

I hope that whets your appetite. To dampen it, a warning: the last few paragraphs talk about certification and best practices. I promise that's the last on that topic for a while.

## Posted at 11:29 in category /misc [permalink] [top]

Tue, 29 Apr 2003

iTunes 4

When I was about 12, I would stay up late Saturday nights to listen to the local community college station's alternative show. "What is this Frank Zappa creature?" They say that the Golden Age of science fiction is 12; for me it was also the Golden Age of Rock.

When I was a freshman in college, I became a huge Prokofiev fan. I actually signed up for two years of Russian just so I could read Nestyev's biography in the original. (That proved to be a mistake...)

In recent years, I've drifted away from music, but some of my enthusiasm came back in the last few months. I've listened obsessively to mostly old fogey music: the Clash's London Calling, Patti Smith, Springsteen (especially Nebraska), and Shostakovich's 8th. So Apple's new iTunes music store came at a vulnerable moment.

It's not possible, maybe, for a 43 year old without any particular musical talent or training to recapture that feeling that music matters, but I have to say I feel close to that tonight. Being able to reach out to a world of songs that mattered to me, click on one, and have it...

Good job, Apple. Good idea. Fine execution. But where are the Stones?

## Posted at 20:52 in category /mac [permalink] [top]

Ferment: Roles, names, and skills

Mike Clark writes "I'm on a mission to help tear down the barriers we've so methodically built up between QA and developers. [...] So starting right now I'm making a resolution to not use the label 'QA' or 'tester'." He also recommends the use of tools like FIT that everyone can understand.

Greg Vaughn has a response. "I'm not convinced abolishing the terms 'QA' and 'tester' is the right approach." He comments on different skills in a way that reminds me of Bret's paper 'Testers and Developers Think Differently'.

I think there's a sea change going on. In many companies, the time of unthinkingly "throwing it over the wall to the testing team" is coming to an end. That was an expression of cultural assumptions:

that business relationships, indeed all relationships, should be assumed to be corrupt. (Well, sometimes they are.)
that a job is best done one thing at a time. First you build the car, then you check the car, then you fix the car: the so-called "scrap and rework" assembly line. That's unpopular in this day of lean manufacturing, but there's also something to it. See what Esther Derby writes on multitasking. Frequent context switching poses a risk to flow.
that people cannot be adequately self-critical. In testing, this view was made part of the lore by the seminal Art of Software Testing. By this view, a close relationship between testers and programmers will, though groupthink, mean that weaknesses aren't probed. (The same would be true of a close relationship between a customer representative and the development team.) See also critiques of the performance of "embedded reporters" in the recent war.
that specialists trump generalists every time.

I want to throw out those assumptions. I've long believed in tight working relationships between testers and programmers. And yet... whatever we do also has to accommodate the exceptions to our desires. We will learn a lot in the next few years about balancing these forces (groupthink vs. teamwork in agile projects, etc.). It's an important conversation, and I look forward to experience reports from Mike and Greg and others.

## Posted at 08:52 in category /testing [permalink] [top]

Mon, 28 Apr 2003

Imre Lakatos and persuasion

I was talking to Mark Miller last week. He's interested in capability security, I'm interested in context-driven testing and agile methods, but we're both interested in persuading people to change their minds. Mark quoted a common summary of Kuhn: "Science advances one dead scientist at a time." The old guard dies off, and the new guard takes their place.

Quite true, in the main, but some in the old guard have to change their minds for the new guard to get started. Some people have to switch from being Newtonians to being relativists, or from being Ptolemaists to being Copernicans. Why and how does that happen?

The philosopher Imre Lakatos's methodology of scientific research programmes talks about why and how scientists should change their minds. He extracts rules for rationality from a somewhat idealized history of great scientific advances. I'm going to do violence to his intentions by not caring (here) about rationality. Instead, I'm going to build off my reaction to Lakatos, which is "Those sure are nice rules. They capture a lot of what I mean when I say, 'Those people who call themselves the Fooists are really going places, and I want to go along." If other software people respond to the same cues as I do, then we who want people to go along with us might benefit in two ways from careful study of Lakatos's rules:

We'll explain what we're doing in a more appealing way.
We'll think about what we're doing in a way tuned for success. We'll make better choices.

But note well that I'm not claiming a complete solution to the problem of persuasion.

So here goes. For skimmability, I'll indent the explanation of Lakatos differently from some of my speculations about various software movements.

At the heart of any research programme is a "hard core of two, three, four or maximum five postulates. Consider Newton's theory: its hard core is made up of three laws of dynamics plus his law of gravitation." (Motterlini 1999, p. 103)

Agile software development has four postulates. I might be right in saying that capability security's hard core is also concise. It seems to be a single diagram, the Granovetter Diagram.

Software postulates probably can't be as crisp as F=ma, but I think the Agile Manifesto postulates are actually quite good as a guide to behavior. If you have a problem in a project, you would first try to solve it by thinking about the people involved and their interactions, only secondarily thinking about people-independent processes and tools. If getting working software soon is in conflict with comprehensive documentation, you go with the working software.

I've been critical (as an insider) of the context-driven testing principles, but they actually stack up reasonably well. My objection has been that principles like "good software testing is a challenging intellectual process" or "people, working together, are the most important part of any project's context" are too easy to agree to. I still think that's a problem, but they are also - when taken literally - a good guide to behavior. The first says that if you're trying to solve a problem by reducing the amount of thinking testers have to do, you're solving it wrong. The second has much the same import as the Agile Manifesto's "individuals and interactions over processes and tools".

I'd recommend the context-driven principles be restated in more of an active voice. Make them less statements of truth and more explicit heuristics for behavior. And maybe knock off a few. (Though I'm skeptical of Lakatos's "at most five", I don't know if I'll ever be able to remember all seven.)

A set of postulates starts out inadequate. The common (Popperian) wisdom about science is that scientists posit hypotheses, specify refutable predictions that follow, and replace the hypotheses when those predictions fail. That's not in fact what happens. Counterexamples are shelved to be dealt with later. For example, Newton's theory did not correctly predict the observed movement of the moon, but he did not discard it. When Le Verrier discovered the motion of Mercury's perihelion was faster than predicted by Newton, people shrugged and waited for Einstein to explain it.

Research programmes can proceed despite their obvious falsity. Rutherford's model of the atom (mostly empty space, electrons orbiting a nucleus) violated Maxwell's equations, which were believed to be rock solid. They were certainly much more compelling than the new evidence Rutherford's model was intended to explain. But Rutherford's programme essentially said, "We'll figure out how to reconcile with Maxwell later." (The solution was quantized orbits - the "Bohr atom".)

How can we use this? I made a little splash in 1999 by attacking the popular V model for software development. I suggested requirements for a better model. I now think of those requirements as groping toward agile methods, particularly Scrum's management style and XP release planning. As is typical, my presentations were well received by those predisposed to agree with me, and thought scandalous by those not. Defenders of the V model tended to list problems that projects using my alternative might have. Rather than argue with them, I should have changed the subject (as Rutherford did): "Of course an early version of a model or methodology will have problems. So? Do you expect progress with no backward steps? Or can we talk about something useful? "

Similarly, people attempt to discount XP by saying "XP is crucially dependent on the myth of the on-site acceptance-test-writing customer". That's a valid criticism in the sense of being a statement of a limitation that XP should want to tackle someday. But it should have no more innovation-stopping power than saying, early last century, "Well, relativity has problems too". (Lakatos said of Einstein: "Einstein's theory inherits 99% of [the anomalies found with Newtonian mechanics], eliminates only 1%... As a matter of fact, Einstein's theory increases the number of anomalies; the sea of anomalies does not diminish, it only shifts." (ibid, pp. 100-101))

The preface to XP Explained does the right thing: "XP is designed to work with projects that can be built by teams of two to ten programmers, that aren't sharply constrained by the existing computing environment, and where a reasonable job of executing tests can be done in a fraction of a day." (p. xviii) Specifically eschew universal claims, and don't let people tempt you into claiming more than that you have a set of principles that work in some cases and that you're going to work hard to make them work in more.

A problem that context-driven testing has, I think, is that it's not sharply enough delineated. It clearly comes from the shrink-wrap product world, but it's hard to draw boundaries around it. We "contexters" are always making nods toward highly-regulated environments, talking about how we'd write oodles of documents if the context demanded it. I think that weakens our impact by lashing together the contexts where we provide novelty and innovation and ferment with the contexts where we don't say much (yet).

But surely factual evidence counts for something. Lakatos says it does, in two ways.

First: while "theories grow in a sea of anomalies, and counterexamples are merrily ignored (Motterlini, p. 99)," the same is not true of dramatic ("novel") confirmations. What convinced scientists of Newton's theory of gravitation? According to Lakatos, it was Edmund Halley's successful prediction (to within a minute) of the return date of the comet that now bears his name. What "tipped" scientific opinion toward Einstein's theory of general relativity? The famous experiment in which the bending of light was observed during a solar eclipse. (Interestingly, according to Overbye's Einstein in Love, the observational evidence was actually shaky.)

Sometimes critics of agility will concede that it could be appropriate for "completely chaotic, changing environments". The implication is that a properly managed project, with customers who knew their own minds and markets, could do better. But if you're hopeless anyway, perhaps agility can salvage something.

The temptation is to argue that agility is not so narrowly applicable, but I think that takes us off into unpersuasive meta-arguments about what methodology would be best in the abstract. Instead, rely on the fact that chaos exists, that people on projects view it as a problem, and that solving this problem serves as a novel confirmation of a surprising prediction.

Seek out what's unlikely and demonstrate success at it. Much credibility flows from that. That's part of the rhetorical power of origin myth projects.

Something that many project members believe unlikely is job satisfaction. Job satisfaction is thus both a novel and testable prediction. For that reason, I think it's good that agile advocates predict that agile methods lead to it. (See for example, Lisa Crispin's writing and how Josh Kerievsky makes Enjoyment one of the five values of Industrial XP.)

The second way that factual evidence counts is in the way proponents respond to it. Lakatos pictures the core postulates as being surrounded by a protective belt of auxiliary hypotheses that are used to handle telling counterexamples.

Newton provides an important example of the right kind of protective belt. He was informed that observations of the moon refuted his theory. To protect it, he invented a new theory of refraction that, together with his laws, did predict the moon's movement. (Sort of - it was never right, because the moon's center of mass isn't at the center of the sphere, which throws off the calculations.)

His theory of optics not only corrects refuting observations to make them match the theory. It is also a theory of its own, makes new predictions, and had some of the new predictions confirmed. Strictly, the observations that refuted Newton's theory of gravitation served to surprisingly confirm his theory of optics. He knew that there were refuting observations, but he didn't have the values, so he couldn't "work backwards" from them to a theory of optics that served no purpose other than to protect his hard core.

It's that "working backwards" - fitting the protection to the specifics - that distinguishes what Lakatos calls "ad hoc" protective hypotheses from the good kind. "[Some proposed counterexample] was never discussed before, but now you can account for this case too by introducing an ad hoc auxiliary hypothesis. Everything can be explained in this way, but there is never any prediction of a corroborated novel fact." (ibid, p. 101)

The existence of ad hoc protective belts shows why my attempts to take down the V model were pointless. For any criticism of the use of the V model in a specific project, one can always invent an ad hoc explanation. It needn't ever be the fault of the methodology. It can always be the fault of inadequate training, improperly motivated staff, a lack of management support, upheaval due to an acquisition... None of these explanations provide prescriptions about what to do and also make novel predictions.

In contrast, I don't think Scrum's approach to management problems is ad hoc. It makes a specific prediction: that if project management sees its role as facing outward from the team (clearing roadblocks from their path) rather than inward (telling them what to do) the team will perform perfectly fine. Moreover, Scrum addresses the criticism that "agile doesn't scale" with a specific mechanism (the " scrum of scrums"), and predicts that this mechanism alone suffices for teams of hundreds.

To be convincing, we must scrupulously avoid ad hoc, after-the-fact explanations of why something didn't work. Instead, we must either forthrightly admit that we don't know how to deal with problem X, or we need to make predictions much more specific than "get proper training".

And forget about flailing after things like the V model. While such attacks do serve to cement the solidarity of those already on your side, they are one of the less effective ways of convincing the unconvinced.

Research programmes, even ones as successful as Newton's, eventually degenerate. A programme "is degenerating if ... (1) it does not lead to stunning new predictions (at least occasionally...); (2) if all its bold predictions are falsified; and (3) if it does not grow in steps which follow the spirit of the programme." (ibid, p. 106) (This last seems to mean avoiding the ad hoc additions mentioned above. But I think there's also a less pin-down-able sense. It would not be in the spirit of an agile method to extend itself by adding more and more examples of written process documentation.)

I take this to mean that research programmes, like sharks, have to keep moving or die. It would not be enough for agile methods to confine themselves to small projects where there's rapid change, else they would drift out of the software development attention space (see commandment 4). The agile methods have to keep trying to push the boundaries of applicability. They need to report success in new types of projects, with a reasonable argument that the success is due to new practices or extensions of old practices.

P.S. The citations are from Lakatos's "Lectures on Scientific Method", in For and Against Method, Matteo Motterlini (ed.). I should be citing Lakatos's Philosophical Papers, Vol. 1: The Methodology of Scientific Research Programmes. But when I wrote a long email on this same topic, some years ago, I'd just happened to pick up Motterlini.

## Posted at 08:10 in category /misc [permalink] [top]

Fri, 25 Apr 2003

Interview with the search engine

The venerable game of picking on programs that try to talk like people is done rather well in this interview with the Ask Jeeves search engine.

Some Emacs users may not know about M-x psychoanalyze-pinhead, which hooks something quite like Weizenbaum's original Eliza program to a Zippy the Pinhead quote generator. Sample output from the middle of the conversation (Zippy speaks first):


Those aren't WINOS--that's my JUGGLER, my AERIALIST,
 my SWORD SWALLOWER, and my LATEX NOVELTY SUPPLIER!!

I don't understand.

I OWN six pink HIPPOS!!

Maybe your plans have something to do with this.

The LOGARITHM of an ISOSCELES TRIANGLE is TUESDAY WELD!!

Is it because of your life that you are going through all this?

This is a NO-FRILLS flight -- hold th' CANADIAN BACON!!

Are you afraid of that?

A little of this goes a long way...

## Posted at 09:32 in category /junk [permalink] [top]

Thu, 24 Apr 2003

Two book references. No, three.

Laurent quotes from Winograd and Flores's Understanding Computers and Cognition and also links to a discussion page. I read Winograd and Flores while sitting in on an odd course that Simon Kaplan taught at Illinois, and it quite excited me, to the point where I hunted down Flores's dissertation. Some influences linger - for example, their discussion of Heidegger's notion of "ready-to-hand" tools helped me think about what I call "just what I needed (almost) bugs".

(Kaplan's course was where I first read the wonderful Proofs and Refutations, Imre Lakatos's book on the philosophy of mathematics. It's written as a stage play about the history of Euler's conjecture that, for polyhedra, V-E+F=2).

I'm reminded of one last book. To write this blog entry, I had to create a new category for it, misc. That reminded me again of how much I dislike hierarchies as a way of representing/organizing my very non-hierarchical brain. I love Blosxom, my blogging tool, and I think it's a cute hack the way it uses the file system hierarchy to organize, but that cute hack is in fact an example of a "just what I needed (almost)" bug. (Well, maybe it's not a bug, given how much code it saves.)

Categories are in fact a tool of limited usefulness. Plato said we should "carve nature at its joints" (like a good butcher does), but lots of nature doesn't have any joints. This point is put nicely in George Lakoff's Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. I found it pretty convincing.

I wish I had a blogging tool that let me tag entries with multiple keywords (or went beyond that), but was as easy to install, use, and control as Blosxom.

## Posted at 08:50 in category /misc [permalink] [top]

Tue, 22 Apr 2003

If we must have certification, here's a better model

Earlier, I linked to Cem Kaner's skill-based final exam for a testing course. My wife is "board certified" through the American College of Veterinary Internal Medicine (ACVIM). That means she's an expert practitioner. I said I'd describe how such certification works. Here goes. (More details here.) I've highlighted the parts that jump out at me. If we must have certification, can it include things like these?

(Note: all this happens after you're already a veterinarian and have completed an internship.)

You must work (practice medicine and also take some courses) under the direct supervision of two board-certified "diplomates" (like my wife) for a portion of a three-year residency. You must also spend some time (not much) shadowing other specialists. For example, you might read radiographs (x-rays) with radiologists. (The software equivalent might be testers spending some time working with programmers.)

You must have one publication accepted to a journal. It could be a research report ("I did this experiment..."), a case report ("I treated this interesting case..."), or a retrospective study ("At U Illinois, we've fixed 543 LDAs over the past 15 years and...")

You must submit three case reports to an ACVIM review committee. Two out of the three must be accepted. Each case report describes your management of a challenging case. The writeups have to be your own work, without help from your supervisors. They will certainly include discussion of things you did but shouldn't have, and discussion of why you didn't do particular things. Dawn, who sat on the review committee one year, says that any writeup that didn't have such discussion would be "unbelievable". No case that demonstrates expertise goes smoothly.

(In food animal medicine, the reasons for not doing things are often economic: a farmer is only willing to spend X dollars on an animal, and you have to spend those dollars wisely as the case progresses and you learn more. Sound familiar? But case reports also include descriptions of flat-out mistakes, how you recovered from them, and what you now realize you should have done. That's only realistic.)

The written exam is taken in two parts (in different years). There's a general exam, multiple choice, that takes a day. Then there's a more specific exam (concentrating on your sub-specialty, like food animal medicine) that takes two days.

The two day exam has a multiple-choice part. It also has an essay question part. ("Compare and contrast these two diseases...", "Explain the physiology behind coughing.", "Here are some toxic plants. Explain what they do...")

There's also a patient management part that works in an interesting way. You are given a set of symptoms. You pick tests you'd run. You get back the results of those tests. You can run more tests or try some treatments. At each exam step, you try something and get results. This continues until you can't think of anything more to do. You never know if you got to the answer by the optimal route, or even if you've gotten a complete answer (since the animal may have more than one problem).

This is serious stuff. Testing certifications do not stack up very favorably, in my opinion.

## Posted at 07:06 in category /testing [permalink] [top]

Mon, 21 Apr 2003

Knowledge and skill

Cem Kaner (ex-Silicon-Valley tester and consultant, author, leading light in the context-driven testing school, lawyer, and now professor of computer science) has a new blog. His first posting gives his final exam for a software testing course.

It's important because Cem is in the vanguard of a debate in testing: what would certification in testing mean, were it not to be completely bogus?

Most certification schemes aim to evaluate knowledge. The degenerate form of such a scheme would have students sit through a day of lecture, then immediately take a half-hour exam where they demonstrate that they heard what was said. Lather, rinse, repeat. Sit through enough days, and you become a Certified Tester.

The problem with such a scheme is that the connection between knowledge and skill is tenuous and indirect. Knowledge is what you know. Skill is what you can do. Employers generally want to hire skill, but have no skill at evaluating it. They make do with evaluating knowledge, or with accepting a certification of knowledge. Certifiers prefer to certify knowledge because it's much simpler than certifying skill.

(Caveat emptor: many oversimplifications in previous paragraph.)

Cem's exam is skill-based evaluation.

I rather like the way veterinarians are certified to be skilled at Internal Medicine. (I assume practitioners of single-species medicine [human doctors] are certified the same way, but I don't know.) I'll describe it in a later post.

## Posted at 08:19 in category /testing [permalink] [top]

Fri, 18 Apr 2003

Two testing principles, illustrated

I used to have the pleasure of working with Mark Miller, capability security guy. One of his charming quirks was that, when someone found a bug in his code, he'd send email to everyone describing how he'd gone wrong and what he'd learned by fixing the bug. He also bought lunch for the bug finder.

Here's a bug that escaped my tests. I found it through use yesterday. What did it teach me about my testing?

It's in a time-tracking program. Here's the sequence of command-line events.

start_day: This starts the background job, which happens to be named 'misc'. That job tracks time that's not tracked by a more specific job.
start 'plop': Now I'm working on PLoP. If I paused this job, 'misc' would resume accumulating time.
start 'stqe': Now I'm going to start editing an article for STQE. Starting this new job pauses the 'plop' job. But wait - I remembered something else PLoPish to do. So...
undo: This puts me back to working on PLoP. Whatever time had been accumulated for 'stqe' would be given to 'plop'.

At this point, I should have 'misc' paused, 'plop' running (accumulating time), and 'stqe' should have never been started. As I later discovered, both 'misc' and 'plop' were running. That's one of the worst bugs that could happen in a timeclock program. No double-billing at Testing Foundations!

The cause of the bug was that I incorrectly believed that I could use the 'stop' command to undo most of what 'start' does. That's true, but it also does something extra. As it proceeds, the 'stop' command calls out to all paused jobs and says, "Hey! I'm stopping the running job. Anyone want to volunteer to start up in its place?" The background job, 'misc', volunteered.

Why didn't my tests prevent me from making this mistake?

One of my testing principles is that tests should set and check the background. (Here, 'background' isn't supposed to mean 'background job' - it's supposed to evoke the background of a painting or a photograph - that part that's not the focus of your attention. Specifically, tests should check that what was supposed to be left alone was in fact left alone, not only that what was supposed to be changed was changed.)

Examining my undo test suite, I found that I had some tests of that sort. I had a test that made sure that an unstarted job was left alone. I had a test that checked that a paused non-background job was left paused. As far as I can tell, I only missed the one crucial idea for a test. I bet I wasn't following the background principle consciously enough. When coding in a test-first way, I sometimes find it hard to sit back, breathe deeply, and do a global case analysis of the problem.

Of course, testing should be robust in the face of human frailty. That's the root of another of my principles: use visible reminders. There are certain test ideas that have high value. If they're easy to forget (which you know by forgetting them), they should be written down. A reminder catalog isn't useful unless it's seen, so it should be highly visible (as an information radiator). When I look up from my screen, I should see test ideas like:

The background job
(it likes to restart)

in front of me, instead of a big poster for Gene Wolfe's The Book of the New Sun, a "decsystem10" bumper sticker from 1981, a Claude Monet print, an extra cover from a Patti Smith album, and an oil painting of galaxies.

## Posted at 09:25 in category /testing [permalink] [top]

Thu, 17 Apr 2003

The mechanical act of coding

In the latest installment of Dave and Andy's interview, Andy says:

As an anecdote, on one project Dave and I were implementing legal requirements in code. These were actual laws that had been passed. We got about three quarters the way through it and realized that the law as stated was wrong. It was inconsistent. If you read through the law one way, it said something had to happen. But in the next paragraph it contradicted itself. Nobody had caught it up to that point, because nobody had tried to do a detailed implementation enacting this legal requirement. This mere mechanical act of coding actually catches bugs in the real world.

This is also something we testers say: the mechanical act of being specific about inputs and expected results makes you think about the requirements in a different way. Things jump out at you. The effect is especially noticeable when you're doing exploratory testing, usually a tactile act of interacting with something semi-physical.

I've also found that the same is true of writing user documentation. If you're trying to do it well, you constantly feel obliged to explain why the product makes sense. Sometimes you realize it doesn't... But even without Eureka moments, the focus on "why" gives you yet another perspective.

The art of it all is to marshal lots of perspectives to give you a rich picture of what you're after. Perhaps ironically, the perspective I've always found least illuminating is the conventional rigorous-natural-language-with-lots-of-section-headers-and-traceability requirements document. But perhaps I've never seen them done well.

## Posted at 09:00 in category /testing [permalink] [top]

Fri, 11 Apr 2003

But Syd: Design sequences

I haven't posted for the last two weeks because I've been head down coding. I find my English skills disappear when I'm compulsively coding. (That includes writing comprehensible comments, alas.)

However, this coding has led me to reflect on a couple of threads bouncing around. (I can still think while coding, though only in the shower.)

Andy and Dave have an interview that I'd describe as "appending 'But Syd' to YAGNI". YAGNI means "you aren't going to need it". To that, I characterize the PragGuys as saying "But Sometimes You Do".

Now, I've always seen YAGNI as a playing of the odds more than a Law of Nature. The bet is that you will - on balance - save time if you do only what's needed immediately. Sure, you lose time undoing work to add generality. But there's less of that time than the time you'd lose adding generality that turns out to be unneeded or wrong.

However...

There's this program of mine. I knew all along that I would add an OSX Cocoa GUI. And I suspected I'd want an HTML interface too. But I didn't know Cocoa, and I wanted the program working for my own use, so I started with a command-line interface. (I live in Emacs anyway, so having the program be only an F8 away is convenient - better than a GUI, really.) As for laying down infrastructure for the other two interfaces... YAGNI.

Time has passed, and I fer sure need those two interfaces, because Bret and I will be giving a tutorial on Scripting for Testers in May. (In part, this advances our campaign to make Ruby be the tester's favorite language.) Students are going to test my program through the command line (which is the Ruby interpreter), the browser interface, and a "web services" interface (Distributed Ruby). And I have to be able to show them the GUI.

So my development sequence was this:

lots of command-line commands.
a rather shoddy GUI, but good enough for our course.
an HTML interface.

But I could have pursued an alternate sequence:

add one new command to the command line.
add it to the GUI.
add it to the browser interface.
iterate until those commands have made me write a solid infrastructure.
add the rest of the command-line commands so that I have a tool I can use.
add those commands to the other interfaces.

I believe that the second sequence would have led to my being farther along today and would not have delayed the first 'release' by much. I also have a hunch that the design would have been cleaner.

What does my experience have to do with Dave and Andy's article? Christopher Alexander (the architect who gave us software folk the word "patterns") talks about "design sequences". Here's what Ron Goldman had to say about them in a posting to the patterns-discussion mailing list (patterns-discussion-request@cs.uiuc.edu):

In "The Timeless Way of Building", "A Pattern Language" and his other writings, Christopher Alexander describes how patterns are organized into pattern languages, and how one must apply the patterns in a specific sequence to create a beautiful and whole design. He stresses this in his latest work "The Nature of Order" and on his web site (c.f. http://www.patternlanguage.com/sequencetheory/sequenceopener.htm).

When creating a new building, a sequence presents a series of design questions in a very specific order. As Alexander writes in TTWOB: "For once you find the proper sequence, the power to design coherent things follows from it almost automatically, and you will be able to make a beautiful and whole design without any trouble.... But, if the sequence is not correctly formed---if the sequence is itself incoherent, or the patterns in it are incomplete---then no amount of trying will allow you to create a design which is whole." [pg 382]

As a crude example, consider how important it is when building a house to first decide where on the site to build it. Only then can one make intelligent decisions about the house itself. Note that this is not just an example of top-down design --- though it does include elements of that --- but more about the order to tackle design questions that are at the same level.

Does This Apply to Software?

From Alexander's many years of experience he knows that sequences make sense for building houses, but what about for software? For example, in the old procedural programming world the general advice was to first design the data structures and only afterwards start writing code --- but that's pretty generic. How specific can we get?

My question is to those folks on this list that have developed many different implementations of similar software: Do you have a specific order to how you design it? If you're building another e-commerce site do you generally build it the same way each time? When building an embedded system is there a design question you always tackle first? Are there design questions that you know have immense ramifications, so you need to deal with them first?

My hunch is that Andy and Dave have experience that's led them to internalize particular useful design sequences for particular types of software. So they know when it makes sense to start by using metadata, and they don't need to wait for change requests to push them toward that conclusion. Such experience, in the hands of proven craftsmen, trumps the business-value-first approach that makes sense when you don't have proven sequences.

Unfortunately, Ron's discussion petered out, but I'd like to see it get restarted somewhere, because I think there's something there.

I'd also like to see a workshop somewhen that starts with a fairly fixed set of requirements, then has different teams implement them in different orders, then gets everyone together to reflect on how the different "trajectories" influenced the final product.

I'd also like to speculate on the interplay between Alexander's notion, where sequences seem to be a property of experts, and Dave's recent posting on karate, where things-like-sequences are more the province of novices. However, I've run out of steam for this posting.

## Posted at 07:05 in category /agile [permalink] [top]

Thu, 10 Apr 2003

Construction

In my original announcement of Analogy Fest, I said it would be about new analogies, not questioning boring old ones like construction and engineering.

Still true. But here are two links about the construction analogy.

Bret describes how the contractor building his new office assumes that the cost of change is low.

I've changed my mind plenty of times, and it hasn't cost me a cent. My contractor assumes that changes will be made. My architect, and many software people who are keen to make analogies, presume that it is cheap to make changes on paper, but expensive to make them with the physical structure...

On Kuro5hin, there's a longer essay:

Over the next several paragraphs we will examine how the analogy is broken (disfunctional) and why it is dangerous to use the analogy to guide our efforts to make software better.

(Thanks to Paul Carvalho for the second link.)

## Posted at 12:06 in category /analogyfest [permalink] [top]

Wed, 26 Mar 2003

Best practices?

A glimpse of Cem Kaner's upcoming talk at the Pacific Northwest Software Quality Conference:

Twenty-plus years ago, we developed a model for the software testing effort. It involved several "best practices," such as these:

the purpose of testing is to find bugs;
the test group works independently of the programming group;
tests are designed without knowledge of the underlying code;
automated tests are developed at the user interface level, by non-programmers;
tests are designed early in development;
tests are designed to be reused time and time again, as regression tests;
testers should design the build verification tests, even the ones to be run by programmers;
testers should assume that the programmers did a light job of testing and so should extensively cover the basics (such as boundary cases for every field);
the pool of tests should cover every line and branch in the program, or perhaps every basis path;
manual tests are documented in great procedural detail so that they can be handed down to less experienced or less skilled testers;
there should be at least one thoroughly documented test for every requirement item or specification item;
test cases should be based on documented characteristics of the program, for example on the requirements documents or the specifications;
test cases should be documented independently, ideally stored in a test case management system that describes the proconditions, procedural details, postconditions, and basis (such as trace to requirements) of each individual test case;
failures should be reported into a bug tracking system;
the test group can block release if product quality is too low;
a count of the number of defects missed, or a ratio of defects missed to defects found, is a good measure of the effectiveness of the test group.

Some of these practices were (or should have been seen as) dubious from the start... they are all far from being universal truths.

He continues:

Testing practices should be changing. No--strike that "should." Practices are changing.

For those who don't pay attention to testing because it's an intellectual backwater, be aware: there's a dust-up in progress. With luck, and sympathetic attention from "outsiders", things will be better in five years.

## Posted at 20:20 in category /context_driven_testing [permalink] [top]

Mon, 24 Mar 2003

How do you get to Carnegie Hall?

Chad Fowler has written a piece inspired by PragDave's essay on "artifacting". Chad writes:

there is one particular aspect of practicing "art" which seems conspicuously missing from the every day work of the programmer. As a musician, I might spend a significant portion of my practice schedule playing things that nobody would want to listen to. This might include scales, arpeggios, extended tones (good for improving control on wind instruments, for example), and various technical patterns.

This idea of practice has been a theme of Dick Gabriel's in recent years. Here's an abstract for a talk he gave at XP/Agile Universe 2001. The talk was titled "Triggers & Practice: How Extremes in Writing Relate to Creativity and Learning".

The thrust of the talk is that it is possible to teach creative activities through an MFA process and to get better by practicing, but computer science and software engineering education on one hand and software practices on the other do not begin to match up to the discipline the arts demonstrate. Get to work.

A final link: at an OOPSLA workshop on constructing software to outlive its creators, PragDave brought up the idea that we should be more ready to throw out our work and rewrite it. That ties in to a story Dick tells of the poet John Dickey. Here it is, from Dick's book Writers' Workshops and the Work of Making Things:

Dicky had numerous methods of revision. His favorite was redrafting, in which he would create new drafts of a poem until there were literally hundreds from which he could choose... Dicky viewed his process as one of experimentation. In the end, when he had hundreds of versions and drafts of a piece, he was able to choose from them. Surely some of them were better than others, and if he chose one of those, he was better off than he could have been.

PragDave inspired me to start throwing out and rewriting code, but I've done little of it yet. Unlike Dickey, I keep tweaking and tweaking and tweaking the same piece. There's never enough time to do it over, but there's always enough time to do it right...

## Posted at 15:13 in category /links [permalink] [top]

Two interesting reads this morning

Laurent on the principle of legitimate curiosity. How can you use questions about encapsulation to drive thoughts about what the real problems are?

PragDave on artifacting, our tendency to think too much in terms of nouns, too little in terms of verbs. That's also a hobbyhorse of mine. (Hence my slogan: "No slogan without a verb!")

Reading Dave's piece, I realized part of the reason I like verbs. We think of actual things in the world - rocks, trees, etc. - as standing alone. They are what they are, independent of things around them. That property we see in concrete objects is carried along into language, where we apply it to very abstract nouns like "requirements" and "quality". Those are not actual things, and they do not stand alone in any meaningful sense, but the conventions of language let us treat them as if they were and did.

In contrast, verbs customarily don't stand alone. In sentences, they take subjects and objects. Contrast two conversations.

Betty: What did you work on this morning?
Sue: Quality.
Betty: Ah... So, where to for lunch?

Betty: What did you do this morning?
Sue: Coded.
Betty: Coded what?

Now, in the second conversation, Betty could have skipped the followup question. And, in the first, Betty could have said, "What does it mean to work on 'quality'?" My point, though, is that verbs pull you to ask the next question, to look for subjects and objects and connections, whereas nouns make it perfectly easy to stop asking. Since one of our problems is that we stop conversations too soon (including our interior conversations), consciously using a lot of verbs helps us.

But now I must head back to the taxes... Why is it that a crash in a tax program is so much more disturbing than a crash in the operating system?

## Posted at 08:00 in category /links [permalink] [top]

Fri, 14 Mar 2003

Usability bugs and exploratory testing

Laurent Bossavit tells of a usability bug his wife encountered. It seems a library checkin system produced an annoying penalty screen, with sound, for each overdue book. When Laurent's wife checked in thirty books, the penalty beeps drove the librarian mad. Laurent asks "Could a tester have seen this one coming?"

An interesting question. Suppose I were doing automated product-level testing. I would certainly have a test in which I checked in multiple overdue books. (I'd probably toss in some not-overdue books as well.)

That is, I would not at all be surprised to start typing a test like this:

  def test_big_checkin_with_fines
    // add code to create some books

    @terminal.scan(three_days_overdue_book)
    assert_overdue_notice(@terminal)
    @terminal.acknowledge
    assert_checked_in(three_days_overdue_book)

    @terminal.scan(within_grace_period_book)

At about that time, I'd get annoyed because I could see a bunch of redundant typing coming up. So I'd extract a utility:

  def assert_overdue_checkin(book)
    @terminal.scan(book)
    assert_overdue_notice(@terminal)
    @terminal.acknowledge
    assert_checked_in(book)
  end
    
  def test_big_checkin_with_fines
    // add code to create some books

    assert_overdue_checkin(three_days_overdue_book)
    assert_overdue_checkin(within_grace_period_book)
    assert_overdue_checkin(ten_days_overdue_book)
    @terminal.checkin(actual_on_time_book)
    assert_checked_in(actual_on_time_book)

    ... check that the fines were calculated correctly ...

  end

More intention-revealing, though there is room for improvement. Much more pleasant to type. In fact, I've factored out the annoyance that, in a more audible form, is the bug (using Bret's definition of a bug as "something that bugs someone").

It's my belief that the act of automating would make me pretty unlikely to stop and say, "You know... this repetition is really annoying."

I further believe that those sorts of annoyances tend to skate past end-of-iteration demonstrations and the trials that product managers or agile-style Customers make. It's only the real users who use things enough to have their nerves rubbed raw. Real users... and manual testers. Manual testers are the only people on most teams who use the product like the real users and anywhere near as much as the real users.

That's one of the reasons why, when I talk about agile testing, I swim a bit against the current and hold out a role for manual exporatory testers.

P.S. "Nail-Tinted Glasses" (couldn't determine the real name) points out that interface design people ought to have the experience to avoid the problem. Agreed. They, like testers, should learn about bugs from experience. But, when they slip up, who notices?

## Posted at 16:55 in category /testing [permalink] [top]

Thu, 13 Mar 2003

The conduit metaphor (continued)

My earlier note looking at test-driven design in terms of the conduit metaphor has gotten more response than I expected.

Tim Van Tongeren writes:

You might like some of these other models, which take into consideration noise, decoding, and feedback.

Glenn Vanderburg wrote two blog entries riffing on it (here and here). Here are a couple of quotes to give you a flavor:

But the problem there, I believe, is largely that people have a hard time communicating about abstract things... the problem is that there's almost no feedback about whether you really understand what's being said. I've been completely confused and at the same time absolutely convinced that I understood everything...

When people have those same discussions in the context of code -- working code -- communication changes. It may seem strange to say this about software, but code is concrete. It's unambiguous. It grounds our discussions in a reality, and it's a terrific aid to effective communication and understanding.... It's no wonder that, in many test-first teams, design debates are frequently carried out through unit tests.
...

Part of the strength of XP and many other agile methods is that they don't address the communication per se; instead, they address the context in which it occurs. They strive to make communication less formal, more frequent, more concrete, more serendipitous, and (tellingly, I think) more redundant. The key is understanding that people want to communicate, and we're good at it, if the barriers are low enough.

I find the point about abstraction interesting. It's close to one I often make. I refer to myself as a "recovering abstractionist", so I tend to be dogged about insisting on concreteness and examples. (There are people who roll their eyes when I get on the topic.)

And yet... It seems to me that tests aren't entirely concrete. They're somewhere usefully between concrete and abstract. I don't know how to talk about this, exactly - maybe the point is too abstract. But if, as I said earlier, part of the purpose of tests is to provoke a programmer to write the right program, part of the craft of test-writing is to pitch the test at the right level of abstraction (or, if you don't believe in levels of abstraction, use the right words) to help the programmer think the right thoughts and take the right actions.

## Posted at 20:14 in category /agile [permalink] [top]

Shy code, adventurous testers

Andy and Dave have put more of their IEEE Software articles up on Pragmatic Programmer.com. I quite like The Art of Enbugging (pdf) with its metaphor of "shy code".

If, as I believe, the decade-or-more-long swing away from testers who can code is reversing, we have to confront the fact that way too many testers-who-can-code code badly. Testers should read Dave and Andy's book, The Pragmatic Programmer.

In return, programmers should read Cem, James, and Bret's book, Lessons Learned in Software Testing.

## Posted at 14:12 in category /testing [permalink] [top]

Tue, 11 Mar 2003

I'm a Bayesian Filter

Which spam filter are you?

## Posted at 15:11 in category /junk [permalink] [top]

Requirements and the conduit metaphor

In an influential paper, Michael Reddy argues that English speakers (at least) have a folk theory of communication that he calls the "conduit metaphor". (Sorry, couldn't find any good links online, but you can try this and this.) The theory goes like this. Communication is about transferring thoughts (likened to objects) from one mind to another. The first person puts his ideas into words, sends those words to the second, who then extracts the ideas out of the words. This metaphor infects our speech:

You've got to get this idea across to her.
I gave you that idea.
It's hard to put this idea into words.
Something was lost in translation.
The thought's buried in that convoluted prose.
I remember a book with that idea in it.
That talk went right over my head.

Why bring it up? Well, I have difficulty putting across (ahem) the idea that tests can serve as well as requirements to drive development. Part of the reason may be that requirements fit the conduit metaphor, and tests do not.

What is a requirements document? It is a description of a problem and the constraints on its solution, boiled down to its essence, complete, stripped of ambiguity - made into the smallest, most easily transmissable package. Send that across the conduit and, by gum, you've sent the idea in its optimal form.

Let me contrast the conduit metaphor to something I've heard Dick Gabriel say of poetry: that a poem is a program that runs in the reader's mind. Let me extrapolate from that (perhaps offhand) comment. What happens when the program runs isn't predictable. It depends on its inputs, whatever's already there in the mind. Extrapolating further, an effective communication is a program that provokes desirable responses from the recipient. It need not contain the information; we could alternately describe it as generating it.

An effective set of tests is one that provokes a programmer to write the right program.

To do that, the tests needn't necessarily describe the right program in any logical sense. That addresses the most common objection I hear to pure test-driven development (one where there's no additional up-front documentation), which is some variant of "No number of white swans proves the proposition 'all swans are white'." That's to say, what prevents the programmer from writing a program that passes exactly and only the tests given, but fails on all other inputs?

The answer is that what prevents it is not the "content" of the communication, but rather the context in which the communication takes place:

Everyone has practice generalizing from examples. What generalizations will this programmer make?
What will she assume?
What conversations will she have that might correct her mistaken assumptions?
Is there writing that can supplement the tests? (I annoy people by referring to requirements documents as "annotations to the tests", with the implication that they should only explain what needs explaining, not attempt to be complete.)
When the wrong assumptions get turned into code, how big a deal is it? Is it readily noticed? Readily fixed?
...

That seems unsatisfying: throw a logically incomplete set of test cases into the mix and hope that the programmer reacts correctly. Why not just send over all the information in a form so perfect that the only possible reaction is the correct one? Well, we've spent a zillion years trying to write unambiguous requirements, requirements that cause a programmer to make the same decisions the requirements writer would have made. It's Just Too Hard to be proposed as a universal practice. Pragamatically, I think many of us will do better to improve our skill at writing and annotating tests.

"The conduit metaphor: A case of frame conflict in our language about language", Michael J. Reddy, in Metaphor and Thought, Andrew Ortony (ed.)

## Posted at 15:11 in category /agile [permalink] [top]

Sun, 09 Mar 2003

What would Plato have thought of test-driven design?

Running through conventional software development is a strain of Platonic idealism, a notion that the "upstream" documents are more important, more true, more real than the code. The mundane final product is like the shadows on the wall in Plato's allegory of the cave, so software development is the process of adding progressively more detail that's progressively less essential.

As I understand it, test-driven design is swimming against that current. For example, there's a discussion of creating a board game test-first on the the test-driven development group. William Caputo writes:

So, I don't start with a story like "The game has Squares" I start with something like: "Player can place a piece on a square."[...]

What I am not doing is worrying about overall game design. [...] [Ideally], I let the design emerge.

I'm not a huge fan of the rhetoric of emergence, but I want to note the shift in perspective here. William does not appear to be concerned with capturing the essence of "Square", with finding the right model for a physical game in software, but rather with allowing some action to be performed with something named (for the moment) "Square".

From this perspective, it does not matter awfully whether any single person has correctly apprehended, nor correctly expressed, the essence of the problem or the solution. We can even remain agnostic about whether such an essence exists, whether there are a set of requirements that we could ever "capture".

Next: The Gathering Storm

## Posted at 14:07 in category /agile [permalink] [top]

Fri, 07 Mar 2003

Analogy Fest

Ken Schwaber and I will be hosting a rather odd event at the Agile Development Conference. It's called Analogy Fest. Here's the idea.

Software people often use analogies to other fields as a way of thinking about their work. The most tired one is construction: building software is like building buildings, therefore it makes sense to talk of architects, and architecture, and models that function like blueprints, and careful division of labor, and so forth.

Another popular one is that software engineering ought to be like real engineering. Therefore, we should strive for a type of software development that looks to the numerically-oriented sciences for guidance and lauds a "measure twice, cut once" style of work.

Those analogies are so pervasive that they often go without question. At Analogy Fest, we're going to leave them unquestioned. But what we're going to do is add more analogies. Each attendee will bring a new analogy, serious or fanciful, for discussion. The analogies will be explored in some detail, but our goal is not to assert that they're true. Our goal is that they serve as a trigger for inspiration, a way for a bunch of bright people to have a focused brainstorm that leads to unexpected results.

I hope some people show up! Again, the link is here.

## Posted at 13:58 in category /analogyfest [permalink] [top]

Sun, 02 Mar 2003

Pattern Languages of Programs 2003: fear and trembling

I'm the program chair for the 10th PLoP conference. The call for papers and proposals is out today. We've changed things from previous years: much more emphasis on focus groups, stricter standards for paper acceptance, extra assistance for novice authors, and - my pet project - lots of activities in which people learn by doing. I want this to be the "roll up your sleeves" PLoP.

I'm nervous that no one will submit proposals for the new categories.

## Posted at 11:01 in category /patterns [permalink] [top]

Sat, 01 Mar 2003

Mac OS X tool: Sciral Consistency

Sciral Consistency is a tool for reminding you to do things semi-regularly. For example, I want to check my Junk mailbox every three to nine days to see if my spam filter filtered out anything important. Every one to twenty days, I want to do something that improves my computing environment. Consistency reminds me to do those kinds of things through a display that looks like this:

That display is always visible on my left-most screen (of three). Each row is a task. You can see parts of some of the task names on the left. Each column is a day. Every so often, I'll glance at the display, focusing on today's column (the one with the brown heading). If I see red cells, I know I'm overdue for that task. Green cells are tasks I might do today or put off until later. Yellow cells mark my last chance: tomorrow, the cell will be red. Blue means that it's not time to do the task again yet.

The display of the past lets me know how well I'm keeping up with the things I want to do. As you can see, I slipped recently (a long consulting trip and some rush projects just after it).

It's really quite simple and surprisingly effective. Well worth $25.

## Posted at 18:34 in category /mac [permalink] [top]

Fri, 28 Feb 2003

Does test-driven development lead to inadequate tests?

In the test-driven development group, John Roth is unnecessarily modest:

The suite of tests that come out of TDD looks pretty insufficient to a classical tester.

Here's my response, which I decided to preserve.

I have to disagree, and I do it with my classical tester hat on. For two reasons.

First, let's suppose that TDD code is less well tested than the classical ideal. Doesn't matter, since the classical tester almost never sees code that comes close to that ideal. Testers spend much of their day finding, laboriously, bugs that programmers could have found easily. Typically, those are bugs where the programmer intended something, but didn't check that she had in fact achieved her intention (or didn't check that her intention stayed achieved as the program changed). TDD certainly does that.

Next, I argue that TDD code isn't all that far from the classical ideal. There are really three kinds of classical ideal.

1) The test suite meets some structural coverage criterion. Generally, the minimal criterion considered acceptable is that all logical tests evaluate both true and false. For if (a && b), that requires three tests.

I believe a competent set of TDD tests will come awfully close to that criterion. It might not reach more stringent ones, such as those based on dataflow coverage, but I argue that those criterion don't have much to recommend them (The Craft of Software Testing).

2) The code meets a bug coverage criterion: some variant of mutation testing. (Jester measures a kind of mutation coverage.) Generally, mutation testing works by making one-token changes to the program and checking whether any test notices that change. The idea is that a test suite that can't tell whether the code should be a < b or a > b must be missing lots of other bugs. That seems plausible.

What doesn't seem plausible to me is the assumption that a test suite that can detect one-token bugs is necessarily good at detecting more complicated bugs. As far as I know, there's no hard evidence for that, and I suspect it's false (because of faults of omission). There are other problems with mutation testing - it can require a lot of tests, and my (limited) personal experience is that it has a low "Aha!" factor. That is, when I designed what I considered to be good tests, then checked them with a mutation testing tool I'd written, the work rarely resulted in my saying, "Oh, that's a good test."

So I wouldn't worry about that criterion.

3) Note that I said "pretty good tests" above. Where did those pretty good tests come from? Experience. I have in my head (and, sometimes, on paper) a catalog of plausible bugs and the kinds of test cases that find them. I've built this by, first, standing on the shoulders of others and, second, by paying attention when I see bugs.

TDD asks programmers to pay attention. In notes they've written to this list, John Arrizza and Bill Wake have shown that they are. Ron Jeffries has spoken of learning how to test better by paying attention to what bugs slip past. As paying attention becomes part of the received wisdom, all TDD programmers will write tests that look pretty darn sufficient.

## Posted at 14:05 in category /testing [permalink] [top]

Wed, 26 Feb 2003

More on command lines and tests

A faithful correspondent notes that my entry on command lines as refactored tests is incoherent. Let me explain the structure of the app.

It's named Timeclock. It's a client-server app. There are two user interfaces, the GUI and the command line. Command line commands are just Ruby functions, and the command line is normally used through the Ruby interpreter.

Now let's add the tests to the picture:

The top set tests the command line. Among other things, they check that return values and exceptions from the Session are turned into pleasing messages to the user. The bottom set are direct tests of the Session: they create Session objects, call methods on them, and check the results.

To make the Session tests shorter and clearer, there should be (and are) test utility methods:

However, my observation was that a goodly chunk of the test utility methods would look a lot like the command line commands. The command line commands can already be used in unit tests (namely their own), so why not use them as utility routines when testing the Session? That means that the command line tests can be used to both test the command line and the underlying Session methods. Doing so appeals to my desire to eliminate duplicate code.

## Posted at 09:34 in category /testing [permalink] [top]

Tue, 25 Feb 2003

Command lines as refactored tests

I did another test-first demo, this time in Ralph Johnson's software engineering course. I'm chagrined to say that in a 1.5 hour class, I ended up producing only this code in two test-then-code iterations:

 def move(seconds, from_id, to_id)
   shorten(from_id, seconds)
   shorten(to_id, -seconds)
   [records.record_with_id(from_id), records.record_with_id(to_id)]
 end

We were just starting to get into the swing of things when I looked at the clock and saw I had five minutes to go. Argh!

Part of the slowness was due to popping up for meta-discussions about test-first (and, I confess, digressions like the implications of test-first for the choice between statically and dynamically typed languages), and part of it is that I had to explain the application under test. But, as I was writing this note, I found myself chastising myself for not being dutiful enough about refactoring common test code.

You see, the move command of the Session object moves time between two time records. That means my test had to install records before calling move:

 first_record =
   @session.add_record(FinishedRecord.new(first_time, 1.hour,
                                                      first_job))
 # do the move, check the results.

But to do that, my test had to create (and I had to explain) a Time and a Job to be part of the FinishedRecord.

So, I found myself telling myself, I should have long since factored out an add_record test utility that created the intermediate objects for me.

But wait... it's common for me to test new internal methods through the command-line interface. If I'd done that, my test would have looked like this:

 job 'first job'
 job 'second job'
 add_record '12:45 pm', 1.hour, 'first job'
 add_record '1 pm', 2.hours, 'second job'
 # do the move and check the results.
 ...

(Because my command line is Ruby, I can use the same unit testing tools as I do when testing internal objects.)

In other words, the command line 'move' is almost the easy-to-use-and-explain utility method I berated myself for not having written. It just lives at a different layer, one with a pretty straightforward translation to the Session.

I deviated from my common practice because I didn't want to have to explain the tradeoffs between testing methods directly and testing them indirectly. I wanted to launch into showing students some tests. Instead, I found myself explaining too much and slowing myself down.

I don't want to make any general claims for this technique. But for this application and this person, the command line serves as a handy collection of test utilities. Because the command line is Ruby, I can also "dive beneath the surface" and call methods directly when I need to.

I love command lines.

## Posted at 15:34 in category /testing [permalink] [top]

Fri, 21 Feb 2003

Generative and elaborative tests

In the test-driven development group, John Arrizza writes:

IMO, there are two kinds of tests in TDD, both of which are heavily related to each other. The first is a speculative test, it will cause you to write code for a very small part of new functionality. The new code satisfies some small part of your spec, requirements, task, Story, etc.

The second kind of test will confirm that the code you've just added works as you expect in and around the existing functionality. It examines slightly different but related parts of the solution space that are important to you. These do not have to be comprehensive, but they do have to be relevant! You write these until your anxiety is satisfied, or as Phlip says "write tests until you're bored."

The first kind moves the design forward in an area, the second fleshes out the design in that area.

Bill Wake chimed in:

I have the same feeling - I call the first set "generative" as they drive through to generate the design, and the second set "elaborative" as they play out the theme and variations.

I think naming these two (of N) types of tests is a useful thing to do, so I intend to talk about generative and elaborative tests from now on.

## Posted at 13:37 in category /testing [permalink] [top]

Some tools for Windows switchers

I switched from Windows to Mac OS X last summer. It seems I know a lot of people who have switched too (PragDave), will switch as soon as their 17 inch TiBook arrives (Cem Kaner), or won't be able to resist forever (Mike Clark).

Back in my Unix days, I used to be quite the tinkerer. I fiddled with my environment to make it just so. When I went to Windows, I stopped doing that. I just submitted. On OS X, I'm back to spending time making myself more efficient.

Here are some thing I've done that other switchers might like. Let me know what you've done.

I have a 15 inch iMac, but I've migrated to using my 15 inch TiBook as my work machine. More pixels. I normally run it with a Sony SDM-X52 15 inch flat panel display as a second monitor. Having a second screen is just enormously more efficient for so many tasks. The Sony isn't the greatest display, but it's the best I could find out here in the boondocks for around $400. (I didn't want to buy sight unseen.)

I actually have my iMac set up to the left of my TiBook, so I'm facing a semicircle of screens. It looks cool, but the iMac is just running iCal, my current todo list (in TextEdit), a program called Consistency (about which, more later), and iTunes.
I bought a Maxtor firewire drive as a backup device. Daily, I make a bootable disk image to it using Synchronize Pro X. I find it reassuring to know that, if my hard disk fails, I can run from the firewire disk. The Maxtor drive is a touch noisy. Synchronize Pro X is a bit pricy ($100), but I also use it to synchronize directories between the two macs.
I like switching programs from the keyboard, and I don't like the way the Dock does it. I always overshoot. So I bought Keyboard Maestro for $20. Option-Tab cycles you through a list of only the running programs. There are other commands to get you quickly to the finder, to quickly quit all programs, to back up if you overshoot, etc. It also lets you have N clipboards instead of just one. You can also program the keyboard to do arbitrary things. For example, when I'm in Emacs, F6 means "switch to project builder, build the debug version of the current program, and run it". That way, I can write my Ruby code in Emacs but still use the Project Builder for the things it's good for.
I use an Aquafied Emacs instead of the one that ships with the Mac.

## Posted at 08:08 in category /mac [permalink] [top]

Thu, 20 Feb 2003

Don't confuse more exact with better

Ron Jeffries has a quote file. I'm in it for "Don't confuse more exact with better". I don't remember the conversation where I said that, but I sure do believe it. Here's an example of exactness that's worse, as recounted (in PDF) by James Bach:

"The screen control should respond to user input within 300 milliseconds." I once saw a test designer fret and ponder over this requirement. She thought she would need to purchase a special tool to measure the performance of the product down to the millisecond level. She worried about how transient processes in Windows could introduce spurious variation into her measurements. Then she realized something: With a little preparation, an unaided human can measure time on that scale to a resolution of plus or minus 50 milliseconds. Maybe that would be accurate enough. It further occurred to her that perhaps this requirement was specified in milliseconds not to make it more meaningful, but to make it more objectively measurable. When she asked the designer, it turned out that the real requirement was that the response time "not be as annoyingly slow as it is in the current version of this product."

Thus we see that the pragmatics of testing are not necessarily served by unambiguous specification, though testing is always served by meaningful communication.

I wasn't the designer James mentions, but I was at the lunch where the topic came up. I'm sorry to say that I got as caught up as the designer in the quest to satisfy - not question - the requirement. Exactness has that magic power.

That's one of the things that makes me worry about the generally good tendency in agile projects toward test automation. I know that agility is all about conversation. But people choose what to converse about. I believe that creating tests provokes many useful conversations as you strive to make ideas testable. But I also believe that having tests tends to foreclose future conversations. Wise people - like James - try to break free of the exactness of tests.

## Posted at 13:17 in category /testing [permalink] [top]

Wed, 19 Feb 2003

Lots of tests up front

On the test-driven development group, I asked if people had experience switching from a process where programmers implemented all the unit tests after the code to one where they implemented all the unit tests before any of the code. That replaces Big Design Up Front with Big Test Up Front (sort of). I speculated it might be a reasonable intermediate step in organizational change.

C. Keith Ray writes the following:

Writing a whole lot of tests, and then writing the code to pass the tests has allowed my coworkers to create a whole lot of redundant code. In their case, they copy-paste-modified a bunch of tests, and then copy-paste-modify a bunch of code.

They didn't have a refactor step after getting each test to pass.

With a refactor step between each test, they could have noticed that the second piece of code to pass the test was almost identical to the previous piece of code, and thought about making a parameterized object instead of duplicating a method and changing one line.

Darach Ennis writes:

When I started out in the big bad world after university I joined an organization which followed a 'One Shot Deal' development lifecycle. SEI CMM with the V Model to be precise...

At the time I didn't know any better so I gave writing my test stories up front a shot when writing requirements etc..

Ultimately whenever I started coding, after a long train of documents and reviews, I quickly found the documented tests becoming less and less relevant as my assumptions were always incomplete, innacurate and sometimes irrelevant.

I spent more time re-writing documents than coding or contributing anything really useful. [...] My question would be whether the frustration and wasted work was better or worse with tests than it would have been with design documents. - bem

For me, test first implies one test or task at a time. Writing 10 or 20 tests without feedback from code is presumptuous and may prevent the minimal amount of code required to satisfy a stories acceptance criteria from emerging.[...]

[A company] might stand to gain a lot more if they can objectively gauge differences in methodologies rather than provide a 'migratory' testing approach for them to ease them into test-first. Sometimes its better just to roll up ones sleeves and give it a shot.

That's how I started out with TFD/TDD. One day I just decided to give it a shot. It took a few weeks before TFD/TDD clicked. It took another few months before I started to become proficient.

## Posted at 16:55 in category /testing [permalink] [top]

Fixtures as controllers - credit where it's due

Earlier, I speculated that FIT fixtures might be controllers, as in model-view-controllers. I learn that Rob Mee speculated the same speculation in exactly the talk I saw him give. I wish I could say that great minds think alike, but it's actually that my mind forgets where it learned something.

## Posted at 16:16 in category /testing [permalink] [top]

Tue, 18 Feb 2003

Is it useful to think of FIT fixtures as controllers?

This morning, while waiting for the alarm to go off, I was thinking about Rob Mee's style of developing test-first with FIT. He starts with a test, puts code to implement it in the FIT fixture, then over time refactors it out into the application proper. (I think I've seen Ron Jeffries and Chet Hendrickson do something similar.)

I've been doing a little Cocoa programming recently. Cocoa, derived from NextStep, uses what it calls a model-view-controller architecture. (It differs from the original Smalltalk-80 MVC, and I know the whole issue of MVC is confused.)

In Cocoa, controllers are classes that exist to decouple the views (which handle both user output and input) from the application model. Put another way, they serve to translate from the user-visible domain to the internal domain and back.

In a way, it seems that FIT fixtures exist to translate from the user-conceptual domain ("what I want this program to do for me") to the programmer-conceptual domain ("what the guts look like"). I wonder if there's any intellectual leverage to be had by imagining what we'd do differently if we thought of fixtures as controllers? What grabs my attention is that maybe we don't think of fixtures as incidental code, just there to make the tests work, but as a part of the architecture in their own right. Maybe interesting serendipity could arise?

## Posted at 07:21 in category /testing [permalink] [top]

Mon, 17 Feb 2003

At least I wasn't Windows ME

Which OS are You?

## Posted at 12:52 in category /junk [permalink] [top]

A test for programming group health

Every programming group should have at least three people who think this feature of Ruby is cool:

> 1 + 1
2

> class Fixnum
>   def +(other)
>     self - other
>   end
> end

> 1+1
0

It's also possible that every group should have at least three people who don't.

A healthy group is one in which both types try to respect each other.

## Posted at 06:46 in category /ruby [permalink] [top]

Sun, 16 Feb 2003

Test-first up close and personal

One thing I like to do (sometimes for pay, sometimes for fun) is to sit down with people and do some test-first programming. When I do it for pay, I try to have three people (including me) sit down around a terminal for three hours while we add some code test-first to their app. That works pretty well - I see the light go on in about half the people's eyes.

I did it for fun last week at Florida Tech. That really drove home, I think, what's right about the above approach:

By working on their app, their interest is immediately engaged. And it takes less time to explain their app to me than it does to explain my app to them. (That's partly that I'm better at knowing what I don't need to know, partly that my app is in Ruby and does use Ruby-isms. I enjoy explaining Ruby to people - and a lot of them get a Ruby-light in their eyes - but it's a bit off-topic.)
Three hours works better than an hour and a half. What with chatting, following digressions, and explaining background, an hour and a half isn't long enough to get through enough iterations. It makes the difference between something interesting they saw and something interesting they did.

It would be better to pair than triple, but there isn't usually enough of me to go around. I've found that more than a triple doesn't work - there's not enough physical space near the screen, so one person drifts to the edge of the group and doesn't get involved.

## Posted at 18:45 in category /testing [permalink] [top]

Fri, 14 Feb 2003

Demo test-first by starting in the middle

Yesterday, I did a long demo of test-first programming in Cem Kaner's Software Testing 2 course at Florida Tech. I had a great time. I think the students had a somewhat less great time, though still OK.

Here's the thing I did that I probably won't do again. My style of doing test-first programming is a touch odd. Suppose I'm given a task. (In this case, the task was changing the way a time-recording program allows the user to "lie" about when she started, stopped, or paused a task.)

I begin by writing one or a couple tests from the user's perspective. (In this case, I wrote a test that showed a simplified day's worth of tasks. At one point, the hypothetical user started working on a task, but forgot to tell the program, so she later tells it, "Pretend I started working on task X half an hour ago." Then she forgot to stop the clock at the end of the day, so the next day begins by having her tell the program, "Pretend I stopped the day at 5:15 yesterday.")

As I write one test, I get ideas for others, which I usually just note in a list.

After finishing the test, I'll run it, make some efforts to get it to pass. But I quickly get to a point where I choose to switch gears. (In this case, test failures walked me through the steps of creating the new command, but when I got to the point where the command had to do something, there was no obvious small step to take.)

When I get stuck, I switch gears to more conventional fine-grained programmer tests. (I said, "OK, now it's time to create a new object to manage the desired time. What's a good name for its class?... We really want only one instance. Will client code want to create it in one place and hand it to each relevant command object? Or should command objects fetch a singleton? How about just using class methods?" - all questions posed by writing the first test for the new class.)

That all went well, from my point of view. The problem is that it didn't convey well the tight test-code loop that is essential to test-driven design. We spent a lot of time on that first user-level test, and it took us - I think - too long to get to what I bet the students considered the real code.

So next time I do such a thing (at Ralph Johnson's software engineering course), I think I'll start after I've already defined my first whack at the user experience (though the acceptance-ish test). I'll start in the middle of the task, rather than at the beginning.

## Posted at 09:42 in category /testing [permalink] [top]

The McGee/Marick Hypothesis

Pat McGee and I were talking about whether you can use test-first design to grow systems with good security. I've had some contact in the past with an alternative approach to security, capability security. I wondered whether it would be easier to grow a secure system on top of this better foundation (hoping that Pat would go off and try it).

This led us to this hypothesis: if you find it hard to build something using test-first development, then you're fundamentally looking at the wrong problem (or building on the wrong foundations).

Anyone have examples that are less speculative? (The common difficulty starting with test-first against a huge monolithic chunk of legacy code? Testing GUI systems without a clean separation between presentation and model?)

See this links page for more about capability security.

## Posted at 09:09 in category /testing [permalink] [top]

Wed, 12 Feb 2003

Is management more interesting than testing?

Today I'm at an editorial planning meeting for STQE magazine. The other technical editor, Esther Derby, and I have an ongoing gentle rivalry. We pretend that she only does squishy, people-oriented, management articles, and I only do people-ignoring, technology-focused articles (mainly on testing and tools).

We were listing possible article topics and authors on the white board, and I was inspired to say, "You know, the squishy management articles are more interesting than the testing articles."

She made me promise to put that on my blog. Here it is.

However, I hope what this represents is a failure of imagination - or inspiration - on my part. I moan too much that mainstream testing has gone stagnant, that not much new is happening except in the intersection of agile methods and testing (where a lot is happening). I must prove myself wrong this editorial year.

## Posted at 14:08 in category /testing [permalink] [top]

Fri, 07 Feb 2003

Simplicity and complexity, programmers and testers

Glenn Vanderburg writes a well deserved paean to Ward Cunningham:

I have a strong preference for simplicity in software, and I'm frequently frustrated by people who seem to value complexity instead, and who don't seem to understand why the simpler solution is usually preferable.

But I shouldn't be so hard on them, I guess, because it's all a matter of degree. There are designs that are too simple for me to really grasp. And I don't mean designs that are too simple to work; I mean designs that seem too simple to me, because I don't understand how the simple solution meets all the needs. Want examples? Take a look at nearly everything Ward Cunningham does.

That made me think. Simplicity and the related notion, elegance, are typically considered virtues in software development, as they are in mathematics. They're virtues often honored in the breach, that's true. But they are nevertheless known virtues. The makers of grotesquely bloated specifications and APIs have at least heard some professor talking about elegance, or they've read Tony Hoare's Turing Award lecture ("There are two ways to design systems: make them simple enough to be obviously right, or make them complex enough not to be obviously wrong."), or they've heard of Ritchie's rule that good OS design comes when, any time you add a system call, you have the discipline to take another one out. (I'm dating myself terribly with these examples - maybe I should talk about POJOs.)

But, while simplicity is part of the culture of programming, it's not part of the culture of testing. In fact, testers seem to revel in complexity. They believe "the devil is in the details" and see part of their job as finding oversimplification. They especially look for faults of omission, which Bob Glass called "code not complicated enough for the problem".

Whatever agile testing will be, it will mean bringing those two cultures into closer communication and cooperation. Right now, they operate independently enough that the clash of values can mostly be ignored.

Bret's paper, Testers and Developers Think Differently, is probably relevant here.

## Posted at 09:18 in category /agile [permalink] [top]

Thu, 06 Feb 2003

Personality and process: a story

Esther Derby passes along this in response to my riff on process and personality (which you'll find embedded in the story below).

Here's a story (a true story, from just last week):

I was talking the other day to a woman who works with companies to improve their project-ability.

She had a potential client who was set on instituting some sort of rigorous PMI-type project management. Most of the people who were actually working on projects found the thought of PMI-type management horrifying.

When she probed further, she found that the potential client was reacting to the stress-related death of a 30-year old project manager on his staff. He never wanted that to happen again. He desperately wanted to reduce the stress his development staff and PMs were living with. His answer, the answer that would reduce stress for *him* was a mechanistic model of project management.

When the consultant keyed in on the difference between what would reduce his stress and what would reduce stress for the staff, they were able to find a different, and more fitting, solution.

## Posted at 19:37 in category /agile [permalink] [top]

How micro should a test-first micro-iteration be?

Canonical test-driven design uses very small micro-iterations. You write a test that fails, then you write code that makes it pass, then you maybe refactor, then you repeat.

I tried it that way for about 2000 lines of Java code. It didn't stick. The experience has caused me to write fewer tests before writing code, but not always just one. Sometimes it's one. Sometimes it's more.

Why not? Why not do it the right way? It's because I don't believe there is a right way. I wrote the following in an upcoming STQE editorial:

Of course, I can justify my approaches, showing how they're grounded in careful reasoning and historical experience. But let's be frank: my approaches are not just compatible with my personality - they're derived from my personality. And so are everyone else's, even though they also justify their approaches with history and reason.

Every article on methodology implicitly begins the way this one did: "Let's talk about me." Methodologies and techniques are developed to help their developers get ever better at what they do, while allowing them to be who they are. So those people who like things laid out as a whole for dispassionate review create and refine inspection techniques to make them better at such review. Other people feel unsettled without interaction and exploration, so they create and refine test-driven design and pair programming to help them do that better.

(There are similarities between this position and Kent Beck's oft-quoted "All methodologies are based on fear". But I think more than fear matters.)

My approach to test-driven design balances two forces: my desire for interaction and exploration, and my feeling that I'm building a whole thing that I need to grasp in some sort of entirety. I get pleasure from both, so I devise a way to do both.

Like most people, I operate mostly on autopilot. But I am often conscious, when having just written an assert, of wondering whether I'll learn more by writing another test or by writing code. What I do next depends on my answer.

## Posted at 13:46 in category /agile [permalink] [top]

Wed, 05 Feb 2003

Ideas for a course on programmer testing

I've been contacted by a big company that would like a lot of programmers taught programmer testing. My job would be to design the course and hand it off to an internal trainer. I think they were thinking more of conventional programmer testing (write code design + test plan, write the code, write the tests). I'm talking to them about test-first programming. Here are my talking points for a phone conference tomorrow.

There are two main issues:

How should unit testing be done in a way that makes it "sticky"? Goal: unit testing is still being done, with thoroughness, in two years. Experience tells us that's not an easy goal.
How can it be taught to a large group of programmers both efficiently and effectively? A large group argues for classroom teaching, but my experience has been that small groups and 1-on-1 teaching is far more effective. How to balance these?

I write about the first by way of background, then get concrete when talking about the second.

== Sticky unit testing

There seem to be three main (non-exclusive) ways to make unit testing stick:

Programmers must see it as helping them. Not only must it help them deliver fewer bugs, but they must think it speeds and smooths their programming. That means:
- tests are written first, before the code. (I mean the actual test cases, not just test plans.) Tests written after the fact may mean fewer bugs, but programmers will think the tests slow them down. Under schedule pressure, the tests will go by the wayside.
- all tests must be automated, to give programmers a safety net that lets them make changes with confidence. Programmers will need to run these tests easily and often. (For example, I run all my tests multiple times an hour by hitting F10.) Ease and speed have implications for test tooling and program design.
Testing must be tightly integrated with the act of programming. What works well is a rapid test-code cycle, where a programmer implements one or a few tests, writes the code to make those tests pass, then proceeds to the next test.
Peer pressure. Even with the above, testing requires discipline. It's easy to yield to temptation and skimp on testing. For example, under schedule pressure, programmers might do the minimum the rules require, rather than what the problem demands. When a test suite is weakened in this way, it becomes less useful. That means it's easier to yield to temptation next time - and you're in a downward spiral. To avoid that, yieldings need to be exposed to the light. XP's pair programming is one good example of that - if everyone programs in pairs, each person keeps the other on the straight and narrow.

Implications for teaching:

training should emphasize micro-iteration, test-first programming with fully automated tests.
training should be actively designed to encourage peer pressure.

=== An efficient and effective course

My experience is that unit-testing training should emphasize practice over concepts. That is, learning to test is more like learning to ride a bike than learning an algorithm for finding square roots. You have to do it, not just hear about it.

Within the past year, I've shifted my programmer testing training toward practice. I give small lectures, but spend most of my time sitting with groups of two or three programmers, taking up to half a day to add code test-first to their application. I'd say about half the people "get it". That means they want to try it because they believe it would help, and they believe they could ride the bike - perhaps they'd be a little wobbly, but they're ready to proceed.

A 50% success rate seems low, but my goal is to have a core of opinion leaders doing test-first. As they become successful, the practice will spread. I am no doubt prejudiced, but the other 50% are generally not the kind of people that spread new practices anyway.

But at =Anonycorp= we want to train a lot of people, and we want to train them fast - no time for understanding to radiate out from a core group of opinion leaders.

Here's how I envision a class. It would be three days long. It would be no larger than 15 people. (10 would be better.) There would be two or three "alumni" returning from previous classes to help out in exercises. (The alumni would be different for each class.)

There would be ordinary lectures on two topics:

unit testing in general
those parts of =Anonycorp='s testing strategy that apply to programmers.

These would be followed by a test-first demo, where the instructor adds a feature test-first to some code.

But the bulk of the class would be exercises in which groups of 2-4 people do realistic work. That work would be of two kinds:

adding a new feature to some code.
cleaning up code ("refactoring") while simultaneously creating a test suite that makes adding features later both safer and faster.

In both cases, the code would come from one of your products.

Both types of exercises would be lengthy (hours) and repeated once. After each exercise, there would be a class discussion of what they did. Groups would mix up between exercises.

The class would end with a discussion of what people expect next. What hurdles will they have to overcome? How do they plan to work on them?

We'll be explicit that our goal is not just teaching techniques, but team building. When they have questions about testing after the class, they should feel free to talk to other students. (It's probably best if most of a class is from a single team or product area, so talking will come naturally.)

After about a month, there'll be a two hour "class reunion", facilitated by the instructor, where people can take stock (and the instructor can learn to improve the course).

Moreover, during the class the instructor will have kept an eye out for natural enthusiasts and opinion leaders. After the class, those people will be cultivated. The instructor will wander by and see how things are going - and not forget to ask them if they'll help out as alumni in later classes.

== Notes

This is still weak on peer pressure. We should discuss what would work well within =Anonycorp=.
Four multiple-hour exercises + discussion is a lot for a three day class.
...
All this post-class interaction doesn't mesh well with remote sites.

## Posted at 11:05 in category /agile [permalink] [top]

Tue, 04 Feb 2003

Learning from Mistakes

Pragmatic Dave Thomas has a nice little bit on how pilots learn from mistakes. They read accident reports. In our field, we should build on the bug report.

I once worked with someone who took great glee in dissecting bugs in his code. Not only would he buy lunch for anyone who found a bug, he'd write long, loving notes analyzing the bug, why he made it, etc. We need more people like him.

I note in passing that I edit the "Bug Report" column for STQE Magazine. I'm always looking for authors.

Here's an example of a kind of bug I learned about. I call it the "just what I needed (almost)" bug (PDF).

## Posted at 09:31 in category /bugs [permalink] [top]

Cooper and Beck Debate

Bret points us to an interesting debate between Alan Cooper and Kent Beck. It's clear they're coming from fundamentally different positions:

Cooper: Yeah. The instant you start coding, you set a trajectory that is substantially unchangeable. If you try, you run into all sorts of problems...

and

Cooper: Building software isn't like slapping a shack together; it's more like building a 50-story office building or a giant dam.

Beck: I think it's nothing like those. If you build a skyscraper 50 stories high, you can't decide at that point, oh, we need another 50 stories and go jack it all up and put in a bigger foundation.

Cooper: That's precisely my point.

Beck: But in the software world, that's daily business.

Cooper: That's pissing money away and leaving scar tissue.

Beck: No. I'm going to be the programming fairy for you, Alan. I'm going to give you a process where programming doesn't hurt like that [...]

Cooper and Beck have different worldviews. They disagree, at a fundamental level, about what kind of thing software is. That put me in mind of a position paper I wrote recently, called "Agile Methods, the Emersonian Worldview, and the Dance of Agency", that tries to characterize the agile worldview.

At the workshop, one of the attendees said that he'd described my paper to his wife, a physician, and she said something like "if he really believes that, my diagnosis would be clinical schizophrenia." I still haven't decided if I'm taking that as a compliment or an insult...

## Posted at 08:45 in category /agile [permalink] [top]

Mon, 03 Feb 2003

A successful workshop on test automation

Bret Pettichord and others organized a successful workshop on test automation in Austin in late January. I'd characterize it as one of series of good interactions between the XP folk and the context-driven testing folk. This blog is a direct result of that. There will be more exciting results later.

## Posted at 20:35 in category /agile [permalink] [top]

Context-driven testing and agile development

I'm a member of a school of testing called "context-driven". Part of what I want to do with this blog is talk through what context-driven testing is, what agile testing might be, and how and whether they intersect.

I've never been enormously fond of the context-driven manifesto. It somehow seems to miss the essence, and - worse - it's too easy for anyone to agree to, even people who seem to me entirely not of the school. (Note: I had a hand of some sort in writing that manifesto, so I'm not blaming anyone else for its failure to thrill me.)

In a talk I gave at SD East, I listed things about agility that resonate with me. Agility emphasizes:

conversation and collaboration
minimal documentation
working software soon
responding to change

Then I listed things about context-driven testing that distinguish it from conventional testing:

projects are conversations about quality
documents are treated as 'interesting fictions'
a big emphasis on producing bug reports soon (bug reports against working software)
responding to change

Pretty close parallels, it seems to me.

## Posted at 20:35 in category /context_driven_testing [permalink] [top]

About Brian Marick

I consult mainly on Agile software development, with a special focus on how testing fits in.

Contact me here: marick@exampler.com.