Archive for the 'TDD' Category

Three proposed workshops in Europe

I’ll be in Europe definitely between 9 April (Bristol) and 20 April (Kiev) and likely either before or after those dates. I have some ideas for workshops that we might have while I’m there, provided we can scare up someone to do the local arrangements (venue, mainly). I can work on finding participants, though help would be greatly appreciated.

Here are the ideas:

“Shell and Guts” combinations of object-oriented and functional programming

There’s been much talk about a strategy of having functional cores, nuggets, or guts embedded within an object-oriented shell. Here’s a picture from my recent book:

At SCNA, Gary Bernhardt described a different-but-similar approach. (Very neat - incorporating actor-ish communication between shells.)

This workshop would be about exploring the practicalities of such an approach. (Credit to Angela Harms, who proposed a workshop like this to me and Mike Feathers.)

Next steps in unit testing tools

This would bring together tool implementors, power users, and iconoclastic outsiders to make and critique proposals for moving beyond the state-of-the-practice of unit testing tools. Bold proposals might include: use QuickCheck-style generative/random-ish testing instead of or in addition to conventional TDD, or claiming that the next step in TDD is incorporating or replacing the sort of bottom-up tinkering in the interpreter that’s common in (for example) the Lisp tradition.

Community building for language and tool implementors

Although I released my first open-source tool in 1992, I’ve always been lousy at building communities around those tools. Other people have succeeded. This workshop would be a collection of people who’ve succeeded and who’ve failed, sharing and refining tips, tricks, viewpoints, and styles.


Of the various meeting formats I’ve experienced, one of the most productive has been the LAWST format that I learned from Cem Kaner and Brian Lawrence. Some of the LAWST workshops produced concise, group-approved lists of statements that had substantial influence going forward. These three workshops seem like the kind that should produce such statements.

The LAWST format works best if it has a facilitator who has no stake in the topic at hand. (Such facilitators might well follow the style of How to Make Meetings Work or Facilitator’s Guide to Participatory Decision-Making.) I’d hope the local arrangements person could help find such.

If you’re interested in making one of these workshops happen, please contact me.

Outgrowing rhetoric-heavy business-facing tests

Tools like Fit/FitNesse, Cucumber, and my old Omnigraffle tests allow you to write what I call “rhetoric-heavy” business-facing tests. All business-facing tests should be written using nouns and verbs from the language of the business; rhetoric-heavy tests are specialized to situations where people from the business should be able to read (and, much more rarely, write) tests easily. Instead of a sentence like the following, friendly to a programming language interpreter:


… a test can use a line like this:

I have previously specified default payment options.

(The examples are from a description of Kookaburra, which looks to be a nice Gem for organizing the implementation of Cucumber tests.)

You could argue that rhetoric-heavy tests are useful even if no one from the business ever reads them. That argument might go something like this: “Perhaps it’s true that the programmers write the tests based on whiteboard conversations with product owners, and it’s certainly true that transcribing whiteboard examples into executable tests is easier if you go directly to code, but some of the value of testing is that it takes one out of a code-centric mindset to a what-would-the-user-expect mindset. In the latter mindset, you’re more likely to have a Eureka! moment and realize some important example had been overlooked. So even programmers should write Cucumber tests.”

I’m skeptical. I think even Cucumber scenarios or Fit tables are too constraining and formal to encourage Aha! moments the way that a whiteboard and conversation do. Therefore, I’m going to assume for this post that you wouldn’t do rhetoric-heavy tests unless someone from the business actually wants to read them.

Now: when would they most want to read them? In the beginning of the project, I think. That’s when the business representative is most likely to be skeptical of the team’s ability or desire to deliver business value. That’s when the programmers are most likely to misunderstand the business representative (because they haven’t learned the required domain knowledge). And it’s when the business representative is worst at explaining domain knowledge in ways that head off programmer misunderstandings. So that’s when it’d be most useful for the business representative to pore over the tests and say, “Wait! This test is wrong!”

Fine. But what about six months later? In my limited experience, the need to read tests will have declined greatly. (All of the reasons in the previous paragraph will have much less force.) If the business representative is still reading tests, it’s much more likely to be pro forma or a matter of habit: once-necessary scaffolding that hasn’t yet been taken down.

Because process is so hard to remove, I suspect that everyone involved will only tardily (or never) take the step of saying “We don’t need cucumber tests. We don’t need to write the fixturing code that adapts the rhetorically friendly test steps to actual test APIs called by actual test code.” People will keep writing this:

… when they could be writing something like this:

That latter (or something even more unit-test-like) makes tests into actual code that fits more nicely into the programmer’s usual flow.

How often is my suspicion true? How often do people keep doing rhetoric-heavy tests long after the rhetoric has ceased being useful?

TDD Workflow (Sinatra / Haml / jQuery) Part 1


This is a draft. Worth continuing the series? Let me know.

Critter4Us is a webapp used to schedule teaching animals at the University of Illinois Vet School. Its original version was a Ruby/Sinatra application with a Cappuccino front end. Cappuccino lets you write desktop-like webapps using a framework modeled after Apple’s Cocoa. I chose it for two reasons: it made it easy to test front-end code headlessly (which was harder back then than it is now), and it let me reuse my RubyCocoa experience.

Earlier this year, I decided it was time for another bout of experimentation. I decided to switch from Cappuccino to jQuery, Coffeescript, Haml because I thought they were technologies I should know and because they’d force me to develop a new TDD workflow. I’d never gotten comfortable with the testing—or for that matter, any of the design—of “traditional” front-end webapp code and the parts of the backend from the controller layer up.

I now think I’ve reached a plateau at which I’m temporarily comfortable, so this is a good time to report. Other people might find the approach and the tooling useful. And other people might explain places where my inexperience has led me astray.

Top-down design in “functional classic” programming

While waiting for my product owner to get back to me, I was going through open browser tabs. I read this from Swizec Teller:

The problem is one of turning a list of values, say, [A, B, C, D] into a list of pairs with itself, like so [[A,B], [A,C], [A,D], [B, C], [B,D], [C,D]].

He had problems coming up with a good solution. “I can do that!” I said, launched a Clojure REPL, and started typing the whole function out. I quickly got bogged down.

This, I think, is a problem with a common functional approach. Even with this kind of problem—the sort of list manipulation that functional programming lives for—there’s a temptation to build a single function bottom up. “I need to create the tails of the sequence,” I thought, “because I know roughly how I’ll use them.”

For me (and let’s not get into all that again), it usually works better to go top down, mainly because it lets me think of discrete, meaningful functions, give them names, and write facts about how they relate to one another. So that’s what I did.

First, what am I trying to do? Not create all the pairs, but only ones in which an element is combined with another element further down the list. Like this:

As I often do when doing list-manipulation problems, I lay things out to visually emphasize the “shape” of the solution. That helps me see more clearly what has to be done to create that solution. There’s one set of pairs, each headed by the first element of the list, then another set, each headed by the second element. That is, I can say the above fact is true provided two other facts are true:

It’s easy to see how I get the heads—just map down the list (maybe I have to worry about the last element, maybe not—I’ll worry about it if it comes up). What are those heads combined with? Just the successive tails of the original list: [2 3], [3]. I’ll assume there’s a function that does that for me. That gives me this entire fact-about-the-world-of-this-program:

Given all that, downward-pairs is easy enough to write:

It’s important to me that I have reached a stable point here. When building up a complicated function from snippets in a REPL, I more often create the wrong snippets than when I define what those snippets need to do via a function that uses them. (I’m not saying that I never backtrack: I might find a subfunction too hard to write, or I may have carved up the world in a way that leads to gross inefficiency, etc. I’m saying that I seem to end up with correct answers faster this way.)

Now I just have to solve two simpler problems. Tails I’ve done before, and here’s a solution I like:

For those who don’t know Clojure well, this produces this sequence of calls to drop:

This does roughly the same work as an iteration would do because of how laziness is implemented.

That leaves me headed-pairs, which is a pretty straightforward map:

This seems like a reasonable solution, doesn’t strike me as being terribly inefficient (given laziness), has a readability that I like, doubtless took me less time than developing it bottom-up would have, and comes with tests (including an end-to-end test that I won’t bother showing).

The whole solution is here.

UPDATE: Yes, I could have used do or even gotten explicit with the sequence-m monad, but that wouldn’t have addressed the original poster’s point, which I took to be how to think about functional problems.

How mocks can cut down on test maintenance

After around 11 months of not working on it, I needed to make a change to Critter4us, an app I wrote for the University of Illinois vet school. The change was simple. When I tried to push it to Heroku, though, I discovered that my Ruby gems were too out of date. So, I ended up upgrading from Ruby 1.8 to 1.9, to Sinatra 1.3 from a Sinatra less than 1.0, to a modern version of Rack, etc. etc. In essence, I replaced all the turtles upon which my code-world was resting. There were some backwards-compatibility problems.

One incompatibility was that I was sending an incorrectly formatted URI to fetch some JSON data. The old version of Rack accepted it, but the new one rejected it. The easy fix was to split what had been a single `timeslice` parameter up into multiple parameters. [Update: I later did something more sensible, but it doesn’t affect the point of this post.] “Crap!”, I thought. “I’m going to have to convert who knows how much test data.” But I was pleased when I looked at the first test and saw this:

The key point here is that neither the format of the URI parameters nor the resulting timeslice object is given in its real form. Instead, they’re represented by strings that basically name their type. (In my Clojure testing framework, Midje, I refer to these things as “metaconstants“.)

The only responsibility this code has toward timeslices is to pass them to another object. That object, the `internalizer`, has the responsibility for understanding formats. The test (and code) change is trivial:

The test is even (and appropriately) less specific than before. It says only that the GET parameters (a hash) will contain some key/value pairs of use to the internalizer. It’s up to the internalizer to know which those are and do the right thing.

The point here is that the size of the test change is in keeping with the size of the code change. It is unaffected by the nature of the change to the data—which is as it should be.

This application is the one where I finally made the important decision to use mocks heavily in the Freeman and Pryce “London” style and—most importantly—to not fall into the trap of thinking “Mocks are stupid!” when I ran into problems. Instead, I said “I’m stupid!” and, working on that assumption, figured out what I was doing wrong.

I made that decision halfway through writing the app. One of the happy results of the mocking that followed was that a vast amount of test setup code devoted to constructing complete data structures went away. No more “fixtures” or “object mothers” or “test factories.”

Top-down TDD for Clojure

I’ve recorded a 43-minute screencast. It’s a decent example of my current style of TDD for Clojure. It emphasizes:

There are two sources for the video:


Alert watcher Mark Derricutt notices that I got east and west mixed up. I, um, I… I did that deliberately (no, really!) to illustrate an important point about TDD. TDD helps you do what you intended. It is not particularly good at helping you intend the right thing. Indeed, the flow experience it can induce probably makes you worse at noticing you’re proceeding smoothly in the wrong direction.

For example, I had a niggling feeling throughout that I was getting the orientation wrong. That, plus general principle, made me shy away from just copying the answers from the tests into the rotation maps—instead, I recreated them. All for naught: the incorrect mental image I used when I calculated the first left turn had captured my mind.

This getting stuck in a rut is one of the reasons why you still need testing with a fresh mind (either someone else’s or your own, after a rest).

It’s interesting to speculate whether this bug would ever become visible. If, just after finishing the kata, I’d wired up a UI and chosen four ship images, I’d probably again get east and west flipped, meaning all would look well to the user. It’d be like using y as the variable representing horizontal position and x for vertical position: functionally correct if you do it consistently, but death to any sane mind reading the code.

If I chose ship images later, I’d be more likely to get them right (that is, wrong given the bug). I do at times know east from west.

Mailing list for Midje

I’ve created a mailing list for Midje, my Clojure TDD framework.

Combining related expectations into a single one (mocking, midje)

Here’s a simplified example of a Ruby test:

  def test_queuing
    during {
    }.behold! {
      @mailPayload.receives(:attachment_path) { “attachment path” }
      @allContacts.receives(:selectedContact) { “contact” }
                                               :source_path => “attachment path”,
                                               :recipient => “contact”)

I find this annoying. It obscures what the test is about because the “definition” of the attachment path is far away from its use. If you were using your native tongue to describe what’s happening, you would not say:

First, let’s say that there’s a thing called “the attachment path”. You get it by asking the Mail Payload for its attachment path. There’s also a “contact”. It’s the contact selected in the list of All Contacts.

When the “send mail” button is clicked, a notification is generated. Its “source path” is the attachment path and the “recipient” is the contact.

(Well, you might if you were so completely corrupted by years of mathematics training that its definition-theorem-proof style seems natural.) If I were listening to you, I’d prefer to hear something like this:

When the “send mail” button is clicked, a notification is generated. (Key fact comes first.) Its “source path” is the Mail Payload’s attachment path and the “recipient” is the contact currently selected in All Contacts. (Definitions and uses in the same place.)

In code, that’d look more like this:

  def test_queuing
    during {
    }.behold! {
                                               :source_path => @mailPayload.attachment_path,
                                               :recipient => @allContacts.selectedContact)

That emphasizes an important fact: that I don’t really care what the attachment path or selected contact are, because whatever they produce gets stuffed into the recipient. Indeed, the real selected contact isn’t anything like a string. In the first example, I can use “contact” to represent it because the code treats it as any-old opaque object, and a string is easy for me to type. (It’d probably be better to use symbols for this, I just now realize, because they’re less likely to be taken to be an example of the actual object.)

In OO Land, you could make an argument that my annoyance is a sign I’m Doing It Wrong: I’ve given responsibilities to the wrong objects. I think that’s certainly so in some cases, though (probably) not this one.

But the same thing happens in Functional Land, where the responsibility argument is harder to make because dumb objects are more natural. Here’s an example that uses my Midje TDD library for Clojure:

(fact "unless overridden, each procedure can be used with each animal"
  (all-procedures-no-exclusions) => { ...procedure-name... [] }
    (procedures) => [ …procedure… ]
    (procedure-names […procedure… ]) => [ …procedure-name…]))

This is a little terser, but I have the same pattern of two expectation-statements to express one idea — in this case, the idea that the program wants to reach out and grab a list of all the names of all the procedures.

I’m thinking of allowing this kind of fact in Midje:

(fact "unless overridden, each procedure can be used with each animal"
  (all-procedures-no-exclusions) => { ...procedure-name... [] }
    (procedure-names (procedures)) => [ … procedure-name … ]))

You’d get an expectation failure for line 4 if either (1) procedures was never called or (2) procedure-names didn’t use what procedures produced.

Comments? (from residents of either land) Who’s done things like this before? How has it worked out? Gotchas? (Replies might go in the Midje mailing list.)

Release 0.1.1 of Midje, a Clojure mocking tool

The devoted reader will recall that I earlier sketched a “little language” for mocking Clojure functions. I’ve finished implementing a usable–perhaps even pleasant–subset of it. I invite you to take a look. In keeping with my whole “work with ease” thing, I’ve made it dirt simple to try it out: one click and two shell commands. You don’t even have to have Clojure installed.

TDD in Clojure, part 3 (one wafer-thin function; conclusions)

Part 1
Part 2

All that remains is to add the locations of bordering cells to a given sequence of locations. A small wrinkle is that a border location may be next to more than one of the input locations, so duplicates need to be prevented. I could write the test this way:

I don’t like that. When the test fails, it’ll likely be hard to discover how the expected and actual values differ. And I bet that a failure would more likely be due to a typo in the expected value than to an actual bug. The test is just awkward.

I like this version better:

It is a clearer statement of the relationship between two concepts: a location’s neighborhood and the border of a set of locations. More pragmatically, it’s less typing. (When I first started coding in this style, it surprised me how much test setup clutter went away. I had much less need for “factories” or “fixtures” or “object mothers“.)

Given either test, the code is straightforward:


No matter how satisfying the individually-tested pieces, the whole has to work, which is why everything ends by running at least one end-to-end test. A test like the one we started with:

Because I’m making up my test notation as I go, I’ve been running all the tests manually in the REPL. Now I can run the whole file:

If you look carefully, you’ll see that the test would fail because the locations are in the wrong order. But that’s a quick fix:

Done. Ship it!

Development order - Test-down or REPL-up?

It’s often said that the Lisps are bottom-up languages, that you test out expressions in the REPL, discover good functions, and compose them into programs. A lot of people do work that way. A lot of people who use TDD to write object-oriented code also work that way: when implementing a new feature, they start at low-level objects, add whatever new code the top-level feature seems to demand of them, then use those augmented objects in the testing of next-higher-level objects.

For I guess about a year now, I’ve been experimenting with being strictly top-down in some projects. I find that leads to less churn. Too often, when I go bottom-up, I end up discovering that those low-level changes are not in fact what the feature needs, so I have to revisit and redo what I did.

I get less disrupted by rework when I go from the top down (or from user interface in). It’s not that I don’t blunder—we saw one of those in the previous installment—but those blunders seem easier to recover from.

That’s not to say that I don’t use the REPL. We saw that a little bit in this program, when I was writing neighbors. It’s perfectly sensible to do even more in the REPL. I think of the REPL as a handy tool for what XP calls a spike solution and the Pragmatic Programmers call tracer bullets. When I’m uncertain what to do next, the REPL is a tool to let me try out possibilities. So I might be stuck in a certain function and go to the REPL to see how it feels to build up parts of what might lie under it. After I’m more confident, I can continue on with the original function, test-driven, reusing REPL snippets when they seem useful.

I don’t claim everyone should work that way. I do claim it’s a valid style that you’d be wise to try.


I’m something of an obsessive about test notation, and I’ve been endlessly fiddling with a Clojure mock notation. I implemented its first version as a facade on top of clojure.contrib.mock. As I experimented, though, I found that keeping my facade up to date with my notational variants slowed me down too much, so I put the code aside until the notation settled down.

I’m pretty happy with what you’ve seen here. Are you? If so, I may start on another mock package. I’ve got the most important parts: a name and sketch of a logo. (”Midje” and someone flying safely between the sun [of abstraction without examples] and the sea [of overwhelming detail].)

Tests and code together

You can see the completed program here. It mixes up tests and code. I’ve tried that on-and-off over the years and always reverted to separate test files. This time, it’s seemed to work better. I probably want an Emacs keystroke that lets me hide all tests, though. I’d also want alternate definitions of the test macros so that I can compile them out of the production system

What next?

I’ll write a web app in this style, using Compojure.