Exploration Through Example

Example-driven development, Agile testing, context-driven testing, Agile programming, Ruby, and other things of interest to Brian Marick

191.8 ⇒ 167.2 ⇒ 186.2 183.6 184.0 183.2 184.6

Sat, 12 Aug 2006

Over the hump

I have finished the review draft of Scripting for Testers. I am going on holiday.

## Posted at 16:29 in category /testing [permalink] [top]

Thu, 15 Jun 2006

Wireframe style for tests

Earlier, I wrote about sentence style tests for rendered pages. Based partly on conversations about that entry with Steve Freeman and partly on bashing against reality, I've changed the style of those tests.

Since they are about the part of the app that the user sees and since I'd like them to be readable by the product director, I found myself asking where they would come from in a business-facing-test-first world and how the product director would therefore think about them. I imagined that, sometime early on, someone makes a sketch, paper prototype, or a wireframe diagram. So I came to think that this test ought to be a textual, automatically-checkable wireframe diagram. Like this:

def test_structure_without_audits_or_visits wireframe_looks_like { page.has_two_columns.and_all_forms_use_the_session_id. and_all_links_use_the_session_id_except_help page.title.has_id(:patient_display_page). and_includes(:current_animal). and_includes(:current_client) page.main_text.has_no_list_named(:visits). has_no_list_named(:audits). has_a_form_with_action(:want_add_visit_form). has_a_form_with_action(:want_add_audit_form). has_a_help_popup_named(:patient_display_page). and_no_other_actions }.given_that { a_user_is_logged_in an_animal_has_been_selected the_animal_has_no_visits animal_treatments_have_never_been_audited there_is_help_for_page(:patient_display_page) } end

One interesting thing is that I put the setup for the test after the checking code. That's because the page layout seems more important.

How well does that test describe this page? (The sidebar is described in tests of its own.)

I'll let you be the judge.

## Posted at 11:45 in category /testing [permalink] [top]

Fri, 09 Jun 2006

Sentence style for tests

Steve Freeman and Nat Pryce will have a paper titled "Evolving an Embedded Domain-Specific Language in Java" at OOPSLA. It's about the evolution of jMock from the first version to the current one, which is something of a domain-specific language for testing. It's a good paper.

I've been doing some work recently on an old renderer-presenter project, and I was inspired by the paper to rip out my old tests of a rendered page and replace them with tests in their style. Here's the result. It first has flexmock sentence descriptions of how the renderer uses the presenter. Then come other sentence descriptions of the important parts of the page structure.

def test_patient_display_page_minimal_structure # No audits, no visits, no "add audit" button certain_calls_never_vary # ... but these do: @presenter.should_receive(:patient_visits).once.and_return([]) @presenter.should_receive(:patient_audits).once.and_return([]) @presenter.should_receive(:possible_audit_button).once. and_return(:missing_audit_button.marker) page.should_have_normal_structure page.title.should_have_id(:patient_display_page). and_should_include(:current_animal). and_should_include(:current_client) page.main_text.should_have_no_list_named(:visits). should_have_no_list_named(:audits). should_have_a_form_with_action(:want_add_visit_form). should_have_no_form_with_action(:want_add_audit_form). and_should_have_a_help_popup_named(:patient_display_page) end

I rather like that, today at least. It's much more understandable than my previous tests. After only a few months, I had to go digging to figure them out, but I doubt I'll have to do that for these. Moreover, I think these tests would be more-or-less directly transcribable from a wireframe diagram or sketch of a page on a whiteboard. They're also, with a little practice, reviewable by the product director.

(I'm still very much up in the air about how much automated testing how close to the GUI we should do, but this has nudged my balance toward more automated tests.)

I also remain fond of workflow tests in this style:

def test_approvals_can_happen_without_logging_in @sanjay_student.logs_in @sanjay_student.adds_a_patient (@dr_dawn.is_sent_an_approval_url) # without logging in... @dr_dawn.goes_to(her.approval_url) @dr_dawn.approves_new_case (@dr_dawn.should_be_on_beginning_page) # no place else to go for now. # can't show further approvals because not logged in. # this will change in some later iteration. end

These workflow tests can be derived from interaction design work as easily as Fit tests are. They're less readable than Fit tests, but not impossibly code-like. These workflow tests are end-to-end. They go through HTTP (using my own browser object, rather than Watir or Selenium), into the renderer/presenter layer, down into the business logic, and through Lafcadio into MySQL.

And, finally, I also am starting to write RubyFIT tests in the style that I've heard Jim Shore call "business-facing unit tests":

Adding a Case

Students can add cases on their own, but they must be approved by the clinician.

When students add a case, clinicians get sent email with a clickable link. When approved, they're taken to the approval page. Their sidebar also shows that same approval URL.

AddCaseFixture

user type allowed?() approval needed?()

clinician yes no

student yes yes

admin no -

I feel reasonably comfortable with the way this project is test-driven from (what would be in reality) business-facing sketches on the whiteboard down to individual lines of code. Well, except for the Javascript. That wasn't test-driven.

## Posted at 13:18 in category /testing [permalink] [top]

Thu, 25 May 2006

Notes toward integration testing (1)

Any time you write code that sits on top of a third party library, your code will hide some of its behavior, reveal some, and transform some. What are the testing and cost implications?

By "cost implications," I mean this: suppose subsystem USER is 1000 lines of code that makes heavy use of library LIB, and NEW is 1000 lines that doesn't (except for the language's class library, VM, and the operating system). I think we all wish that USER and NEW would cost the same (even though USER presumably delivers much more). However, even if we presume LIB is bug free, we have to test the interactions. How much? Enough so that an equal-cost USER would be 1100 lines of unentangled code? 1500? 2000? It is conceivable that the cost to test interactions might exceed the benefit of using LIB, especially since it's unlikely we're making use of all of its features.

More likely, though, we'll under-test. That's especially true because I've never met anyone with a good handle on what we're testing for. Tell me about a piece of fresh code, and I can rattle off things to worry about: boundary conditions, plausible omissions, special values like nil or zero. I'm much worse at that when it comes to integrated code, and I think I'm far from alone.

The result of uncertain testing is a broken promise. Given test-driven design, bug reports should fall into two categories:

Something that was omitted from any of the driving tests. Most of those can be fairly classified as new or changed requirements. They can be estimated and scheduled in the normal way (presuming they're not so simple to fix that you just do it right away). Such are more like new features than what most people mean by "bug," and seeing them shouldn't be cause for surprise or disappointment.
A real bug. Everyone agrees that, given the tests driving the code, this previously untried example should have worked. But it doesn't. That's a surprise and a disappointment.

The TDD promise is that there should be few type 2 real bugs. But if we don't know how to test the integration of LIB and USER, there will be many of what I call fizzbin bugs: ones where the programmer fixing them discovers that, oh!, when you use LIB on Tuesday, you have to use it slightly differently.

Since fizzbin bugs look the same to the product director or user, greater reuse can lead to a product that feels shaky. It seems to me I've seen this effect in projects that make heavy use of complex frameworks that the programmers don't know well. Everyone's testing as best they can, but end-of-iteration use reveals all kinds of annoyances.

I (at least) need a better way to think about this problems. More later, if I think of anything worth writing.

## Posted at 07:54 in category /testing [permalink] [top]

Wed, 22 Feb 2006

Table of contents for the "GUI testing tarpit" series

My "working your way out of the GUI testing tarpit" series really ought to be put into a single paper with the rough transitions smoothed over. Until that happens, if ever, what I've got will have to serve. Here's the table of contents.

Three ways of writing the same test: click-specific procedural, abstracted procedural, and declarative. The first two are usually inefficient solutions to their problems. Life is better if you get rid of as many procedural tests as possible. That's what this series is about.
A declarative test might require a lot of test-specific work behind the scenes. To avoid that, build an engine that can deduce paths through the app.
Trying to convert a whole test suite at once is failure-prone. Therefore, convert the suite one failure at a time.
Capturing abstract UI actions behind the scenes doesn't provide much speedup, but it allows a dandy programming, debugging, and testing tool that lets you get to any page in one step.
If you have your tests avoid the network, you'll discover that many tests boil down into assertions about the structure and contents of a single page. There's no reason those can't be fast, targeted, robust unit tests.
But if most tests are about single pages, how do you prevent changes from introducing dead links? The renderer can check links without clicking on them, at unit-test time.
Not everything can be turned into a fast, network-avoiding unit test. Workflow tests remain GUI tests, but they should clearly focus on workflow and not test things better tested elsewhere. Such tests can be an integral part of the design of application flow.

## Posted at 09:35 in category /testing [permalink] [top]

Mon, 20 Feb 2006

End-to-end tests and the fear of truthiness

The American Dialect Society voted "truthiness" the 2005 word of the year. It "refers to the quality of preferring concepts or facts one wishes to be true, rather than concepts or facts known to be true." For me, 2006 is turning into the year of replacing end-to-end tests with unit tests. One risk to face is that unit tests can play into truthiness. This picture illustrates the problem:

Everything seems fine here. The tests all pass. What the picture doesn't show is that the Widget tests require the strings from the Wadget to be in ascending alphabetical order. The fake Wadget dutifully does that. The Wadget tests don't express that requirement, so the real Wadget isn't coded to satisfy it. The strings come back in any old order.

Truthiness would be wishing that unit tests add up to a working system. But the truth is that those two units would add up to a system like this:

We know that those sorts of mismatches happen in real life. So we should fear unit tests.

More tests are a proper response to fear. Hence the desire to wrap the entire chain above in an end-to-end test that 'sees what the user sees'. However, such tests tend to be slow, fragile, etc. So I want to replace them with smaller tests or other methods that are fast, robust, etc., thus reducing the need for end-to-end tests to a bare minimum.

Two such methods are:

Value Objects can be the authority over data format. Suppose neither the fake nor real Wadget returned an array of strings but rather an InventoryNames object. Before, both the real and fake Wadget were supposed to know about order. Now only one object needs to. The requirement on Wadget turns into two requirements: the first that it use InventoryNames, and the second that InventoryNames always yields names in the right order.
I earlier described one way for any code that generates <form action="want_new_case_form"...> to check at runtime whether there's any code corresponding to want_new_case_form. Because of that, a unit(ish) test that generates a page automatically checks link validity.

I expect there are a host of other tricks to learn (but I'm not at this moment aware of places where they're written down). What's seems to me key is to take the strategy of "something could go wrong somewhere, so here's a kind of test with a chance of stumbling over some wrongness" and replace it with (1) a host of tactics of the form "this could go wrong in places like that, so here's a specific kind of test or coding practice highly likely to prevent such bugs" and (2) a much more limited set of general tests (including especially manual exploratory testing).

P.S. I don't like the word "truthiness." It seems statements should have truthiness, not people. A question for you hepcats out there who are down with the happening slang: which is more copacetic, "that's a truthy statement" or "that's a truthish statement"?

## Posted at 18:11 in category /testing [permalink] [top]

Sat, 18 Feb 2006

Model-Renderer-Presenter: MVP for web apps?

A client and I were talking over how Model-View-Presenter would work for web applications. The sequence diagram to the right (click on it to get a bigger version in a new window) describes a possible interpretation. Since the part that corresponds to a View just converts values into HTML text, I'm going to call it the Renderer instead. The Renderer can be either a template language (Velocity, Plone's ZPT, Sails's Viento) or—my bias—an XML builder like Ruby's Builder.

I did a little Model-Renderer-Presenter spike this week and feel pretty happy with it. I'm wondering who else uses something like what I explain below and what implications it's had for testing. Mail me if you have pointers.

(Prior work: Mike Mason just wrote about MVP on ASP.NET. I understand from Adam Williams that Rails does something similar, albeit using mixins. So far handling the Rails book hasn't caused me to learn it. I may actually have to work through it.)

Here's the communication pattern from the sequence diagram:

After the Action gets some HTTP and does whatever it does to the Model, it creates the appropriate Presenter (there is one for each page) and asks it for the HTML for the page.
The Presenter asks the Renderer for the HTML for the page. The Renderer is the authority for the structure of the page and for any static content (the title, etc.) The Presenter is the authority for any content that depends on the state of the Model.
When the Renderer needs to "fill in a blank", it asks the Presenter. In this case, suppose it's asking for a quantity (like a billing amount).
The Presenter gets that information from the Model.
The Renderer can also ask the Presenter to make a decision. In this case, suppose it asks the Presenter whether it should display an edit button. I decided that the Presenter should either give it back an empty string or the HTML for the button. That works as follows:
First, the Presenter asks the Model for any data it needs to make the decision. Suppose that it decides the button should be displayed. But it's not an authority over what HTML should look like, so it...
... asks the Renderer for that button's HTML. As part of rendering that button, the Renderer needs to fill in the name of the Action the button should invoke. It could just fill in a constant value, but I want the program to check—at the time the page is displayed—whether that constant value actually corresponds to a real action. That way, any bad links will be detected whenever the page is rendered, not when the link is followed. Since there are unit tests that render each page, there will be no need for slow browser or HTTP tests to find bad links. Therefore...
... the Renderer asks the Presenter for the name to fill in.
The Presenter is the dispenser of Action names. Before giving the name to the Renderer the Presenter asks the Action (layer) whether the name is valid. The Action will blow up if not. (It would make as much sense—maybe more—to have the Action be the authority over its name but this happened to be most convenient for the program I started the spike with.)

What good is this? Classical Model-View-Presenter is about making the View a thin holder of whatever controls the windowing system provides. It does little besides route messages from the window system to the Presenter and vice versa. That lets you mock out the View so that Presenter tests don't have to interact with the real controls, which are usually a pain.

There's no call for that in a web app. The Renderer doesn't interact with a windowing framework; it just builds HTML, which is easy to work with. However, the separation does give us four objects (Action, Model, Renderer, and Presenter) that:

can be created through test-driven design,
can be tested independently of each other,
and can be tested in a way that doesn't require many end-to-end tests to give confidence that almost all of the plausible bugs have been avoided.

The second picture gives a hint of the kinds of checks and tests that make sense here. (Click for the larger version. Safari users note that sometimes the JPG renders as garbage for me. A Shift-Reload has always fixed it.)

More later, unless I find that someone else has already described this in detail.

## Posted at 10:31 in category /testing [permalink] [top]

Thu, 16 Feb 2006

Programmers as testers, again

Over at the Agile-Testing list, there's another outbreak of a popular question: are testers needed on Agile projects? To weary oldtimers, that debate is something like the flu: perennial, sneakily different each time it appears so that you can't resolve it once and be done with it, something you just have to live with.

After skimming the latest set of messages on the topic, I returned to editing a magazine article and then I had a thought that might just possibly add something.

Editors are supposed to represent readers (and others), just as testers are supposed to represent users (and others). To an even greater extent than testers, editors do exactly what the the represented people do: they read the article. And yet, you can't take J. Random Reader and expect her to be a good editor. Why not?

It seems to me that as readers we're trained to make allowances for writers. We're so good at tolerating weak reasoning, shaky construction, and muddled language that a given reader will notice only a fraction of the problems in a manuscript. A good editor will notice most of them. How?

Some of it is what "do we need testers?" discussions obsessively circle: perspective. Editors didn't write the manuscript (usually...), so their view of what it says is not as clouded by knowledge of what it should have said. Editors also do not have their ego involved in the product.

But that perspective is shared by any old reader. What makes editors special is, first, technique. I put those techniques into two rough categories:

Model-building techniques. Esther Derby has described cutting a manuscript into pieces and rearranging them on the floor into something like an affinity map. She created a content model she could use to explore structure. In testing, James Bach and Michael Bolton teach many model-building techniques under the umbrella name Rapid Testing. Programmers who learned from them would do better (modulo perspective effects).
Something I don't know how to name. Call them attentiveness techniques. It sometimes happens that I have a niggling feeling of something wrong with an article. It's easy to ignore those. But if you have techniques to explore them, you're more likely to. For example, sometimes I find it useful to use the literary technique of deconstruction to figure out an article's implicit assumptions or contradictions.

It seems to me the exploratory testing world would be the place to look for something that could be adapted to programmers. I'm somewhat disconnected from the leading edge of that world these days, so I don't know how explictly this topic has been taken up. (If you're looking for something to read, Mike Kelly has quite a list of books that have influenced exploratory testers.)

But there's something else that editors and testers have that programmers don't have: leisure. When I'm acting as a pure reader, I intend to get through it and out the other side quickly. As an editor, there's no guilt if I linger. There's guilt if I don't. One problem that Agile projects have is a lack of slack time, down time, bench time. There's velocity to maintain—improve—and the end of the iteration looms. Agile projects are learning projects, true, but the learning is in the context of producing small chunks of business value. There's no leisure for the focus to drift from that. (I'm using "leisure" rather than "permission" because so much of the pressure is self-generated.)

My hunch is that perspective is less important than technique and leisure for producing good products. If the testing and programming roles are to move closer together (which I would like to see), the real wizards of testing technique need to collaborate with programmers to adapt the techniques to a programmer's life. (I tried to do that a few years ago. It was a disaster, cost me two friendships. Someone else's turn.) And projects need some way to introduce leisure. (Gold cards?)

## Posted at 06:56 in category /testing [permalink] [top]

Wed, 15 Feb 2006

Continuous integration and testing conference

A message from Paul Julius and Jeffrey Fredrick about a conference:

Jeffrey Fredrick and Paul Julius are cohosting an event that will focus on [continuous integration and testing]. The event will use Open Spaces to structure conversation, understanding and innovation.

What: Open Space event discussing all aspects of CI and Testing, together
Where: Chicago, IL
When: April 7 & 8, 2006
Who: Everyone interested in CI and Testing
Cost: Free
Info: http://www.citconf.com

We'll be inviting people for all manner of projects and places. In fact, feel free to pass this invitation along to anyone that you think will be interested.

For us to finalize the details of time and place we need to get a feel for how many people are likely to attend. If you are interested in attending please join the CITCON mailing list at:
http://groups.yahoo.com/group/citcon
and post an introductory message. In your message it would be useful if you could indicate any topics of special interest and also how likely you are to attend.

## Posted at 07:53 in category /testing [permalink] [top]

Sat, 28 Jan 2006

Working your way out of the automated GUI testing tarpit (part 7)

part 1, part 2, part 3, part 4 part 5, part 6

Where do we stand?

I've prototyped a strategy for gradual transformation of slow, fragile, hard-to-read tests into fast tests that use no unnecessary words and are therefore less fragile.
Every page has tests of its layout. They may be comprehensive or not, depending on the needs of the project.
Sometimes, parts of pages are generated dynamically according to "presentation business rules". For example, a rule might govern whether a particular button appeears. The page tests can (should) include tests of each of those rules.
The page tests are not mocked up. That is, the page renderer is not invoked with an artificially constructed app state. Instead, the app state is constructed through a series of actions on the app.
Nevertheless, the tests are declarative in the sense that they do not describe how a user of the app would navigate to the page in question. Instead, the test figures navigation out for itself.
As the app grows a presentation layer, the page tests can run at unit test speeds by calling directly into it.
There are no tests that check for dead links by following them. Instead, the output-generator and input-handler cooperate so that any attempt to generate a bad link will lead to an immediate failure. Therefore, dead links can be discovered by rendering all parts of all pages. The page tests do that, so they are also link-checking tests.

I want to end this series by closing one important gap. We know that links go somewhere, but we don't know that they go to the right place, the place where the user can continue her task.

We could test that each link destination is as expected. But if following links is all about doing tasks, good link tests follow links along a path that demonstrates how a user would do her work. They are workflow tests or use-case tests. They are, in fact, the kind of design tests that Jeff Patton and I thought would be a communication tool between user experience designers and programmers. (At this point, you should wonder about hammers and nails.)

Here's a workflow test that shows a doctor entering a new case into the system.

def test_normal_new_patient_workflow
    @dr_dawn.logs_in
                (@dr_dawn.should_be_on_the_main_page)
    @dr_dawn.adds_a_case
                (@dr_dawn.should_be_on_the_case_display_page)
    @dr_dawn.adds_a_visit
              (@dr_dawn.should_be_back_on_the_case_display_page)
end

I've written that with unusual messages, formatted oddly. Why?

Unlike this test, I think my final declarative tests really are unit tests. According to my definition, unit tests are ones written in the language of the implementation rather than the language of the business. My declarative tests are about what appears on a page and when, not about cases and cows and audits. They're unit tests, so I don't mind that they look geeky.

Workflow tests, however, are quintessential business-facing tests: they're all about asserting that the app allows a doctor to perform a key business task. So I'm trying to write them such that, punctuational peculiarities aside, they're sentences someone calling the support desk might speak. I do that not so much because I expect a user to look at them as because I want my perspective while writing them to be outward-focused. That way, I'll stumble across more design omissions. Sending all messages to an object representing a a persona (dr_dawn) also helps me look outward from the code.

Similarly, I'm using layout to emphasize what's most important. That's what the user can do and what, having done that, she can now do next. The actual checks that the action has landed her on the right page are less important—parenthetical—so I place them to the side. (Note also the nod to behavior-driven design.)

The methods that move around (like adds_a_case) talk to the browser in the same way that the earlier abstracted procedural tests do, and the parenthetical comments turn into assertions:

   def should_be_on_the_main_page
      assert_page_title_matches(/Cases Available to Dmorin/)
   end

As you can see, I don't check much about the page. I leave that to the declarative page tests.

That's it. I believe I have a strategy for transforming a tarpit of UI tests into (1) a small number of workflow tests that still go through the UI and (2) a larger number of unit tests of everything else.

Thanks for reading this far (supposing anyone has).

What's missing?

The tests I was transforming didn't do any checking of pure business logic, but in real life they probably would. They could be rewritten in the same way, though I'd prefer to have at least some such tests go below the presentation layer.

There are no browser compatibility tests. If the compatibility testing strategy is to run all the UI tests against different browsers, the transformation I advocate might well weaken it.

There are no tests of the Back button. Should they be part of workflow tests? Specialized? I don't know enough about how a well-behaved program deals with Back to speculate just now. (Hat tip to Seaside here (PDF).)

Can you do all this?

The transformation into unit tests depends on there being one place that receives HTTP requests (Webrick). Since Webrick is initialized in a single place, it's easy to find all the places that need to be changed to add a test-support feature. The same was true on the outgoing side, since there was a single renderer to make XHTML. So this isn't the holy grail—a test improvement strategy that can work with any old product code. Legacy desktop applications that have GUI code scattered everywhere are still going to be a mess.

See the code for complete details.

## Posted at 07:56 in category /testing [permalink] [top]

About Brian Marick

I consult mainly on Agile software development, with a special focus on how testing fits in.

Contact me here: marick@exampler.com.

Syndication

Agile Testing Directions

Introduction
Tests and examples
Technology-facing programmer support
Business-facing team support
Business-facing product critiques
Technology-facing product critiques
Testers on agile projects
Postscript

Permalink to this list

Working your way out of the automated GUI testing tarpit

Permalink to this list

Design-Driven Test-Driven Design

Creating a test
Making it (barely) run
Views and presenters appear
Hooking up the real GUI