Send me bugs that are caught in end-to-end testing
For some time now, I’ve been skeptical of the ROI of end-to-end automated tests and of the value of automating the kind of business-facing examples that drive development.
I’ve walked the walk. The Critter4Us app that’s being used at the University of Illinois vet school does not have these kinds of tests. I’m doing contract programming on another app. I make heavy use of Growing Object-Oriented Software-style tests, but I don’t have any that are larger than unit-sized.
What I’ve discovered with Critter4Us is this: if I do what I consider good TDD, then I run through end-to-end tests by hand, and follow up with some not-wonderful exploratory testing*, I do not have bugs that escape to production but would have been caught by full end-to-end tests. (* It’s not-wonderful exploratory testing because I’m a not-wonderful exploratory tester.)
I have written some partial end-to-end tests that exercise the route through the server from HTTP Request to HTTP Response*. Even those are probably not justifiable if the question you care about is “Do they find bugs manual testing would have found, only faster?” However, I write them for two reasons. First, they make me feel better about the pacing of my work, and my own ease-of-work is important to me. The second reason is that I believe a lot of progress in Agile has come through people wanting something, being so naĆve they didn’t realize it was impossible for them to have it, so they changed their context to make it possible. So I’m edging toward writing end-to-end tests as a way to force myself to figure out how to make them cost-effective to me.
(* These are very partial end-to-end tests because most of the code lives in the browser front end.)
However, these apps—while “real”—are relatively small, and I do see occasional tweets saying “Having those automated end-to-end tests really saved our butts today!” I’d like to examine some of those bugs in detail so that I can (preferably) discover what kinds of bugs make end-to-end tests worthwhile and thus what specific kinds of end-to-end tests are worthwhile or (less exciting) figure out what unit-style tests were missing.
So email me if you have a juicy bug. But please be aware that “in detail” likely means NDA-level detail and possibly a fair amount of email back-and-forth. And I will want to describe the bugs and systems (in sanitized form) to a worldwide audience.
June 4th, 2010 at 3:48 pm
I think “how big is your app?” and “how often do you release?”, along with maybe “how often do your features regress?” are important questions.
At Socialtext, we released software to production for someone like 48 out of 52 two-week iterations. We’re feature-bound, not time-bound now, so those numbers are something like Feb 2008-Feb 2010.
That means we ran our browser-driving end-to-end test suite at least 52 times, but also once for each browser, so more like 200 times.
Each iteration, I’d say we caught between five and fifteen regression bugs with the suite. Keep in mind, test run takes maybe 10 minutes to setup, then you have to come back in 3 hours and evaluate the results. Each browser eval might take 1-5 hours, depending on how many failures you had, how easy they were to repro, how much bug investigation you had to do, did you have to fix any tests, etc.
So let’s see. That’s about 10 hours of test maintenance per iteration, for maybe ten bugs. How many of those were p1’s? How many could we have just waited for customers to tell us about first? What could we have done with those ten hours?
PLUS there’s the development cost of the suite.
All in all, I’d say we had net positive returns from our test suite, but it’s not the huge, multi-hundred percent numbers the vendors would tell you about.