Interview with Brian Marick on How to do Good Testing: Part 2

by Mark Johnson
The Software QA Quarterly (now Software Testing and Quality Engineering)

Last quarter, in the first part of our interview with Brian Marick, we focused on software test design using test requirements and the evaluation of testing effectiveness using test coverage measurement.

In the final part of our interview we will talk about:

Brian's new book, The Craft of Software Testing
Testing object-oriented software: what is similar and what is different
The development life cycle, increasing developer testing, and how software testing can actually improve time to market
Brian's vision of the future for software testing

Q: Before we talk further about testing, I see that you have a new book out since the first part of this interview was written. Is your title a take off on Glenford Myers' book The Art of Software Testing?

Yes, it is partly to say that since the late 1970s testing has moved from an art to a craft, and also an homage to Myers' book, which I think is still the best introductory book on the what and why of software testing. So I called my book The Craft of Software Testing. Some people object to the word "craft," thinking I'm suggesting being sloppy and ad-hoc or somehow unscientific. I'm not. A craft is a disciplined profession where knowledge of the type published in refereed journals isn't yet sufficient.

I wrote this book because I really felt there was a need for one that provided a cookbook, step-by-step process for software testing. Whenever you read a book, then sit down and try to apply it to your problem, there's always that annoying period where you have to fill in the gaps the author left unexplained. Sometimes the gaps are huge, the "obvious details" are often the hardest problem. I wanted to give all the details. Of course, I didn't entirely succeed, and of course the way you really learn a craft is to watch someone else do something, then do it yourself and have them comment on your attempts. So even my book is no replacement for being able to work with and learn from experienced professionals.

You could say the book is a very detailed expansion of the types of things we have covered in this interview.

Q: OK. So far we haven't talked about testing Object-Oriented Software. Do you see it as different than testing other software?

I'd like to preface this by saying that everything we have talked about up to now I have personally used on "big software" and I know it works. My ideas on Object-Oriented software make sense, but I have only tried them on small programs. I believe that they are right, but I'm not as confident as if I had years of experience using them. Think of them as having been inspected but not tested.

There are three issues in testing object-oriented software. First, how do you test a base class? Second, how do you test external code that uses a base class? Finally, how do you deal with inheritance and dynamic binding?

Without inheritance, object-oriented programming is object-based programming: good old abstract data types. As an example of an abstract data type, take a hash table. It has some data structure for storage, like an array, and a set of member functions to operate on the storage. There are member functions to add elements, delete elements, and so on.

Now, the code in those "member functions" is just code, and you need to test it the same way you test other code. For example, a member function calls other functions. Some of them are other member functions, some of them are external to the class. No difference. Whenever code calls a function, you can derive test requirements to check whether the code calls it correctly and whether it handles all possible results.

There is one slightly funny thing in the code: that sort-of-global variable, the array the code maintains. But that's not really any different than an array passed to the code as an argument. You test it in the usual way: arrays are collections, and you find test requirements for them in a test requirement catalog.

That seems too simple: just testing classes like any other code. What about the relationships between the member functions? What about the state machine the class implements? Relationships are largely captured by test requirements for the shared data and called member functions, secondarily by having more complex tests that exercise sequences of member functions. Those sequences are partially driven by the state machine.

Now look what we have: test requirements for the object as a data type, and also for the member functions that control the object. We've used them, along with other requirements, in testing the class code itself. But these requirements are also useful for testing external code that uses the class. If some code uses a hash table, it should be tested against the hash-table-as-collection requirements, against the requirements for the member functions it calls, and perhaps against the hash-table-as-state-machine requirements.

What we really should have is a catalog entry telling anyone using a hash table how to test their code to see if it misuses the hash table. These cataloged test requirements are a form of reusability: reuse of test requirements. I hope that as vendors provide more and more class libraries, they will also provide catalogs of test requirements for their objects. This will be a good thing, it will save you from having to think of all these test requirements yourself, you can just look them up. The vendor will tell you how to test your code for likely misuses.

When you get into object oriented software, you also get inheritance. The hope is that you won't have to retest everything that is inherited. Using the hash table example again, say you create a class that inherits 80% of its functions from the original hash table, and adds 20% of its own functions, either by adding new functions or by overriding hash table functions. The hope is that you would only have to test the 20% of added or changed code. Unfortunately, this generally doesn't work. The problem is that if you override member function 'A', it may be used by member function 'B' of hash table, which has not been changed. Because something that 'B' uses has changed, 'B' now needs to be retested, along with 'A'.

You want to minimize the amount of testing you have to do. My approach is this: you have the class catalog for the base hash table class, with its test requirements. And you have the new or changed functions in the derived class. Now you need to analyze the differences between the base class and the derived class and use the differences to populate the derived class's class catalog. I have a long procedure for doing this analysis, which considers all sorts of special cases, so I won't try to go into it here. What you end up with is a parallel inheritance structure. On one side, there is the base class and the derived class. On the other side, there is the base class catalog and the derived class catalog. The derived class catalog inherits some test requirements from the base class catalog. It also contains new test requirements based on the differences between the derived class and the base class. The new test requirements are what are used in testing the derived class by itself. The entire test catalog for the derived class is used when testing external code that uses the derived class.

If your object-oriented design is done well, for example following Bertram Meyers' book on object-oriented design, so that you have a good inheritance structure, most of your tests from the base class can be used for testing the derived class, and only a small amount of re-testing of the derived class is required. If your inheritance structure is bad, for example if you have inheritance of implementation, where you are grabbing code from the base class instead of having inheritance of specification, then you will have to do a lot of extra testing. So, the price of using inheritance poorly is having to retest all of the inherited code.

Finally there is the issue of dynamic binding. The issue here is that if you have a variable in your code that is a hash table, you don't know if it is the base hash table or the derived hash table. This means you have to test it with both cases. There is yet another kind of test requirements catalog that I use to keep track of this.

You can see that object-oriented programming requires considerably more bookkeeping during testing. I don't think there is a way to get around this extra bookkeeping. I should also point out that there are lots of other theories right now on how to test object-oriented programs.

Q: How do you see testing as fitting into the overall development life cycle?

I believe the idea of test requirements as separate from actually creating the tests to be a powerful idea. This is because the test requirements can be created well before the actual tests. When you have the first deliverable from your project, say a prototype, a user's manual, or a specification of some sort, you can create test requirements from it. At this point you can't implement any tests, in fact you might not even be able to design the inputs for any tests because you might not have the complete product specification yet. But you can be writing down test requirements, and asking the question, "Will this design handle the test requirements I'm coming up with?" This means that the test requirements can actually give you something to test your specification against.

So, you want to have the system test personnel involved from the beginning, defining the system test requirements. They can give you a lot of feedback on whether the specified system will actually work. And this means the system testers will have time to plan how they will automate the tests for this product. Automation is really essential to getting a good long-term job done.

Then the product is passed on to a group of developers who will design and implement it. The design might be informal or rigorous. As they are doing their designs, they can use the system level test requirements to evaluate their subsystem designs, by asking the question "This is how the system level tests will exercise my subsystem. Will it work?" They can also be creating their own subsystem level test requirements for use in reviews or inspections before they get to coding. And then as they do the coding they can identify still more test requirements for use in subsystem testing.

It may be the developers who do the subsystem level testing using the test requirements they captured during design and coding. Or, it could actually be more cost effective to pass the test requirements identified by the developers to the system testers. This is because the system testers are much better at producing repeatable test suites than are developers. And it is sometimes easier to create a repeatable test using the system level interface than by taking a module of code and testing it in isolation. So, to reduce costs, you still do the test definition at the code level, but you actually implement the tests at the system level. However, this may not be effective for very low level code that is hidden way down in the system and communicates with hardware, for example. You need to see if tests for this low level code can effectively be written at the system level. If not, you will need to test this low level code at the subsystem interface level.

There are a couple of more things the developers and system testers should cooperate on. Once the tests are implemented, they should sit down together and review the coverage. Together they will be able to find more test design mistakes than they would individually. The other thing they should do together is to look at customer bug reports. Customer found bugs do a very good job of telling you where your testing is weak. The only problem with them is that they come a little bit late. But they can be used for improving the testing for the next release.

Q: I have found that some developers don't focus much on testing their work. How would you recommend going about getting developers to do good testing?

There are matters of corporate culture that have a very real effect on something like this. So, a general prescription may work well for some cultures but not work well for all cultures. There are some things that you can do to build a strong developer testing climate in your organization. You need to think about the things that you can accomplish given your organization's culture, and your developer base, and then tailor your approach to that. An error that a lot of people make in quality assurance is that we tend to be evangelical. Like anyone else, we tend to think that our work is the most important part of the effort. So we tend to go to developers and want to convert them immediately into replicas of ourselves. What happens then, of course, is you try to make large and drastic changes. This will then either succeed entirely, or fail entirely. Unfortunately, it is much more likely to fail. So the single most important thing you can do is to realize that you have to make this change gradually. For the next release of your product, you want a bit more developer testing. And over time you want to continue to improve it. Basically, you want to spend a moderate amount of money looking for a moderate improvement, instead of betting everything on one roll of the dice.

The other thing you need to do is to consider the typical personality of developers. Developers tend to be fairly skeptical and critical. They will not take it on faith that what you are talking about is the right thing to do. You have to convince them, and in their own terms.

So the way that I approach improving developer testing is to start them doing things that will seem most appealing to them, but may not be the most valuable overall. For example, I believe that improving test design is the real key to improving testing effectiveness. But it takes time before you can get people to where they want to do test design. Instead I start people out with coverage. Have them do whatever testing they currently do, and give them a coverage tool to measure the results of their tests. They will see a lot of unexercised code, so they will write tests to exercise that code. They will not write the best possible tests to exercise that code. But it is a start. It gives them two things: they get instant gratification because they know right away what their tests are doing. And they get some idea of where to stop, because they have gotten to a nice number. For instance, you could say, "Using the debugger, get 100% branch coverage." With manual testing using the debugger, this is usually not a hard thing to do. This means you have gotten 100% coverage, but with a set of tests that will never be run again.

Next you want to introduce test implementation tools and make these tests repeatable. When you do this, you want to make it possible for the developers to create repeatable tests with very little work. For example, to repeatably test a GUI based product, you would want to get a GUI testing tool. There are two styles of GUI testing. One is the capture/replay method and the other is programmatic, where you write your tests as if they were programs, with commands which cause mouse button presses, movement, etc. Of these two styles, the method of writing test programs has much better long term maintainability. But from the standpoint of getting people eased into creating tests, the capture/replay method provides quick satisfaction. The results from capture/replay will not be as good as they could be, but you will end up with a result, which is the starting point. Over the course of time, you can work on demonstrating the benefits of the programmatic method and get people changed over to it.

Once you have done the up-front work of getting developers started doing more testing, measuring their coverage and creating repeatable tests, other forces will make them want to improve their testing. They will be getting bug reports back from system test and customers and wondering how they could have caught them. They will want to know how they can get higher coverage faster. Now you can start talking about test design as a method to catch more bugs up front, and to create higher coverage with less effort. So you can start teaching test design techniques, tailored to their expectations. They will want something that they can go and apply rapidly, not a long drawn out class on everything they could do. As their test design becomes more sophisticated, you can add more advanced techniques.

Overall there are two key things to support this process. First, you need to be willing to spend money on tools. Software development is incredibly tool poor, compared to hardware engineering. Essential tools are test coverage tools and test automation tools, both test drivers and entire test suite managers. You will also need local gurus who know the tools well to help people use them. The second area is to provide training. I remember being given a copy of The Art of Software Testing and being told to "Go test!" It took me years to figure out how to really do testing. I have heard of a study from HP where they looked at two groups of engineers. One group had been trained in the use of a design tool, and others had been simply given the tool and the manuals. After a year, most of the people who had been trained were still using the tool. Very few of the people who had not been trained were still using the tool. And, of all the people who were continuing to use the tool, the trained people were using most of its capacity, while the untrained people were using a very limited subset of its features.

Q: I have heard you say that testing can improve time to market. What do you mean?

This is back to the discussion of the testing process, where you create your test requirements in advance. If you are creating test requirements early in the development cycle, you will also be finding a lot of defects early in the process. It is much cheaper and faster to find problems in the earlier phases. Fixing requirement or design problems after the code is written is expensive, creates lots of problems, and takes lots of time.

The other thing that can improve time to market is automating testing. For the first release, automating testing will actually slow things down because time will have to go into creating the infrastructure. But, once this is in place, releases 2, 3, 4, etc. will go much faster. There are places where the test cycles are very long because they have a whole bunch of people doing manual testing. If they took the time to automate their tests, they could greatly reduce this, because in the future they could use computers to run the tests instead of people.

Q: Where do you see software testing going?

There are really two types of testing. One is based on the product, testing of the code, basically the kind of testing we do today. I think this type of testing can be made quite straightforward, a fairly rote procedure. It should not be a big deal, just something you need to do, that you know how to do, that takes some time, but you get it done and then you move on to other things.

Then there is the other type of testing that I think is considerably more interesting. It is asking "Is this the right product? Does it do the right thing for the customer?" This is sometimes done at the system level. It is a form of testing the specification against the requirements. Unfortunately, the requirements are often implicit rather than explicit. This is the direction that I want to see testing move.

Q: When you said that testing based on the product is a fairly rote procedure, do you mean it could be made fully automated?

There are certain aspects of creating test requirements that could be done automatically. You could run the source of your program through a tool that would look at structures and techniques used and select test requirements from test catalogs. If your design was captured in a stylized form, it could be run through a tool that would generate certain test requirements. There are some tools that do this sort of thing for both code and designs.

The problem is that these tools cannot create as good a test design as even a moderately trained, moderate interested human being. So you still need to go beyond these tools. Omissions, forgotten special cases, and mistaken assumptions still need to be discovered. An automated tool would do a poor job of finding these because it is basing its test requirements on the code or on the stylized design. So I see a need to continue to have a human being involved. As part of my structured approach to test design, I have a stage where I put the code aside for a few days. Then I come back to it and try to be inspired about what things I might have missed, what new test requirements are needed. The additional tests I find will be a minority of the total test requirements. But this minority is very important.

The creation of the actual tests is not yet very automatable. It is not very hard to generate a whole bunch of inputs, but creating their matching expected outputs, so you know if the program passed or failed, is generally hard. There are tools that will generate input test data based on a specification-like language, but they do not also generate the expected results. Therefore a tool that creates 10,000 tests is not very useful, if you then have to go through and manually figure out the expected outcome for each one.

Q: Where do you hope to spend your time in the future?

Testing is about understanding the sources of human errors and how they manifest themselves. How they manifest themselves in code is quite straightforward. We should get those out of the way. How they manifest themselves in defining a product is the really interesting thing to me. This is where I want to move my investigations of testing.

Brian's book is:

The Craft of Software Testing
ISBN 0-13-177411-5
553 pages, $47
Prentice Hall, 1995

[an error occurred while processing this directive]