Notions of “Completeness”

For the past ten years or so, part of my job as a test manager has been interviewing testers. In that time, I’ve interviewed hundreds of testers using an “audition” format.

An audition is a one-hour simulated project, where the tester applying for a job comes in and demonstrates their exploratory testing ability on a piece of software. As I observe, I take notes on a whiteboard about what I see.

The application they test is an old VB app that takes three numbers that the user enters into a text field and tells them what kind of triangle it would create. The numbers entered must be separated by commas. When the “Check” button is pressed, a read-only output field will report one of five things depending on the geometric legality of the triangle: Equilateral, Isosceles, Scalene, Invalid!, and “Not a Triangle.”

It’s a deceptively simple program, and over the years, I have seen and recorded thousands of tests by candidates. Today, I saw a test idea I could have sworn I have seen many times before, but it resulted in an error I know have never seen.

The error was a run-time ’13’ (type mismatch). It was revealed with a simple test: 5%, 17000%, 88.999999999%.

It could be emergent behavior, but I doubt it. Nothing has changed in my program and I suspect nothing new or meaningful is occurring with my machine state. My best guess is that no one has ever run this kind of a test.

I’ve seen people enter all kinds of symbols (like percent signs) and numbers, and combinations thereof, but surprisingly, nothing that combined three types (integers, decimals, and percent signs) in this way.

That simultaneously amazes me and makes me shrug. It amazes me because you’d think that someone would have stumbled on to this bug before now — either via a different test or this very same test. But I shrug because I know that not every test (or combinations of tests) conceived by humans or computers can be performed, so “completeness” will always be a subjective notion.

So how do we report completeness when there may be a latent (and important) bug hiding despite all of our best efforts to find it?

One of the exercises I use toward the end of the audition is to ask the candidate to tell me how complete their testing is on a scale of 0 to 100 (100 being “completely tested”).

I have done about 400 of these auditions over the years and I have seen a lot of interesting tests, comments, and behaviors from testers. I’ve hard all kinds of answers to the 0 – 100 question.

It’s meant to be a trap. If they give me a number, I can push back with a series of tough questions as to why they gave me that number and not another. Some answers are better than others.

Here’s a few I find acceptable enough that I would likely not push back on the candidate too hard:

1) The “Tests Passed” response — “I started with X amount of test ideas in my head and I ran Y. Now, Z of them have passed, and assuming “passing” is an indication that I have a little bit of coverage in that area, I answered in terms of that notion.”

2) “Risk List” response — “I have a story for all of the meaningful risks I identified at the start of this audition.”

3) Confirmatory — “The product met X expectations under these conditions. Anything more is superfluous because I treated this like a trade show demo.”

4) Trap Avoidance — “I can’t answer that without more context.”

5) Zero — “There are an infinite amount of tests. I have run 57 that I can identify. Fifty-seven divided by infinity is a number that’s so close to zero, I chose to say zero.”

6) 100 (based on time) — “Assuming 100 means ‘100 percent’, I have completed all the testing I had time for.”

7) Terminology Check-in — “What do you mean by 0 or 100. 0 or 100 what? Percent? Test cases? Bugs found?”

8) Good Enough — “In the time you gave me, I found bugs I believe to be of value for the stakeholders. The story of my testing does not have any major problems or would not likely provoke critical questions I can’t answer. The benefits of my work outweigh any problems with it, and furthermore, further testing may be more expensive at this point, so we may have reached an acceptable level of testing. Do you agree?”

9) Pushback — The candidate asks me “Why are you asking the question?” or “Why is it important to distill it in this way?” or “What will be done with this information?”

10) Any number — The candidate gives me a number, but can explain to my satisfaction why they came up with it and defends it to the extent that they convince me they would not lose credibility or respect with other members of the team.