Automatic Testing - Programmers Are Still The Problem
Written by Alex Armstrong   
Friday, 16 June 2017

White box test generator tools can help you find bugs automatically by generating combinations of inputs that are likely to give the wrong answer. However there is a problem - how do you know when you have the wrong answer?

A recent study by David Hon€ and Zoltan Micskei of Budapest University of Technology and Economics puts automatic testing under the spotlight.

White-box test generator tools rely only on the code under test to select test inputs, and capture the implementation’s output as assertions. If there is a fault in the implementation, it could get encoded in the generated tests. Tool evaluations usually measure fault-detection capability using the number of such fault-encoding tests. However, these faults are only detected, if the developer can recognize that the encoded behavior is faulty.

They carried out a study using 54 grad students and two open source projects and the IntelliTest Tool which is still better known as Pex. The tool generated test cases and the subjects were asked to identify when the output was indicative of a fault. The results were some what surprising.

ŒThe results showed that participants incorrectly classi€ed a large number of both fault-encoding and correct tests (with median misclassi€cation rate 33% and 25% respectively). Œus the real fault-detection capability of test generators could be much lower than typically reported, and we suggest to take this human factor into account when evaluating generated white-box tests.

You can see the confusion matrix for the NBitcoin project's tests:

errorgrid

FP=False Positive, FN=False Negative, TP=True Positive, TN=True Negative

What this means is that in practice white box testing is subject to programmer errors in its own right. By examining videos of the subjects trying to interpret the tests the researchers note that they tended to use debugging methods to further explore the code and clarify the meaning of the test. An "exit" survey also indicates the sorts of difficulties the subjects felt they had with the task:

  • “Deciding whether a test is OK or wrong when it tests an unspecifi€ed case, e.g. comparing with null, or equality of null.”

  • “Distinguishing between the variables was difficult, e.g between assetMoney, assetMoney1, assetMoney2.”

  • “Tests should compare less with null and objects with themselves.”

  • “I think that some assertions are useless, and not asserting ’real problems’, just some technical details.”

  • “Generated test cases doesn’t separated into Arrange, Act, Assert and should create more private methods for these concerns.”

  • “Generate comments into tests describing what happening.”

The key finding is:

ŒThe implication of the results is that the actual fault-€nding capabilities of the test generator tools could be much lower than reported in technology-focused experiments. 

Of course it would be better if the programmer could be taken out of the loop, but this would require the white box tester to work out what the result of the test should be and this would require a lot of AI.

More Information

Classifying the Correctness of Generated White-Box Tests: An Exploratory Study

Related Articles

Code Digger Finds The Values That Break Your Code

Code Hunt - New Coding Game From Microsoft Research

Debugging and the Experimental Method       

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


Microsoft Releases Update For VSCode SQL Extension
30/12/2024

There's a new version of the Microsoft MSSQL extension for Visual Studio Code that aims to make it easier to write VSCode apps for database use, specifically for Azure SQL, SQL Database in Fabric, and [ ... ]



Rust 1.84 Adds Strict Provenance APIs
16/01/2025

Rust 1.84 has been released with changes including a move to a new trait solver and a set of Strict Provenance APIs.


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Friday, 16 June 2017 )