What’s wrong with traditional unit testing?
Unit tests, which verify the behavior of individual building blocks of code, are expensive. They are time-consuming for developers to write, they can be resource-intensive to maintain, and a critical mass of them is needed before they add much value. But they are also expensive to not create: if a major bug slips through to production, it can cost the business untold sums in lost customers, revenue and reputational damage.
Traditionally, there’s been no way around this expense, so development teams have accepted a trade-off: under pressure to deliver new feature code, developers typically only write the unit tests that cover the business logic they think is most critical. They save time, but risk missing any number of less obvious, but still critical test cases.
A changing approach to unit testing, powered by AI, has made it easier than ever for development teams to get more of the tests they need. Unit regression tests are a new category of tests created by Diffblue Cover, a tool developed by the team of world-leading experts in software verification at the University of Oxford spin-out Diffblue.
By describing the historical behavior of your code, these unit regression tests track both intended and unintended changes in the behavior of code over time. Their strength is in their numbers and the speed at which AI creates them—hundreds of times faster than the equivalents could be written by people. But what are they, how else do they differ from traditional unit tests, and what benefits do they offer?
What are unit regression tests?
Traditional functional regression testing and unit testing aren’t topics that are typically discussed together, but they ultimately aim to achieve the same goal in two different ways, with varying levels of success.
Regression tests (typically an automated functional test) aim to verify the consistency of functionality from release to release—usually investment is focused on important use cases. They typically run late in the pipeline, as part of final verification of the program, and are usually end-to-end functional tests, which require a realistic test environment, including dependencies like databases, APIs, etc. For these reasons, traditional regression tests are typically slow, expensive and (as a black box) ineffective at helping find where the unintended behavior has been introduced.
Traditional unit tests, on the other hand, run as early as possible in the development cycle and are designed to pinpoint errors in a single module. They have to run quickly so they can be used by the developer as part of the code-build-test-repeat cycle without impacting productivity. Having a full set of dependencies (such as databases, APIs, etc.) isn’t practical for fast-running tests that run often. Dependencies are mocked—stubbed out with code that returns test data, for example, returning simulated data instead of making an actual database query.
Upon closer inspection, it becomes clear that one of the main purposes of unit tests is to also find regressions. Once a unit test has been authored and the code committed, the unit test will forever provide a benchmark to which future commits can be judged. Developers run unit tests periodically to see if something that previously worked has broken or changed. This, however, only starts to deliver value if a critical mass of code is exercised with a complementary critical mass of unit tests. This is defined as code coverage, expressed as a percent of how much code is exercised by the unit test.
Having a small number of targeted unit tests can prevent some key issues, typically for high risk business logic that the developer has deemed valuable enough to write unit tests for. Unit tests don’t provide any protection outside of the code that they cover, and this is why most organizations fail to see regression benefits from their unit tests without investing significant human resources in re-visiting existing code to implement unit tests.
Unit regression tests that are written and maintained automatically by Diffblue Cover, on the other hand, exist in volume by default. This is why they can quickly and efficiently allow developers to find changes in the behavior of their code, and even in edge and corner cases. Having a wide array of unit tests reduces risk and associated cost, so developers can have more confidence that the changes they make won’t break the pipeline.
To learn more about unit regression tests, check out the other blogs in this series:
And try Diffblue Cover yourself by signing up for a free trial.