Skip to content

Move beyond coverage based testing #530

@MicahGale

Description

@MicahGale

I started thinking about this recently:

Why are we finding a lot of bugs in MontePy despite having around 98% code coverage?

This is a broad and complex issue, but in part I came across the concept of "pseudo-tested methods" (this was written about Java).
The authors do provide a tool for finding these methods, but it is only implemented for Java.

The authors also wrote an IEE article on this topic, that I should read at some point (doi: 10.1145/2896941.2896944)

Not reading anything won't stop me from drawing conclusions from an Abstract though:

Automated tests play an important role in software evolution because they can rapidly detect faults introduced during changes. In practice, code-coverage metrics are often used as criteria to evaluate the effectiveness of test suites with focus on regression faults. However, code coverage only expresses which portion of a system has been executed by tests, but not how effective the tests actually are in detecting regression faults.

Our goal was to evaluate the validity of code coverage as a measure for test effectiveness. To do so, we conducted an empirical study in which we applied an extreme mutation testing approach to analyze the tests of open-source projects written in Java. We assessed the ratio of pseudo-tested methods (those tested in a way such that faults would not be detected) to all covered methods and judged their impact on the software project. The results show that the ratio of pseudo-tested methods is acceptable for unit tests but not for system tests (that execute large portions of the whole system). Therefore, we conclude that the coverage metric is only a valid effectiveness indicator for unit tests.

So some actionable steps for the time being:

  1. exclude tests/test_integration, etc. from coverage reports
  2. Limit the scope of coverage for specific test packages to specific source code. e.g., tests/test_syntax_parsing should not contribute to the MCNP_Problem coverage

Wishlist:

  1. Have an automated tool to detect pseudo-tested functions
  2. Detect when a function's return value is not tested.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions