Tezt: fix coverage with [--job-count] through [--bisect-sigterm]
Context
Coverage data for unit tests executing through Tezt in the CI is currently not being collected due to an interaction between Test's --job-count and bisect_ppx's instrumentation. In short, Tezt's worker processes are killed with sigterm before they get a chance to write collected coverage data.
Details
With the alcotezt project, unit tests are now linked with and executed through the Tezt main entrypoint (tezt/tests/main.exe).
The linked test libraries are instrumented with bisect_ppx, so coverage data is collected.
However, this does not work when tezt is launched with --job-count, as is done in the CI.
The reason is that bisect_ppx works by adding an at_exit hook to the instrumented binary.
This hook writes collected coverage data when the process terminates.
When --job-count is used with tezt, tezt forks a number of child processes that do the actual test execution.
It is in these processes that coverage data is collected when unit tests are executed (in the case of integration test, it works slightly differently).
However, once there are no more tests to execute, Tezt kills the child process by signalling them with SIGTERM.
By default, an OCaml programs at_exit hook is not executed in this case. Consequently, the coverage data is not executed.
Proposed solution
Bisect_ppx contains a feature which installs a sigterm handler in instrumented processes which writes coverage data when signalled. This MR:
- updates manifest so that a target can specify whether this feature should be enabled
- activates this feature on all tezt entrypoints (
tezt/tests/main.exeand all the test binaries corresponding to unit tests, e.g.src/lib_crypto_dal/test/main.exe.
Discussion
Note that bisect's sigterm handler can also be installed by setting the environment variable BISECT_SIGTERM. However, this has the side effect of installing it in all sub-processes launched by Tezt, not just its worker processes. This means that a sigterm handler will be installed in each invocation of e.g. octez-client. As this may interfere with existing signal handlers, I've opted for the approached of passing --bisect-sigterm to the instrumentation backend.
Manually testing the MR
Before this MR:
Run the unit tests for lib_clic and report on coverage for the file src/lib_clic/tezos_clic.ml.
$ make coverage-clean && ./scripts/with_coverage.sh dune exec src/lib_clic/test/main.exe && bisect-ppx-report summary _coverage_output/*.coverage --per-file | grep tezos_clic.ml
Running dune exec src/lib_clic/test/main.exe with:
BISECT_FILE=/home/arvid/dev/nomadic-labs/tezos/master/_coverage_output/
DUNE_INSTRUMENT_WITH=bisect_ppx
-------------------------------
Done: 23% (203/847, 644 left) (jobs: 0)[14:52:55.524] [SUCCESS] (1/2) dispatch
[14:52:55.524] [SUCCESS] (2/2) auto-completion-parameters
20.49 % 216/1054 src/lib_clic/tezos_clic.ml
Coverage is 20.49%.
Now do the same thing with -j 2 (-j is short for --job-count):
$ make coverage-clean && ./scripts/with_coverage.sh dune exec src/lib_clic/test/main.exe -- -j 2 && bisect-ppx-report summary _coverage_output/*.coverage --per-file | grep tezos_clic.ml
Running dune exec src/lib_clic/test/main.exe -- -j 2 with:
BISECT_FILE=/home/arvid/dev/nomadic-labs/tezos/master/_coverage_output/
DUNE_INSTRUMENT_WITH=bisect_ppx
-------------------------------
Done: 23% (203/847, 644 left) (jobs: 0)[14:53:05.687] [SUCCESS] (2/2) auto-completion-parameters
[14:53:05.688] [SUCCESS] (1/2) dispatch
1.23 % 13/1054 src/lib_clic/tezos_clic.ml
Notice that coverage is now only 1.23%.
Try the same thing with the patches of this MR applied, and you'll
have 20.49% with or without -j 2.
Checklist
-
Document the interface of any function added or modified (see the coding guidelines) -
Document any change to the user interface, including configuration parameters (see node configuration) -
Provide automatic testing (see the testing guide). -
For new features and bug fixes, add an item in the appropriate changelog ( docs/protocols/alpha.rstfor the protocol and the environment,CHANGES.rstat the root of the repository for everything else). -
Select suitable reviewers using the Reviewersfield below. -
Select as Assigneethe next person who should take action on that MR