Bazel Tests Talk (Ulf Adams)

Bazel Test Model

“From Bazel’s perspective a test in an executable binary that it runs, and it looks at the exit code, and if the exit code is zero then the tests pass, and if it is not zero then the tests failed.”

Optional features for running tests in Bazel are designed such that Bazel will set an environment variable, and the test runner binary can read that environment variable to do something interesting.

Flaky Tests

There are two approaches to handling flaky tests. You can either clamp down, file a bug, and insist that flaky tests get fixed urgently, or you can mark tests as flaky and rerun them a few times if they fail.

Nine Bazel Test Features

Premature exit (Doesn’t work?)

It is possible that a test executable might exit prematurely with an exit code of zero, and this would look like success to Bazel when it is instead a failure. The solution is, at test startup, create a file at the env variable TEST_PREMATURE_EXIT_FILE, and then delete it when the test is done. Bazel should check if the file exists when the tst executable exists, but right now it does not. This is a bug. It doesn’t work.

Unstructured logs

Bazel captures test output, so you can log or print to stdout and stderr and it will show up in the terminal.

Structured split logs ???

absolute path to a private file in a writable directory (used to write Logsplitter protobuffer log) Adams was not able to find the protobuf file for this, so it maybe has not been open-sourced, but he thinks it was used to write the output from each test to a separate file. But this does not currently work.

Structured test.xml files

A test.xml file is a structured representation of the test cases in a test executable and their individual results (error messages, stack traces).

Test filters

A regular expression to select a subset of test cases to run within a test. Use --test_filter and Bazel will put the value into the env variable TESTBRIDGE_TEST_FILTER, which the test runner can read to select specific test cases.

Undeclared outputs

Allows a test to generate output files. To use, write files to the directory indicated by TEST_UNDECLARED_OUTPUTS_DIR.

Test warnings

Allows a test to output a warning. To us, write to TEST_WARNINGS_OUTPUT_FILE, but unfortunately the file format is undocumented, and also none of the Bazel UIs currently surface the warnings.

Infrastructure failures

Allows a test to indicate that a test failure is due to a lower-level component, not the test itself. For example, if a test fails because the database has not been started, or an expected service in unavailable. To use it, write to the file at TEST_INFRASTRUCTURE_FAILURE_FILE with one line naming the failed component and another line giving a human-readable error message. Be aware that Bazel posts the file to the BEP, but otherwise ignores it, so maybe this is not useful yet.

Runfiles

A directory with relevant input files for the currently running test. Use by putting the desired files in the data attribute of the test. Be aware that Bazel creates a simlink for every runfile for every test, which can cause the runfile tree to become big enough to cause performance issues. Know also that currently Bazel builds the runfile tree before it checks the cache to see if a test needs to be run, so this slows down even fully cached test runs. Bazel sets TEST_UNUSED_RUNFILES_LOG_FILE which contains a list of unused files in your test run. (It is not clear to me from the talk if Bazel generates this itself, which would require watching the file system for file access, or if it is expected that the test runner / framework would somehow generate it.) And yet again, Bazel posts this file to the BEP but otherwise dos not use it. One could, however, use this file to prune unneeded runfiles.

Test sharding

A way to run the same test executable multiple times, setting TEST_SHARD_INDEX and TEST_TOTAL_SHARDS to different values each time. The idea here is that you can have each shard run a different subset of tests. To implement this, have the test framework deterministically collect all test cases in a list, and then select which tests to run with test_list_index % TOTAL_SHARDS == SHARD_INDEX.

Runs per test + random seeds

Runs the same tst executable multiple times, setting TEST_RUN_NUMBER. This lets you run a test multiple times (without caching) to see if it is flaky. Use by setting --runs_per_test={number} or --runs_per_test={regex}@{number}. This feature also sets TEST_RANDOM_SEED to a random number.

Coverage

Bazel has code coverage support, with a bunch of caveats. Adams recommends explicitly setting an instrumentation filter in .bazelrc is build --instrumentation_filter=//java/,//resultstore/

Flaky test attempts

Retries failing tests “after they fail”. To use it, pass --flaky_test_attempts={number} or --flaky_test_attempts={regex}@{number}.