# Running tests ## Unit-tests The unit tests are defined using the [googletest] and [rapidcheck] frameworks. [googletest]: https://google.github.io/googletest/ [rapidcheck]: https://github.com/emil-e/rapidcheck ### Source and header layout > An example of some files, demonstrating much of what is described below > > ``` > … > ├── src > │   ├── libexpr > │   │   ├── … > │   │   ├── value > │   │   │   ├── context.cc > │   │   │   └── context.hh > │ … … > ├── tests > │   … > │   └── unit > │   ├── libcmd > │   │   └── args.cc > │   ├── libexpr > │   │   ├── … > │   │   └── value > │   │   ├── context.cc > │   │   └── print.cc > │   ├── libexpr-support > │   │   └── tests > │   │   ├── libexpr.hh > │   │   └── value > │   │   ├── context.cc > │   │   └── context.hh > │   ├── libstore > │   │   ├── common-protocol.cc > │   │   ├── data > │   │   │   ├── libstore > │   │   │   │   ├── common-protocol > │   │   │   │   │   ├── content-address.bin > │   │   │   │   │   ├── drv-output.bin > … … … … … … > ``` The unit tests for each Lix library (`liblixexpr`, `liblixstore`, etc..) live inside a directory `src/${library_shortname}/tests` within the directory for the library (`src/${library_shortname}`). The data is in `tests/unit/LIBNAME/data/LIBNAME`, with one subdir per library, with the same name as where the code goes. For example, `liblixstore` code is in `src/libstore`, and its test data is in `tests/unit/libstore/data/libstore`. The path to the unit test data directory is passed to the unit test executable with the environment variable `_NIX_TEST_UNIT_DATA`. ### Running tests You can run the whole testsuite with `just test` (see justfile for exact invocation of meson), and if you want to run just one test suite, use `just test --suite installcheck functional-init` where `installcheck` is the name of the test suite in this case and `functional-init` is the name of the test. To get a list of tests, use `meson test -C build --list` (or `just test --list` for short). For `installcheck` specifically, first run `just install` before running the test suite (this is due to meson limitations that don't let us put a dependency on installing before doing the test). Finer-grained filtering within a test suite is also possible using the [--gtest_filter](https://google.github.io/googletest/advanced.html#running-a-subset-of-the-tests) command-line option to a test suite executable, or the `GTEST_FILTER` environment variable. ### Unit test support libraries There are headers and code which are not just used to test the library in question, but also downstream libraries. For example, we do [property testing] with the [rapidcheck] library. This requires writing `Arbitrary` "instances", which are used to describe how to generate values of a given type for the sake of running property tests. Because types contain other types, `Arbitrary` "instances" for some type are not just useful for testing that type, but also any other type that contains it. Downstream types frequently contain upstream types, so it is very important that we share arbitrary instances so that downstream libraries' property tests can also use them. It is important that these testing libraries don't contain any actual tests themselves. On some platforms they would be run as part of every test executable that uses them, which is redundant. On other platforms they wouldn't be run at all. ### Characterization testing See [below](#characterization-testing-1) for a broader discussion of characterization testing. Like with the functional characterization, `_NIX_TEST_ACCEPT=1` is also used. For example: ```shell-session $ _NIX_TEST_ACCEPT=1 just test --suite check libstore-unit-tests ... ../tests/unit/libstore/common-protocol.cc:27: Skipped Cannot read golden master because another test is also updating it ../tests/unit/libstore/common-protocol.cc:62: Skipped Updating golden master ../tests/unit/libstore/common-protocol.cc:27: Skipped Cannot read golden master because another test is also updating it ../tests/unit/libstore/common-protocol.cc:62: Skipped Updating golden master ... ``` will regenerate the "golden master" expected result for the `liblixstore` characterization tests. The characterization tests will mark themselves "skipped" since they regenerated the expected result instead of actually testing anything. ## Functional tests The functional tests reside under the `tests/functional` directory and are listed in `tests/functional/meson.build`. Each test is a bash script. ### Running the whole test suite
FIXME(meson): this section is wrong for meson and commented out accordingly. See "Running Tests" above, and ask the Lix team if you need further clarification.
### Debugging failing functional tests When a functional test fails, it usually does so somewhere in the middle of the script. To figure out what's wrong, it is convenient to run the test regularly up to the failing `nix` command, and then run that command with a debugger like GDB. For example, if the script looks like: ```bash foo nix blah blub bar ``` edit it like so: ```diff foo -nix blah blub +gdb --args nix blah blub bar ```
FIXME(meson): the command here is incorrect for meson and this whole functionality may need rebuilding.
Then, running the test with `./mk/debug-test.sh` will drop you into GDB once the script reaches that point: ```shell-session $ ./mk/debug-test.sh tests/functional/${testName}.sh ... + gdb blash blub GNU gdb (GDB) 12.1 ... (gdb) ``` One can debug the Nix invocation in all the usual ways. For example, enter `run` to start the Nix invocation. ### Characterization testing Occasionally, Lix utilizes a technique called [Characterization Testing](https://en.wikipedia.org/wiki/Characterization_test) as part of the functional tests. This technique is to include the exact output/behavior of a former version of Nix in a test in order to check that Lix continues to produce the same behavior going forward. For example, this technique is used for the language tests, to check both the printed final value if evaluation was successful, and any errors and warnings encountered. It is frequently useful to regenerate the expected output. To do that, rerun the failed test(s) with `_NIX_TEST_ACCEPT=1`. For example: ```bash _NIX_TEST_ACCEPT=1 just test --suite installcheck -v functional-lang ``` An interesting situation to document is the case when these tests are "overfitted". The language tests are, again, an example of this. The expected successful output of evaluation is supposed to be highly stable – we do not intend to make breaking changes to (the stable parts of) the Nix language. However, the errors and warnings during evaluation (successful or not) are not stable in this way. We are free to change how they are displayed at any time. It may be surprising that we would test non-normative behavior like diagnostic outputs. Diagnostic outputs are indeed not a stable interface, but they still are important to users. By recording the expected output, the test suite guards against accidental changes, and ensure the *result* (not just the code that implements it) of the diagnostic code paths are under code review. Regressions are caught, and improvements always show up in code review. To ensure that characterization testing doesn't make it harder to intentionally change these interfaces, there always must be an easy way to regenerate the expected output, as we do with `_NIX_TEST_ACCEPT=1`. ## Integration tests The integration tests are defined in the Nix flake under the `hydraJobs.tests` attribute. These tests include everything that needs to interact with external services or run Lix in a non-trivial distributed setup. Because these tests are expensive and require more than what the standard github-actions setup provides, they only run on the master branch (on ). You can run them manually with `nix build .#hydraJobs.tests.{testName}` or `nix-build -A hydraJobs.tests.{testName}`
Installer tests section is outdated and commented out, see https://git.lix.systems/lix-project/lix/issues/33