Expanded test suite

* Lang now verifies errors and parse output * Some new miscellaneous tests * Easy way to update the tests * Document workflow in manual * Use `!` not `~` as separater char for sed It is confusing to use `~` when we are talking about paths and home directories! * Test test suite itself (`test/lang-test/infra.sh`) Additionally, run shellcheck on `tests/lang.sh` to help ensure it is correct, now that is is more complex. Co-authored-by: Robert Hensing <roberth@users.noreply.github.com> Co-authored-by: Valentin Gagarin <valentin.gagarin@tweag.io>
author: Mathnerd314 <mathnerd314.gph+hs@gmail.com> 2015-09-04 14:23:08 -0600
committer: John Ericson <John.Ericson@Obsidian.Systems> 2023-07-11 21:43:09 -0400
commit: c70484454f52a3bba994ac0c155b4a6f35a5013c (patch)
tree: 024cda6d4e26ffc6396bed4da9e172eeba2001d3 /doc/manual/src/contributing/testing.md
parent: c2c8187118c4bf2961aa175ff8667e1402a767c7 (diff)
1 files changed, 25 insertions, 0 deletions
diff --git a/doc/manual/src/contributing/testing.md b/doc/manual/src/contributing/testing.md
index e5f80a928..a5253997d 100644
--- a/doc/manual/src/contributing/testing.md
+++ b/doc/manual/src/contributing/testing.md
@@ -86,6 +86,31 @@ GNU gdb (GDB) 12.1
 One can debug the Nix invocation in all the usual ways.
 For example, enter `run` to start the Nix invocation.
 
+### Characterization testing
+
+Occasionally, Nix utilizes a technique called [Characterization Testing](https://en.wikipedia.org/wiki/Characterization_test) as part of the functional tests.
+This technique is to include the exact output/behavior of a former version of Nix in a test in order to check that Nix continues to produce the same behavior going forward.
+
+For example, this technique is used for the language tests, to check both the printed final value if evaluation was successful, and any errors and warnings encountered.
+
+It is frequently useful to regenerate the expected output.
+To do that, rerun the failed test with `_NIX_TEST_ACCEPT=1`.
+(At least, this is the convention we've used for `tests/lang.sh`.
+If we add more characterization testing we should always strive to be consistent.)
+
+An interesting situation to document is the case when these tests are "overfitted".
+The language tests are, again, an example of this.
+The expected successful output of evaluation is supposed to be highly stable – we do not intend to make breaking changes to (the stable parts of) the Nix language.
+However, the errors and warnings during evaluation (successful or not) are not stable in this way.
+We are free to change how they are displayed at any time.
+
+It may be surprising that we would test non-normative behavior like diagnostic outputs.
+Diagnostic outputs are indeed not a stable interface, but they still are important to users.
+By recording the expected output, the test suite guards against accidental changes, and ensure the *result* (not just the code that implements it) of the diagnostic code paths are under code review.
+Regressions are caught, and improvements always show up in code review.
+
+To ensure that characterization testing doesn't make it harder to intentionally change these interfaces, there always must be an easy way to regenerate the expected output, as we do with `_NIX_TEST_ACCEPT=1`.
+
 ## Integration tests
 
 The integration tests are defined in the Nix flake under the `hydraJobs.tests` attribute.
author	Mathnerd314 <mathnerd314.gph+hs@gmail.com>	2015-09-04 14:23:08 -0600
committer	John Ericson <John.Ericson@Obsidian.Systems>	2023-07-11 21:43:09 -0400
commit	c70484454f52a3bba994ac0c155b4a6f35a5013c (patch)
tree	024cda6d4e26ffc6396bed4da9e172eeba2001d3 /doc/manual/src/contributing/testing.md
parent	c2c8187118c4bf2961aa175ff8667e1402a767c7 (diff)