From 447212fa65a80180150b265411924cc638a2c52c Mon Sep 17 00:00:00 2001 From: sugar Date: Tue, 20 Aug 2024 00:21:59 +0200 Subject: libexpr: Replace regex engine with boost::regex This avoids C++'s standard library regexes, which aren't the same across platforms, and have many other issues, like using stack so much that they stack overflow when processing a lot of data. To avoid backwards and forward compatibility issues, regexes are processed using a function converting libstdc++ regexes into Boost regexes, escaping characters that Boost needs to have escaped, and rejecting features that Boost has and libstdc++ doesn't. Related context: - Original failed attempt to use `boost::regex` in CppNix, failed due to boost icu dependency being large (disabling ICU is no longer necessary because linking ICU requires using a different header file, `boost/regex/icu.hpp`): https://github.com/NixOS/nix/pull/3826 - An attempt to use PCRE, rejected due to providing less backwards compatibility with `std::regex` than `boost::regex`: https://github.com/NixOS/nix/pull/7336 - Second attempt to use `boost::regex`, failed due to `}` regex failing to compile (dealt with by writing a wrapper that parses a regular expression and escapes `}` characters): https://github.com/NixOS/nix/pull/7762 Closes #34. Closes #476. Change-Id: Ieb0eb9e270a93e4c7eed412ba4f9f96cb00a5fa4 --- doc/manual/change-authors.yml | 4 ++++ doc/manual/rl-next/boost-regex.md | 37 +++++++++++++++++++++++++++++++++++++ 2 files changed, 41 insertions(+) create mode 100644 doc/manual/rl-next/boost-regex.md (limited to 'doc') diff --git a/doc/manual/change-authors.yml b/doc/manual/change-authors.yml index e18abada1..d9303a747 100644 --- a/doc/manual/change-authors.yml +++ b/doc/manual/change-authors.yml @@ -129,6 +129,10 @@ roberth: display_name: Robert Hensing github: roberth +sugar: + forgejo: sugar + github: sugar700 + thufschmitt: display_name: Théophane Hufschmitt github: thufschmitt diff --git a/doc/manual/rl-next/boost-regex.md b/doc/manual/rl-next/boost-regex.md new file mode 100644 index 000000000..c541434d0 --- /dev/null +++ b/doc/manual/rl-next/boost-regex.md @@ -0,0 +1,37 @@ +--- +synopsis: Replace regex engine with boost::regex +issues: [fj#34, fj#476] +cls: [1821] +category: Fixes +credits: [sugar] +--- + +Previously, the C++ standard regex expression library was used, the +behaviour of which varied depending on the platform. This has been +replaced with the Boost regex library, which works identically across +platforms. + +The visible behaviour of the regex functions doesn't change. While +the new library has more features, Lix will reject regular expressions +using them. + +This also fixes regex matching reporting stack overflow when matching +on too much data. + +Before: + + nix-repl> builtins.match ".*" ( + builtins.concatStringsSep "" ( + builtins.genList (_: "a") 1000000 + ) + ) + error: stack overflow (possible infinite recursion) + +After: + + nix-repl> builtins.match ".*" ( + builtins.concatStringsSep "" ( + builtins.genList (_: "a") 1000000 + ) + ) + [ ] -- cgit v1.2.3