aboutsummaryrefslogtreecommitdiff
path: root/src/libexpr/parser.y
AgeCommit message (Collapse)Author
2022-05-06trying debugThrowBen Burdette
2022-04-29Merge remote-tracking branch 'origin/master' into coerce-stringGuillaume Maudoux
2022-04-28Merge branch 'master' into debug-merge-masterBen Burdette
2022-04-25rename SymbolIdx -> Symbol, Symbol -> SymbolStrpennae
after #6218 `Symbol` no longer confers a uniqueness invariant on the string it wraps, it is now possible to create multiple symbols that compare equal but whose string contents have different addresses. this guarantee is now only provided by `SymbolIdx`, leaving `Symbol` only as a string wrapper that knows about the intricacies of how symbols need to be formatted for output. this change renames `SymbolIdx` to `Symbol` to restore the previous semantics of `Symbol` to that name. we also keep the wrapper type and rename it to `SymbolStr` instead of returning plain strings from lookups into the symbol table because symbols are formatted for output in many places. theoretically we do not need `SymbolStr`, only a function that formats a string for output as a symbol, but having to wrap every symbol that appears in a message into eg `formatSymbol()` is error-prone and inconvient.
2022-04-21store Symbols in a table as well, like positionspennae
this slightly increases the amount of memory used for any given symbol, but this increase is more than made up for if the symbol is referenced more than once in the EvalState that holds it. on average every symbol should be referenced at least twice (once to introduce a binding, once to use it), so we expect no increase in memory on average. symbol tables are limited to 2³² entries like position tables, and similar arguments apply to why overflow is not likely: 2³² symbols would require as many string instances (at 24 bytes each) and map entries (at 24 bytes or more each, assuming that the map holds on average at most one item per bucket as the docs say). a full symbol table would require at least 192GB of memory just for symbols, which is well out of reach. (an ofborg eval of nixpks today creates less than a million symbols!)
2022-04-21don't use Symbol in Pos to represent a pathpennae
PosTable deduplicates origin information, so using symbols for paths is no longer necessary. moving away from path Symbols also reduces the usage of symbols for things that are not keys in attribute sets, which will become important in the future when we turn symbols into indices as well.
2022-04-21replace most Pos objects/ptrs with indexes into a position tablepennae
Pos objects are somewhat wasteful as they duplicate the origin file name and input type for each object. on files that produce more than one Pos when parsed this a sizeable waste of memory (one pointer per Pos). the same goes for ptr<Pos> on 64 bit machines: parsing enough source to require 8 bytes to locate a position would need at least 8GB of input and 64GB of expression memory. it's not likely that we'll hit that any time soon, so we can use a uint32_t index to locate positions instead.
2022-04-21remove Symbol::emptypennae
the only use of this function is to determine whether a lambda has a non-set formal, but this use is arguably better served by Symbol::set and using a non-Symbol instead of an empty symbol in the parser when no such formal is present.
2022-04-08minor cleanupBen Burdette
2022-04-08move throw to preverve Error type; turn off debugger for tryEvalBen Burdette
2022-04-07Merge remote-tracking branch 'upstream/master' into upstream-mergeBen Burdette
2022-03-18Refactor to use more traces and less string manipulationsGuillaume Maudoux
2022-03-14more debug_throw coverage of EvalErrorsBen Burdette
2022-03-04Add error context for most basic coercionsGuillaume Maudoux
2022-02-25Remove std::string alias (for real this time)Eelco Dolstra
Also use std::string_view in a few more places.
2022-02-21Remove std::vector aliasEelco Dolstra
2022-02-04Merge branch 'master' into debug-stepBen Burdette
2022-02-02Merge branch 'parser-improvements' of https://github.com/pennae/nixEelco Dolstra
2022-01-27return string_views from forceString*pennae
once a string has been forced we already have dynamic storage allocated for it, so we can easily reuse that storage instead of copying.
2022-01-27convert a for more utilities to string_viewpennae
2022-01-25Fix parsing of variable names that are a suffix of '__curPos'regnat
Follow-up from #5969 Fix #5982
2022-01-24Fix parsing of variable names that are a prefix of '__curPos'Eelco Dolstra
Fixes $ nix-instantiate --parse -E 'x: with x; _' (x: (with x; __curPos))
2022-01-19defer formals duplicate check for incresed efficiency all roundpennae
if we defer the duplicate argument check for lambda formals we can use more efficient data structures for the formals set, and we can get rid of the duplication of formals names to boot. instead of a list of formals we've seen and a set of names we'll keep a vector instead and run a sort+dupcheck step before moving the parsed formals into a newly created lambda. this improves performance on search and rebuild by ~1%, pure parsing gains more (about 4%). this does reorder lambda arguments in the xml output, but the output is still stable. this shouldn't be a problem since argument order is not semantically important anyway. before nix search --no-eval-cache --offline ../nixpkgs hello Time (mean ± σ): 8.550 s ± 0.060 s [User: 6.470 s, System: 1.664 s] Range (min … max): 8.435 s … 8.666 s 20 runs nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix Time (mean ± σ): 346.7 ms ± 2.1 ms [User: 312.4 ms, System: 34.2 ms] Range (min … max): 343.8 ms … 353.4 ms 20 runs nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system' Time (mean ± σ): 2.720 s ± 0.031 s [User: 2.415 s, System: 0.231 s] Range (min … max): 2.662 s … 2.780 s 20 runs after nix search --no-eval-cache --offline ../nixpkgs hello Time (mean ± σ): 8.462 s ± 0.063 s [User: 6.398 s, System: 1.661 s] Range (min … max): 8.339 s … 8.542 s 20 runs nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix Time (mean ± σ): 329.1 ms ± 1.4 ms [User: 296.8 ms, System: 32.3 ms] Range (min … max): 326.1 ms … 330.8 ms 20 runs nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system' Time (mean ± σ): 2.687 s ± 0.035 s [User: 2.392 s, System: 0.228 s] Range (min … max): 2.626 s … 2.754 s 20 runs
2022-01-19don't use Symbols for stringspennae
string expressions by and large do not need the benefits a Symbol gives us, instead they pollute the symbol table and cause unnecessary overhead for almost all strings. the one place we can think of that benefits from them (attrpaths with expressions) extracts the benefit in the parser, which we'll have to touch anyway when changing ExprString to hold strings. this gives a sizeable improvement on of 3-5% on all benchmarks we've run. before nix search --no-eval-cache --offline ../nixpkgs hello Time (mean ± σ): 8.844 s ± 0.045 s [User: 6.750 s, System: 1.663 s] Range (min … max): 8.758 s … 8.922 s 20 runs nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix Time (mean ± σ): 367.4 ms ± 3.3 ms [User: 332.3 ms, System: 35.2 ms] Range (min … max): 364.0 ms … 375.2 ms 20 runs nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system' Time (mean ± σ): 2.810 s ± 0.030 s [User: 2.517 s, System: 0.225 s] Range (min … max): 2.742 s … 2.854 s 20 runs after nix search --no-eval-cache --offline ../nixpkgs hello Time (mean ± σ): 8.533 s ± 0.068 s [User: 6.485 s, System: 1.642 s] Range (min … max): 8.404 s … 8.657 s 20 runs nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix Time (mean ± σ): 347.6 ms ± 3.1 ms [User: 313.1 ms, System: 34.5 ms] Range (min … max): 343.3 ms … 354.6 ms 20 runs nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system' Time (mean ± σ): 2.709 s ± 0.032 s [User: 2.414 s, System: 0.232 s] Range (min … max): 2.655 s … 2.788 s 20 runs
2022-01-19remove ExprIndStrpennae
it can be replaced with StringToken if we add another bit if information to StringToken, namely whether this string should take part in indentation scanning or not. since all escaping terminates indentation scanning we need to set this bit only for the non-escaped IND_STRING rule. this improves performance by about 1%. before nix search --no-eval-cache --offline ../nixpkgs hello Time (mean ± σ): 8.880 s ± 0.048 s [User: 6.809 s, System: 1.643 s] Range (min … max): 8.781 s … 8.993 s 20 runs nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix Time (mean ± σ): 375.0 ms ± 2.2 ms [User: 339.8 ms, System: 35.2 ms] Range (min … max): 371.5 ms … 379.3 ms 20 runs nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system' Time (mean ± σ): 2.831 s ± 0.040 s [User: 2.536 s, System: 0.225 s] Range (min … max): 2.769 s … 2.912 s 20 runs after nix search --no-eval-cache --offline ../nixpkgs hello Time (mean ± σ): 8.832 s ± 0.048 s [User: 6.757 s, System: 1.657 s] Range (min … max): 8.743 s … 8.921 s 20 runs nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix Time (mean ± σ): 367.4 ms ± 3.2 ms [User: 332.7 ms, System: 34.7 ms] Range (min … max): 364.6 ms … 374.6 ms 20 runs nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system' Time (mean ± σ): 2.810 s ± 0.030 s [User: 2.517 s, System: 0.225 s] Range (min … max): 2.742 s … 2.854 s 20 runs
2022-01-13avoid copies of parser input datapennae
when given a string yacc will copy the entire input to a newly allocated location so that it can add a second terminating NUL byte. since the parser is a very internal thing to EvalState we can ensure that having two terminating NUL bytes is always possible without copying, and have the parser itself merely check that the expected NULs are present. # before Benchmark 1: nix search --offline nixpkgs hello Time (mean ± σ): 572.4 ms ± 2.3 ms [User: 563.4 ms, System: 8.6 ms] Range (min … max): 566.9 ms … 579.1 ms 50 runs Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix Time (mean ± σ): 381.7 ms ± 1.0 ms [User: 348.3 ms, System: 33.1 ms] Range (min … max): 380.2 ms … 387.7 ms 50 runs Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system' Time (mean ± σ): 2.936 s ± 0.005 s [User: 2.715 s, System: 0.221 s] Range (min … max): 2.923 s … 2.946 s 50 runs # after Benchmark 1: nix search --offline nixpkgs hello Time (mean ± σ): 571.7 ms ± 2.4 ms [User: 563.3 ms, System: 8.0 ms] Range (min … max): 566.7 ms … 579.7 ms 50 runs Benchmark 2: nix eval -f ../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix Time (mean ± σ): 376.6 ms ± 1.0 ms [User: 345.8 ms, System: 30.5 ms] Range (min … max): 374.5 ms … 379.1 ms 50 runs Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system' Time (mean ± σ): 2.922 s ± 0.006 s [User: 2.707 s, System: 0.215 s] Range (min … max): 2.906 s … 2.934 s 50 runs
2022-01-13don't strdup tokens in the lexerpennae
every stringy token the lexer returns is turned into a Symbol and not used further, so we don't have to strdup. using a string_view is sufficient, but due to limitations of the current parser we have to use a POD type that holds the same information. gives ~2% on system build, 6% on search, 8% on parsing alone # before Benchmark 1: nix search --offline nixpkgs hello Time (mean ± σ): 610.6 ms ± 2.4 ms [User: 602.5 ms, System: 7.8 ms] Range (min … max): 606.6 ms … 617.3 ms 50 runs Benchmark 2: nix eval -f hackage-packages.nix Time (mean ± σ): 430.1 ms ± 1.4 ms [User: 393.1 ms, System: 36.7 ms] Range (min … max): 428.2 ms … 434.2 ms 50 runs Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system' Time (mean ± σ): 3.032 s ± 0.005 s [User: 2.808 s, System: 0.223 s] Range (min … max): 3.023 s … 3.041 s 50 runs # after Benchmark 1: nix search --offline nixpkgs hello Time (mean ± σ): 574.7 ms ± 2.8 ms [User: 566.3 ms, System: 8.0 ms] Range (min … max): 569.2 ms … 580.7 ms 50 runs Benchmark 2: nix eval -f hackage-packages.nix Time (mean ± σ): 394.4 ms ± 0.8 ms [User: 361.8 ms, System: 32.3 ms] Range (min … max): 392.7 ms … 395.7 ms 50 runs Benchmark 3: nix eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system' Time (mean ± σ): 2.976 s ± 0.005 s [User: 2.757 s, System: 0.218 s] Range (min … max): 2.966 s … 2.990 s 50 runs
2022-01-03Merge branch 'master' into debug-exploratory-PRBen Burdette
2021-12-27remove debug codeBen Burdette
2021-12-20:d errorBen Burdette
2021-12-13Merge branch 'better-interpolation-error-location' of ↵Eelco Dolstra
https://github.com/greedy/nix
2021-11-25Merge branch 'master' into debug-mergeBen Burdette
2021-11-17Parse '(f x) y' the same as 'f x y'Eelco Dolstra
(cherry picked from commit 5253cb4b68ad248f37b27849c0ebf3614e4f2777)
2021-11-07Remove unused "<let-body>" symbolAndreas Rammhold
The requirement for the symbol has been removed since at least 7d47498.
2021-11-04Optimize primop callsEelco Dolstra
We now parse function applications as a vector of arguments rather than as a chain of binary applications, e.g. 'substring 1 2 "foo"' is parsed as ExprCall { .fun = <substring>, .args = [ <1>, <2>, <"foo"> ] } rather than ExprApp (ExprApp (ExprApp <substring> <1>) <2>) <"foo"> This allows primops to be called immediately (if enough arguments are supplied) without having to allocate intermediate tPrimOpApp values. On $ nix-instantiate --dry-run '<nixpkgs/nixos/release-combined.nix>' -A nixos.tests.simple.x86_64-linux this gives a substantial performance improvement: user CPU time: median = 0.9209 mean = 0.9218 stddev = 0.0073 min = 0.9086 max = 0.9340 [rejected, p=0.00000, Δ=-0.21433±0.00677] elapsed time: median = 1.0585 mean = 1.0584 stddev = 0.0024 min = 1.0523 max = 1.0623 [rejected, p=0.00000, Δ=-0.20594±0.00236] because it reduces the number of tPrimOpApp allocations from 551990 to 42534 (i.e. only small minority of primop calls are partially applied) which in turn reduces time spent in the garbage collector.
2021-11-04StaticEnv: Use std::vector instead of std::mapEelco Dolstra
2021-10-27Remove redundant 'warning:'Eelco Dolstra
2021-10-26Make experimental-features a proper typeregnat
Rather than having them plain strings scattered through the whole codebase, create an enum containing all the known experimental features. This means that - Nix can now `warn` when an unkwown experimental feature is passed (making it much nicer to spot typos and spot deprecated features) - It’s now easy to remove a feature altogether (once the feature isn’t experimental anymore or is dropped) by just removing the field for the enum and letting the compiler point us to all the now invalid usages of it.
2021-10-22more code cleanupBen Burdette
2021-10-22remove more debug codeBen Burdette
2021-10-11comment out debugsBen Burdette
2021-10-06libexpr: remove matchAttrs boolean from ExprLambdaAndreas Rammhold
The boolean is only used to determine if the formals are set to a non-null pointer in all our cases. We can get rid of that allocation and instead just compare the pointer value with NULL. Saving up to sizeof(bool) + platform specific alignment per ExprLambda instace. Probably not a lot of memory but perhaps a few kilobyte with nixpkgs? This also gets rid of a potential issue with dereferencing formals based on the value of the boolean that didn't have to be aligned with the formals pointer but was in all our cases.
2021-09-22Better eval error locations for interpolation and +Geoff Reedy
Previously, type or coercion errors for string interpolation, path interpolation, and plus expressions were always reported at the beginning of the outer expression. This leads to confusing evaluation error messages making it hard to accurately diagnose and then fix the error. For example, errors were reported as follows. ``` cannot coerce an integer to a string 1| let foo = 7; in "bar" + foo | ^ cannot add a string to an integer 1| let foo = "bar"; in 4 + foo | ^ cannot coerce an integer to a string 1| let foo = 7; in "x${foo}" | ^ ``` This commit changes the ExprConcatStrings expression vector to store a sequence of expressions *and* their expansion locations so that error locations can be reported accurately. For interpolation, the error is reported at the beginning of the entire `${foo}`, not at the beginning of `foo` because I thought this was slightly clearer. The previous errors are now reported as: ``` cannot coerce an integer to a string 1| let foo = 7; in "bar" + foo | ^ cannot add a string to an integer 1| let foo = "bar"; in 4 + foo | ^ cannot coerce an integer to a string 1| let foo = 7; in "x${foo}" | ^ ``` The error is reported at this kind of precise location even for multi-line indented strings. This probably helps with at least some of the cases mentioned in #561
2021-09-22more debug stuffBen Burdette
2021-09-15add cout debuggingBen Burdette
2021-09-14shared_ptr for StaticEnvBen Burdette
2021-08-31fix parse of `/${foo}`. was `// + foo`Taeer Bar-Yam
I don't think this changes the way any program would behave, but it's a cleaner internal representation.
2021-08-06add antiquotations to pathsTaeer Bar-Yam
2021-04-13libexpr: misc improvements for proper error positionMaximilian Bosch
When working on some more complex Nix code, there are sometimes rather unhelpful or misleading error messages, especially if coerce-errors are thrown. This patch is a first steps towards improving that. I'm happy to file more changes after that, but I'd like to gather some feedback first. To summarize, this patch does the following things: * Attrsets (a.k.a. `Bindings` in `libexpr`) now have a `Pos`. This is helpful e.g. to identify which attribute-set in `listToAttrs` is invalid. * The `Value`-struct has a new method named `determinePos` which tries to guess the position of a value and falls back to a default if that's not possible. This can be used to provide better messages if a coercion fails. * The new `determinePos`-API is used by `builtins.concatMap` now. With that change, Nix shows the exact position in the error where a wrong value was returned by the lambda. To make sure it's still obvious that `concatMap` is the problem, another stack-frame was added. * The changes described above can be added to every other `primop`, but first I'd like to get some feedback about the overall approach.
2021-01-21Improve error formattingEelco Dolstra
Changes: * The divider lines are gone. These were in practice a bit confusing, in particular with --show-trace or --keep-going, since then there were multiple lines, suggesting a start/end which wasn't the case. * Instead, multi-line error messages are now indented to align with the prefix (e.g. "error: "). * The 'description' field is gone since we weren't really using it. * 'hint' is renamed to 'msg' since it really wasn't a hint. * The error is now printed *before* the location info. * The 'name' field is no longer printed since most of the time it wasn't very useful since it was just the name of the exception (like EvalError). Ideally in the future this would be a unique, easily googleable error ID (like rustc). * "trace:" is now just "…". This assumes error contexts start with something like "while doing X". Example before: error: --- AssertionError ---------------------------------------------------------------------------------------- nix at: (7:7) in file: /home/eelco/Dev/nixpkgs/pkgs/applications/misc/hello/default.nix 6| 7| x = assert false; 1; | ^ 8| assertion 'false' failed ----------------------------------------------------- show-trace ----------------------------------------------------- trace: while evaluating the attribute 'x' of the derivation 'hello-2.10' at: (192:11) in file: /home/eelco/Dev/nixpkgs/pkgs/stdenv/generic/make-derivation.nix 191| // (lib.optionalAttrs (!(attrs ? name) && attrs ? pname && attrs ? version)) { 192| name = "${attrs.pname}-${attrs.version}"; | ^ 193| } // (lib.optionalAttrs (stdenv.hostPlatform != stdenv.buildPlatform && !dontAddHostSuffix && (attrs ? name || (attrs ? pname && attrs ? version)))) { Example after: error: assertion 'false' failed at: (7:7) in file: /home/eelco/Dev/nixpkgs/pkgs/applications/misc/hello/default.nix 6| 7| x = assert false; 1; | ^ 8| … while evaluating the attribute 'x' of the derivation 'hello-2.10' at: (192:11) in file: /home/eelco/Dev/nixpkgs/pkgs/stdenv/generic/make-derivation.nix 191| // (lib.optionalAttrs (!(attrs ? name) && attrs ? pname && attrs ? version)) { 192| name = "${attrs.pname}-${attrs.version}"; | ^ 193| } // (lib.optionalAttrs (stdenv.hostPlatform != stdenv.buildPlatform && !dontAddHostSuffix && (attrs ? name || (attrs ? pname && attrs ? version)))) {