lix -

Age	Commit message (Collapse)	Author
2022-04-21	ca: add sqlite index on `RealisationsRefs(realisationReference)`	Sergei Trofimovich
	Without the change any CA deletion triggers linear scan on large RealisationsRefs table: sqlite>.eqp full sqlite> delete from RealisationsRefs where realisationReference IN ( select id from Realisations where outputPath = 1234567890 ); QUERY PLAN \|--SCAN RealisationsRefs `--LIST SUBQUERY 1 `--SEARCH Realisations USING COVERING INDEX IndexRealisationsRefsOnOutputPath (outputPath=?) With the change it gets turned into a lookup: sqlite> CREATE INDEX IndexRealisationsRefsRealisationReference on RealisationsRefs(realisationReference); sqlite> delete from RealisationsRefs where realisationReference IN ( select id from Realisations where outputPath = 1234567890 ); QUERY PLAN \|--SEARCH RealisationsRefs USING INDEX IndexRealisationsRefsRealisationReference (realisationReference=?) `--LIST SUBQUERY 1 `--SEARCH Realisations USING COVERING INDEX IndexRealisationsRefsOnOutputPath (outputPath=?)
2022-04-21	Make sure to delete all the realisation refs	regnat
	Deleting just one will only work in the test cases where I didn’t bother creating too many of them :p
2022-04-21	Fix the gc with indirect self-references via the realisations	regnat
	If the derivation `foo` depends on `bar`, and they both have the same output path (because they are CA derivations), then this output path will depend both on the realisation of `foo` and of `bar`, which themselves depend on each other. This confuses SQLite which isn’t able to automatically solve this diamond dependency scheme. Help it by adding a trigger to delete all the references between the relevant realisations. Fix #5320
2021-11-10	ca-specific-schema.sql: add index on RealisationsRefs(referrer) and (outputPath)	Sergei Trofimovich
	For a typical desktop system (~2K packages) we can easily get 100K entries in RealisationsRefs. Without indices query for RealisationsRefs requires linear scan. RealisationsRefs(referrer) -------------------------- Inefficiency is seen as a 100% CPU load of nix-daemon for the following scenario: $ nix edit -f . bash # add unused environment variable, like FOO="1" # populate RealisationsRefs, build fresh system $ nix build -f nixos system --arg config '{ contentAddressedByDefault = true; }' $ nix edit -f . bash # add unused environment variable, like FOO="2" $ time nix build -f nixos system --arg config '{ contentAddressedByDefault = true; }' In this case `bash `will be rebuilt a few times and then rest of CPU time is spent on scanning RealisationsRefs table (about 5 CPU-minutes on my machine). Before the change: $ time nix build -f nixos system ... # step 4 above real 34m3,613s user 0m5,232s sys 0m0,758s Of all this time about 29.5 minutes are taken by nix-daemon's CPU time. After the change: $ time nix build -f nixos system ... # step 4 above real 4m50,061s user 0m5,038s sys 0m0,677s Of all this time about 1 minute is taken by nix-daemon's CPU time. Most of the time is spent polling for non-existent realisations on cache-nixos.org. Realisations(outputPath) ------------------------ After running CA system for two weeks I got ~1M entries in Realisations table. `nix-collect-garbage` became very slow (seemingly 100 path deletions per second). It happens due to a slow cascading delete from Realisations triggered by deletion from ValidPaths. The fix is to add an index on primary key from ValidPaths(id) that triggers cascading deletions. Before the change: $ time nix-collect-garbage -d --max-freed 100G <interrupted before finish, took too long> real 23m32.411s user 17m49.679s sys 4m50.609s Most of time was spent in re-scanning Realisations table on each path deletion. After the change: $ time nix-collect-garbage -d --max-freed 100G real 8m43.226s user 6m16.317s sys 1m40.188s Time is spent scanning sqlite indices and in kernel when unlinking directories.
2021-05-26	Store the realisation deps on the local store	regnat

2021-03-15	Add some logic for signing realisations	regnat
	Not exposed anywhere, but built realisations are now signed (and this should be forwarded when copy-ing them around)
2020-12-11	Rework the db schema for derivation outputs	regnat
	Add a new table for tracking the derivation output mappings. We used to hijack the `DerivationOutputs` table for that, but (despite its name), it isn't a really good fit: - Its entries depend on the drv being a valid path, making it play badly with garbage collection and preventing us to copy a drv output without copying the whole drv closure too; - It dosen't guaranty that the output path exists; By using a different table, we can experiment with a different schema better suited for tracking the output mappings of CA derivations. (incidentally, this also fixes #4138)