aboutsummaryrefslogtreecommitdiff
path: root/doc/manual/src
diff options
context:
space:
mode:
authorThéophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>2022-08-04 20:49:01 +0200
committerGitHub <noreply@github.com>2022-08-04 20:49:01 +0200
commit81e101345fda2a8651c470f08b364a1ca6fa37cf (patch)
tree77f1972839dba73b6541d871cc91bab622c6f99b /doc/manual/src
parent7d1280bbaf7f4cd142c2259dec620c42bf6f96fd (diff)
parent39d32ac4c63f4aa3784d114b19c0eca83e306ca9 (diff)
Merge pull request #6420 from nix-community/doc-what-is-nix
Document what Nix *is*
Diffstat (limited to 'doc/manual/src')
-rw-r--r--doc/manual/src/SUMMARY.md.in6
-rw-r--r--doc/manual/src/architecture/architecture.md79
-rw-r--r--doc/manual/src/architecture/store/fso.md69
-rw-r--r--doc/manual/src/architecture/store/path.md105
-rw-r--r--doc/manual/src/architecture/store/store.md151
-rw-r--r--doc/manual/src/architecture/store/store/build-system-terminology.md32
-rw-r--r--doc/manual/src/architecture/store/store/closure.md29
7 files changed, 471 insertions, 0 deletions
diff --git a/doc/manual/src/SUMMARY.md.in b/doc/manual/src/SUMMARY.md.in
index 084c8f442..a47d39f31 100644
--- a/doc/manual/src/SUMMARY.md.in
+++ b/doc/manual/src/SUMMARY.md.in
@@ -59,6 +59,12 @@
@manpages@
- [Files](command-ref/files.md)
- [nix.conf](command-ref/conf-file.md)
+- [Architecture](architecture/architecture.md)
+ - [Store](architecture/store/store.md)
+ - [Closure](architecture/store/store/closure.md)
+ - [Build system terminology](architecture/store/store/build-system-terminology.md)
+ - [Store Path](architecture/store/path.md)
+ - [File System Object](architecture/store/fso.md)
- [Glossary](glossary.md)
- [Contributing](contributing/contributing.md)
- [Hacking](contributing/hacking.md)
diff --git a/doc/manual/src/architecture/architecture.md b/doc/manual/src/architecture/architecture.md
new file mode 100644
index 000000000..41deb07af
--- /dev/null
+++ b/doc/manual/src/architecture/architecture.md
@@ -0,0 +1,79 @@
+# Architecture
+
+*(This chapter is unstable and a work in progress. Incoming links may rot.)*
+
+This chapter describes how Nix works.
+It should help users understand why Nix behaves as it does, and it should help developers understand how to modify Nix and how to write similar tools.
+
+## Overview
+
+Nix consists of [hierarchical layers][layer-architecture].
+
+```
++-----------------------------------------------------------------+
+| Nix |
+| [ commmand line interface ]------, |
+| | | |
+| evaluates | |
+| | manages |
+| V | |
+| [ configuration language ] | |
+| | | |
+| +-----------------------------|-------------------V-----------+ |
+| | store evaluates to | |
+| | | | |
+| | referenced by V builds | |
+| | [ build input ] ---> [ build plan ] ---> [ build result ] | |
+| | | |
+| +-------------------------------------------------------------+ |
++-----------------------------------------------------------------+
+```
+
+At the top is the [command line interface](../command-ref/command-ref.md), translating from invocations of Nix executables to interactions with the underlying layers.
+
+Below that is the [Nix expression language](../expressions/expression-language.md), a [purely functional][purely-functional-programming] configuration language.
+It is used to compose expressions which ultimately evaluate to self-contained *build plans*, used to derive *build results* from referenced *build inputs*.
+
+The command line and Nix language are what users interact with most.
+
+> **Note**
+> The Nix language itself does not have a notion of *packages* or *configurations*.
+> As far as we are concerned here, the inputs and results of a build plan are just data.
+
+Underlying these is the [Nix store](./store/store.md), a mechanism to keep track of build plans, data, and references between them.
+It can also execute build plans to produce new data.
+
+A build plan is a series of *build tasks*.
+Each build task has a special build input which is used as *build instructions*.
+The result of a build task can be input to another build task.
+
+```
++-----------------------------------------------------------------------------------------+
+| store |
+| ................................................. |
+| : build plan : |
+| : : |
+| [ build input ]-----instructions-, : |
+| : | : |
+| : v : |
+| [ build input ]----------->[ build task ]--instructions-, : |
+| : | : |
+| : | : |
+| : v : |
+| : [ build task ]----->[ build result ] |
+| [ build input ]-----instructions-, ^ : |
+| : | | : |
+| : v | : |
+| [ build input ]----------->[ build task ]---------------' : |
+| : ^ : |
+| : | : |
+| [ build input ]------------------' : |
+| : : |
+| : : |
+| :...............................................: |
+| |
++-----------------------------------------------------------------------------------------+
+```
+
+[layer-architecture]: https://en.m.wikipedia.org/wiki/Multitier_architecture#Layers
+[purely-functional-programming]: https://en.m.wikipedia.org/wiki/Purely_functional_programming
diff --git a/doc/manual/src/architecture/store/fso.md b/doc/manual/src/architecture/store/fso.md
new file mode 100644
index 000000000..e0eb69f60
--- /dev/null
+++ b/doc/manual/src/architecture/store/fso.md
@@ -0,0 +1,69 @@
+# File System Object
+
+The Nix store uses a simple file system model for the data it holds in [store objects](store.md#store-object).
+
+Every file system object is one of the following:
+
+ - File: an executable flag, and arbitrary data for contents
+ - Directory: mapping of names to child file system objects
+ - [Symbolic link][symlink]: may point anywhere.
+
+We call a store object's outermost file system object the *root*.
+
+ data FileSystemObject
+ = File { isExecutable :: Bool, contents :: Bytes }
+ | Directory { entries :: Map FileName FileSystemObject }
+ | SymLink { target :: Path }
+
+Examples:
+
+- a directory with contents
+
+ /nix/store/<hash>-hello-2.10
+ ├── bin
+ │   └── hello
+ └── share
+ ├── info
+ │   └── hello.info
+ └── man
+ └── man1
+ └── hello.1.gz
+
+- a directory with relative symlink and other contents
+
+ /nix/store/<hash>-go-1.16.9
+ ├── bin -> share/go/bin
+ ├── nix-support/
+ └── share/
+
+- a directory with absolute symlink
+
+ /nix/store/d3k...-nodejs
+ └── nix_node -> /nix/store/f20...-nodejs-10.24.
+
+A bare file or symlink can be a root file system object.
+Examples:
+
+ /nix/store/<hash>-hello-2.10.tar.gz
+
+ /nix/store/4j5...-pkg-config-wrapper-0.29.2-doc -> /nix/store/i99...-pkg-config-0.29.2-doc
+
+Symlinks pointing outside of their own root or to a store object without a matching reference are allowed, but might not function as intended.
+Examples:
+
+- an arbitrarily symlinked file may change or not exist at all
+
+ /nix/store/<hash>-foo
+ └── foo -> /home/foo
+
+- if a symlink to a store path was not automatically created by Nix, it may be invalid or get invalidated when the store object is deleted
+
+ /nix/store/<hash>-bar
+ └── bar -> /nix/store/abc...-foo
+
+Nix file system objects do not support [hard links][hardlink]:
+each file system object which is not the root has exactly one parent and one name.
+However, as store objects are immutable, an underlying file system can use hard links for optimization.
+
+[symlink]: https://en.m.wikipedia.org/wiki/Symbolic_link
+[hardlink]: https://en.m.wikipedia.org/wiki/Hard_link
diff --git a/doc/manual/src/architecture/store/path.md b/doc/manual/src/architecture/store/path.md
new file mode 100644
index 000000000..663f04f46
--- /dev/null
+++ b/doc/manual/src/architecture/store/path.md
@@ -0,0 +1,105 @@
+# Store Path
+
+Nix implements [references](store.md#reference) to [store objects](store.md#store-object) as *store paths*.
+
+Store paths are pairs of
+
+- a 20-byte [digest](#digest) for identification
+- a symbolic name for people to read.
+
+Example:
+
+- digest: `b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z`
+- name: `firefox-33.1`
+
+It is rendered to a file system path as the concatenation of
+
+ - [store directory](#store-directory)
+ - path-separator (`/`)
+ - [digest](#digest) rendered in a custom variant of [base-32](https://en.m.wikipedia.org/wiki/Base32) (20 arbitrary bytes become 32 ASCII characters)
+ - hyphen (`-`)
+ - name
+
+Example:
+
+ /nix/store/b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z-firefox-33.1
+ |--------| |------------------------------| |----------|
+ store directory digest name
+
+## Store Directory
+
+Every [store](./store.md) has a store directory.
+
+If the store has a [file system representation](./store.md#files-and-processes), this directory contains the store’s [file system objects](#file-system-object), which can be addressed by [store paths](#store-path).
+
+This means a store path is not just derived from the referenced store object itself, but depends on the store the store object is in.
+
+> **Note**
+> The store directory defaults to `/nix/store`, but is in principle arbitrary.
+
+It is important which store a given store object belongs to:
+Files in the store object can contain store paths, and processes may read these paths.
+Nix can only guarantee [referential integrity](store/closure.md) if store paths do not cross store boundaries.
+
+Therefore one can only copy store objects to a different store if
+
+- the source and target stores' directories match
+
+ or
+
+- the store object in question has no references, that is, contains no store paths.
+
+One cannot copy a store object to a store with a different store directory.
+Instead, it has to be rebuilt, together with all its dependencies.
+It is in general not enough to replace the store directory string in file contents, as this may render executables unusable by invalidating their internal offsets or checksums.
+
+# Digest
+
+In a [store path](#store-path), the [digest][digest] is the output of a [cryptographic hash function][hash] of either all *inputs* involved in building the referenced store object or its actual *contents*.
+
+Store objects are therefore said to be either [input-addressed](#input-addressing) or [content-addressed](#content-addressing).
+
+> **Historical Note**
+> The 20 byte restriction is because originally digests were [SHA-1][sha-1] hashes.
+> Nix now uses [SHA-256][sha-256], and longer hashes are still reduced to 20 bytes for compatibility.
+
+[digest]: https://en.m.wiktionary.org/wiki/digest#Noun
+[hash]: https://en.m.wikipedia.org/wiki/Cryptographic_hash_function
+[sha-1]: https://en.m.wikipedia.org/wiki/SHA-1
+[sha-256]: https://en.m.wikipedia.org/wiki/SHA-256
+
+### Reference scanning
+
+When a new store object is built, Nix scans its file contents for store paths to construct its set of references.
+
+The special format of a store path's [digest](#digest) allows reliably detecting it among arbitrary data.
+Nix uses the [closure](store.md#closure) of build inputs to derive the list of allowed store paths, to avoid false positives.
+
+This way, scanning files captures run time dependencies without the user having to declare them explicitly.
+Doing it at build time and persisting references in the store object avoids repeating this time-consuming operation.
+
+> **Note**
+> In practice, it is sometimes still necessary for users to declare certain dependencies explicitly, if they are to be preserved in the build result's closure.
+This depends on the specifics of the software to build and run.
+>
+> For example, Java programs are compressed after compilation, which obfuscates any store paths they may refer to and prevents Nix from automatically detecting them.
+
+## Input Addressing
+
+Input addressing means that the digest derives from how the store object was produced, namely its build inputs and build plan.
+
+To compute the hash of a store object one needs a deterministic serialisation, i.e., a binary string representation which only changes if the store object changes.
+
+Nix has a custom serialisation format called Nix Archive (NAR)
+
+Store object references of this sort can *not* be validated from the content of the store object.
+Rather, a cryptographic signature has to be used to indicate that someone is vouching for the store object really being produced from a build plan with that digest.
+
+## Content Addressing
+
+Content addressing means that the digest derives from the store object's contents, namely its file system objects and references.
+If one knows content addressing was used, one can recalculate the reference and thus verify the store object.
+
+Content addressing is currently only used for the special cases of source files and "fixed-output derivations", where the contents of a store object are known in advance.
+Content addressing of build results is still an [experimental feature subject to some restrictions](https://github.com/tweag/rfcs/blob/cas-rfc/rfcs/0062-content-addressed-paths.md).
+
diff --git a/doc/manual/src/architecture/store/store.md b/doc/manual/src/architecture/store/store.md
new file mode 100644
index 000000000..08b6701d5
--- /dev/null
+++ b/doc/manual/src/architecture/store/store.md
@@ -0,0 +1,151 @@
+# Store
+
+A Nix store is a collection of *store objects* with references between them.
+It supports operations to manipulate that collection.
+
+The following concept map is a graphical outline of this chapter.
+Arrows indicate suggested reading order.
+
+```
+ ,--------------[ store ]----------------,
+ | | |
+ v v v
+ [ store object ] [ closure ]--, [ operations ]
+ | | | | | |
+ v | | v v |
+ [ files and processes ] | | [ garbage collection ] |
+ / \ | | |
+ v v | v v
+[ file system object ] [ store path ] | [ derivation ]--->[ building ]
+ | ^ | | |
+ v | v v |
+ [ digest ]----' [ reference scanning ]<------------'
+ / \
+ v v
+[ input addressing ] [ content addressing ]
+```
+
+## Store Object
+
+A store object can hold
+
+- arbitrary *data*
+- *references* to other store objects.
+
+Store objects can be build inputs, build results, or build tasks.
+
+Store objects are [immutable][immutable-object]: once created, they do not change until they are deleted.
+
+## Reference
+
+A store object reference is an [opaque][opaque-data-type], [unique identifier][unique-identifier]:
+The only way to obtain references is by adding or building store objects.
+A reference will always point to exactly one store object.
+
+## Operations
+
+A Nix store can *add*, *retrieve*, and *delete* store objects.
+
+ [ data ]
+ |
+ V
+ [ store ] ---> add ----> [ store' ]
+ |
+ V
+ [ reference ]
+
+<!-- -->
+
+ [ reference ]
+ |
+ V
+ [ store ] ---> get
+ |
+ V
+ [ store object ]
+
+<!-- -->
+
+ [ reference ]
+ |
+ V
+ [ store ] --> delete --> [ store' ]
+
+
+It can *perform builds*, that is, create new store objects by transforming build inputs into build outputs, using instructions from the build tasks.
+
+
+ [ reference ]
+ |
+ V
+ [ store ] --> build --(maybe)--> [ store' ]
+ |
+ V
+ [ reference ]
+
+
+As it keeps track of references, it can [garbage-collect][garbage-collection] unused store objects.
+
+
+ [ store ] --> collect garbage --> [ store' ]
+
+## Files and Processes
+
+Nix maps between its store model and the [Unix paradigm][unix-paradigm] of [files and processes][file-descriptor], by encoding immutable store objects and opaque identifiers as file system primitives: files and directories, and paths.
+That allows processes to resolve references contained in files and thus access the contents of store objects.
+
+Store objects are therefore implemented as the pair of
+
+ - a [file system object](fso.md) for data
+ - a set of [store paths](path.md) for references.
+
+[unix-paradigm]: https://en.m.wikipedia.org/wiki/Everything_is_a_file
+[file-descriptor]: https://en.m.wikipedia.org/wiki/File_descriptor
+
+The following diagram shows a radical simplification of how Nix interacts with the operating system:
+It uses files as build inputs, and build outputs are files again.
+On the operating system, files can be run as processes, which in turn operate on files.
+A build function also amounts to an operating system process (not depicted).
+
+```
++-----------------------------------------------------------------+
+| Nix |
+| [ commmand line interface ]------, |
+| | | |
+| evaluates | |
+| | manages |
+| V | |
+| [ configuration language ] | |
+| | | |
+| +-----------------------------|-------------------V-----------+ |
+| | store evaluates to | |
+| | | | |
+| | referenced by V builds | |
+| | [ build input ] ---> [ build plan ] ---> [ build result ] | |
+| | ^ | | |
+| +---------|----------------------------------------|----------+ |
++-----------|----------------------------------------|------------+
+ | |
+ file system object store path
+ | |
++-----------|----------------------------------------|------------+
+| operating system +------------+ | |
+| '------------ | | <-----------' |
+| | file | |
+| ,-- | | <-, |
+| | +------------+ | |
+| execute as | | read, write, execute |
+| | +------------+ | |
+| '-> | process | --' |
+| +------------+ |
++-----------------------------------------------------------------+
+```
+
+There exist different types of stores, which all follow this model.
+Examples:
+- store on the local file system
+- remote store accessible via SSH
+- binary cache store accessible via HTTP
+
+To make store objects accessible to processes, stores ultimately have to expose store objects through the file system.
+
diff --git a/doc/manual/src/architecture/store/store/build-system-terminology.md b/doc/manual/src/architecture/store/store/build-system-terminology.md
new file mode 100644
index 000000000..eefbaa630
--- /dev/null
+++ b/doc/manual/src/architecture/store/store/build-system-terminology.md
@@ -0,0 +1,32 @@
+# A [Rosetta stone][rosetta-stone] for build system terminology
+
+The Nix store's design is comparable to other build systems.
+Usage of terms is, for historic reasons, not entirely consistent within the Nix ecosystem, and still subject to slow change.
+
+The following translation table points out similarities and equivalent terms, to help clarify their meaning and inform consistent use in the future.
+
+| generic build system | Nix | [Bazel][bazel] | [Build Systems à la Carte][bsalc] | programming language |
+| -------------------------------- | ---------------- | -------------------------------------------------------------------- | --------------------------------- | ------------------------ |
+| data (build input, build result) | store object | [artifact][bazel-artifact] | value | value |
+| build instructions | builder | ([depends on action type][bazel-actions]) | function | function |
+| build task | derivation | [action][bazel-action] | `Task` | [thunk][thunk] |
+| build plan | derivation graph | [action graph][bazel-action-graph], [build graph][bazel-build-graph] | `Tasks` | [call graph][call-graph] |
+| build | build | build | application of `Build` | evaluation |
+| persistence layer | store | [action cache][bazel-action-cache] | `Store` | heap |
+
+All of these systems share features of [declarative programming][declarative-programming] languages, a key insight first put forward by Eelco Dolstra et al. in [Imposing a Memory Management Discipline on Software Deployment][immdsd] (2004), elaborated in his PhD thesis [The Purely Functional Software Deployment Model][phd-thesis] (2006), and further refined by Andrey Mokhov et al. in [Build Systems à la Carte][bsalc] (2018).
+
+[rosetta-stone]: https://en.m.wikipedia.org/wiki/Rosetta_Stone
+[bazel]: https://bazel.build/start/bazel-intro
+[bazel-artifact]: https://bazel.build/reference/glossary#artifact
+[bazel-actions]: https://docs.bazel.build/versions/main/skylark/lib/actions.html
+[bazel-action]: https://bazel.build/reference/glossary#action
+[bazel-action-graph]: https://bazel.build/reference/glossary#action-graph
+[bazel-build-graph]: https://bazel.build/reference/glossary#build-graph
+[bazel-action-cache]: https://bazel.build/reference/glossary#action-cache
+[thunk]: https://en.m.wikipedia.org/wiki/Thunk
+[call-graph]: https://en.m.wikipedia.org/wiki/Call_graph
+[declarative-programming]: https://en.m.wikipedia.org/wiki/Declarative_programming
+[immdsd]: https://edolstra.github.io/pubs/immdsd-icse2004-final.pdf
+[phd-thesis]: https://edolstra.github.io/pubs/phd-thesis.pdf
+[bsalc]: https://www.microsoft.com/en-us/research/uploads/prod/2018/03/build-systems.pdf
diff --git a/doc/manual/src/architecture/store/store/closure.md b/doc/manual/src/architecture/store/store/closure.md
new file mode 100644
index 000000000..065b95ffc
--- /dev/null
+++ b/doc/manual/src/architecture/store/store/closure.md
@@ -0,0 +1,29 @@
+# Closure
+
+Nix stores ensure [referential integrity][referential-integrity]: for each store object in the store, all the store objects it references must also be in the store.
+
+The set of all store objects reachable by following references from a given initial set of store objects is called a *closure*.
+
+Adding, building, copying and deleting store objects must be done in a way that preserves referential integrity:
+
+- A newly added store object cannot have references, unless it is a build task.
+
+- Build results must only refer to store objects in the closure of the build inputs.
+
+ Building a store object will add appropriate references, according to the build task.
+
+- Store objects being copied must refer to objects already in the destination store.
+
+ Recursive copying must either proceed in dependency order or be atomic.
+
+- We can only safely delete store objects which are not reachable from any reference still in use.
+
+ <!-- more details in section on garbage collection, link to it once it exists -->
+
+[referential-integrity]: https://en.m.wikipedia.org/wiki/Referential_integrity
+[garbage-collection]: https://en.m.wikipedia.org/wiki/Garbage_collection_(computer_science)
+[immutable-object]: https://en.m.wikipedia.org/wiki/Immutable_object
+[opaque-data-type]: https://en.m.wikipedia.org/wiki/Opaque_data_type
+[unique-identifier]: https://en.m.wikipedia.org/wiki/Unique_identifier
+
+