aboutsummaryrefslogtreecommitdiff
path: root/doc/manual/src/architecture/store/store.md
blob: fba2f90fd4eb1de4205d53f72b337f3ef5c19b9c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
# Store

A Nix store is a collection of *store objects* with references between them.
It supports operations to manipulate that collection.

The following concept map is a graphical outline of this chapter.
Arrows indicate suggested reading order.

```
                      ,----------[ store ]---------,
                      |                            |
                      |                            v
                      |                     [ operations ]
                      |                      /          \
                      v                     v            v
               [ store object ] [ garbage collection ] [ build ]
                      |                 ^                 ^ |
                      v                 |                 | |
           [ files and processes ]      | [ derivation ]--' |
              /            \            |       ^           |
             v              v           |       |           |
[ file system object ]  [ store path ]  '--[ closure ]      |
                  |         ^     \             |           |
                  v         |      v            v           |
             [ digest ]-----'  [ reference scanning ]<------'
              /      \
             v        v
[ input addressing ] [ content addressing ]
```

## Store Object {#store-object}

A store object can hold

- arbitrary *data*
- *references* to other store objects.

Nix makes no distinction if store objects are build inputs, build results, or build tasks.

Store objects are [immutable][immutable-object]: once created, they do not change until they are deleted.

## Reference

A store object reference is an [opaque][opaque-data-type], [unique identifier][unique-identifier]:
The only way to obtain references is by adding or building store objects.
A reference will always point to exactly one store object.

## Operations

A Nix store can *add*, *retrieve*, and *delete* store objects.

                [ data ]
                    |
                    V
    [ store ] ---> add ----> [ store' ]
                    |
                    V
              [ reference ]

<!-- -->

              [ reference ]
                    |
                    V
    [ store ] ---> get
                    |
                    V
             [ store object ]

<!-- -->

              [ reference ]
                    |
                    V
    [ store ] --> delete --> [ store' ]


It can *perform builds*, that is, create new store objects by transforming build inputs into build outputs, using instructions from the build tasks.


              [ reference ]
                    |
                    V
    [ store ] --> build --(maybe)--> [ store' ]
                             |
                             V
                       [ reference ]


As it keeps track of references, it can [garbage-collect][garbage-collection] unused store objects.


    [ store ] --> collect garbage --> [ store' ]


## Closure {#closure}

Nix stores ensure [referential integrity][referential-integrity]: for each store object in the store, all the store objects it references must also be in the store.

The set of all store objects reachable by following references from a given initial set of store objects is called a *closure*.

Adding, building, copying and deleting store objects must be done in a way that preserves referential integrity:

- A newly added store object cannot have references, unless it is a build task.

- Build results must only refer to store objects in the closure of the build inputs.

  Building a store object will add appropriate references, according to the build task.

- Store objects being copied must refer to objects already in the destination store.

  Recursive copying must either proceed in dependency order or be atomic.

- We can only safely delete store objects which are not reachable from any reference still in use.

  <!-- more details in section on garbage collection, link to it once it exists -->

[referential-integrity]: https://en.m.wikipedia.org/wiki/Referential_integrity
[garbage-collection]: https://en.m.wikipedia.org/wiki/Garbage_collection_(computer_science)
[immutable-object]: https://en.m.wikipedia.org/wiki/Immutable_object
[opaque-data-type]: https://en.m.wikipedia.org/wiki/Opaque_data_type
[unique-identifier]: https://en.m.wikipedia.org/wiki/Unique_identifier

## Files and Processes {#files-and-processes}

Nix maps between its store model and the [Unix paradigm][unix-paradigm] of [files and processes][file-descriptor], by encoding immutable store objects and opaque identifiers as file system primitives: files and directories, and paths.
That allows processes to resolve references contained in files and thus access the contents of store objects.

Store objects are therefore implemented as the pair of

  - a [file system object](fso.md) for data
  - a set of [store paths](path.md) for references.

[unix-paradigm]: https://en.m.wikipedia.org/wiki/Everything_is_a_file
[file-descriptor]: https://en.m.wikipedia.org/wiki/File_descriptor

The following diagram shows a radical simplification of how Nix interacts with the operating system:
It uses files as build inputs, and build outputs are files again.
On the operating system, files can be run as processes, which in turn operate on files.
A build function also amounts to an operating system process (not depicted).

```
+-----------------------------------------------------------------+
| Nix                                                             |
|                  [ commmand line interface ]------,             |
|                               |                   |             |
|                           evaluates               |             |
|                               |                manages          |
|                               V                   |             |
|                  [ configuration language  ]      |             |
|                               |                   |             |
| +-----------------------------|-------------------V-----------+ |
| | store                  evaluates to                         | |
| |                             |                               | |
| |             referenced by   V       builds                  | |
| |  [ build input ] ---> [ build plan ] ---> [ build result ]  | |
| |         ^                                        |          | |
| +---------|----------------------------------------|----------+ |
+-----------|----------------------------------------|------------+
            |                                        |
    file system object                          store path
            |                                        |
+-----------|----------------------------------------|------------+
| operating system        +------------+             |            |
|           '------------ |            | <-----------'            |
|                         |    file    |                          |
|                     ,-- |            | <-,                      |
|                     |   +------------+   |                      |
|          execute as |                    | read, write, execute |
|                     |   +------------+   |                      |
|                     '-> |  process   | --'                      |
|                         +------------+                          |
+-----------------------------------------------------------------+
```

There exist different types of stores, which all follow this model.
Examples:
- store on the local file system
- remote store accessible via SSH
- binary cache store accessible via HTTP

To make store objects accessible to processes, stores ultimately have to expose store objects through the file system.

## A [Rosetta stone][rosetta-stone] for build system terminology

The Nix store's design is comparable to other build systems.
Usage of terms is, for historic reasons, not entirely consistent within the Nix ecosystem, and still subject to slow change.

The following translation table points out similarities and equivalent terms, to help clarify their meaning and inform consistent use in the future.

| generic build system             | Nix              | [Bazel][bazel]                                                       | [Build Systems à la Carte][bsalc] | programming language     |
| -------------------------------- | ---------------- | -------------------------------------------------------------------- | --------------------------------- | ------------------------ |
| data (build input, build result) | store object     | [artifact][bazel-artifact]                                           | value                             | value                    |
| build instructions               | builder          | ([depends on action type][bazel-actions])                            | function                          | function                 |
| build task                       | derivation       | [action][bazel-action]                                               | `Task`                            | [thunk][thunk]           |
| build plan                       | derivation graph | [action graph][bazel-action-graph], [build graph][bazel-build-graph] | `Tasks`                           | [call graph][call-graph] |
| build                            | build            | build                                                                | application of `Build`            | evaluation               |
| persistence layer                | store            | [action cache][bazel-action-cache]                                   | `Store`                           | heap                     |

All of these systems share features of [declarative programming][declarative-programming] languages, a key insight first put forward by Eelco Dolstra et al. in [Imposing a Memory Management Discipline on Software Deployment][immdsd] (2004), elaborated in his PhD thesis [The Purely Functional Software Deployment Model][phd-thesis] (2006), and further refined by Andrey Mokhov et al. in [Build Systems à la Carte][bsalc] (2018).

[rosetta-stone]: https://en.m.wikipedia.org/wiki/Rosetta_Stone
[bazel]: https://bazel.build/start/bazel-intro
[bazel-artifact]: https://bazel.build/reference/glossary#artifact
[bazel-actions]: https://docs.bazel.build/versions/main/skylark/lib/actions.html
[bazel-action]: https://bazel.build/reference/glossary#action
[bazel-action-graph]: https://bazel.build/reference/glossary#action-graph
[bazel-build-graph]: https://bazel.build/reference/glossary#build-graph
[bazel-action-cache]: https://bazel.build/reference/glossary#action-cache
[thunk]: https://en.m.wikipedia.org/wiki/Thunk
[call-graph]: https://en.m.wikipedia.org/wiki/Call_graph
[declarative-programming]: https://en.m.wikipedia.org/wiki/Declarative_programming
[immdsd]: https://edolstra.github.io/pubs/immdsd-icse2004-final.pdf
[phd-thesis]: https://edolstra.github.io/pubs/phd-thesis.pdf
[bsalc]: https://www.microsoft.com/en-us/research/uploads/prod/2018/03/build-systems.pdf