aboutsummaryrefslogtreecommitdiff
path: root/doc/manual/src/architecture/store/path.md
blob: 039d8b1c6130491e7b914e19d7c8e04a23e0bc98 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
# Store Path

Nix implements [references](store.md#reference) to [store objects](store.md#store-object) as *store paths*.

Store paths are pairs of

- a 20-byte [digest](#digest) for identification
- a symbolic name for people to read.

Example:

- digest: `b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z`
- name:   `firefox-33.1`

It is rendered to a file system path as the concatenation of

  - [store directory](#store-directory)
  - path-separator (`/`)
  - [digest](#digest) rendered in a custom variant of [base-32](https://en.m.wikipedia.org/wiki/Base32) (20 arbitrary bytes become 32 ASCII characters)
  - hyphen (`-`)
  - name

Example:

      /nix/store/b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z-firefox-33.1
      |--------| |------------------------------| |----------|
    store directory            digest                 name

## Store Directory {#store-directory}

Every [store](./store.md) has a store directory.

If the store has a [file system representation](./store.md#files-and-processes), this directory contains the store’s [file system objects](#file-system-object), which can be addressed by [store paths](#store-path).

This means a store path is not just derived from the referenced store object itself, but depends on the store the store object is in.

> **Note**
> The store directory defaults to `/nix/store`, but is in principle arbitrary.

It is important which store a given store object belongs to:
Files in the store object can contain store paths, and processes may read these paths.
Nix can only guarantee [referential integrity](store/closure.md) if store paths do not cross store boundaries.

Therefore one can only copy store objects to a different store if

- the source and target stores' directories match

  or

- the store object in question has no references, that is, contains no store paths.

One cannot copy a store object to a store with a different store directory.
Instead, it has to be rebuilt, together with all its dependencies.
It is in general not enough to replace the store directory string in file contents, as this may render executables unusable by invalidating their internal offsets or checksums.

# Digest {#digest}

In a [store path](#store-path), the [digest][digest] is the output of a [cryptographic hash function][hash] of either all *inputs* involved in building the referenced store object or its actual *contents*.

Store objects are therefore said to be either [input-addressed](#input-addressing) or [content-addressed](#content-addressing).

> **Historical Note**
> The 20 byte restriction is because originally digests were [SHA-1][sha-1] hashes.
> Nix now uses [SHA-256][sha-256], and longer hashes are still reduced to 20 bytes for compatibility.

[digest]: https://en.m.wiktionary.org/wiki/digest#Noun
[hash]: https://en.m.wikipedia.org/wiki/Cryptographic_hash_function
[sha-1]: https://en.m.wikipedia.org/wiki/SHA-1
[sha-256]: https://en.m.wikipedia.org/wiki/SHA-256

### Reference scanning

When a new store object is built, Nix scans its file contents for store paths to construct its set of references.

The special format of a store path's [digest](#digest) allows reliably detecting it among arbitrary data.
Nix uses the [closure](store.md#closure) of build inputs to derive the list of allowed store paths, to avoid false positives.

This way, scanning files captures run time dependencies without the user having to declare them explicitly.
Doing it at build time and persisting references in the store object avoids repeating this time-consuming operation.

> **Note**
> In practice, it is sometimes still necessary for users to declare certain dependencies explicitly, if they are to be preserved in the build result's closure.
This depends on the specifics of the software to build and run.
>
> For example, Java programs are compressed after compilation, which obfuscates any store paths they may refer to and prevents Nix from automatically detecting them.

## Input Addressing {#input-addressing}

Input addressing means that the digest derives from how the store object was produced, namely its build inputs and build plan.

To compute the hash of a store object one needs a deterministic serialisation, i.e., a binary string representation which only changes if the store object changes.

Nix has a custom serialisation format called Nix Archive (NAR)

Store object references of this sort can *not* be validated from the content of the store object.
Rather, a cryptographic signature has to be used to indicate that someone is vouching for the store object really being produced from a build plan with that digest.

## Content Addressing {#content-addressing}

Content addressing means that the digest derives from the store object's contents, namely its file system objects and references.
If one knows content addressing was used, one can recalculate the reference and thus verify the store object.

Content addressing is currently only used for the special cases of source files and "fixed-output derivations", where the contents of a store object are known in advance.
Content addressing of build results is still an [experimental feature subject to some restrictions](https://github.com/tweag/rfcs/blob/cas-rfc/rfcs/0062-content-addressed-paths.md).