diff options
author | Jade Lovelace <lix@jade.fyi> | 2024-08-07 02:00:50 -0700 |
---|---|---|
committer | Jade Lovelace <lix@jade.fyi> | 2024-08-07 02:52:00 -0700 |
commit | 1437d3df15c1efae3164ae45c3285bd9959def5f (patch) | |
tree | e2eac9bba68e1976d4ce747102a3ee4664a93ce6 /src | |
parent | 529eed74c477eee8567f28379210cd47f0b4e18f (diff) |
darwin: workaround PROC_PIDLISTFDS on processes with no fds
This has been causing various seemingly spurious CI failures as well as
some failures on people running tests on beta builds.
lix> ++(nix-collect-garbage-dry-run.sh:20) nix-store --gc --print-dead
lix> ++(nix-collect-garbage-dry-run.sh:20) wc -l
lix> finding garbage collector roots...
lix> error: Listing pid 87261 file descriptors: Undefined error: 0
There is no real way to write a proper test for this, other than to
start a process like the following:
int main(void) {
for (int i = 0; i < 1000; ++i) {
close(i);
}
sleep(10000);
}
and then let Lix's gc look at it.
I have a relatively high confidence this *will* fix the problem since I
have manually confirmed the behaviour of the libproc call is
as-unexpected, and it would perfectly explain the observed symptom.
Fixes: https://git.lix.systems/lix-project/lix/issues/446
Change-Id: I67669b98377af17895644b3bafdf42fc33abd076
Diffstat (limited to 'src')
-rw-r--r-- | src/libstore/platform/darwin.cc | 17 |
1 files changed, 16 insertions, 1 deletions
diff --git a/src/libstore/platform/darwin.cc b/src/libstore/platform/darwin.cc index 1b591fde3..1f7e9be23 100644 --- a/src/libstore/platform/darwin.cc +++ b/src/libstore/platform/darwin.cc @@ -56,12 +56,27 @@ void DarwinLocalStore::findPlatformRoots(UncheckedRoots & unchecked) while (fdBufSize > fds.size() * sizeof(struct proc_fdinfo)) { // Reserve some extra size so we don't fail too much fds.resize((fdBufSize + fdBufSize / 8) / sizeof(struct proc_fdinfo)); + errno = 0; fdBufSize = proc_pidinfo( pid, PROC_PIDLISTFDS, 0, fds.data(), fds.size() * sizeof(struct proc_fdinfo) ); + // errno == 0???! Yes, seriously. This is because macOS has a + // broken syscall wrapper for proc_pidinfo that has no way of + // dealing with the system call successfully returning 0. It + // takes the -1 error result from the errno-setting syscall + // wrapper and turns it into a 0 result. But what if the system + // call actually returns 0? Then you get an errno of success. + // + // https://github.com/apple-opensource/xnu/blob/4f43d4276fc6a87f2461a3ab18287e4a2e5a1cc0/libsyscall/wrappers/libproc/libproc.c#L100-L110 + // https://git.lix.systems/lix-project/lix/issues/446#issuecomment-5483 + // FB14695751 if (fdBufSize <= 0) { - throw SysError("Listing pid %1% file descriptors", pid); + if (errno == 0) { + break; + } else { + throw SysError("Listing pid %1% file descriptors", pid); + } } } fds.resize(fdBufSize / sizeof(struct proc_fdinfo)); |