From: sterni <sternenseemann@systemli•org>
To: Vincent Ambo <mail@tazj•in>, depot@tvl.su
Subject: Re: [tvix] string contexts vs. reference scanning
Date: Wed, 11 Jan 2023 12:49:12 +0100 [thread overview]
Message-ID: <ecb3ceb6-eb07-7e96-6aa3-28b4b640d445@systemli.org> (raw)
In-Reply-To: <CANHrikpS8YXVPUzmqE-yWBse85j3RRjTu0tR4yg9Cqz89CpeNw@mail.gmail.com>
On 1/9/23 23:07, Vincent Ambo wrote:
> Any other input people might have on string contexts is also welcome!
One thing I'd want to see answered is how to handle import from
derivation. In current C++ Nix this is handled in the following way:
import calls coerceToPath on the value it gets passed. coerceToPath
looks at the string context and realises any derivation found within it.
Finally the actual file is retrieved from the store / disk.
There are also similar occasions where things get realised while
evaluating (interactively?) except reading / importing, but I don't have
a very good handle on those yet.
> [1]: I'm not actually sure about this. It's possible that all these
> use-cases that exist right now (e.g. string context discarding in TVL's
> :llama: step) actually go away with the Tvix model of starting builds
> immediately, but strongly ordered. Thoughts?
Currently you can work with string context in the following ways:
1. You can discard some using builtins.unsafeDiscardOutputDependency.
This has been, as far as I can tell, been added to combat the
oddity of the string context of the `drvPath` attribute.
Apparently disnix ran into the problem that the `drvPath`
of a derivation would cause all of its outputs to be built
in 2009. Subsequently, `builtins.unsafeDiscardOutputDependency`
was [introduced].
My _unconfirmed_ theory is that this was a quick and easy
workaround that was implemented without considering the underlying
problem. In my view, there is no reason why `drvPath` should
incur a reference to all outputs of the derivation as well as
the derivation file itself (I think this is thanks to the reference
scanner the store runs after the fact which determines if the
derivation and/or any of its outputs are actually referenced).
I would be interested in any theories why `drvPath` behaved
and maybe even should behave that way (maybe useful for recursive
Nix?). In my experience `drvPath` either never enters a derivation or
is closely accompanied by `builtins.unsafeDiscardOutputDependency`.
Maybe when implementing string contexts there was a confusion
what "=<drv_path>" should mean originally, but this was never
fixed when disnix came around.
2. You can discard all using builtins.unsafeDiscardStringContext.
I think the uses of this builtin fall into two categories:
- To drop wrongfully retained string context. All string
operations retain string context, even though some actually
destroy any reference that was present in the string.
Classic examples would be
`builtins.substring 0 3 ">>> ${pkgs.hello}"` or
`builtins.baseNameOf "${pkgs.hello}/bin/hello".
- As an escape hatch from references to the derivations
in question. We use this in //nix/buildkite: We use
derivation paths, so we can skip re-evaluating targets,
but discard any references to those files. Since buildkite
doesn't know about nix-copy-closure(1), it'd be difficult
to copy the required derivation files to an executing machine
even if we had the correct references. Instead we impurely
access the store and re-evaluate the target if the derivation
file is missing.
With reference scanning, wrongfully retained string context should
basically disappear, but so would the escape hatch. I think we'd
need to invent a new mechanism entirely, maybe even as ugly as
an `allowedImpureReferences = [ … ];`.
A third use is described in the item for appendContext.
3. You can check if there's any using builtins.hasContext.
4. You can query it using builtins.getContext.
In C++ Nix 2.3 this is not particularly useful, since you can
only inspect the root of the dependency graph (i.e. outPath
will just give you drvPath as context). In Nix >= 2.6 you can,
however, use this in conjunction with builtins.readFile
to query the references of store paths. This is of course already
possible before, but requires writing a reference scanner and/or
derivation parser in Nix. We have a hacky [prototype] of this
in depot.
I think it should be possible to emulate this using reference
scanning as well, i.e. tvix-eval would need to run the reference
scanner on the given string for getContext. This would actually
be pretty nice for doing dependency analysis.
5. You can add it using builtins.appendContext.
The main use case for `builtins.appendContext` I can think of,
is to restore string context after it has been discarded via
`builtins.unsafeDiscardStringContext`. This is required due
to technical limitations in C++ Nix that affect some algorithms:
If you, for example want to use (parts of) input strings
as keys in an attribute set, you need to make sure they have no
string context. A function that has such a step in its algorithm
would then use `builtins.getContext` to store the context,
run the actual algorithm after `builtins.unsafeDiscardStringContext`
is applied and finally return after restoring the string context
using `builtins.appendContext`.
[introduced]:
https://github.com/NixOS/nix/commit/437077c39dd7abb44b2ab02cb9c6215d125bef04
[prototype]:
https://code.tvl.fyi/tree/nix/dependency-analyzer/default.nix?id=805219a2fad0edac10d046fc5ad5820edb4482ee#n10
next prev parent reply other threads:[~2023-01-11 11:49 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CANHrikrEDPkH1raGDGAGETeATrWOJ=sBQCUXr6=pHJm1ajbd0A@mail.gmail.com>
[not found] ` <20221202152213.3a59e629@ostraka>
2023-01-09 22:07 ` Vincent Ambo
2023-01-11 11:49 ` sterni [this message]
2023-01-11 12:20 ` Vincent Ambo
2023-03-16 9:41 ` Vincent Ambo
2023-03-16 12:00 ` Florian Klink
2023-01-10 20:20 ` reference-scanning inputDrvs/inputSrcs Adam Joseph
2023-01-10 20:48 ` Vincent Ambo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ecb3ceb6-eb07-7e96-6aa3-28b4b640d445@systemli.org \
--to=sternenseemann@systemli$(echo .)org \
--cc=depot@tvl.su \
--cc=mail@tazj$(echo .)in \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://code.tvl.fyi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).