TVL depot development (mail to depot@tvl.su)
 help / color / mirror / code / Atom feed
From: Vincent Ambo <tazjin@tvl.su>
To: Adam Joseph <adam@westernsemico•com>, depot@tvl.su
Cc: Florian Klink <flokli@flokli•de>
Subject: Re: [tvix] string contexts vs. reference scanning
Date: Thu, 16 Mar 2023 12:41:52 +0300	[thread overview]
Message-ID: <585fa6e7-f4f1-5b0a-1bef-e46b422fcec5@tvl.su> (raw)
In-Reply-To: <CANHrikpS8YXVPUzmqE-yWBse85j3RRjTu0tR4yg9Cqz89CpeNw@mail.gmail.com>

Okay, there's some (for me) new information on string contexts that will 
require some careful handling if we want to proceed with reference scanning.

Let me preface by saying that despite this problem, reference-scanning 
for inputs yields perfectly functional, *equivalent* but not *identical* 
derivations to Nix. We do currently consider it a problem because we 
want to be fully hash-equal with C++ Nix, both to prove that our 
implementation is correct and to make use of Hydra's cache.

Now, for the problem:

There are some real-life cases, for example during nixpkgs 
bootstrapping, where multiple different fixed-output derivations are 
written to produce the same hash.

For example, bootstrap sources that are downloaded early are fetched 
using a special "builder hack", in which the `builder` field of the 
derivation is populated with the magic string `builtins:fetchurl` and 
the builder itself will perform a fetch, with everything looking like a 
normal derivation to the user.

These bootstrap sources are later on defined *again*, once `curl` is 
available, to be downloaded using the standard pkgs.fetchtarball 
mechanism, but yielding the *same* outputs (as the same files are being 
fetched).

In our reference scanning implementation, this output scanning of FOD 
will yield whatever the *first* derivation was that produced the given 
path as the drv to be stored in the `inputDrvs` field of the derivation.

There's an orthogonal problem which made this confusing to understand, 
where C++ Nix has some special logic for how it hashes derivations that 
use fixed-output paths, which we haven't fully replicated yet. This led 
to hash differences which masked this underlying problem (the 
differences still exist, and are a separate issue).

I discussed this with Adam yesterday and he suggested an approach 
similar to `builtins.placeholder`, which would only be in effect for 
fixed-output derivations. I think this is feasible but haven't sketched 
anything yet.

Either way, for me this also raises the thought of whether we should 
decouple Tvix's internal representation of a derivation from that of C++ 
Nix and only "materialise" C++ Nix derivations (and accompanying hashes) 
where needed. Something to think about ...

//V

  parent reply	other threads:[~2023-03-16  9:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CANHrikrEDPkH1raGDGAGETeATrWOJ=sBQCUXr6=pHJm1ajbd0A@mail.gmail.com>
     [not found] ` <20221202152213.3a59e629@ostraka>
2023-01-09 22:07   ` Vincent Ambo
2023-01-11 11:49     ` sterni
2023-01-11 12:20       ` Vincent Ambo
2023-03-16  9:41     ` Vincent Ambo [this message]
2023-03-16 12:00       ` Florian Klink
2023-01-10 20:20   ` reference-scanning inputDrvs/inputSrcs Adam Joseph
2023-01-10 20:48     ` Vincent Ambo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=585fa6e7-f4f1-5b0a-1bef-e46b422fcec5@tvl.su \
    --to=tazjin@tvl.su \
    --cc=adam@westernsemico$(echo .)com \
    --cc=depot@tvl.su \
    --cc=flokli@flokli$(echo .)de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://code.tvl.fyi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).