Skip to content

Race in .drv import helper initialization with parallel eval #447

@omarjatoi

Description

@omarjatoi

Describe the bug

Importing many .drv files concurrently can race under parallel-eval, especially through nix eval --json, which deep-forces attributes in parallel.

The issue appears to be in derivationToValue: EvalState::vImportedDrvToDerivation is assigned before imported-drv-to-derivation.nix.gen.hh has finished evaluating. Another worker can observe the non-null root and call forceFunction while the helper is still a thunk.

Observed failure:

error: expected a function but found a thunk: «thunk»

Older builds (specifically 3.17.0) have also crashed in Value::type() from the same path.

Steps To Reproduce

  1. Create repro.nix:
let
  mkDrv = n:
    derivation {
      name = "parallel-drv-import-repro-${toString n}";
      system = builtins.currentSystem;
      builder = "/bin/sh";
      args = [ "-c" "echo ok > $out" ];
    };

  mkEntry = n:
    let
      drvPath = builtins.storePath (mkDrv n).drvPath;
    in {
      name = "x${toString n}";
      value = (import drvPath).outPath;
    };
in
builtins.listToAttrs (map mkEntry (builtins.genList (n: n) 1000))
  1. Confirm the single-core control case succeeds:
NIX_CONFIG=$'extra-experimental-features = parallel-eval\neval-cores = 1' \
  nix eval --json --impure --show-trace --file repro.nix >/dev/null
  1. Run with parallel eval enabled:
NIX_CONFIG=$'extra-experimental-features = parallel-eval\neval-cores = 0' \
  nix eval --json --impure --show-trace --file repro.nix >/dev/null
  1. Observe intermittent failure, often immediately.

Expected behavior

Parallel evaluation should behave like eval-cores = 1; concurrent .drv imports should not observe a partially initialized helper value.

Metadata

Observed on (reproducible on both Darwin and Linux):

nix (Determinate Nix 3.19.1) 2.34.6

Additional context

Likely fix: serialize lazy initialization of vImportedDrvToDerivation, evaluate and validate the helper as a function first, then publish the RootValue. The per-.drv conversion can remain outside the lock.

I have a patch that I've tested which resolves the issue omarjatoi@0171ee2. Happy to open a PR, but wanted to make sure there was a bug report first.

Disclaimer: I worked through this bug with Codex, and the patch above was generated by Codex as well.

Checklist

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions