Skip to content

Add logger bindings, fix bindgen-gcc.sh#22

Merged
RossComputerGuy merged 3 commits into
integrationfrom
tristanross/dsp-204
May 7, 2026
Merged

Add logger bindings, fix bindgen-gcc.sh#22
RossComputerGuy merged 3 commits into
integrationfrom
tristanross/dsp-204

Conversation

@RossComputerGuy
Copy link
Copy Markdown
Member

@RossComputerGuy RossComputerGuy commented May 7, 2026

Binds the logger from DeterminateSystems/nix-src#448

Summary by CodeRabbit

  • New Features

    • Added a Rust-backed, FFI-safe logger interface with customizable logging callbacks, activity tracking, and safe lifecycle handling.
  • Improvements

    • Streamlined discovery and formatting of system include paths in the build script for more efficient and concise argument assembly.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a Rust Nix logger module with FFI thunks, vtable, installation, and tests; refactors bindgen-gcc.sh to extract GCC include paths via a single LC_ALL=C awk/printf pipeline and assemble -I flags in one argument.

Changes

Build Script Optimization

Layer / File(s) Summary
Include Path Extraction
bindgen-gcc.sh
GCC system include paths are extracted under LC_ALL=C and parsed with a single awk/printf step to produce a combined -I... argument appended to BINDGEN_EXTRA_CLANG_ARGS.

Nix Logger FFI Binding

Layer / File(s) Summary
Module & Types
nix-bindings-util/src/lib.rs, nix-bindings-util/src/logger.rs
Adds pub mod logger; and public types: Verbosity, ActivityId, ActivityType, ResultType with raw-value conversion and Unknown/Other fallbacks.
Logger Trait Interface
nix-bindings-util/src/logger.rs
Public Logger trait (Send + Sync) with default no-op methods for log messages, activity start/stop, and string result delivery; documents FFI panic-safety.
FFI Thunks & VTABLE
nix-bindings-util/src/logger.rs
Erases trait into boxed userdata; defines extern "C" thunks that decode nullable C strings, convert raw enums, wrap calls in catch_unwind, and expose a static VTABLE including destroy.
Installation & Lifecycle
nix-bindings-util/src/logger.rs
set_logger serializes installs with a mutex, allocates boxed logger userdata, calls raw::set_logger with the VTABLE, and frees the allocation if installation is rejected.
Tests
nix-bindings-util/src/logger.rs
Unit tests confirm logger replacement drops previous logger and validate Verbosity conversions for known raw values.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hop across C and Rust tonight,
Thunks catch unwinds and keep things tight,
Enums and vtables in tidy rows,
Build flags gathered where the compiler goes,
A rabbit cheers: the bindings glow bright!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately covers both main changes: adding logger bindings and fixing bindgen-gcc.sh, matching the commit messages and file modifications.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch tristanross/dsp-204

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@nix-bindings-util/src/logger.rs`:
- Around line 199-257: The FFI thunks thunk_log, thunk_start_activity,
thunk_stop_activity, and thunk_result_string dereference userdata without
checking for null; add a defensive null check at the top of each thunk and
return early if userdata.is_null() to avoid UB, then cast userdata to &LoggerObj
only after the check (i.e., replace let logger = &*(userdata as *const
LoggerObj) with a guarded cast), ensuring none of the logger.* methods are
invoked when userdata is null.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9bb29105-916e-48f2-9efc-c8d750894707

📥 Commits

Reviewing files that changed from the base of the PR and between 1d65037 and 357d533.

⛔ Files ignored due to path filters (1)
  • flake.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • bindgen-gcc.sh
  • nix-bindings-util/src/lib.rs
  • nix-bindings-util/src/logger.rs

Comment on lines +199 to +257
unsafe extern "C" fn thunk_log(userdata: *mut c_void, level: raw::verbosity, msg: *const c_char) {
let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
let logger = &*(userdata as *const LoggerObj);
let msg_str = if msg.is_null() {
std::borrow::Cow::Borrowed("")
} else {
CStr::from_ptr(msg).to_string_lossy()
};
logger.log(Verbosity::from_raw(level), msg_str.as_ref());
}));
}

unsafe extern "C" fn thunk_start_activity(
userdata: *mut c_void,
activity_id: raw::activity_id,
level: raw::verbosity,
type_: raw::activity_type,
s: *const c_char,
parent_id: raw::activity_id,
) {
let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
let logger = &*(userdata as *const LoggerObj);
let s_str = if s.is_null() {
std::borrow::Cow::Borrowed("")
} else {
CStr::from_ptr(s).to_string_lossy()
};
logger.start_activity(
activity_id,
Verbosity::from_raw(level),
ActivityType::from_raw(type_),
s_str.as_ref(),
parent_id,
);
}));
}

unsafe extern "C" fn thunk_stop_activity(userdata: *mut c_void, activity_id: raw::activity_id) {
let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
let logger = &*(userdata as *const LoggerObj);
logger.stop_activity(activity_id);
}));
}

unsafe extern "C" fn thunk_result_string(
userdata: *mut c_void,
activity_id: raw::activity_id,
type_: raw::result_type,
msg: *const c_char,
) {
let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
let logger = &*(userdata as *const LoggerObj);
let msg_str = if msg.is_null() {
std::borrow::Cow::Borrowed("")
} else {
CStr::from_ptr(msg).to_string_lossy()
};
logger.result_string(activity_id, ResultType::from_raw(type_), msg_str.as_ref());
}));
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In the Nix C API for nix_set_logger, is callback userdataguaranteed to be non-null forlog, start_activity, stop_activity, and result_string callbacks?

💡 Result:

No, the Nix C API documentation for nix_set_logger does not explicitly guarantee that the callback userdata is non-null for the log, start_activity, stop_activity, or result_string callbacks. The official Nix manual points to generated C API documentation [1][2], but no specific details on nix_set_logger or userdata guarantees were found in accessible sources. An open GitHub issue discusses adding C API logging callbacks, suggesting it may not be fully implemented or documented yet [3]. C++ Logger class methods like log, startActivity, stopActivity, and result exist [4], but C callback semantics, including userdata handling, are unspecified in available results. Without source code or doxygen docs confirming the callback signatures and invariants (e.g., userdata always passed as provided, assuming non-null on set), it cannot be confirmed as guaranteed. The C API is explicitly noted as unstable and in development [1].

Citations:


🏁 Script executed:

find . -name "logger.rs" -path "*/nix-bindings-util/*" | head -20

Repository: DeterminateSystems/nix-bindings-rust

Length of output: 114


🏁 Script executed:

sed -n '199,270p' ./nix-bindings-util/src/logger.rs

Repository: DeterminateSystems/nix-bindings-rust

Length of output: 2410


🏁 Script executed:

sed -n '1,198p' ./nix-bindings-util/src/logger.rs | tail -50

Repository: DeterminateSystems/nix-bindings-rust

Length of output: 1807


🏁 Script executed:

sed -n '270,350p' ./nix-bindings-util/src/logger.rs

Repository: DeterminateSystems/nix-bindings-rust

Length of output: 2546


Guard userdata before dereferencing in all FFI thunks.

userdata is dereferenced unconditionally in thunk_log, thunk_start_activity, thunk_stop_activity, and thunk_result_string without null checks. The thunk_destroy callback guards against null, creating an inconsistency. Since the Nix C API does not explicitly guarantee non-null userdata across FFI callbacks, defensive null checks are required to prevent UB if C code ever invokes these callbacks with null.

Suggested fix
 unsafe extern "C" fn thunk_log(userdata: *mut c_void, level: raw::verbosity, msg: *const c_char) {
+    if userdata.is_null() {
+        return;
+    }
     let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
         let logger = &*(userdata as *const LoggerObj);
         let msg_str = if msg.is_null() {
             std::borrow::Cow::Borrowed("")
         } else {
             CStr::from_ptr(msg).to_string_lossy()
         };
         logger.log(Verbosity::from_raw(level), msg_str.as_ref());
     }));
 }

 unsafe extern "C" fn thunk_start_activity(
     userdata: *mut c_void,
@@
 ) {
+    if userdata.is_null() {
+        return;
+    }
     let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
         let logger = &*(userdata as *const LoggerObj);
@@
 unsafe extern "C" fn thunk_stop_activity(userdata: *mut c_void, activity_id: raw::activity_id) {
+    if userdata.is_null() {
+        return;
+    }
     let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
         let logger = &*(userdata as *const LoggerObj);
         logger.stop_activity(activity_id);
     }));
 }

 unsafe extern "C" fn thunk_result_string(
     userdata: *mut c_void,
@@
 ) {
+    if userdata.is_null() {
+        return;
+    }
     let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
         let logger = &*(userdata as *const LoggerObj);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@nix-bindings-util/src/logger.rs` around lines 199 - 257, The FFI thunks
thunk_log, thunk_start_activity, thunk_stop_activity, and thunk_result_string
dereference userdata without checking for null; add a defensive null check at
the top of each thunk and return early if userdata.is_null() to avoid UB, then
cast userdata to &LoggerObj only after the check (i.e., replace let logger =
&*(userdata as *const LoggerObj) with a guarded cast), ensuring none of the
logger.* methods are invoked when userdata is null.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If userdata is null, then we have a serious problem in Nix.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RossComputerGuy That's a fair point. Since userdata is set exclusively by this crate's set_logger to a non-null Box<dyn Logger>, a null value arriving in a callback would mean Nix has corrupted or ignored the pointer — a serious invariant violation. Silently returning early in that case would hide the bug rather than surface it, which is the wrong trade-off. I'll withdraw the suggestion.


✏️ Learnings added
Learnt from: RossComputerGuy
Repo: DeterminateSystems/nix-bindings-rust PR: 22
File: nix-bindings-util/src/logger.rs:199-257
Timestamp: 2026-05-07T16:02:54.533Z
Learning: In `nix-bindings-util/src/logger.rs`, the `userdata` pointer passed to FFI thunks (`thunk_log`, `thunk_start_activity`, `thunk_stop_activity`, `thunk_result_string`) is always set by this crate's own `set_logger` function to a non-null `Box<dyn Logger>`. A null `userdata` in these callbacks would indicate a serious Nix-side bug (pointer corruption or contract violation), so defensive null checks that silently return early are intentionally absent — they would hide critical bugs rather than surface them.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Copy Markdown
Member

@cole-h cole-h May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case should we not just explicitly panic if the userdata is null? Especially if the behavior would result in UB.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, idk. It would be UB on the Nix C/C++ side calling the logger in a way that is broken.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's our responsibility sitting at this FFI boundary to try to prevent UB and instead crash loudly under those circumstances. At least, putting myself in the shoes of a consumer, I'd hope that my libraries would crash when some invariant is not upheld, rather than silently continuing with garbage data and crashing later (making it harder to diagnose where the problem originally came from).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've added asserts.

@RossComputerGuy RossComputerGuy enabled auto-merge May 7, 2026 16:15
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bindgen-gcc.sh`:
- Around line 10-16: The include-path discovery command that populates
include_paths (the LC_ALL... $NIX_CC_UNWRAPPED -v -E ... | awk ...) can succeed
while producing no paths, and $NIX_CC_UNWRAPPED should be quoted to avoid
word-splitting; update the probe to quote "$NIX_CC_UNWRAPPED" and add an
explicit guard after computing include_paths that checks if include_paths is
empty (or contains only whitespace) and exits with a non-zero status and an
error message to stderr if so, preventing silent export of incomplete
BINDGEN_EXTRA_CLANG_ARGS; keep the subsequent include_args and
BINDGEN_EXTRA_CLANG_ARGS assignment unchanged but only run them when
include_paths is non-empty.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bb0e1bab-834a-4482-8446-8ba412c589e2

📥 Commits

Reviewing files that changed from the base of the PR and between 357d533 and 959616e.

📒 Files selected for processing (3)
  • bindgen-gcc.sh
  • nix-bindings-util/src/lib.rs
  • nix-bindings-util/src/logger.rs
✅ Files skipped from review due to trivial changes (2)
  • nix-bindings-util/src/lib.rs
  • nix-bindings-util/src/logger.rs

Comment thread bindgen-gcc.sh
Comment on lines +10 to +16
LC_ALL=C $NIX_CC_UNWRAPPED -v -E -x c - </dev/null 2>&1 \
| awk '/#include <...> search starts here:/{flag=1;next} \
/End of search list./{flag=0} \
flag==1 {print $1}'
)
for path in $include_paths; do
echo " - $path" 2>&1
BINDGEN_EXTRA_CLANG_ARGS="$BINDGEN_EXTRA_CLANG_ARGS -I$path"
done
include_args=$(printf '%s\n' "$include_paths" | awk 'NF {printf " -I%s", $1; printf " - %s\n", $1 > "/dev/stderr"}')
BINDGEN_EXTRA_CLANG_ARGS="$BINDGEN_EXTRA_CLANG_ARGS$include_args"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

fd -name "bindgen-gcc.sh" -type f

Repository: DeterminateSystems/nix-bindings-rust

Length of output: 311


🏁 Script executed:

fd "bindgen-gcc.sh"

Repository: DeterminateSystems/nix-bindings-rust

Length of output: 95


🏁 Script executed:

cat -n bindgen-gcc.sh

Repository: DeterminateSystems/nix-bindings-rust

Length of output: 943


🏁 Script executed:

cat > /tmp/test_pipeline.sh << 'EOF'
#!/bin/bash

# Test 1: Pipeline with failing first command, succeeding last command
echo "=== Test 1: Failed compiler, successful awk ==="
result=$( 
  (exit 1) | awk 'BEGIN { print "awk ran" }' 
)
echo "Pipeline exit code: $?"
echo "Result: '$result'"

# Test 2: Pipeline where first command produces no output
echo "=== Test 2: No output case ==="
result=$(
  (echo "" | grep "nonexistent") | awk '{ print "matched" }'
)
echo "Pipeline exit code: $?"
echo "Result: '$result'"
echo "Is result empty: [ -z '$result' ]"
[ -z "$result" ] && echo "Yes, result is empty"

# Test 3: Unquoted variable expansion
echo "=== Test 3: Unquoted variable with spaces (simulated) ==="
MY_VAR="gcc -v"
# This will try to run just "gcc" as separate command
result=$(
  LC_ALL=C $MY_VAR -E -x c - </dev/null 2>&1 | head -1
) 2>/dev/null || true
echo "Expansion behavior demonstrated"
EOF
bash /tmp/test_pipeline.sh

Repository: DeterminateSystems/nix-bindings-rust

Length of output: 370


🏁 Script executed:

cat > /tmp/test_fix.sh << 'EOF'
#!/bin/bash

# Simulate the proposed fix structure
NIX_CC_UNWRAPPED="/bin/false"  # Simulate non-existent compiler

include_paths=$(
  LC_ALL=C "${NIX_CC_UNWRAPPED:?NIX_CC_UNWRAPPED is not set}" -v -E -x c - </dev/null 2>&1 \
  | awk '/#include <...> search starts here:/{flag=1;next} \
        /End of search list./{flag=0} \
        flag==1 {print $1}'
)

echo "include_paths: '$include_paths'"

# This is the guard the review proposes
if [ -z "$include_paths" ]; then
  echo "Failed to detect GCC system include paths from \$NIX_CC_UNWRAPPED" >&2
  exit 1
fi

echo "Guard passed"
EOF
bash /tmp/test_fix.sh 2>&1 || echo "Script exited with code: $?"

Repository: DeterminateSystems/nix-bindings-rust

Length of output: 190


Fail fast when include-path discovery yields no results

Line 10's pipeline can still appear successful if the compiler probe fails (because awk is the last command), leaving include_paths empty and silently exporting incomplete BINDGEN_EXTRA_CLANG_ARGS. Quote "$NIX_CC_UNWRAPPED" to avoid unintended expansion, and add an explicit empty-result guard to catch this failure case.

🔧 Proposed fix
 include_paths=$(
-  LC_ALL=C $NIX_CC_UNWRAPPED -v -E -x c - </dev/null 2>&1 \
+  LC_ALL=C "${NIX_CC_UNWRAPPED:?NIX_CC_UNWRAPPED is not set}" -v -E -x c - </dev/null 2>&1 \
   | awk '/#include <...> search starts here:/{flag=1;next} \
         /End of search list./{flag=0} \
         flag==1 {print $1}'
 )
+if [ -z "$include_paths" ]; then
+  echo "Failed to detect GCC system include paths from \$NIX_CC_UNWRAPPED" >&2
+  exit 1
+fi
 include_args=$(printf '%s\n' "$include_paths" | awk 'NF {printf " -I%s", $1; printf " - %s\n", $1 > "/dev/stderr"}')
 BINDGEN_EXTRA_CLANG_ARGS="$BINDGEN_EXTRA_CLANG_ARGS$include_args"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bindgen-gcc.sh` around lines 10 - 16, The include-path discovery command that
populates include_paths (the LC_ALL... $NIX_CC_UNWRAPPED -v -E ... | awk ...)
can succeed while producing no paths, and $NIX_CC_UNWRAPPED should be quoted to
avoid word-splitting; update the probe to quote "$NIX_CC_UNWRAPPED" and add an
explicit guard after computing include_paths that checks if include_paths is
empty (or contains only whitespace) and exits with a non-zero status and an
error message to stderr if so, preventing silent export of incomplete
BINDGEN_EXTRA_CLANG_ARGS; keep the subsequent include_args and
BINDGEN_EXTRA_CLANG_ARGS assignment unchanged but only run them when
include_paths is non-empty.

@RossComputerGuy RossComputerGuy merged commit 42d4188 into integration May 7, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants