Add logger bindings, fix bindgen-gcc.sh#22
Conversation
📝 WalkthroughWalkthroughAdds a Rust Nix logger module with FFI thunks, vtable, installation, and tests; refactors bindgen-gcc.sh to extract GCC include paths via a single LC_ALL=C awk/printf pipeline and assemble -I flags in one argument. ChangesBuild Script Optimization
Nix Logger FFI Binding
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@nix-bindings-util/src/logger.rs`:
- Around line 199-257: The FFI thunks thunk_log, thunk_start_activity,
thunk_stop_activity, and thunk_result_string dereference userdata without
checking for null; add a defensive null check at the top of each thunk and
return early if userdata.is_null() to avoid UB, then cast userdata to &LoggerObj
only after the check (i.e., replace let logger = &*(userdata as *const
LoggerObj) with a guarded cast), ensuring none of the logger.* methods are
invoked when userdata is null.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9bb29105-916e-48f2-9efc-c8d750894707
⛔ Files ignored due to path filters (1)
flake.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
bindgen-gcc.shnix-bindings-util/src/lib.rsnix-bindings-util/src/logger.rs
| unsafe extern "C" fn thunk_log(userdata: *mut c_void, level: raw::verbosity, msg: *const c_char) { | ||
| let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| { | ||
| let logger = &*(userdata as *const LoggerObj); | ||
| let msg_str = if msg.is_null() { | ||
| std::borrow::Cow::Borrowed("") | ||
| } else { | ||
| CStr::from_ptr(msg).to_string_lossy() | ||
| }; | ||
| logger.log(Verbosity::from_raw(level), msg_str.as_ref()); | ||
| })); | ||
| } | ||
|
|
||
| unsafe extern "C" fn thunk_start_activity( | ||
| userdata: *mut c_void, | ||
| activity_id: raw::activity_id, | ||
| level: raw::verbosity, | ||
| type_: raw::activity_type, | ||
| s: *const c_char, | ||
| parent_id: raw::activity_id, | ||
| ) { | ||
| let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| { | ||
| let logger = &*(userdata as *const LoggerObj); | ||
| let s_str = if s.is_null() { | ||
| std::borrow::Cow::Borrowed("") | ||
| } else { | ||
| CStr::from_ptr(s).to_string_lossy() | ||
| }; | ||
| logger.start_activity( | ||
| activity_id, | ||
| Verbosity::from_raw(level), | ||
| ActivityType::from_raw(type_), | ||
| s_str.as_ref(), | ||
| parent_id, | ||
| ); | ||
| })); | ||
| } | ||
|
|
||
| unsafe extern "C" fn thunk_stop_activity(userdata: *mut c_void, activity_id: raw::activity_id) { | ||
| let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| { | ||
| let logger = &*(userdata as *const LoggerObj); | ||
| logger.stop_activity(activity_id); | ||
| })); | ||
| } | ||
|
|
||
| unsafe extern "C" fn thunk_result_string( | ||
| userdata: *mut c_void, | ||
| activity_id: raw::activity_id, | ||
| type_: raw::result_type, | ||
| msg: *const c_char, | ||
| ) { | ||
| let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| { | ||
| let logger = &*(userdata as *const LoggerObj); | ||
| let msg_str = if msg.is_null() { | ||
| std::borrow::Cow::Borrowed("") | ||
| } else { | ||
| CStr::from_ptr(msg).to_string_lossy() | ||
| }; | ||
| logger.result_string(activity_id, ResultType::from_raw(type_), msg_str.as_ref()); | ||
| })); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In the Nix C API for nix_set_logger, is callback userdataguaranteed to be non-null forlog, start_activity, stop_activity, and result_string callbacks?
💡 Result:
No, the Nix C API documentation for nix_set_logger does not explicitly guarantee that the callback userdata is non-null for the log, start_activity, stop_activity, or result_string callbacks. The official Nix manual points to generated C API documentation [1][2], but no specific details on nix_set_logger or userdata guarantees were found in accessible sources. An open GitHub issue discusses adding C API logging callbacks, suggesting it may not be fully implemented or documented yet [3]. C++ Logger class methods like log, startActivity, stopActivity, and result exist [4], but C callback semantics, including userdata handling, are unspecified in available results. Without source code or doxygen docs confirming the callback signatures and invariants (e.g., userdata always passed as provided, assuming non-null on set), it cannot be confirmed as guaranteed. The C API is explicitly noted as unstable and in development [1].
Citations:
- 1: https://nix.dev/manual/nix/2.24/c-api
- 2: https://nix.dev/manual/nix/2.25/c-api
- 3: C API: Logging callbacks NixOS/nix#11036
- 4: https://squeakybeaver.github.io/lix-api-docs/classnix_1_1Logger.html
🏁 Script executed:
find . -name "logger.rs" -path "*/nix-bindings-util/*" | head -20Repository: DeterminateSystems/nix-bindings-rust
Length of output: 114
🏁 Script executed:
sed -n '199,270p' ./nix-bindings-util/src/logger.rsRepository: DeterminateSystems/nix-bindings-rust
Length of output: 2410
🏁 Script executed:
sed -n '1,198p' ./nix-bindings-util/src/logger.rs | tail -50Repository: DeterminateSystems/nix-bindings-rust
Length of output: 1807
🏁 Script executed:
sed -n '270,350p' ./nix-bindings-util/src/logger.rsRepository: DeterminateSystems/nix-bindings-rust
Length of output: 2546
Guard userdata before dereferencing in all FFI thunks.
userdata is dereferenced unconditionally in thunk_log, thunk_start_activity, thunk_stop_activity, and thunk_result_string without null checks. The thunk_destroy callback guards against null, creating an inconsistency. Since the Nix C API does not explicitly guarantee non-null userdata across FFI callbacks, defensive null checks are required to prevent UB if C code ever invokes these callbacks with null.
Suggested fix
unsafe extern "C" fn thunk_log(userdata: *mut c_void, level: raw::verbosity, msg: *const c_char) {
+ if userdata.is_null() {
+ return;
+ }
let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
let logger = &*(userdata as *const LoggerObj);
let msg_str = if msg.is_null() {
std::borrow::Cow::Borrowed("")
} else {
CStr::from_ptr(msg).to_string_lossy()
};
logger.log(Verbosity::from_raw(level), msg_str.as_ref());
}));
}
unsafe extern "C" fn thunk_start_activity(
userdata: *mut c_void,
@@
) {
+ if userdata.is_null() {
+ return;
+ }
let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
let logger = &*(userdata as *const LoggerObj);
@@
unsafe extern "C" fn thunk_stop_activity(userdata: *mut c_void, activity_id: raw::activity_id) {
+ if userdata.is_null() {
+ return;
+ }
let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
let logger = &*(userdata as *const LoggerObj);
logger.stop_activity(activity_id);
}));
}
unsafe extern "C" fn thunk_result_string(
userdata: *mut c_void,
@@
) {
+ if userdata.is_null() {
+ return;
+ }
let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
let logger = &*(userdata as *const LoggerObj);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@nix-bindings-util/src/logger.rs` around lines 199 - 257, The FFI thunks
thunk_log, thunk_start_activity, thunk_stop_activity, and thunk_result_string
dereference userdata without checking for null; add a defensive null check at
the top of each thunk and return early if userdata.is_null() to avoid UB, then
cast userdata to &LoggerObj only after the check (i.e., replace let logger =
&*(userdata as *const LoggerObj) with a guarded cast), ensuring none of the
logger.* methods are invoked when userdata is null.
There was a problem hiding this comment.
If userdata is null, then we have a serious problem in Nix.
There was a problem hiding this comment.
@RossComputerGuy That's a fair point. Since userdata is set exclusively by this crate's set_logger to a non-null Box<dyn Logger>, a null value arriving in a callback would mean Nix has corrupted or ignored the pointer — a serious invariant violation. Silently returning early in that case would hide the bug rather than surface it, which is the wrong trade-off. I'll withdraw the suggestion.
✏️ Learnings added
Learnt from: RossComputerGuy
Repo: DeterminateSystems/nix-bindings-rust PR: 22
File: nix-bindings-util/src/logger.rs:199-257
Timestamp: 2026-05-07T16:02:54.533Z
Learning: In `nix-bindings-util/src/logger.rs`, the `userdata` pointer passed to FFI thunks (`thunk_log`, `thunk_start_activity`, `thunk_stop_activity`, `thunk_result_string`) is always set by this crate's own `set_logger` function to a non-null `Box<dyn Logger>`. A null `userdata` in these callbacks would indicate a serious Nix-side bug (pointer corruption or contract violation), so defensive null checks that silently return early are intentionally absent — they would hide critical bugs rather than surface them.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
There was a problem hiding this comment.
In this case should we not just explicitly panic if the userdata is null? Especially if the behavior would result in UB.
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Maybe, idk. It would be UB on the Nix C/C++ side calling the logger in a way that is broken.
There was a problem hiding this comment.
I think it's our responsibility sitting at this FFI boundary to try to prevent UB and instead crash loudly under those circumstances. At least, putting myself in the shoes of a consumer, I'd hope that my libraries would crash when some invariant is not upheld, rather than silently continuing with garbage data and crashing later (making it harder to diagnose where the problem originally came from).
There was a problem hiding this comment.
Ok, I've added asserts.
357d533 to
959616e
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@bindgen-gcc.sh`:
- Around line 10-16: The include-path discovery command that populates
include_paths (the LC_ALL... $NIX_CC_UNWRAPPED -v -E ... | awk ...) can succeed
while producing no paths, and $NIX_CC_UNWRAPPED should be quoted to avoid
word-splitting; update the probe to quote "$NIX_CC_UNWRAPPED" and add an
explicit guard after computing include_paths that checks if include_paths is
empty (or contains only whitespace) and exits with a non-zero status and an
error message to stderr if so, preventing silent export of incomplete
BINDGEN_EXTRA_CLANG_ARGS; keep the subsequent include_args and
BINDGEN_EXTRA_CLANG_ARGS assignment unchanged but only run them when
include_paths is non-empty.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bb0e1bab-834a-4482-8446-8ba412c589e2
📒 Files selected for processing (3)
bindgen-gcc.shnix-bindings-util/src/lib.rsnix-bindings-util/src/logger.rs
✅ Files skipped from review due to trivial changes (2)
- nix-bindings-util/src/lib.rs
- nix-bindings-util/src/logger.rs
| LC_ALL=C $NIX_CC_UNWRAPPED -v -E -x c - </dev/null 2>&1 \ | ||
| | awk '/#include <...> search starts here:/{flag=1;next} \ | ||
| /End of search list./{flag=0} \ | ||
| flag==1 {print $1}' | ||
| ) | ||
| for path in $include_paths; do | ||
| echo " - $path" 2>&1 | ||
| BINDGEN_EXTRA_CLANG_ARGS="$BINDGEN_EXTRA_CLANG_ARGS -I$path" | ||
| done | ||
| include_args=$(printf '%s\n' "$include_paths" | awk 'NF {printf " -I%s", $1; printf " - %s\n", $1 > "/dev/stderr"}') | ||
| BINDGEN_EXTRA_CLANG_ARGS="$BINDGEN_EXTRA_CLANG_ARGS$include_args" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd -name "bindgen-gcc.sh" -type fRepository: DeterminateSystems/nix-bindings-rust
Length of output: 311
🏁 Script executed:
fd "bindgen-gcc.sh"Repository: DeterminateSystems/nix-bindings-rust
Length of output: 95
🏁 Script executed:
cat -n bindgen-gcc.shRepository: DeterminateSystems/nix-bindings-rust
Length of output: 943
🏁 Script executed:
cat > /tmp/test_pipeline.sh << 'EOF'
#!/bin/bash
# Test 1: Pipeline with failing first command, succeeding last command
echo "=== Test 1: Failed compiler, successful awk ==="
result=$(
(exit 1) | awk 'BEGIN { print "awk ran" }'
)
echo "Pipeline exit code: $?"
echo "Result: '$result'"
# Test 2: Pipeline where first command produces no output
echo "=== Test 2: No output case ==="
result=$(
(echo "" | grep "nonexistent") | awk '{ print "matched" }'
)
echo "Pipeline exit code: $?"
echo "Result: '$result'"
echo "Is result empty: [ -z '$result' ]"
[ -z "$result" ] && echo "Yes, result is empty"
# Test 3: Unquoted variable expansion
echo "=== Test 3: Unquoted variable with spaces (simulated) ==="
MY_VAR="gcc -v"
# This will try to run just "gcc" as separate command
result=$(
LC_ALL=C $MY_VAR -E -x c - </dev/null 2>&1 | head -1
) 2>/dev/null || true
echo "Expansion behavior demonstrated"
EOF
bash /tmp/test_pipeline.shRepository: DeterminateSystems/nix-bindings-rust
Length of output: 370
🏁 Script executed:
cat > /tmp/test_fix.sh << 'EOF'
#!/bin/bash
# Simulate the proposed fix structure
NIX_CC_UNWRAPPED="/bin/false" # Simulate non-existent compiler
include_paths=$(
LC_ALL=C "${NIX_CC_UNWRAPPED:?NIX_CC_UNWRAPPED is not set}" -v -E -x c - </dev/null 2>&1 \
| awk '/#include <...> search starts here:/{flag=1;next} \
/End of search list./{flag=0} \
flag==1 {print $1}'
)
echo "include_paths: '$include_paths'"
# This is the guard the review proposes
if [ -z "$include_paths" ]; then
echo "Failed to detect GCC system include paths from \$NIX_CC_UNWRAPPED" >&2
exit 1
fi
echo "Guard passed"
EOF
bash /tmp/test_fix.sh 2>&1 || echo "Script exited with code: $?"Repository: DeterminateSystems/nix-bindings-rust
Length of output: 190
Fail fast when include-path discovery yields no results
Line 10's pipeline can still appear successful if the compiler probe fails (because awk is the last command), leaving include_paths empty and silently exporting incomplete BINDGEN_EXTRA_CLANG_ARGS. Quote "$NIX_CC_UNWRAPPED" to avoid unintended expansion, and add an explicit empty-result guard to catch this failure case.
🔧 Proposed fix
include_paths=$(
- LC_ALL=C $NIX_CC_UNWRAPPED -v -E -x c - </dev/null 2>&1 \
+ LC_ALL=C "${NIX_CC_UNWRAPPED:?NIX_CC_UNWRAPPED is not set}" -v -E -x c - </dev/null 2>&1 \
| awk '/#include <...> search starts here:/{flag=1;next} \
/End of search list./{flag=0} \
flag==1 {print $1}'
)
+if [ -z "$include_paths" ]; then
+ echo "Failed to detect GCC system include paths from \$NIX_CC_UNWRAPPED" >&2
+ exit 1
+fi
include_args=$(printf '%s\n' "$include_paths" | awk 'NF {printf " -I%s", $1; printf " - %s\n", $1 > "/dev/stderr"}')
BINDGEN_EXTRA_CLANG_ARGS="$BINDGEN_EXTRA_CLANG_ARGS$include_args"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@bindgen-gcc.sh` around lines 10 - 16, The include-path discovery command that
populates include_paths (the LC_ALL... $NIX_CC_UNWRAPPED -v -E ... | awk ...)
can succeed while producing no paths, and $NIX_CC_UNWRAPPED should be quoted to
avoid word-splitting; update the probe to quote "$NIX_CC_UNWRAPPED" and add an
explicit guard after computing include_paths that checks if include_paths is
empty (or contains only whitespace) and exits with a non-zero status and an
error message to stderr if so, preventing silent export of incomplete
BINDGEN_EXTRA_CLANG_ARGS; keep the subsequent include_args and
BINDGEN_EXTRA_CLANG_ARGS assignment unchanged but only run them when
include_paths is non-empty.
Binds the logger from DeterminateSystems/nix-src#448
Summary by CodeRabbit
New Features
Improvements