spirv: add LLVM SPIR-V backend crash fuzzer#1
Conversation
|
@jlebar please, clarify if that aligns with your project goals as well |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f8e915e129
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (!validateIRCorpusModule(*M)) | ||
| return 0; |
There was a problem hiding this comment.
Verify parsed IR before running LLVM passes
When a byte-mutated bitcode file still parses but violates verifier invariants, validateIRCorpusModule() returns true as long as the module is non-empty and the function calling conventions are allowed. The next step runs the O0/O2 optimization and codegen pipelines on that invalid IR, and LLVM passes are allowed to assert or abort on malformed input, so the fuzzer can report false compiler-crash findings that are not SPIR-V backend bugs. Please run the LLVM verifier on the parsed module before optimization and discard verifier failures.
Useful? React with 👍 / 👎.
| return createIRSkeletonModule(Ctx, CPU); | ||
| // Force the triple so corpus mutation of target metadata cannot send us to | ||
| // a different backend. | ||
| Parsed->setTargetTriple(Triple(DefaultTriple)); |
There was a problem hiding this comment.
Reset mutated data layouts before optimization
For parsed corpus inputs, this forces the triple back to SPIR-V but leaves whatever target datalayout bytes survived mutation in the module until emitObject() resets it after the optimization pipeline has already run. If a valid bitcode input carries a stale or corrupted data layout, target-dependent LLVM passes optimize using pointer sizes/address-space rules that do not match the SPIR-V target, which can produce harness-only crashes or misleading findings. Reset the module data layout from the SPIR-V TargetMachine before running the pass pipeline.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit f8e915e. Configure here.
|
Hi, thanks for the PR!
I think my inclination is to say that if you want to build your own fuzzer, I'll happily link to it from the main README? Right now e.g. I have some of my personal capital on the line when I tell AMD and nvidia that I've found real bugs in their compiler, and for this reason I hesitate at the idea of letting strangers contribute to the list of bugs (without verifying them as real myself, which I don't have resources to do rn). Also, the fact that this is a crash-only fuzzer makes it much less interesting to me, personally. |
|
@jlebar thank you for the feedback! As for contributing to the list of bugs, that's completely understandable. That is difficult to keep track of them this way. So I didn't add any new bugs here. As for the fact that it is crash-only, it is quite easily extendable. There is a tool called If you don't want to have that functionality here, that is ok, I think we will use them internally as a fuzzer for SPIR-V backend and maintain in the fork. |
|
I think yeah, why don't you make a separate project (doesn't even have to be a "fork", you're not sharing anything with this project) and I'll link to it from the main README if you like? If we were sharing resources that'd be one reason to keep things in the same repo, but everything is totally independent. |
Thanks for the suggestion! It makes sense. I created one: https://github.com/aobolensk/FuzzX-spirv Feel free to credit it if you want. |
|
(You probably shouldn't call it FuzzX because SemiAnalysis could reasonably claim trademark on the name?) |
That'd be strange, but ok, renamed |
|
Added a link to your repo in 41c1891. Good luck! |

Note
Low Risk
Adds a self-contained fuzzing subdirectory and demo tooling; no changes to existing fuzzers or production compiler paths in-repo.
Overview
Adds a new
spirv/tree to FuzzX: a crash-only, in-process libFuzzer harness for LLVM’s SPIR-V backend (spirv64), documented as an AMDGPU-style layout without differential GPU execution.The harness
llvm_spirv_crash_fuzzerparses mutated LLVM bitcode (with aspir_kernelskeleton fallback), pinsspirv64-unknown-unknown, runs O0 and O2 optimize+codegen underCrashRecoveryContext, routes LLVM fatals toabort, and writes.bc/.llfindings toFUZZX_FINDINGS_DIR. Supporting pieces mirror AMDGPU: CMake (SPIR-V codegen libs, libFuzzer sanitizers),build_instrumented_llvm.sh(SPIRV;X86, assertions, sancov),build_directed_fuzzer.sh,run_directed_fuzzer.sh, andseed_ir_corpus.sh. The root README gains aspirv/entry;spirv/.gitignoreignores build/runtime/corpus/findings.The SPIR-V README notes a smoke-run MRI reserved-regs assertion that may not repro under standalone
llc(possible in-process/harness interaction).Reviewed by Cursor Bugbot for commit 2e17d2e. Bugbot is set up for automated code reviews on this repo. Configure here.