statsig-go: pin purego to v0.8.0 to dodge concurrent-FFI race#14
Merged
Conversation
Downgrades the purego dependency from v0.9.0 to v0.8.0 (the last
release before upstream PR #282 merged on 2024-10-17). PR #282
introduced a process-wide sync.Pool of *syscall15Args in func.go's
RegisterFunc reflect closure. Under concurrent dispatch from multiple
goroutines, two callers can observe each other's return values —
surfacing as SIGSEGV in runtime.memmove on non-canonical pointers,
glibc "double free or corruption (out)", nil-deref at the deref of
returned *byte values, and silently-swapped feature-flag evaluation
results.
The minimal trigger is a function with signature
func(uint64) *byte
called from two or more goroutines simultaneously. Each goroutine can
get back the other goroutine's return pointer. The full discrimination
matrix is in the upstream issue draft; the relevant data points for
this change:
- The minimal purego-only repro mismatches within seconds at HEAD with
workers >= 2 against v0.9.0 / v0.9.1.
- The same repro against v0.8.0 (no `thePool` references in func.go or
syscall_sysv.go) ran for ~153M total dispatches across workers in
{2, 4, 8, 32} with zero mismatches.
- The full statsig-go gate-evaluation workload against v0.8.0 ran for
5 x 30s x 32 workers (~36M gate calls) with zero crashes,
zero corruption messages, ~260k ops/sec sustained — equivalent to
the patched-v0.9.0 approach previously drafted in PR #13.
What we give up between v0.8.0 and v0.9.x:
- PR #282 itself (the racy memory-usage optimization).
- PR #328, #361, #408, #413, #431, #391, #403, #436 — struct
argument/return support, new architectures (s390x, ppc64le,
linux/386, linux/arm32). statsig-go's linux/amd64 consumers use
none of this.
- PR #357 — darwin int/string fix. Not relevant for linux deploys.
- PR #319, #318, #343 — `-race` and `fakecgo` fixes. Test infra,
not user-facing.
- Various small bug fixes none of which match the statsig usage
profile.
Net: v0.8.0 is functionally equivalent to v0.9.x for this binding's
public API surface. The gap exists on paper but is invisible to
consumers.
This change supersedes the previously drafted approaches:
- #12 (binding-side sync.Mutex workaround) — caps throughput at
~83k ops/sec per process due to serialized FFI.
- #13 (vendor purego with the pool revert) — carries ~19k lines
of upstream code in this repo for an 8-line delta.
When upstream lands a real fix for the underlying race, bump this
dependency forward.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cshi-figma
approved these changes
May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Downgrades the
puregodependency instatsig-go/go.modfromv0.9.0tov0.8.0— the last release before upstream PR #282 merged (2024-10-17) and introduced a process-widesync.Poolof*syscall15Argsinfunc.go'sRegisterFuncreflect closure. That pool, under concurrent dispatch from multiple goroutines, lets two callers observe each other's return values.One-line change in
statsig-go/go.mod(plus the correspondinggo.sumupdate). No code changes in the binding itself.Supersedes both previously drafted approaches:
sync.Mutexworkaround) — caps throughput at ~83k ops/sec per process due to serialized FFI.Why
Symptom in consumers: SIGSEGV in
runtime.memmoveon non-canonical pointers, glibcdouble free or corruption (out), nil-deref at the deref of returned*bytevalues, and — most insidiously — silently-swapped feature-flag evaluation results that pass type checks downstream. All traced to a concurrent-FFI return-value race in purego v0.9.x.The minimal trigger is a function with signature
func(uint64) *bytecalled from two or more goroutines simultaneously. Each goroutine can get back the other goroutine's return pointer. The full discrimination matrix is in the upstream issue draft; the relevant data points for this change:v0.9.0/v0.9.1.v0.8.0(nothePoolreferences infunc.goorsyscall_sysv.go) ran for ~153M total dispatches across workers ∈ {2, 4, 8, 32} with zero mismatches.v0.8.0ran for 5 × 30s × 32 workers (~36M gate calls) with zero crashes, zero corruption messages, ~260k ops/sec sustained — equivalent to the patched-v0.9.0 approach in purego: vendor v0.9.0 with concurrent-FFI race revert #13.What's actually in v0.8.1 → v0.9.1 that we'd be giving up
Looking at the release notes between v0.8.0 (Sept 2024) and v0.9.1 (Nov 2025):
statsig-go's linux/amd64 consumers use none of this.-raceandfakecgofixes. Test infra, not user-facing.Net: for
statsig-go's public API surface (purego.Dlopen,purego.RegisterLibFunc,purego.RTLD_NOW,purego.RTLD_GLOBAL), v0.8.0 is functionally equivalent to v0.9.x. The gap exists on paper but is invisible to consumers on linux/amd64.Invariants worth calling out
statsig-go. The four purego APIs used bystatsig_ffi.go(Dlopen,RegisterLibFunc,RTLD_NOW,RTLD_GLOBAL) all exist with identical signatures in v0.8.0.github.com/ebitengine/puregolisted as an indirect dependency in their owngo.mod(with version v0.9.0) will need to bump that line to v0.8.0 as well — otherwise MVS picks v0.9.0 and the bug returns. Either drop the indirect and re-tidy after pulling this version ofstatsig-go, or set the indirect to v0.8.0 explicitly.ebitengine/purego), bump this dependency forward and drop any compensating consumer-sidego.modlines.Test plan
go build ./statsig-go/...clean against v0.8.0.go test ./statsig-go/...once CI runs.statsig-go/v0.19.4-figma2, validate consumer integration end-to-end by pointing a consumer'sgo.modat the new tag and updating its purego indirect to v0.8.0.🤖 Generated with Claude Code