Overview
The miniKanren benchmark suite (make bench-kanren) provides a real-world optimization target for Wile's VM. The Zebra puzzle dominates at ~8.5s, making it the clearest signal for performance work.
Benchmark Results (M4 Mac, feat/minikanren branch)
| Benchmark |
Time |
Hot path |
| Zebra puzzle |
8.5s |
walk + unify over large substitutions |
| Appendo-20 |
7ms |
mplus / bind / closure allocation |
| Relational arith |
1ms |
Goal application depth |
The Zebra puzzle accounts for 99.9% of the suite's runtime. It's a 5-house, 25-variable, 15-constraint problem — small by logic programming standards but expensive due to microKanren's substitution representation.
Why the Zebra Puzzle Is Slow
microKanren uses triangular substitutions: ext-s conses a new (var . value) pair onto an association list in O(1), but walk pays the cost at lookup time.
When you unify (var 0) with (var 1), then (var 1) with 5, the substitution is:
((#(0) . #(1)) (#(1) . 5))
Resolving (var 0) requires chasing #(0) → #(1) → 5 — two alist scans. This is a triangular substitution (Baader & Snyder): bindings point to other variables, forming chains rather than pointing directly to final values.
For the Zebra puzzle:
- The substitution grows to 25+ bindings (one per house attribute slot)
- Every
unify call does 2 walks, each scanning the full alist via assoc with a custom comparator
- Each
membero tries 5 houses, each attempt doing multiple unifications
- 15 constraints with backtracking multiplies this combinatorially
The cost structure: O(n) per walk (alist length) × O(d) chain depth × thousands of walk calls across the search.
Optimization Targets
Scheme-level (microkanren.scm)
- Substitution representation — alist
walk is O(n); a trie or hash map makes it O(log n) or O(1). This is the classic miniKanren optimization (see Byrd's dissertation, §3).
- Idempotent vs triangular — eagerly updating all existing bindings in
ext-s makes walk O(1) but ext-s O(n). Trade-off depends on the walk-to-extend ratio.
VM-level (Go)
- Closure allocation — every goal (
==, disj, conj) and stream suspension (lambda () ...) creates a closure. The Zebra puzzle creates thousands. Reducing closure allocation pressure is a VM optimization.
assoc with custom predicate — walk calls (assoc u s (lambda (a b) (var=? a b))). Each call creates a closure for the comparator. Hoisting or specializing this is a micro-optimization.
procedure? checks in stream operations — mplus and bind both test (procedure? $) on every stream step.
Profiling
make profile-cpu # while running bench-kanren
This would show exactly which VM operations dominate: closure creation, assoc scanning, vector-ref in var=?, or something else.
References
- Benchmark:
examples/logic/kanren/benchmark.scm
- Library:
lib/wile/kanren.{sld,scm}, lib/wile/microkanren.{sld,scm}
- Run:
make bench-kanren
- Byrd, W.E. (2009). "Relational Programming in miniKanren." PhD dissertation, Indiana University. §3 (substitution representations)
- Baader, F. and Snyder, W. (2001). "Unification Theory." Handbook of Automated Reasoning. (triangular vs idempotent substitutions)
Overview
The miniKanren benchmark suite (
make bench-kanren) provides a real-world optimization target for Wile's VM. The Zebra puzzle dominates at ~8.5s, making it the clearest signal for performance work.Benchmark Results (M4 Mac, feat/minikanren branch)
walk+unifyover large substitutionsmplus/bind/ closure allocationThe Zebra puzzle accounts for 99.9% of the suite's runtime. It's a 5-house, 25-variable, 15-constraint problem — small by logic programming standards but expensive due to microKanren's substitution representation.
Why the Zebra Puzzle Is Slow
microKanren uses triangular substitutions:
ext-sconses a new(var . value)pair onto an association list in O(1), butwalkpays the cost at lookup time.When you unify
(var 0)with(var 1), then(var 1)with5, the substitution is:Resolving
(var 0)requires chasing#(0) → #(1) → 5— two alist scans. This is a triangular substitution (Baader & Snyder): bindings point to other variables, forming chains rather than pointing directly to final values.For the Zebra puzzle:
unifycall does 2walks, each scanning the full alist viaassocwith a custom comparatormemberotries 5 houses, each attempt doing multiple unificationsThe cost structure: O(n) per
walk(alist length) × O(d) chain depth × thousands of walk calls across the search.Optimization Targets
Scheme-level (microkanren.scm)
walkis O(n); a trie or hash map makes it O(log n) or O(1). This is the classic miniKanren optimization (see Byrd's dissertation, §3).ext-smakeswalkO(1) butext-sO(n). Trade-off depends on the walk-to-extend ratio.VM-level (Go)
==,disj,conj) and stream suspension (lambda () ...) creates a closure. The Zebra puzzle creates thousands. Reducing closure allocation pressure is a VM optimization.assocwith custom predicate —walkcalls(assoc u s (lambda (a b) (var=? a b))). Each call creates a closure for the comparator. Hoisting or specializing this is a micro-optimization.procedure?checks in stream operations —mplusandbindboth test(procedure? $)on every stream step.Profiling
make profile-cpu # while running bench-kanrenThis would show exactly which VM operations dominate: closure creation,
assocscanning,vector-refinvar=?, or something else.References
examples/logic/kanren/benchmark.scmlib/wile/kanren.{sld,scm},lib/wile/microkanren.{sld,scm}make bench-kanren