From 333ae2d40b138f7d0fa51b772e5c8553e0f3d598 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ullrich=20Sch=C3=A4fer?= Date: Fri, 5 Jun 2026 17:04:32 +0200 Subject: [PATCH 1/3] feat(extraction): add Clojure/ClojureScript language support MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Index .clj/.cljs/.cljc/.bb and .edn files: namespaces, defn/def (multi-arity, function-valued, private), defprotocol, defrecord/deftype (fields, methods, implements, implicit ->Ctor fns), defmulti/defmethod, and :require/:import clauses. The maintained grammar (sogaiu/tree-sitter-clojure, vendored ABI-14 wasm — build recipe in src/extraction/wasm/README.md) is purely lexical: `(defn foo [x] ...)` is just a list_lit. So extraction runs entirely through the visitNode full-takeover hook (the Pascal precedent), interpreting list heads. A generic def-macro heuristic also catches library definers (defroutes, deftest, definline, rum/defc, hsx/defc) so e.g. logseq's UI components are indexed. Calls through :as aliases and :refer'd symbols emit `full.ns::name` references that resolve via the existing qualified-name matcher, and requires name-match the target namespace's module node — zero resolver changes. Local-scope tracking (let/loop/for bindings, fn params, letfn, as->/catch) suppresses shadowed names so idiomatic shadowing of a same-file fn never fabricates call edges; quoted/discarded/comment forms are skipped; reader conditionals descend both branches; (.-prop x) reads emit references, (.method x) calls. UIx and helix React components are first-class: defui/defnc produce `component` nodes and the `$` element macro emits calls edges to the composed component (`($ ui/button ...)` → button), gated on `$` resolving to uix.core/helix.core in the require table. pitch-io/uix: 131 components with composition edges. UIx and helix React components are first-class: defui/defnc produce `component` nodes and the `$` element macro emits calls edges to the composed component (`($ ui/button ...)` -> button), gated on `$` resolving to uix.core/helix.core in the require table. pitch-io/uix: 131 components with composition edges. re-frame's keyword-keyed dispatch connects statically: every reg-* registration with a literal keyword becomes a function node NAMED by its alias-expanded keyword (::subs/items → :my.app.subs/items), and dispatch/dispatch-sync/subscribe/sub sites with a literal event vector emit same-named calls refs the exact-name matcher links. Detection is shape-based because real apps front re-frame with project facades (status-mobile's utils.re-frame covers 512 files with custom registrars); precision is structural — an edge needs both ends to carry the same keyword. The registrar call itself keeps its ordinary call ref, so callers/impact on a facade still see every registration site (status-mobile: 3,654 calls into utils.re-frame's defs). status-mobile: 1,635 registrations, 2,323 keyword edges; codegraph_node :profile/logout returns the handler plus all 13 dispatch sites in one call. .edn files extract in data mode: top-level map keys become property nodes, qualified symbols in values (shadow-cljs :init-fn, integrant handlers, clj-kondo :hooks) emit references to the code they name, and no call edges are ever emitted. Maps with >64 top-level keys are datasets (locale dicts), not config — no nodes (refs still scanned). Validated on ring (84 files / 2.5k edges), logseq (1,312 / 91k incl. 99 .edn), metabase (15,374 / 623k), re-frame todomvc, athens, and status-mobile (2,050 / 1,635 keyword registrations): extraction PASS, node counts stable on re-sync. Agent A/B: logseq's canonical flow 155→4 tool calls, 0 Read/0 Grep, 3.6× faster, 2.5× cheaper; status-mobile's logout flow (n=2) half the calls and Reads at ~1.7× speed. 47 extraction tests; coverage matrix entry in docs/design/dynamic-dispatch-coverage-playbook.md. Co-Authored-By: Claude Opus 4.8 --- .claude/skills/agent-eval/corpus.json | 6 + CHANGELOG.md | 4 + README.md | 3 +- __tests__/extraction.test.ts | 691 +++++++++ .../dynamic-dispatch-coverage-playbook.md | 1 + src/extraction/grammars.ts | 23 +- src/extraction/languages/clojure.ts | 1252 +++++++++++++++++ src/extraction/languages/index.ts | 2 + src/extraction/wasm/README.md | 38 + src/extraction/wasm/tree-sitter-clojure.wasm | Bin 0 -> 101206 bytes src/types.ts | 1 + 11 files changed, 2018 insertions(+), 3 deletions(-) create mode 100644 src/extraction/languages/clojure.ts create mode 100644 src/extraction/wasm/README.md create mode 100755 src/extraction/wasm/tree-sitter-clojure.wasm diff --git a/.claude/skills/agent-eval/corpus.json b/.claude/skills/agent-eval/corpus.json index 2cfedac4f..5140a1c67 100644 --- a/.claude/skills/agent-eval/corpus.json +++ b/.claude/skills/agent-eval/corpus.json @@ -94,5 +94,11 @@ { "name": "react-native-segmented-control", "repo": "https://github.com/react-native-segmented-control/segmented-control", "size": "Small", "files": "~25", "question": "How does JSX `` reach the native onChange handler on iOS/Android?" }, { "name": "react-native-screens", "repo": "https://github.com/software-mansion/react-native-screens", "size": "Medium", "files": "~1200", "question": "How does JSX `` reach the native RNSScreenStackView component?" }, { "name": "react-native-skia", "repo": "https://github.com/Shopify/react-native-skia", "size": "Large", "files": "~1000", "question": "How does a `` JSX usage reach the iOS / Android native renderer?" } + ], + "Clojure": [ + { "name": "ring", "repo": "https://github.com/ring-clojure/ring", "size": "Small", "files": "~80", "question": "How does a Ring request flow from the Jetty adapter through the handler to an HTTP response?" }, + { "name": "logseq", "repo": "https://github.com/logseq/logseq", "size": "Medium", "files": "~960", "question": "How does editing a block in the editor get persisted to the database and reflected back in the UI?" }, + { "name": "metabase", "repo": "https://github.com/metabase/metabase", "size": "Large", "files": "~3400", "question": "How does a query submitted to the API flow through the query processor middleware to the database driver?" }, + { "name": "status-mobile", "repo": "https://github.com/status-im/status-mobile", "size": "Large", "files": "~2050", "question": "How does tapping the logout option in profile settings end the session and what happens to the app state?" } ] } diff --git a/CHANGELOG.md b/CHANGELOG.md index 54ef5f5aa..db1d2c456 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,6 +11,10 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ### New Features +- CodeGraph now indexes **Clojure and ClojureScript** (`.clj`, `.cljs`, `.cljc`, and Babashka `.bb` files) — namespaces, functions (`defn`, multi-arity, function-valued `def`s), protocols, records and their methods, multimethods, and `:require`/`:import` clauses. Calls through namespace aliases (`(str/upper-case ...)`) and `:refer`ed symbols resolve across files, so callers, callees, impact, and `codegraph_explore` flow tracing all work on Clojure codebases. +- **EDN config files** (`deps.edn`, `bb.edn`, `shadow-cljs.edn`, integrant/system configs) are indexed as data: top-level keys become searchable config entries, and qualified symbols in their values (a shadow-cljs `:init-fn`, an integrant component's handler) link to the code they name. Large EDN datasets (translation dictionaries, fixtures) are deliberately kept out of the graph. +- **re-frame apps get connected event flows**: every `reg-event-db`/`reg-event-fx`/`reg-sub` registration becomes a searchable symbol named by its keyword (`:profile/logout`), and `dispatch`/`subscribe` call sites link to it — so callers, impact, and flow tracing follow keyword-keyed dispatch across files. Project facades that wrap re-frame (custom `reg-*` helpers, `sub` aliases) are detected too. +- **UIx and helix components are first-class**: `defui`/`defnc` definitions become component symbols, and `($ button ...)` element composition produces real call edges — so "what renders this component" works in ClojureScript React apps. - `codegraph status --json` now also reports the running CLI `version`, the index directory (`indexPath`), and a `lastIndexed` timestamp (ISO-8601, or null when nothing's indexed yet), so CI and scripts can pin the CLI version and check index freshness from a single command. A matching `CodeGraph.getLastIndexedAt()` library method exposes the same freshness check without shelling out. Thanks @12122J and @eddieran. (#329) ### Fixes diff --git a/README.md b/README.md index 250b507af..9ffd457ba 100644 --- a/README.md +++ b/README.md @@ -223,7 +223,7 @@ CodeGraph cuts **tokens, tool calls, and wall-clock time on every repo** — acr | **Full-Text Search** | Find code by name instantly across your entire codebase, powered by FTS5 | | **Impact Analysis** | Trace callers, callees, and the full impact radius of any symbol before making changes | | **Always Fresh** | File watcher uses native OS events (FSEvents/inotify/ReadDirectoryChangesW) with debounced auto-sync — the graph stays current as you code, zero config | -| **20+ Languages** | TypeScript, JavaScript, Python, Go, Rust, Java, C#, PHP, Ruby, C, C++, Objective-C, Swift, Kotlin, Dart, Lua, Luau, Svelte, Liquid, Pascal/Delphi | +| **20+ Languages** | TypeScript, JavaScript, Python, Go, Rust, Java, C#, PHP, Ruby, C, C++, Objective-C, Swift, Kotlin, Dart, Lua, Luau, Clojure/ClojureScript, Svelte, Liquid, Pascal/Delphi | | **Framework-aware Routes** | Recognizes web-framework routing files and links URL patterns to their handlers across 14 frameworks | | **Mixed iOS / React Native / Expo** | Closes cross-language flows that static parsing misses: Swift ↔ ObjC bridging, React Native legacy bridge + TurboModules + Fabric view components, native → JS event emitters, Expo Modules | | **100% Local** | No data leaves your machine. No API keys. No external services. SQLite database only | @@ -635,6 +635,7 @@ is written): | Pascal / Delphi | `.pas`, `.dpr`, `.dpk`, `.lpr` | Full support (classes, records, interfaces, enums, DFM/FMX form files) | | Lua | `.lua` | Full support (functions, methods with receivers, local variables, `require` imports, call edges) | | Luau | `.luau` | Full support (everything in Lua, plus `type`/`export type` aliases, typed signatures, and Roblox instance-path `require`) | +| Clojure / ClojureScript | `.clj`, `.cljs`, `.cljc`, `.bb`, `.edn` | Full support (namespaces, `defn`/`def`, protocols, records, multimethods, `:require` alias-resolved call edges, reader conditionals, re-frame keyword dispatch, UIx/helix component composition; `.edn` config keys + code references) | ## Troubleshooting diff --git a/__tests__/extraction.test.ts b/__tests__/extraction.test.ts index d29fa11b3..b840d1dd4 100644 --- a/__tests__/extraction.test.ts +++ b/__tests__/extraction.test.ts @@ -4459,3 +4459,694 @@ func (s Stack[T]) Len() int { return len(s.items) } expect(js.nodes.find((n) => n.name === 'handleRequest' && n.kind === 'function')).toBeDefined(); }); }); + +// ============================================================================= +// Clojure / ClojureScript (lexical grammar — extraction via visitNode hook) +// ============================================================================= + +describe('Clojure Extraction', () => { + describe('Language detection', () => { + it('should detect Clojure family files', () => { + expect(detectLanguage('src/my/app/core.clj')).toBe('clojure'); + expect(detectLanguage('src/my/app/views.cljs')).toBe('clojure'); + expect(detectLanguage('src/my/app/util.cljc')).toBe('clojure'); + expect(detectLanguage('tasks.bb')).toBe('clojure'); + }); + + it('should report Clojure as supported', () => { + expect(isLanguageSupported('clojure')).toBe(true); + expect(getSupportedLanguages()).toContain('clojure'); + }); + }); + + describe('Namespace and defs', () => { + it('should extract the ns as a module and scope defs under it', () => { + const code = `(ns my.app.core + (:require [clojure.string :as str])) + +(def max-retries 3) + +(defn- helper [x] (str/upper-case x)) + +(defn process-user + "Process a user record." + [user] + (helper user)) +`; + const result = extractFromSource('src/my/app/core.clj', code); + const mod = result.nodes.find((n) => n.kind === 'module'); + expect(mod?.name).toBe('my.app.core'); + + const fn = result.nodes.find((n) => n.name === 'process-user'); + expect(fn?.kind).toBe('function'); + expect(fn?.qualifiedName).toBe('my.app.core::process-user'); + expect(fn?.docstring).toBe('Process a user record.'); + expect(fn?.signature).toBe('[user]'); + expect(fn?.language).toBe('clojure'); + + const helper = result.nodes.find((n) => n.name === 'helper'); + expect(helper?.visibility).toBe('private'); + + const constant = result.nodes.find((n) => n.name === 'max-retries'); + expect(constant?.kind).toBe('constant'); + }); + + it('should extract multi-arity defns with a combined signature', () => { + const code = `(ns m.a) +(defn greet + ([] (greet "world")) + ([who] (str "hi " who))) +`; + const result = extractFromSource('src/m/a.clj', code); + const fn = result.nodes.find((n) => n.name === 'greet'); + expect(fn?.signature).toBe('[] [who]'); + // The zero-arity body calls the one-arity — a self call ref must exist + const selfCall = result.unresolvedReferences.find( + (r) => r.referenceKind === 'calls' && r.referenceName === 'greet' + ); + expect(selfCall).toBeDefined(); + }); + + it('should treat function-valued defs as functions', () => { + const code = `(ns m.b) +(def handler (fn [req] {:status 200})) +(def shortcut #(inc %)) +`; + const result = extractFromSource('src/m/b.clj', code); + expect(result.nodes.find((n) => n.name === 'handler')?.kind).toBe('function'); + expect(result.nodes.find((n) => n.name === 'shortcut')?.kind).toBe('function'); + }); + + it('should extract library def-macros (defroutes, deftest) as named nodes', () => { + const code = `(ns m.routes) +(defroutes app-routes + (GET "/" [] home-page)) +(deftest parses-input + (is (= 1 1))) +`; + const result = extractFromSource('src/m/routes.clj', code); + expect(result.nodes.find((n) => n.name === 'app-routes')).toBeDefined(); + expect(result.nodes.find((n) => n.name === 'parses-input')).toBeDefined(); + }); + }); + + describe('Requires and imports', () => { + it('should extract :require entries as import nodes with refs', () => { + const code = `(ns my.app.core + (:require [clojure.string :as str] + [my.app.db :refer [save!]] + my.app.flags) + (:import (java.time Instant))) +`; + const result = extractFromSource('src/my/app/core.clj', code); + const imports = result.nodes.filter((n) => n.kind === 'import').map((n) => n.name); + expect(imports).toContain('clojure.string'); + expect(imports).toContain('my.app.db'); + expect(imports).toContain('my.app.flags'); + expect(imports).toContain('java.time.Instant'); + + const ref = result.unresolvedReferences.find( + (r) => r.referenceKind === 'imports' && r.referenceName === 'my.app.db' + ); + expect(ref).toBeDefined(); + }); + + it('should resolve :as aliases in call references to qualified names', () => { + const code = `(ns my.app.core + (:require [my.app.db :as db])) + +(defn save-user [u] (db/insert! u)) +`; + const result = extractFromSource('src/my/app/core.clj', code); + const call = result.unresolvedReferences.find((r) => r.referenceKind === 'calls' && r.referenceName === 'my.app.db::insert!'); + expect(call).toBeDefined(); + }); + + it('should resolve :refer symbols in call references to qualified names', () => { + const code = `(ns my.app.core + (:require [my.app.db :refer [save!]])) + +(defn save-user [u] (save! u)) +`; + const result = extractFromSource('src/my/app/core.clj', code); + const call = result.unresolvedReferences.find((r) => r.referenceKind === 'calls' && r.referenceName === 'my.app.db::save!'); + expect(call).toBeDefined(); + }); + }); + + describe('Protocols, records, multimethods', () => { + it('should extract defprotocol with method signatures', () => { + const code = `(ns m.proto) +(defprotocol Storage + (put [this k v]) + (fetch [this k])) +`; + const result = extractFromSource('src/m/proto.clj', code); + const proto = result.nodes.find((n) => n.kind === 'protocol'); + expect(proto?.name).toBe('Storage'); + const put = result.nodes.find((n) => n.name === 'put'); + expect(put?.kind).toBe('method'); + expect(put?.qualifiedName).toBe('m.proto::Storage::put'); + }); + + it('should extract defrecord with fields, methods, implements refs, and ctor fns', () => { + const code = `(ns m.rec) +(defprotocol Storage + (put [this k v])) +(defrecord MemStore [state] + Storage + (put [_ k v] (swap! state assoc k v))) +`; + const result = extractFromSource('src/m/rec.clj', code); + const cls = result.nodes.find((n) => n.kind === 'class'); + expect(cls?.name).toBe('MemStore'); + expect(result.nodes.find((n) => n.name === 'state' && n.kind === 'field')).toBeDefined(); + expect(result.nodes.find((n) => n.name === '->MemStore' && n.kind === 'function')).toBeDefined(); + expect(result.nodes.find((n) => n.name === 'map->MemStore')).toBeDefined(); + + const impl = result.unresolvedReferences.find( + (r) => r.referenceKind === 'implements' && r.referenceName === 'Storage' + ); + expect(impl).toBeDefined(); + + const method = result.nodes.find((n) => n.name === 'put' && n.qualifiedName.includes('MemStore')); + expect(method?.kind).toBe('method'); + }); + + it('should extract defmulti/defmethod as same-named functions (overloads)', () => { + const code = `(ns m.multi) +(defmulti render :type) +(defmethod render :button [w] (str w)) +(defmethod render :input [w] (str w)) +`; + const result = extractFromSource('src/m/multi.clj', code); + const renders = result.nodes.filter((n) => n.name === 'render' && n.kind === 'function'); + expect(renders.length).toBe(3); + }); + }); + + describe('Calls and references', () => { + it('should not emit call refs for special forms or core macros', () => { + const code = `(ns m.c) +(defn f [x] + (let [y (inc x)] + (when (pos? y) + (->> y (map inc) (filter odd?))))) +`; + const result = extractFromSource('src/m/c.clj', code); + const names = result.unresolvedReferences.filter((r) => r.referenceKind === 'calls').map((r) => r.referenceName); + expect(names).not.toContain('let'); + expect(names).not.toContain('when'); + expect(names).not.toContain('->>'); + expect(names).not.toContain('map'); + }); + + it('should emit a calls ref for a same-file fn passed to a HOF', () => { + const code = `(ns m.hof) +(defn- transform [x] x) +(defn run [xs] (map transform xs)) +`; + const result = extractFromSource('src/m/hof.clj', code); + const hof = result.unresolvedReferences.find( + (r) => r.referenceKind === 'calls' && r.referenceName === 'transform' + ); + expect(hof).toBeDefined(); + }); + + it('should emit instantiates refs for ctor interop', () => { + const code = `(ns m.inst) +(defn make [] (java.util.ArrayList.)) +(defn make2 [] (new StringBuilder)) +`; + const result = extractFromSource('src/m/inst.clj', code); + const kinds = result.unresolvedReferences.filter((r) => r.referenceKind === 'instantiates').map((r) => r.referenceName); + expect(kinds).toContain('java.util.ArrayList'); + expect(kinds).toContain('StringBuilder'); + }); + + it('should not emit calls from quoted, discarded, or rich-comment forms', () => { + const code = `(ns m.q) +(def data '(fetch-thing 1)) +#_(dropped-call 2) +(comment (scratch-call 3)) +`; + const result = extractFromSource('src/m/q.clj', code); + const names = result.unresolvedReferences.map((r) => r.referenceName); + expect(names).not.toContain('fetch-thing'); + expect(names).not.toContain('dropped-call'); + expect(names).not.toContain('scratch-call'); + }); + }); + + describe('Reader conditionals (.cljc)', () => { + it('should extract defs from both branches of #?', () => { + const code = `(ns m.x) +#?(:clj + (defn read-file [p] (slurp p)) + :cljs + (defn write-log [m] (println m))) +`; + const result = extractFromSource('src/m/x.cljc', code); + expect(result.nodes.find((n) => n.name === 'read-file')).toBeDefined(); + expect(result.nodes.find((n) => n.name === 'write-log')).toBeDefined(); + }); + + it('should extract requires inside reader conditionals', () => { + const code = `(ns m.y + (:require [m.shared :as shared] + #?(:cljs [m.dom :as dom]))) +`; + const result = extractFromSource('src/m/y.cljc', code); + const imports = result.nodes.filter((n) => n.kind === 'import').map((n) => n.name); + expect(imports).toContain('m.shared'); + expect(imports).toContain('m.dom'); + }); + }); +}); + +// ============================================================================= +// EDN data files (.edn — same grammar, data mode: properties + references) +// ============================================================================= + +describe('EDN Extraction', () => { + it('should detect .edn files as clojure', () => { + expect(detectLanguage('deps.edn')).toBe('clojure'); + expect(detectLanguage('resources/system.edn')).toBe('clojure'); + }); + + it('should extract top-level map keys as property nodes', () => { + const code = `{:paths ["src" "resources"] + :deps {org.clojure/clojure {:mvn/version "1.11.1"}} + :aliases {:test {:extra-paths ["test"]}}} +`; + const result = extractFromSource('deps.edn', code); + const props = result.nodes.filter((n) => n.kind === 'property').map((n) => n.name); + expect(props).toContain(':paths'); + expect(props).toContain(':deps'); + expect(props).toContain(':aliases'); + // One level only — nested keys must NOT become nodes + expect(props).not.toContain(':test'); + expect(props).not.toContain(':extra-paths'); + }); + + it('should emit references for qualified symbols in values (shadow-cljs entry points)', () => { + const code = `{:builds + {:app {:target :browser + :modules {:main {:init-fn app.core/init}}}}} +`; + const result = extractFromSource('shadow-cljs.edn', code); + const ref = result.unresolvedReferences.find( + (r) => r.referenceKind === 'references' && r.referenceName === 'app.core::init' + ); + expect(ref).toBeDefined(); + }); + + it('should never emit call references from EDN data', () => { + const code = `{:tasks {clean (shell "rm -rf target")} + :fixture [(make-thing 1) (make-thing 2)]} +`; + const result = extractFromSource('bb.edn', code); + const calls = result.unresolvedReferences.filter((r) => r.referenceKind === 'calls'); + expect(calls).toEqual([]); + }); + + it('should extract qualified keyword keys (integrant system maps) with refs', () => { + const code = `{:app/server {:port 8080 :handler app.http/router} + :app/db {:uri "datomic:mem://app"}} +`; + const result = extractFromSource('resources/system.edn', code); + const props = result.nodes.filter((n) => n.kind === 'property').map((n) => n.name); + expect(props).toContain(':app/server'); + expect(props).toContain(':app/db'); + const ref = result.unresolvedReferences.find((r) => r.referenceName === 'app.http::router'); + expect(ref).toBeDefined(); + // the ref hangs off the :app/server property node + const prop = result.nodes.find((n) => n.name === ':app/server'); + expect(ref?.fromNodeId).toBe(prop?.id); + }); +}); + +// ============================================================================= +// Clojure review follow-ups: shadowing precision, interop forms, ns options +// ============================================================================= + +describe('Clojure Extraction (precision)', () => { + describe('Binding-position shadowing (no false calls)', () => { + it('should not emit a calls ref for a let binding name shadowing a same-file fn', () => { + const code = `(ns m.shadow) +(defn- helper [x] x) +(defn run [data] + (let [helper (compute data)] + (str helper))) +`; + const result = extractFromSource('src/m/shadow.clj', code); + // `(compute data)` is walked (init expr), the binding NAME `helper` is not. + const helperRefs = result.unresolvedReferences.filter((r) => r.referenceName === 'helper'); + expect(helperRefs).toEqual([]); + expect( + result.unresolvedReferences.find((r) => r.referenceName === 'compute' && r.referenceKind === 'calls') + ).toBeDefined(); + }); + + it('should not emit refs for fn params shadowing a same-file fn', () => { + const code = `(ns m.shadow2) +(defn- transform [x] x) +(defn run [xs] (map (fn [transform] (inc transform)) xs)) +`; + const result = extractFromSource('src/m/shadow2.clj', code); + expect(result.unresolvedReferences.filter((r) => r.referenceName === 'transform')).toEqual([]); + }); + + it('should still walk for/doseq modifier expressions and :let vectors', () => { + const code = `(ns m.fors) +(defn- check [x] x) +(defn run [xs] + (for [x xs + :when (check x) + :let [y (deep-init x)]] + y)) +`; + const result = extractFromSource('src/m/fors.clj', code); + const names = result.unresolvedReferences.filter((r) => r.referenceKind === 'calls').map((r) => r.referenceName); + expect(names).toContain('check'); + expect(names).toContain('deep-init'); + }); + + it('should skip as-> and catch binding names', () => { + const code = `(ns m.asarrow) +(defn- step [x] x) +(defn run [v] + (try + (as-> v step (step step)) + (catch Exception step (str step)))) +`; + const result = extractFromSource('src/m/asarrow.clj', code); + // the body call (step step) is head-position — allowed; the binding + // names themselves (as-> 3rd element, catch 3rd element) emit nothing. + const refs = result.unresolvedReferences.filter((r) => r.referenceName === 'step'); + expect(refs.every((r) => r.referenceKind === 'calls')).toBe(true); + }); + }); + + describe('Interop precision', () => { + it('should emit references (not calls) for .-property access', () => { + const code = `(ns m.dom) +(defn read-value [el] (.-value el)) +(defn fire [el] (.focus el)) +`; + const result = extractFromSource('src/m/dom.cljs', code); + const valueRef = result.unresolvedReferences.find((r) => r.referenceName === 'value'); + expect(valueRef?.referenceKind).toBe('references'); + const focusRef = result.unresolvedReferences.find((r) => r.referenceName === 'focus'); + expect(focusRef?.referenceKind).toBe('calls'); + }); + + it('should extract definline as a function', () => { + const code = `(ns m.inline) +(definline pow2 [x] \`(* ~x ~x)) +`; + const result = extractFromSource('src/m/inline.clj', code); + expect(result.nodes.find((n) => n.name === 'pow2')?.kind).toBe('function'); + }); + }); + + describe('Require/def option coverage', () => { + it('should mark ^:private defs as private', () => { + const code = `(ns m.priv) +(def ^:private secret 42) +(defn ^:private hidden [x] x) +`; + const result = extractFromSource('src/m/priv.clj', code); + expect(result.nodes.find((n) => n.name === 'secret')?.visibility).toBe('private'); + expect(result.nodes.find((n) => n.name === 'hidden')?.visibility).toBe('private'); + }); + + it('should extract string requires (shadow-cljs npm deps)', () => { + const code = `(ns m.npm + (:require ["react" :as react] + ["@mui/material" :as mui])) +(defn use-it [] (react/useState 0)) +`; + const result = extractFromSource('src/m/npm.cljs', code); + const imports = result.nodes.filter((n) => n.kind === 'import').map((n) => n.name); + expect(imports).toContain('react'); + expect(imports).toContain('@mui/material'); + expect( + result.unresolvedReferences.find((r) => r.referenceName === 'react::useState') + ).toBeDefined(); + }); + + it('should expand prefix lists in :require', () => { + const code = `(ns m.prefix + (:require (my.app [db :as db] core))) +(defn save [x] (db/insert! x)) +`; + const result = extractFromSource('src/m/prefix.clj', code); + const imports = result.nodes.filter((n) => n.kind === 'import').map((n) => n.name); + expect(imports).toContain('my.app.db'); + expect(imports).toContain('my.app.core'); + expect( + result.unresolvedReferences.find((r) => r.referenceName === 'my.app.db::insert!') + ).toBeDefined(); + }); + + it('should extract calls from letfn bodies without false refs to the local names', () => { + const code = `(ns m.letfn) +(defn- helper [x] x) +(defn run [v] + (letfn [(local-a [x] (helper x)) + (local-b [y] (local-a y))] + (local-b v))) +`; + const result = extractFromSource('src/m/letfn.clj', code); + const names = result.unresolvedReferences.filter((r) => r.referenceKind === 'calls').map((r) => r.referenceName); + expect(names).toContain('helper'); + }); + }); + + describe('Babashka content', () => { + it('should extract symbols from .bb files', () => { + const code = `(ns tasks + (:require [babashka.process :refer [shell]])) +(defn clean [] (shell "rm -rf target")) +`; + const result = extractFromSource('tasks.bb', code); + expect(result.nodes.find((n) => n.name === 'clean')?.kind).toBe('function'); + expect( + result.unresolvedReferences.find((r) => r.referenceName === 'babashka.process::shell') + ).toBeDefined(); + }); + }); +}); + +describe('Clojure Extraction (head-position shadowing)', () => { + it('should not emit a calls edge when a let-bound local shadows a same-file fn and is called', () => { + const code = `(ns m.headshadow) +(defn- helper [x] x) +(defn run [data] + (let [helper (make-handler data)] + (helper 1))) +`; + const result = extractFromSource('src/m/headshadow.clj', code); + expect(result.unresolvedReferences.filter((r) => r.referenceName === 'helper')).toEqual([]); + // the init expr is still a real call + expect( + result.unresolvedReferences.find((r) => r.referenceName === 'make-handler' && r.referenceKind === 'calls') + ).toBeDefined(); + }); + + it('should still emit calls for the un-shadowed name outside the binding scope', () => { + const code = `(ns m.scopeend) +(defn- helper [x] x) +(defn a [v] (let [helper inc] (helper v))) +(defn b [v] (helper v)) +`; + const result = extractFromSource('src/m/scopeend.clj', code); + const helperCalls = result.unresolvedReferences.filter( + (r) => r.referenceName === 'helper' && r.referenceKind === 'calls' + ); + expect(helperCalls.length).toBe(1); // only the one in `b` + }); +}); + +// ============================================================================= +// re-frame keyword-keyed dispatch (registrations ↔ dispatch/subscribe sites) +// ============================================================================= + +describe('Clojure Extraction (re-frame)', () => { + it('should create function nodes for registrations, named by the keyword', () => { + const code = `(ns my.app.events + (:require [re-frame.core :as rf])) +(rf/reg-event-db :todo/add (fn [db [_ t]] (conj-todo db t))) +(rf/reg-sub :todo/items (fn [db _] (:items db))) +`; + const result = extractFromSource('src/my/app/events.cljs', code); + const add = result.nodes.find((n) => n.name === ':todo/add'); + expect(add?.kind).toBe('function'); + expect(add?.signature).toBe('(reg-event-db :todo/add)'); + expect(result.nodes.find((n) => n.name === ':todo/items')).toBeDefined(); + // handler body calls attribute to the registration node + const call = result.unresolvedReferences.find((r) => r.referenceName === 'conj-todo'); + expect(call?.fromNodeId).toBe(add?.id); + // the registrar itself keeps its ordinary call ref (callers/impact on facades) + const registrar = result.unresolvedReferences.find( + (r) => r.referenceName === 're-frame.core::reg-event-db' && r.referenceKind === 'calls' + ); + expect(registrar).toBeDefined(); + expect(registrar?.fromNodeId).not.toBe(add?.id); // attributed to the enclosing scope + }); + + it('should expand :: and ::alias keywords in registrations', () => { + const code = `(ns my.app.events + (:require [re-frame.core :as rf] + [my.app.subs :as subs])) +(rf/reg-event-db ::add (fn [db _] db)) +(rf/reg-sub ::subs/items (fn [db _] db)) +`; + const result = extractFromSource('src/my/app/events.cljs', code); + expect(result.nodes.find((n) => n.name === ':my.app.events/add')).toBeDefined(); + expect(result.nodes.find((n) => n.name === ':my.app.subs/items')).toBeDefined(); + }); + + it('should emit keyword calls refs at dispatch and subscribe sites', () => { + const code = `(ns my.app.views + (:require [re-frame.core :as rf])) +(defn add-button [t] + [:button {:on-click #(rf/dispatch [:todo/add t])}]) +(defn todo-list [] + (let [items @(rf/subscribe [:todo/items])] + items)) +`; + const result = extractFromSource('src/my/app/views.cljs', code); + const dispatchRef = result.unresolvedReferences.find( + (r) => r.referenceName === ':todo/add' && r.referenceKind === 'calls' + ); + expect(dispatchRef).toBeDefined(); + expect( + result.unresolvedReferences.find((r) => r.referenceName === ':todo/items') + ).toBeDefined(); + }); + + it('should support :refer style and dispatch-sync', () => { + const code = `(ns my.app.core + (:require [re-frame.core :refer [reg-event-db dispatch-sync]])) +(reg-event-db :app/init (fn [_ _] {})) +(defn boot [] (dispatch-sync [:app/init])) +`; + const result = extractFromSource('src/my/app/core.cljs', code); + expect(result.nodes.find((n) => n.name === ':app/init')).toBeDefined(); + expect( + result.unresolvedReferences.filter((r) => r.referenceName === ':app/init').length + ).toBeGreaterThanOrEqual(1); + expect( + result.unresolvedReferences.find((r) => r.referenceName === 're-frame.core::reg-event-db') + ).toBeDefined(); + }); + + it('should treat project facades (utils.re-frame style) as re-frame', () => { + // status-mobile fronts re-frame with its own ns: custom registrars + // (reg-root-key-sub) and `sub` for subscribe — shape-based detection + // covers them without knowing the facade. + const code = `(ns my.app.views + (:require [utils.re-frame :as rf])) +(rf/reg-root-key-sub :profile/name :profile-name) +(defn header [] (rf/sub [:profile/name])) +(defn save [] (rf/dispatch [:profile/update])) +`; + const result = extractFromSource('src/my/app/views.cljs', code); + expect(result.nodes.find((n) => n.name === ':profile/name')?.kind).toBe('function'); + expect( + result.unresolvedReferences.find((r) => r.referenceName === ':profile/name' && r.referenceKind === 'calls') + ).toBeDefined(); + expect( + result.unresolvedReferences.find((r) => r.referenceName === ':profile/update') + ).toBeDefined(); + }); + + it('should not shape-match reg-* calls without a literal keyword key', () => { + const code = `(ns my.app.other) +(reg-handler handler-map) +(reg-watch "string-key" f) +(reg-event-db dynamic-kw (fn [db _] db)) +`; + const result = extractFromSource('src/my/app/other.clj', code); + expect(result.nodes.filter((n) => n.name.startsWith(':'))).toEqual([]); + }); + + it('should skip variable event vectors (anonymous frontier)', () => { + const code = `(ns my.app.relay + (:require [re-frame.core :as rf])) +(defn relay [evt] (rf/dispatch evt)) +`; + const result = extractFromSource('src/my/app/relay.cljs', code); + const kwRefs = result.unresolvedReferences.filter((r) => r.referenceName.startsWith(':')); + expect(kwRefs).toEqual([]); + }); +}); + +// ============================================================================= +// UIx / helix (ClojureScript React wrappers — defui/defnc + $ composition) +// ============================================================================= + +describe('Clojure Extraction (UIx / helix)', () => { + it('should extract defui as component nodes', () => { + const code = `(ns my.app.ui + (:require [uix.core :refer [defui $]])) +(defui button [{:keys [on-click]}] + ($ :button {:on-click on-click})) +`; + const result = extractFromSource('src/my/app/ui.cljs', code); + const btn = result.nodes.find((n) => n.name === 'button'); + expect(btn?.kind).toBe('component'); + expect(btn?.signature).toBe('(defui ...)'); + }); + + it('should emit calls edges for $ component composition (refer style)', () => { + const code = `(ns my.app.views + (:require [uix.core :refer [defui $]] + [my.app.ui :as ui])) +(defui panel [_] ($ :aside)) +(defui toolbar [{:keys [doc]}] + ($ :div + ($ ui/button {:on-click identity}) + ($ panel {}))) +`; + const result = extractFromSource('src/my/app/views.cljs', code); + expect( + result.unresolvedReferences.find((r) => r.referenceName === 'my.app.ui::button' && r.referenceKind === 'calls') + ).toBeDefined(); + expect( + result.unresolvedReferences.find((r) => r.referenceName === 'panel' && r.referenceKind === 'calls') + ).toBeDefined(); + // DOM tags produce nothing + expect(result.unresolvedReferences.find((r) => r.referenceName === 'div')).toBeUndefined(); + }); + + it('should support aliased uix/$ and helix defnc', () => { + const code = `(ns my.app.hx + (:require [helix.core :as hx :refer [defnc]] + [my.app.widgets :as w])) +(defnc row [props] (hx/$ w/cell {:v 1})) +`; + const result = extractFromSource('src/my/app/hx.cljs', code); + expect(result.nodes.find((n) => n.name === 'row')?.kind).toBe('component'); + expect( + result.unresolvedReferences.find((r) => r.referenceName === 'my.app.widgets::cell' && r.referenceKind === 'calls') + ).toBeDefined(); + }); + + it('should not treat $ from non-uix namespaces as element creation', () => { + const code = `(ns my.app.money + (:require [my.currency :refer [$]])) +(defn price [x] ($ amount x)) +`; + const result = extractFromSource('src/my/app/money.clj', code); + // plain refer'd call, no component edge to `amount` + expect( + result.unresolvedReferences.find((r) => r.referenceName === 'my.currency::$') + ).toBeDefined(); + expect( + result.unresolvedReferences.find((r) => r.referenceName === 'amount' && r.referenceKind === 'calls') + ).toBeUndefined(); + }); +}); diff --git a/docs/design/dynamic-dispatch-coverage-playbook.md b/docs/design/dynamic-dispatch-coverage-playbook.md index aa65398e4..aafcb180f 100644 --- a/docs/design/dynamic-dispatch-coverage-playbook.md +++ b/docs/design/dynamic-dispatch-coverage-playbook.md @@ -250,6 +250,7 @@ Status legend: ✅ done+validated · 🔬 hole identified · ⬜ not started. | C/C++ | C++ vtables / inheritance | virtual call → override; general direct dispatch | S + X | ✅ **general dispatch strong** (redis C **29k** cross-file calls / leveldb C++ **1.4k**) + **C++ inheritance extraction fix** (`base_class_clause` was unhandled, so C++ extends edges were missing — leveldb **219→298**) + **cpp-override synthesizer** (base virtual method → subclass override, gated to C++, capped — leveldb 12 precise: `Iterator::Next→MergingIterator`). 🔬 C callback structs (`s->fn()` → 422-way fan-out, too noisy to synthesize) + C++ pure-virtual base methods (`virtual void f()=0;` declarations aren't extracted as nodes, so those overrides can't bridge) | | Dart | Flutter | setState → build; build → child widgets | S + X | ✅ **setState→build synthesizer** (Dart analog of react-render: a State method whose body calls `setState(` → `build`) gated to `.dart` + **foundational Dart method-range fix** — Dart models a method body as a *sibling* of the signature, so method nodes were signature-only (`end==start`); now `endLine` spans the body (required for ALL body analysis: callees, context slices, the synthesizer's body scan). counter `initState→build`, books `build→BookDetail/BookForm`; widget composition already static (compass_app `build→ErrorIndicator/HomeButton`). Controls unchanged (excalidraw 9,290 / django 302 — the range fix only extends sibling-body grammars). 🔬 MVVM Command/ChangeNotifier dispatch (compass_app — no setState) + `Navigator.push(MaterialPageRoute(builder:))` nav routes | | Lua / Luau | Neovim / Roblox | module dispatch (require→mod, mod.fn); event/callback | — | ✅ **already covered for the dominant flow (measure-first, no code change)** — Neovim is module-heavy (`require('x')` + `x.fn()`), and the general import + name resolution already handles it: telescope.nvim **220 imports + 335 cross-file `mod.fn` calls**, traces end-to-end (`map_entries ← init.lua → get_current_picker (state.lua)`). Luau instance-path `require(game:GetService(...))` handled by the extractor. 🔬 event-callback registration (`vim.keymap.set(…, fn)`, autocmd `callback=`, Roblox `signal:Connect(fn)`) is predominantly INLINE anonymous closures (corpus ~12 inline vs ~2 named) — the anonymous-handler frontier; named handlers too rare to justify a synthesizer | +| Clojure / ClojureScript | namespace dispatch / protocols / re-frame / integrant | `:require` alias → cross-ns call; editor action → persisted state (logseq); API query → QP middleware → driver (metabase) | — (extraction-level) | ✅ **dominant flow covered at the extraction layer, no synthesizer needed** — aliased calls (`(db/insert! x)`) emit `full.ns::name` and resolve via the qualified-name matcher; `:refer`'d and same-file-HOF calls link too; protocol methods bridge through bare-name interop calls (`(.put store ...)`); local-shadowing suppression keeps precision (ring −207 false call edges). Validated S/M/L: ring 84f/2.5k edges · logseq 1,312f/92k (A/B **155→4 tool calls, 0 Read/Grep**, 3.6× faster, 2.5× cheaper) · metabase 15,374f/623k (A/B 24→14 calls, 16→6 Reads, 1.7× faster). ring A/B n=2: with 4–5 calls / 0–1 Read / 37–55s / ~$0.53 vs without 7 calls / 3 Reads / 44s / ~$0.28 — wall-clock parity within run noise, fewer calls+Reads, but ~1.8× cost (the known small-repo explore over-return pattern, not Clojure-specific). ✅ **re-frame keyword dispatch (extraction-only, no synthesizer)** — keywords are globally unique strings, so each `reg-*` registration becomes a function node NAMED by its alias-expanded keyword (`::subs/items` → `:my.app.subs/items`) and literal `dispatch [:k …]`/`subscribe [:k]` sites emit same-named `calls` refs that the exact-name matcher links. Detection is SHAPE-based (`/^reg-[a-z-]+$/` head + literal kwd first arg; `dispatch`/`dispatch-sync`/`subscribe`/`sub` + kwd-led vector) because real apps front re-frame with project facades — status-mobile's `utils.re-frame` covers 512 files with custom registrars (`reg-root-key-sub`) and `sub`; ns-gating on `re-frame.core` found only 119 edges, shape-based finds **2,323** (1,635 registrations, +911 nodes, 10 kwd collisions). Precision is structural: an edge needs BOTH a registration node and a dispatch ref with the same keyword, so stray shape-matches resolve to nothing. todomvc (re-frame repo) 269 regs / 194 edges · athens 292/552 · status-mobile 1,635/2,323; `codegraph_node :profile/logout` returns the handler + all 13 dispatch sites (views + chained events) in one call. Agent A/B on status-mobile's logout flow (n=2): with 12–14 calls / 3–6 Reads / 72–88s vs without 25–33 calls / 10–15 Reads / ~122s — ~1.7× faster, half the calls+Reads, ~1.9× cost (large-repo explore payloads); residual Reads are the agent chasing fx-chain hops explore didn't lead with. **Explore-side Clojure query support** (validated on a 2k-file cljs monorepo): symbol-token charset widened to the Lisp alphabet (kebab/`?`/`!`, `alias/name`, `:keyword` — previously NO Clojure symbol passed and the flow builder never ran), bare tokens resolve against module last-segments (ns names are how Clojure agents reference subsystems), and ambiguous tokens prefer candidates co-located with anchor dirs (monorepo cross-subsystem noise). Controls held (Alamofire multi-phase, metabase TS). **UIx/helix composition**: `defui`/`defnc` → `component` nodes and the `$` element macro → calls edges to the composed component (gated on `$` resolving to uix.core/helix.core in the require table — too short to shape-match); pitch-io/uix repo: 131 components, composition edges across its benchmark suites. 🔬 status-mobile's legacy `rf/defn {:events [:k]}` macro (kwd lives in an attr-map); fx-map keys → `reg-fx` handlers; handler→sub app-db data-flow (which subs recompute on a db write); integrant `ig/init-key` system-key → defmethod; multimethod dispatch-value → defmethod edges; `extend-protocol`/`extend-type` implements edges; reitit route DATA | | Scala | Play / Akka | request → conf/routes → controller action | R + X | ✅ **Play `conf/routes` → controller** — the extensionless `conf/routes` wasn't indexed; added narrow file-walk opt-in (`isPlayRoutesFile`) + a Play resolver parsing `METHOD /path Controller.action(args)` → the action method (computer-database **0→8, 7/8**; starter 0→4, 3/4 — the unresolved are Play's framework `Assets` controller, external). Scala general controller→DAO dispatch already resolves. No-regression: the file-walk change only ADDS Play routes files (excalidraw 9,290 / suite 800 unchanged). 🔬 SIRD programmatic router (`-> /v1 Router` include + `case GET(p"/x")` in code) + Akka actor `receive`/`Behaviors.receiveMessage` message→handler | | Swift × Objective-C | mixed iOS apps | Swift `obj.foo(bar:)` → ObjC `-fooWithBar:`; ObjC `[obj fooWithBar:]` → Swift `@objc func foo(bar:)` | R | ✅ **Swift↔ObjC cross-language bridge** — `frameworks/swift-objc.ts` implements Apple's `@objc` auto-bridging name math (incl. init forms `initWith:`, property getter+setter pairs, `@objc(custom:)` override) and the reverse direction strips Cocoa preposition prefixes (`With`/`For`/`By`/`In`/`On`/`At`/`From`/`To`/`Of`/`As`) to derive Swift base-name candidates. Validated on Charts S **28/1 obj→swift / swift→objc**, realm-swift M **36/1185**, wikipedia-ios L **52/983**. Genericname blocklist (`init`, `description`, `count`, …) keeps precision. Confidence 0.6 (name-match's 1.0 wins ties) — bridge only fires when name-match has no result. 🔬 Swift generics over ObjC protocols, Swift extensions on ObjC classes (silently miss; matches Java/Kotlin generics frontier) | | JS × native | React Native legacy bridge | JS `NativeModules.X.fn(...)` → ObjC `RCT_EXPORT_METHOD` / Java/Kotlin `@ReactMethod` | R | ✅ **RN legacy bridge** — `frameworks/react-native.ts` parses `RCT_EXPORT_MODULE` (default-name from `RCT`-prefix-stripped class name) + `RCT_EXPORT_METHOD(selector:(...))` + `RCT_REMAP_METHOD(jsName, selector)` on the ObjC side and `@ReactMethod` + `getName()` literal on Java/Kotlin. AsyncStorage S **8/8 precise** (`setItem`→`legacy_multiSet`, etc.), react-native-firebase L **18 precise after `RCTEventEmitter` built-in blocklist** (initial 78 included 60 `addListener:`/`remove:` false positives — every emitter subclass declares those via `RCT_EXPORT_METHOD`, JS callers route through the `NativeEventEmitter` abstraction not the native method directly). 🔬 dynamic bridge keys (`NativeModules[someVar]`) — literal-key only | diff --git a/src/extraction/grammars.ts b/src/extraction/grammars.ts index 576845e20..99479867d 100644 --- a/src/extraction/grammars.ts +++ b/src/extraction/grammars.ts @@ -38,6 +38,7 @@ const WASM_GRAMMAR_FILES: Record = { lua: 'tree-sitter-lua.wasm', luau: 'tree-sitter-luau.wasm', objc: 'tree-sitter-objc.wasm', + clojure: 'tree-sitter-clojure.wasm', }; /** @@ -99,6 +100,21 @@ export const EXTENSION_MAP: Record = { '.sc': 'scala', '.lua': 'lua', '.luau': 'luau', + // Clojure family: one language token for all dialects — .cljc files are + // shared between Clojure and ClojureScript, so splitting the dialects into + // separate Language values would break cross-dialect reference resolution + // (matchers gate on language equality). .bb is Babashka (plain Clojure). + '.clj': 'clojure', + '.cljs': 'clojure', + '.cljc': 'clojure', + // .bb is also BitBake (Yocto) — accepted collision: Clojure tooling treats + // .bb as Babashka, BitBake recipes parse as near-empty lexical trees + // (harmless), and Yocto + CodeGraph overlap is negligible. + '.bb': 'clojure', + // EDN config/data files (deps.edn, bb.edn, shadow-cljs.edn, system configs) + // parse with the same grammar but extract in data mode: top-level keys + // become property nodes and qualified symbols become references — no calls. + '.edn': 'clojure', '.m': 'objc', '.mm': 'objc', // XML: file-level tracking; the MyBatis extractor matches `` @@ -184,8 +200,10 @@ export async function loadGrammarsForLanguages(languages: Language[]): Promise; + /** :refer'd symbol → full namespace name */ + refers: Map; + /** + * Lazy index of same-file function/method names for HOF detection in + * handleSymRef. Rebuilt only when ctx.nodes has grown — without it every + * bare symbol pays a linear scan over all nodes extracted so far + * (O(symbols × nodes) on god-files). + */ + fnNames?: { len: number; names: Set }; + /** + * Stack of local-binding frames (let/loop/for vecs, fn/defn params, letfn + * names, as->/catch bindings). A bare symbol matching ANY frame is a local + * — shadowing a same-file fn name in a `let` is idiomatic Clojure, and + * without this both the shadowed usages and shadowed head-position calls + * would emit false `calls` edges to the fn. + */ + locals: Set[]; +} + +function isLocal(name: string, state: NsState): boolean { + for (let i = state.locals.length - 1; i >= 0; i--) { + if (state.locals[i]!.has(name)) return true; + } + return false; +} + +/** + * Collect the names a binding TARGET introduces: a plain symbol, or every + * unqualified symbol inside a destructuring vec/map (`{:keys [a b] :as all}`). + * Keyword markers (`:keys`, `:as`, `:or`, map keys) are skipped; `&` is not a + * name. Over-collects symbols inside `:or` default expressions — conservative + * in the right direction (suppresses references rather than fabricating them). + */ +function collectBindingNames(target: SyntaxNode, source: string, into: Set): void { + if (target.type === 'sym_lit') { + const { ns, name } = symParts(target, source); + if (!ns && name && name !== '&') into.add(name); + return; + } + if (target.type === 'vec_lit' || target.type === 'map_lit' || target.type === 'ns_map_lit') { + for (const child of valueChildren(target)) { + if (child.type === 'kwd_lit') continue; + collectBindingNames(child, source, into); + } + } +} + +// Keyed weakly by the parse Tree object — one entry per in-flight file parse, +// reclaimed when the extractor deletes the tree. +const nsStateByTree = new WeakMap(); + +function getNsState(node: SyntaxNode): NsState { + let state = nsStateByTree.get(node.tree); + if (!state) { + state = { aliases: new Map(), refers: new Map(), locals: [] }; + nsStateByTree.set(node.tree, state); + } + return state; +} + +/** Split a sym_lit into its optional namespace part and name part. */ +function symParts(sym: SyntaxNode, source: string): { ns?: string; name: string } { + const nsNode = getChildByField(sym, 'namespace'); + const nameNode = getChildByField(sym, 'name'); + return { + ns: nsNode ? getNodeText(nsNode, source) : undefined, + name: nameNode ? getNodeText(nameNode, source) : getNodeText(sym, source), + }; +} + +/** Named children minus comments/discards — the actual value forms. */ +function valueChildren(node: SyntaxNode): SyntaxNode[] { + return node.namedChildren.filter( + (c): c is SyntaxNode => !!c && c.type !== 'comment' && c.type !== 'dis_expr' + ); +} + +/** Does a sym_lit carry `^:private` (or `{:private true}`) metadata? */ +function hasPrivateMeta(sym: SyntaxNode, source: string): boolean { + for (let i = 0; i < sym.namedChildCount; i++) { + if (sym.fieldNameForNamedChild(i) !== 'meta') continue; + const meta = sym.namedChild(i); + if (meta && /:private\b/.test(getNodeText(meta, source))) return true; + } + return false; +} + +/** Strip surrounding quotes from a str_lit's text. */ +function stringContent(node: SyntaxNode, source: string): string { + return getNodeText(node, source).replace(/^"|"$/g, ''); +} + +const LITERAL_TYPES = new Set([ + 'num_lit', 'str_lit', 'kwd_lit', 'bool_lit', 'nil_lit', 'char_lit', 'regex_lit', +]); + +// Container forms whose children are evaluated — walk through them. +const WALK_THROUGH_TYPES = new Set([ + 'vec_lit', 'map_lit', 'set_lit', 'ns_map_lit', + 'read_cond_lit', 'splicing_read_cond_lit', 'syn_quoting_lit', + 'unquoting_lit', 'unquote_splicing_lit', 'derefing_lit', + 'tagged_or_ctor_lit', 'var_quoting_lit', 'evaling_lit', +]); + +// Special forms + ubiquitous clojure.core macros/functions. These never get a +// `calls` reference: the real target (clojure.core) is never in the project +// graph, so emitting them only risks false edges to same-named project +// symbols. Children are still walked, so calls *inside* them are captured. +const CORE_FORMS = new Set([ + // special forms / core macros + 'def', 'if', 'do', 'let', 'let*', 'quote', 'var', 'fn', 'fn*', 'loop', 'loop*', + 'recur', 'throw', 'try', 'catch', 'finally', 'set!', 'new', '.', '..', + 'monitor-enter', 'monitor-exit', 'in-ns', 'import', 'require', 'use', 'refer', + 'when', 'when-not', 'when-let', 'when-some', 'when-first', 'if-let', 'if-not', + 'if-some', 'cond', 'condp', 'case', 'and', 'or', 'not', + '->', '->>', 'as->', 'some->', 'some->>', 'cond->', 'cond->>', 'doto', + 'doseq', 'dotimes', 'for', 'while', 'binding', 'with-open', 'with-redefs', + 'with-local-vars', 'with-bindings', 'with-out-str', 'with-in-str', 'with-meta', + 'delay', 'lazy-seq', 'lazy-cat', 'future', 'promise', 'locking', 'io!', 'sync', + // NOTE: definline is deliberately NOT here — it defines a function, and the + // def-macro heuristic in handleList turns it into a function node. + 'dosync', 'declare', 'assert', 'comment', 'gen-class', 'this-as', + 'goog-define', 'specify', 'specify!', + // ubiquitous core functions + 'map', 'filter', 'remove', 'reduce', 'reduce-kv', 'apply', 'str', 'pr', 'prn', + 'println', 'print', 'printf', 'pr-str', 'prn-str', 'format', 'get', 'get-in', + 'assoc', 'assoc-in', 'update', 'update-in', 'dissoc', 'merge', 'merge-with', + 'conj', 'cons', 'concat', 'into', 'vec', 'vector', 'list', 'hash-map', + 'hash-set', 'set', 'sorted-map', 'sorted-set', 'array-map', 'first', 'second', + 'ffirst', 'rest', 'next', 'nnext', 'nth', 'last', 'butlast', 'take', + 'take-while', 'take-last', 'take-nth', 'drop', 'drop-while', 'drop-last', + 'count', 'empty', 'empty?', 'seq', 'not-empty', 'keys', 'vals', 'contains?', + 'some', 'every?', 'not-any?', 'not-every?', 'filterv', 'mapv', 'keep', + 'keep-indexed', 'map-indexed', 'mapcat', 'partition', 'partition-all', + 'partition-by', 'group-by', 'frequencies', 'sort', 'sort-by', 'reverse', + 'distinct', 'dedupe', 'interleave', 'interpose', 'flatten', 'zipmap', 'range', + 'repeat', 'repeatedly', 'iterate', 'cycle', 'identity', 'constantly', 'comp', + 'partial', 'juxt', 'complement', 'fnil', 'memoize', 'trampoline', + '=', 'not=', '==', '<', '>', '<=', '>=', '+', '-', '*', '/', '+\'', '-\'', '*\'', + 'quot', 'rem', 'mod', 'inc', 'dec', 'inc\'', 'dec\'', 'max', 'min', 'abs', + 'zero?', 'pos?', 'neg?', 'even?', 'odd?', 'number?', 'string?', 'keyword?', + 'symbol?', 'map?', 'vector?', 'list?', 'set?', 'seq?', 'coll?', 'fn?', 'ifn?', + 'nil?', 'true?', 'false?', 'boolean', 'some?', 'any?', 'instance?', + 'satisfies?', 'isa?', 'type', 'class', 'name', 'namespace', 'keyword', + 'symbol', 'gensym', 'int', 'long', 'double', 'float', 'bigdec', 'bigint', + 'num', 'rand', 'rand-int', 'rand-nth', 'shuffle', 'atom', 'swap!', + 'swap-vals!', 'reset!', 'reset-vals!', 'compare-and-set!', 'add-watch', + 'remove-watch', 'agent', 'send', 'send-off', 'await', 'alter', 'alter-var-root', + 'commute', 'ref', 'ref-set', 'deref', 'intern', 'resolve', 'requiring-resolve', + 'find-var', 'meta', 'vary-meta', 'alter-meta!', 'reduced', 'realized?', 'force', + 'ex-info', 'ex-data', 'ex-message', 'ex-cause', 'slurp', 'spit', 'read-string', + 'get-method', 'methods', 'prefer-method', 'remove-method', 'derive', 'underive', + 'make-hierarchy', 'boolean?', 'char?', 'double?', 'float?', 'int?', 'integer?', + 'nat-int?', 'pos-int?', 'neg-int?', 'rational?', 'ratio?', 'decimal?', 'var?', + 'volatile!', 'vswap!', 'vreset!', 'tap>', 'add-tap', 'remove-tap', 'run!', + 'doall', 'dorun', 'nthnext', 'nthrest', 'split-at', 'split-with', 'subvec', + 'subs', 're-find', 're-matches', 're-seq', 're-pattern', 'peek', 'pop', + 'select-keys', 'update-keys', 'update-vals', 'min-key', 'max-key', 'key', + 'val', 'find', 'line-seq', 'file-seq', 'tree-seq', 'xml-seq', 'compare', + 'hash', 'identical?', 'time', 'identity', 'random-uuid', 'parse-long', + 'parse-double', 'parse-boolean', 'parse-uuid', 'char', 'int-array', + 'long-array', 'object-array', 'to-array', 'into-array', 'aget', 'aset', + 'alength', 'aclone', 'amap', 'areduce', 'make-array', +]); + +/** Emit one unresolved reference from the current scope. */ +function emitRef( + ctx: ExtractorContext, + node: SyntaxNode, + referenceName: string, + referenceKind: 'calls' | 'references' | 'instantiates' | 'implements' | 'imports' +): void { + const fromNodeId = ctx.nodeStack[ctx.nodeStack.length - 1]; + if (!fromNodeId || !referenceName) return; + ctx.addUnresolvedReference({ + fromNodeId, + referenceName, + referenceKind, + line: node.startPosition.row + 1, + column: node.startPosition.column, + }); +} + +/** + * Reference name for a namespaced symbol: resolve the alias through the + * file's `:require` table to `full.ns::name` (exact qualifiedName match), or + * fall back to interop/foreign-namespace forms. + */ +function qualifiedRefName(ns: string, name: string, state: NsState): string { + const full = state.aliases.get(ns); + if (full) return `${full}::${name}`; + if (ns.includes('.')) return `${ns}::${name}`; // direct fully-qualified usage + if (/^[A-Z]/.test(ns)) return `${ns}.${name}`; // Class/staticMethod interop + return name; // unknown lowercase alias — fall back to bare name matching +} + +/** Walk any evaluated (non-list) form for references; lists go to handleList. */ +function walkForm(node: SyntaxNode, ctx: ExtractorContext): void { + const t = node.type; + if (t === 'list_lit') { + handleList(node, ctx); + } else if (t === 'anon_fn_lit') { + // `#(f x)` IS the call form — the grammar puts head + args directly under + // anon_fn_lit, no inner list_lit. Route through handleList so the call is + // extracted (`%`/`%1` arg symbols are 1-char and skipped by handleSymRef). + handleList(node, ctx); + } else if (t === 'sym_lit') { + handleSymRef(node, ctx); + } else if (WALK_THROUGH_TYPES.has(t)) { + for (const child of valueChildren(node)) walkForm(child, ctx); + } + // quoting_lit / dis_expr / literals: not evaluated as code — skip +} + +/** + * A symbol in non-head (argument) position. Emit a reference only in the + * high-precision cases: a namespaced symbol (clearly a var usage), a + * `:refer`'d symbol, or a symbol naming a function already defined in this + * file (the common private-helper-passed-to-HOF case — Clojure is + * define-before-use, so the node already exists). Bare locals never match. + */ +function handleSymRef(sym: SyntaxNode, ctx: ExtractorContext): void { + const state = getNsState(sym); + const { ns, name } = symParts(sym, ctx.source); + if (!name || name.length <= 1) return; + + if (ns) { + emitRef(ctx, sym, qualifiedRefName(ns, name, state), 'references'); + return; + } + if (CORE_FORMS.has(name)) return; + if (isLocal(name, state)) return; // bound by an enclosing let/fn/loop — not a var usage + const referNs = state.refers.get(name); + if (referNs) { + emitRef(ctx, sym, `${referNs}::${name}`, 'references'); + return; + } + // Same-file function passed as a value — a higher-order call. + let cache = state.fnNames; + if (!cache || cache.len !== ctx.nodes.length) { + const names = new Set(); + for (const n of ctx.nodes) { + if (n.kind === 'function' || n.kind === 'method') names.add(n.name); + } + cache = { len: ctx.nodes.length, names }; + state.fnNames = cache; + } + if (cache.names.has(name)) emitRef(ctx, sym, name, 'calls'); +} + +/** + * Core forms whose second element is a binding VECTOR: `[name expr name + * expr ...]`. Binding names introduce locals — they are never usages, and + * shadowing a same-file fn name in a `let` is idiomatic Clojure, so walking + * them through handleSymRef would emit false `calls` edges. + */ +const BINDING_FORMS = new Set([ + 'let', 'let*', 'loop', 'loop*', 'binding', 'for', 'doseq', 'dotimes', + 'when-let', 'if-let', 'when-some', 'if-some', 'when-first', + 'with-open', 'with-redefs', 'with-local-vars', +]); + +/** + * Process the pairs of a binding vector into `frame`: even positions are + * binding targets (their names join the frame AFTER their init is walked — + * `let` is sequential), odd positions are init expressions. `for`/`doseq` + * modifiers keep the pairing: `:when expr` / `:while expr` walk the expr, + * `:let [..]` recurses into the nested binding vector. + */ +function processBindingPairs( + vec: SyntaxNode, + ctx: ExtractorContext, + frame: Set +): void { + const kids = valueChildren(vec); + for (let i = 0; i + 1 < kids.length; i += 2) { + const target = kids[i]!; + const init = kids[i + 1]!; + if (target.type === 'kwd_lit') { + if (getNodeText(target, ctx.source) === ':let' && init.type === 'vec_lit') { + processBindingPairs(init, ctx, frame); + } else { + walkForm(init, ctx); // :when / :while expressions + } + continue; + } + walkForm(init, ctx); + collectBindingNames(target, ctx.source, frame); + } +} + +/** + * `(let [name expr ...] body)` and friends — bind the vector's names for the + * duration of the body so shadowed usages don't emit references. + */ +function handleBindingForm(kids: SyntaxNode[], ctx: ExtractorContext, state: NsState): void { + const frame = new Set(); + state.locals.push(frame); + processBindingPairs(kids[1]!, ctx, frame); + for (const kid of kids.slice(2)) walkForm(kid, ctx); + state.locals.pop(); +} + +/** + * `(fn name? [params] body)` / `(fn name? ([a] ...) ([a b] ...))` — the + * optional self-name and the param vectors are bindings: they join a locals + * frame around the bodies instead of being walked as usages. + */ +function walkFnForm(kids: SyntaxNode[], ctx: ExtractorContext, state: NsState): void { + const frame = new Set(); + state.locals.push(frame); + let i = 1; + if (kids[i]?.type === 'sym_lit') { + collectBindingNames(kids[i]!, ctx.source, frame); + i++; + } + if (kids[i]?.type === 'vec_lit') { + collectBindingNames(kids[i]!, ctx.source, frame); + for (const form of kids.slice(i + 1)) walkForm(form, ctx); + } else { + for (const arity of kids.slice(i)) { + if (arity.type !== 'list_lit') { + walkForm(arity, ctx); + continue; + } + const aKids = valueChildren(arity); + if (aKids[0]?.type === 'vec_lit') collectBindingNames(aKids[0]!, ctx.source, frame); + for (const form of aKids.slice(aKids[0]?.type === 'vec_lit' ? 1 : 0)) { + walkForm(form, ctx); + } + } + } + state.locals.pop(); +} + +/** Push a frame of param names around a body walk (defn arities, method impls). */ +function walkBodyWithParams( + params: SyntaxNode | undefined, + body: SyntaxNode[], + ctx: ExtractorContext, + state: NsState +): void { + const frame = new Set(); + if (params) collectBindingNames(params, ctx.source, frame); + state.locals.push(frame); + for (const form of body) walkForm(form, ctx); + state.locals.pop(); +} + +// --------------------------------------------------------------------------- +// re-frame keyword-keyed dispatch +// --------------------------------------------------------------------------- +// +// re-frame routes everything through keyword-keyed registries at runtime — +// `(reg-event-db :todo/add handler)` … `(dispatch [:todo/add x])` — so the +// flow has zero static edges. Keywords are globally unique strings, so the +// bridge is extraction-only: each registration becomes a function node NAMED +// by its (alias-expanded) keyword, each literal dispatch/subscribe site emits +// a `calls` reference with the same name, and the existing exact-name matcher +// links them. +// +// Detection is SHAPE-based, not require-gated: real apps wrap re-frame in a +// project facade (status-mobile's `utils.re-frame` fronts 512 files, with +// custom registrars like `reg-root-key-sub` and `sub` for subscribe), so +// gating on `re-frame.core` in the require table misses most call sites. +// The shape is distinctive — a `reg-*` head with a literal keyword first arg +// is re-frame-family vocabulary — and precision is enforced structurally: an +// edge only materializes when a registration node AND a dispatch ref carry +// the exact same keyword, so a stray shape-match in a non-re-frame app +// resolves to nothing and is dropped. (The node side of a stray match — a +// spurious function node named `:kwd` — is accepted: it is inert without a +// same-keyword dispatch site, and the registrar call itself still gets its +// ordinary call ref, so nothing is lost either way.) + +const RE_FRAME_REG_SHAPE = /^reg-[a-z-]+$/; +const RE_FRAME_DISPATCH_FORMS = new Set(['dispatch', 'dispatch-sync', 'subscribe', 'sub']); + +// --------------------------------------------------------------------------- +// UIx / helix (ClojureScript React wrappers) +// --------------------------------------------------------------------------- +// +// Components are defined with a def-macro (`defui` in UIx, `defnc` in helix) +// and composed with the `$` element macro: `($ ui/button {:on-click f} ...)`. +// `$` is the entire composition mechanism, so the component argument gets a +// real `calls` edge (a render IS a call — same reasoning as the React JSX +// child edges), not just an argument-position reference. Gated on `$` +// actually resolving to uix.core / helix.core in the file's require table — +// `$` is too short a name to shape-match. + +const UIX_CORE_NAMESPACES = new Set(['uix.core', 'helix.core']); +const UIX_COMPONENT_MACROS = new Set(['defui', 'defnc']); + +/** `($ ui/button {...} child)` — emit a calls ref to the component (sym args only; `:div`/`:<>` keywords are DOM tags). */ +function emitUixElementRef(kids: SyntaxNode[], ctx: ExtractorContext, state: NsState): void { + const comp = kids[1]; + if (!comp || comp.type !== 'sym_lit') return; + const { ns, name } = symParts(comp, ctx.source); + if (!name || isLocal(name, state)) return; + if (ns) { + emitRef(ctx, comp, qualifiedRefName(ns, name, state), 'calls'); + return; + } + const referNs = state.refers.get(name); + emitRef(ctx, comp, referNs ? `${referNs}::${name}` : name, 'calls'); +} + +/** + * Expand a keyword literal to its canonical `:full.ns/name` string: + * `:todo/add` as written, `::add` → `:/add`, `::subs/items` → + * `:/items`. The `::` auto-resolve marker only exists in + * the raw text — the grammar's ns/name fields don't carry it. + */ +function expandKeyword(kwd: SyntaxNode, ctx: ExtractorContext, state: NsState): string { + const raw = getNodeText(kwd, ctx.source); + const nsNode = getChildByField(kwd, 'namespace'); + const nameNode = getChildByField(kwd, 'name'); + const name = nameNode ? getNodeText(nameNode, ctx.source) : raw.replace(/^:+/, ''); + const ns = nsNode ? getNodeText(nsNode, ctx.source) : undefined; + if (raw.startsWith('::')) { + if (ns) return `:${state.aliases.get(ns) ?? ns}/${name}`; + return state.nsName ? `:${state.nsName}/${name}` : `:${name}`; + } + return ns ? `:${ns}/${name}` : `:${name}`; +} + +/** + * `(reg-event-db :todo/add (fn [db v] ...))` — the registration becomes a + * function node named by the keyword, and the handler body walks under it so + * its calls attribute to the event, not the file. + */ +function handleReframeRegistration( + list: SyntaxNode, + kids: SyntaxNode[], + ctx: ExtractorContext, + state: NsState, + regName: string +): void { + const kwd = kids[1]; + if (!kwd || kwd.type !== 'kwd_lit') { + // Dynamic registration key — nothing to name; walk for calls only. + for (const kid of kids.slice(1)) walkForm(kid, ctx); + return; + } + const keyword = expandKeyword(kwd, ctx, state); + const regNode = ctx.createNode('function', keyword, list, { + signature: `(${regName} ${getNodeText(kwd, ctx.source)})`, + isExported: true, + }); + if (regNode) { + ctx.pushScope(regNode.id); + for (const kid of kids.slice(2)) walkForm(kid, ctx); + ctx.popScope(); + } else { + for (const kid of kids.slice(2)) walkForm(kid, ctx); + } +} + +/** + * `(dispatch [:todo/add x])` / `(subscribe [:todo/items])` — emit a `calls` + * reference named by the literal event keyword so it links to the + * registration node. Variable event vectors (`(dispatch evt)`) stay + * unlinked — the anonymous frontier. + */ +function emitReframeDispatchRef(kids: SyntaxNode[], ctx: ExtractorContext, state: NsState): void { + const vec = kids[1]; + if (!vec || vec.type !== 'vec_lit') return; + const kwd = valueChildren(vec)[0]; + if (!kwd || kwd.type !== 'kwd_lit') return; + emitRef(ctx, vec, expandKeyword(kwd, ctx, state), 'calls'); +} + +/** Dispatch a list form by its head symbol. */ +function handleList(list: SyntaxNode, ctx: ExtractorContext): void { + const kids = valueChildren(list); + const head = kids[0]; + if (!head) return; + + // `((make-handler) req)` or `(:kwd m)` — no callable name; walk everything. + if (head.type !== 'sym_lit') { + for (const kid of kids) walkForm(kid, ctx); + return; + } + + const state = getNsState(list); + const { ns, name } = symParts(head, ctx.source); + + if (!ns) { + switch (name) { + case 'ns': + handleNs(list, kids, ctx, state); + return; + case 'comment': // rich-comment block — never code that runs + case 'quote': + case 'declare': + return; + case 'defn': + case 'defn-': + case 'defmacro': + handleDefn(list, kids, ctx, name === 'defn-'); + return; + case 'def': + case 'defonce': + handleDef(list, kids, ctx); + return; + case 'defprotocol': + handleProtocol(list, kids, ctx, 'protocol'); + return; + case 'definterface': + handleProtocol(list, kids, ctx, 'interface'); + return; + case 'defrecord': + case 'deftype': + handleRecord(list, kids, ctx, name === 'defrecord'); + return; + case 'defmulti': + handleDefmulti(list, kids, ctx); + return; + case 'defmethod': + handleDefmethod(list, kids, ctx); + return; + case 'reify': + case 'proxy': + case 'extend-protocol': + case 'extend-type': + case 'extend': + case 'specify': + case 'specify!': + handleInlineImpl(kids, ctx); + return; + case 'letfn': + handleLetfn(kids, ctx, state); + return; + case 'new': { + // (new Foo args) + const cls = kids[1]; + if (cls?.type === 'sym_lit') { + emitRef(ctx, list, symParts(cls, ctx.source).name, 'instantiates'); + } + for (const kid of kids.slice(2)) walkForm(kid, ctx); + return; + } + } + + // (.method obj args) — interop / protocol method call by bare name. + // (.-property obj) is a ClojureScript property READ, not a call. + if (name.startsWith('.') && name.length > 1 && name !== '..') { + const isPropertyAccess = name.startsWith('.-'); + emitRef(ctx, list, name.replace(/^\.-?/, ''), isPropertyAccess ? 'references' : 'calls'); + for (const kid of kids.slice(1)) walkForm(kid, ctx); + return; + } + // (Foo. args) — constructor call. + if (name.endsWith('.') && name.length > 1) { + emitRef(ctx, list, name.slice(0, -1), 'instantiates'); + for (const kid of kids.slice(1)) walkForm(kid, ctx); + return; + } + if (CORE_FORMS.has(name)) { + // Binding forms: names join a locals frame, init exprs + body walked. + if (BINDING_FORMS.has(name) && kids[1]?.type === 'vec_lit') { + handleBindingForm(kids, ctx, state); + return; + } + // fn literals: self-name + param vectors are bindings, not usages. + if (name === 'fn' || name === 'fn*') { + walkFnForm(kids, ctx, state); + return; + } + // (as-> expr name forms...) / (catch ExClass e body) — kids[2] is a + // binding name scoped over the remaining forms. + if (name === 'as->' || name === 'catch') { + if (kids[1]) walkForm(kids[1]!, ctx); + const frame = new Set(); + if (kids[2]) collectBindingNames(kids[2]!, ctx.source, frame); + state.locals.push(frame); + for (const kid of kids.slice(3)) walkForm(kid, ctx); + state.locals.pop(); + return; + } + for (const kid of kids.slice(1)) walkForm(kid, ctx); + return; + } + + // Library def-macros: `(defroutes app-routes ...)`, `(deftest x ...)`, + // `(defstate db ...)` — anything def-shaped whose first arg is a symbol + // defines that symbol. Without this, the var never becomes a node and + // every call inside the body attributes to the file instead. + // UIx `defui` / helix `defnc` define React components — kind 'component' + // (same modeling as Svelte/Vue components). + if (/^def[a-z-]*$/.test(name) && name !== 'default' && name !== 'defer') { + const defSym = kids[1]; + if (defSym?.type === 'sym_lit') { + const kind = UIX_COMPONENT_MACROS.has(name) ? 'component' : 'function'; + const defNode = ctx.createNode(kind, symParts(defSym, ctx.source).name, list, { + signature: `(${name} ...)`, + isExported: !hasPrivateMeta(defSym, ctx.source), + }); + if (defNode) { + ctx.pushScope(defNode.id); + for (const kid of kids.slice(2)) walkForm(kid, ctx); + ctx.popScope(); + return; + } + } + } + + // A locally-bound head — `(let [helper (mk)] (helper 1))` — calls the + // LOCAL, not the same-named var; the target is unknowable statically. + if (isLocal(name, state)) { + for (const kid of kids.slice(1)) walkForm(kid, ctx); + return; + } + + // re-frame shapes (see the block comment above RE_FRAME_REG_SHAPE). + if (RE_FRAME_REG_SHAPE.test(name) && kids[1]?.type === 'kwd_lit' && kids.length >= 3) { + // The registrar itself is still called — keep its ordinary call ref so + // "who calls reg-sub" / impact on a project facade sees every + // registration site. + const regReferNs = state.refers.get(name); + emitRef(ctx, list, regReferNs ? `${regReferNs}::${name}` : name, 'calls'); + handleReframeRegistration(list, kids, ctx, state, name); + return; + } + if (RE_FRAME_DISPATCH_FORMS.has(name)) { + emitReframeDispatchRef(kids, ctx, state); + // fall through — the dispatch call itself is still a call + } + + // UIx/helix element macro: `($ button {...})` with `$` :refer'd. + if (name === '$' && UIX_CORE_NAMESPACES.has(state.refers.get('$') ?? '')) { + emitUixElementRef(kids, ctx, state); + for (const kid of kids.slice(kids[1]?.type === 'sym_lit' ? 2 : 1)) walkForm(kid, ctx); + return; + } + + // Plain call. Prefer the :refer'd qualified form when known. + const referNs = state.refers.get(name); + emitRef(ctx, list, referNs ? `${referNs}::${name}` : name, 'calls'); + for (const kid of kids.slice(1)) walkForm(kid, ctx); + return; + } + + // re-frame via an alias — `(rf/reg-event-db :k ...)`, `(rf/dispatch [:k x])` + // — including project facades (`utils.re-frame`); see RE_FRAME_REG_SHAPE. + if (RE_FRAME_REG_SHAPE.test(name) && kids[1]?.type === 'kwd_lit' && kids.length >= 3) { + // Keep the ordinary registrar call ref (callers/impact on the facade). + emitRef(ctx, list, qualifiedRefName(ns, name, state), 'calls'); + handleReframeRegistration(list, kids, ctx, state, name); + return; + } + if (RE_FRAME_DISPATCH_FORMS.has(name)) { + emitReframeDispatchRef(kids, ctx, state); + // fall through — the dispatch call itself is still a call + } + + // UIx/helix element macro via alias: `(uix/$ button {...})`. + if (name === '$' && UIX_CORE_NAMESPACES.has(state.aliases.get(ns) ?? ns)) { + emitUixElementRef(kids, ctx, state); + for (const kid of kids.slice(kids[1]?.type === 'sym_lit' ? 2 : 1)) walkForm(kid, ctx); + return; + } + + // Qualified def-macros: `(rum/defc page [args] ...)`, `(m/defstate db ...)` + // — same def-shape heuristic as the unqualified branch. Without this, every + // rum/uix/fulcro component in a ClojureScript app is invisible. + if (/^def[a-z-]*$/.test(name) && kids[1]?.type === 'sym_lit') { + const kind = UIX_COMPONENT_MACROS.has(name) ? 'component' : 'function'; + const defNode = ctx.createNode(kind, symParts(kids[1]!, ctx.source).name, list, { + signature: `(${ns}/${name} ...)`, + isExported: !hasPrivateMeta(kids[1]!, ctx.source), + }); + if (defNode) { + ctx.pushScope(defNode.id); + for (const kid of kids.slice(2)) walkForm(kid, ctx); + ctx.popScope(); + return; + } + } + + // Namespaced head: aliased / fully-qualified / interop call. + emitRef(ctx, list, qualifiedRefName(ns, name, state), 'calls'); + for (const kid of kids.slice(1)) walkForm(kid, ctx); +} + +/** + * `(ns my.app.core (:require ...) (:import ...))` — create the module node, + * scope the rest of the file under it, record alias/refer tables, and create + * import nodes + `imports` refs for required namespaces. + */ +function handleNs( + list: SyntaxNode, + kids: SyntaxNode[], + ctx: ExtractorContext, + state: NsState +): void { + const nameSym = kids[1]; + if (!nameSym || nameSym.type !== 'sym_lit') return; + const nsName = symParts(nameSym, ctx.source).name; + state.nsName = nsName; + + const docNode = kids[2]?.type === 'str_lit' ? kids[2] : undefined; + const moduleNode = ctx.createNode('module', nsName, list, { + signature: `(ns ${nsName})`, + docstring: docNode ? stringContent(docNode, ctx.source) : undefined, + endLine: list.endPosition.row + 1, + }); + if (!moduleNode) return; + // Deliberately never popped — the whole file's top-level defs live in this + // namespace, giving them qualifiedName `my.app.core::sym` (same pattern as + // the JVM package_header namespace wrapper). A (rare) second `ns` form in + // the same file nests its module inside the first, so later defs carry both + // namespaces in their qualifiedName — accepted: still searchable, and + // multi-ns files are vanishingly rare outside generated code. + ctx.pushScope(moduleNode.id); + + for (const clause of kids.slice(2)) { + if (clause.type !== 'list_lit') continue; + const clauseKids = valueChildren(clause); + const kwd = clauseKids[0]; + if (!kwd || kwd.type !== 'kwd_lit') continue; + const kwdName = getNodeText(kwd, ctx.source).replace(/^:+/, ''); + + if (kwdName === 'require' || kwdName === 'use' || kwdName === 'require-macros') { + for (const entry of clauseKids.slice(1)) parseRequireEntry(entry, '', ctx, state); + } else if (kwdName === 'import') { + for (const entry of clauseKids.slice(1)) parseImportEntry(entry, ctx); + } + } +} + +/** One `:require` entry: `[my.app.db :as db :refer [save!]]`, a bare sym, a prefix list, or a reader conditional. */ +function parseRequireEntry( + entry: SyntaxNode, + prefix: string, + ctx: ExtractorContext, + state: NsState +): void { + if (entry.type === 'read_cond_lit' || entry.type === 'splicing_read_cond_lit') { + for (const child of valueChildren(entry)) { + if (child.type !== 'kwd_lit') parseRequireEntry(child, prefix, ctx, state); + } + return; + } + if (entry.type === 'sym_lit') { + const base = symParts(entry, ctx.source).name; + createRequire(prefix ? `${prefix}.${base}` : base, entry, ctx); + return; + } + if (entry.type !== 'vec_lit' && entry.type !== 'list_lit') return; + + const kids = valueChildren(entry); + const first = kids[0]; + if (!first) return; + let base: string | null = null; + if (first.type === 'sym_lit') base = symParts(first, ctx.source).name; + else if (first.type === 'str_lit') base = stringContent(first, ctx.source); // shadow-cljs npm require + if (!base) return; + const full = prefix ? `${prefix}.${base}` : base; + + // Prefix form: `(my.app [db :as db] core)` — sub-entries are vecs/syms/lists. + const subEntries = kids + .slice(1) + .filter((k) => k.type === 'vec_lit' || k.type === 'list_lit' || k.type === 'sym_lit'); + const hasOptions = kids.some((k) => k.type === 'kwd_lit'); + if (subEntries.length > 0 && !hasOptions) { + for (const sub of subEntries) parseRequireEntry(sub, full, ctx, state); + return; + } + + createRequire(full, entry, ctx); + + for (let i = 1; i < kids.length - 1; i++) { + const k = kids[i]!; + if (k.type !== 'kwd_lit') continue; + const opt = getNodeText(k, ctx.source).replace(/^:+/, ''); + const value = kids[i + 1]; + if (!value) continue; + if ((opt === 'as' || opt === 'as-alias') && value.type === 'sym_lit') { + state.aliases.set(symParts(value, ctx.source).name, full); + } else if (opt === 'refer' && value.type === 'vec_lit') { + for (const refSym of valueChildren(value)) { + if (refSym.type === 'sym_lit') { + state.refers.set(symParts(refSym, ctx.source).name, full); + } + } + } + } +} + +function createRequire(nsName: string, node: SyntaxNode, ctx: ExtractorContext): void { + ctx.createNode('import', nsName, node, { + signature: getNodeText(node, ctx.source).trim(), + }); + emitRef(ctx, node, nsName, 'imports'); +} + +/** One `:import` entry: `(java.time Instant Duration)` or `java.util.Date`. External — import nodes only, no refs. */ +function parseImportEntry(entry: SyntaxNode, ctx: ExtractorContext): void { + if (entry.type === 'sym_lit') { + ctx.createNode('import', symParts(entry, ctx.source).name, entry, { + signature: getNodeText(entry, ctx.source).trim(), + }); + return; + } + if (entry.type !== 'list_lit' && entry.type !== 'vec_lit') return; + const kids = valueChildren(entry); + const pkg = kids[0]; + if (!pkg || pkg.type !== 'sym_lit') return; + const pkgName = symParts(pkg, ctx.source).name; + for (const cls of kids.slice(1)) { + if (cls.type === 'sym_lit') { + ctx.createNode('import', `${pkgName}.${symParts(cls, ctx.source).name}`, cls, { + signature: getNodeText(entry, ctx.source).trim(), + }); + } + } +} + +/** `(defn name docstring? attr-map? [params] body)` or multi-arity `(defn name ([a] ...) ([a b] ...))`. */ +function handleDefn( + list: SyntaxNode, + kids: SyntaxNode[], + ctx: ExtractorContext, + privateForm: boolean +): void { + const nameSym = kids[1]; + if (!nameSym || nameSym.type !== 'sym_lit') { + for (const kid of kids.slice(1)) walkForm(kid, ctx); + return; + } + const name = symParts(nameSym, ctx.source).name; + const isPrivate = privateForm || hasPrivateMeta(nameSym, ctx.source); + + let docstring: string | undefined; + const arities: { params: SyntaxNode; body: SyntaxNode[] }[] = []; + + let i = 2; + if (kids[i]?.type === 'str_lit') { + docstring = stringContent(kids[i]!, ctx.source); + i++; + } + if (kids[i]?.type === 'map_lit') i++; // attr-map + + if (kids[i]?.type === 'vec_lit') { + arities.push({ params: kids[i]!, body: kids.slice(i + 1) }); + } else { + // Multi-arity: each remaining list is ([params] body...) + for (const arity of kids.slice(i)) { + if (arity.type !== 'list_lit') continue; + const arityKids = valueChildren(arity); + if (arityKids[0]?.type === 'vec_lit') { + arities.push({ params: arityKids[0]!, body: arityKids.slice(1) }); + } + } + } + + const fnNode = ctx.createNode('function', name, list, { + signature: arities.map((a) => getNodeText(a.params, ctx.source)).join(' ') || undefined, + docstring, + visibility: isPrivate ? 'private' : 'public', + isExported: !isPrivate, + }); + if (!fnNode) return; + const state = getNsState(list); + ctx.pushScope(fnNode.id); + for (const a of arities) walkBodyWithParams(a.params, a.body, ctx, state); + ctx.popScope(); +} + +/** `(def name value)` / `(defonce name value)` — var, constant, or function-valued def. */ +function handleDef(list: SyntaxNode, kids: SyntaxNode[], ctx: ExtractorContext): void { + const nameSym = kids[1]; + if (!nameSym || nameSym.type !== 'sym_lit') return; + const name = symParts(nameSym, ctx.source).name; + const isPrivate = hasPrivateMeta(nameSym, ctx.source); + + let docstring: string | undefined; + let valueIdx = 2; + if (kids.length > 3 && kids[2]?.type === 'str_lit') { + docstring = stringContent(kids[2]!, ctx.source); + valueIdx = 3; + } + const value = kids[valueIdx]; + + // (def handler (fn [req] ...)) / (def handler #(...)) — a function in disguise. + if (value) { + const isFnList = + value.type === 'list_lit' && + (() => { + const h = valueChildren(value)[0]; + if (!h || h.type !== 'sym_lit') return false; + const hn = symParts(h, ctx.source).name; + return hn === 'fn' || hn === 'fn*'; + })(); + if (isFnList || value.type === 'anon_fn_lit') { + const fnNode = ctx.createNode('function', name, list, { + docstring, + visibility: isPrivate ? 'private' : 'public', + isExported: !isPrivate, + }); + if (fnNode) { + ctx.pushScope(fnNode.id); + walkForm(value, ctx); + ctx.popScope(); + } + return; + } + } + + const kind = value && LITERAL_TYPES.has(value.type) ? 'constant' : 'variable'; + const defNode = ctx.createNode(kind, name, list, { + docstring, + visibility: isPrivate ? 'private' : 'public', + isExported: !isPrivate, + }); + if (defNode && value) { + ctx.pushScope(defNode.id); + walkForm(value, ctx); + ctx.popScope(); + } +} + +/** `(defprotocol Storage (put [this k v]) (fetch [this k]))` / `definterface`. */ +function handleProtocol( + list: SyntaxNode, + kids: SyntaxNode[], + ctx: ExtractorContext, + kind: 'protocol' | 'interface' +): void { + const nameSym = kids[1]; + if (!nameSym || nameSym.type !== 'sym_lit') return; + const name = symParts(nameSym, ctx.source).name; + const docNode = kids[2]?.type === 'str_lit' ? kids[2] : undefined; + + const protoNode = ctx.createNode(kind, name, list, { + docstring: docNode ? stringContent(docNode, ctx.source) : undefined, + isExported: true, + }); + if (!protoNode) return; + ctx.pushScope(protoNode.id); + for (const sig of kids.slice(2)) { + if (sig.type !== 'list_lit') continue; + const sigKids = valueChildren(sig); + const mSym = sigKids[0]; + if (!mSym || mSym.type !== 'sym_lit') continue; + const params = sigKids + .filter((k) => k.type === 'vec_lit') + .map((k) => getNodeText(k, ctx.source)) + .join(' '); + const mDoc = sigKids.find((k) => k.type === 'str_lit'); + ctx.createNode('method', symParts(mSym, ctx.source).name, sig, { + signature: params || undefined, + docstring: mDoc ? stringContent(mDoc, ctx.source) : undefined, + }); + } + ctx.popScope(); +} + +/** `(defrecord MemStore [state] Storage (put [_ k v] ...))` / `deftype`. */ +function handleRecord( + list: SyntaxNode, + kids: SyntaxNode[], + ctx: ExtractorContext, + isRecord: boolean +): void { + const nameSym = kids[1]; + if (!nameSym || nameSym.type !== 'sym_lit') return; + const name = symParts(nameSym, ctx.source).name; + const fieldsVec = kids[2]?.type === 'vec_lit' ? kids[2] : undefined; + + const classNode = ctx.createNode('class', name, list, { + signature: fieldsVec ? getNodeText(fieldsVec, ctx.source) : undefined, + isExported: true, + }); + if (!classNode) return; + + // defrecord implicitly defines positional + map constructors; creating them + // as function nodes lets `(->MemStore ...)` call sites resolve by name. + if (isRecord) { + ctx.createNode('function', `->${name}`, list, { + signature: fieldsVec ? getNodeText(fieldsVec, ctx.source) : undefined, + isExported: true, + }); + ctx.createNode('function', `map->${name}`, list, { isExported: true }); + } + + ctx.pushScope(classNode.id); + if (fieldsVec) { + for (const f of valueChildren(fieldsVec)) { + if (f.type === 'sym_lit') ctx.createNode('field', symParts(f, ctx.source).name, f); + } + } + for (const member of kids.slice(fieldsVec ? 3 : 2)) { + if (member.type === 'sym_lit') { + // Protocol / interface being implemented + emitRef(ctx, member, symParts(member, ctx.source).name, 'implements'); + } else if (member.type === 'list_lit') { + handleMethodImpl(member, ctx); + } + } + ctx.popScope(); +} + +/** `(put [_ k v] (swap! state assoc k v))` inside defrecord/deftype — a method node + body walk. */ +function handleMethodImpl(impl: SyntaxNode, ctx: ExtractorContext): void { + const kids = valueChildren(impl); + const mSym = kids[0]; + if (!mSym || mSym.type !== 'sym_lit') return; + const paramsVec = kids[1]?.type === 'vec_lit' ? kids[1] : undefined; + const mNode = ctx.createNode('method', symParts(mSym, ctx.source).name, impl, { + signature: paramsVec ? getNodeText(paramsVec, ctx.source) : undefined, + }); + if (!mNode) return; + ctx.pushScope(mNode.id); + walkBodyWithParams(paramsVec, kids.slice(paramsVec ? 2 : 1), ctx, getNsState(impl)); + ctx.popScope(); +} + +/** `(defmulti render :type)` — the dispatch entry point. */ +function handleDefmulti(list: SyntaxNode, kids: SyntaxNode[], ctx: ExtractorContext): void { + const nameSym = kids[1]; + if (!nameSym || nameSym.type !== 'sym_lit') return; + let docstring: string | undefined; + let dispatchIdx = 2; + if (kids[2]?.type === 'str_lit') { + docstring = stringContent(kids[2]!, ctx.source); + dispatchIdx = 3; + } + const dispatch = kids[dispatchIdx]; + const fnNode = ctx.createNode('function', symParts(nameSym, ctx.source).name, list, { + signature: dispatch ? getNodeText(dispatch, ctx.source) : undefined, + docstring, + isExported: true, + }); + if (fnNode && dispatch) { + ctx.pushScope(fnNode.id); + walkForm(dispatch, ctx); + ctx.popScope(); + } +} + +/** + * `(defmethod render :button [w] ...)` — an implementation of the multimethod + * (same name → overload). For a foreign multimethod — `(defmethod ig/init-key + * ::server ...)` — the node is named by the method name (`init-key`) with the + * dispatch value in the signature: slightly mislabeled (the local file doesn't + * own `init-key`), but deliberately kept — it's exactly what a search for the + * integrant/multimethod key needs to find. + */ +function handleDefmethod(list: SyntaxNode, kids: SyntaxNode[], ctx: ExtractorContext): void { + const nameSym = kids[1]; + if (!nameSym || nameSym.type !== 'sym_lit') return; + const dispatchVal = kids[2]; + const paramsIdx = kids.findIndex((k, idx) => idx >= 3 && k.type === 'vec_lit'); + const params = paramsIdx >= 0 ? getNodeText(kids[paramsIdx]!, ctx.source) : ''; + const fnNode = ctx.createNode('function', symParts(nameSym, ctx.source).name, list, { + signature: `${dispatchVal ? getNodeText(dispatchVal, ctx.source) : ''} ${params}`.trim() || undefined, + }); + if (!fnNode) return; + ctx.pushScope(fnNode.id); + walkBodyWithParams( + paramsIdx >= 0 ? kids[paramsIdx] : undefined, + kids.slice(paramsIdx >= 0 ? paramsIdx + 1 : 3), + ctx, + getNsState(list) + ); + ctx.popScope(); +} + +/** + * `reify` / `proxy` / `extend-protocol` / `extend-type` bodies: inline method + * impls `(method [args] body)` are anonymous — no nodes created, but their + * bodies are walked so calls attribute to the enclosing function. Other + * children walk normally. + */ +function handleInlineImpl(kids: SyntaxNode[], ctx: ExtractorContext): void { + for (const kid of kids.slice(1)) { + if (kid.type === 'list_lit') { + const implKids = valueChildren(kid); + if (implKids[0]?.type === 'sym_lit' && implKids[1]?.type === 'vec_lit') { + walkBodyWithParams(implKids[1], implKids.slice(2), ctx, getNsState(kid)); + continue; + } + } + walkForm(kid, ctx); + } +} + +// --------------------------------------------------------------------------- +// EDN data mode (.edn files: deps.edn, bb.edn, shadow-cljs.edn, system configs) +// --------------------------------------------------------------------------- + +/** + * Recursively collect qualified-symbol references (`app.core/init` in a + * shadow-cljs `:main`, integrant component maps, …) from an EDN value. + * Bare symbols are data, not code — only namespaced symbols are precise + * enough to reference. + */ +function scanEdnValueForRefs(node: SyntaxNode, ctx: ExtractorContext): void { + if (node.type === 'sym_lit') { + const { ns, name } = symParts(node, ctx.source); + if (ns) { + // `app.core/init` → app.core::init; single-segment ns (rare in EDN) is + // still emitted — there is no alias table in a data file to consult. + emitRef(ctx, node, `${ns}::${name}`, 'references'); + } + return; + } + for (const child of valueChildren(node)) scanEdnValueForRefs(child, ctx); +} + +/** + * EDN is data: never emit `calls`, never interpret list heads. Top-level map + * keys become `property` nodes (one level only, so multi-megabyte fixture + * files can't explode the graph), and every qualified symbol in their values + * becomes a `references` edge to the code it names. + */ +function handleEdnTopLevel(node: SyntaxNode, ctx: ExtractorContext): void { + if (node.type === 'map_lit' || node.type === 'ns_map_lit') { + const kids = valueChildren(node); + // A config map (deps.edn, shadow-cljs.edn, system.edn) has dozens of keys + // at most. Thousands of keys means a dataset (translation dicts, icon + // tables) — extracting those as property nodes explodes the graph with + // pure data (measured: logseq's locale dicts alone added 40k nodes). + // Skip the nodes entirely but still scan for code references. + if (kids.length / 2 > 64) { + scanEdnValueForRefs(node, ctx); + return; + } + for (let i = 0; i + 1 < kids.length; i += 2) { + const key = kids[i]!; + const value = kids[i + 1]!; + if (key.type !== 'kwd_lit') { + scanEdnValueForRefs(key, ctx); + scanEdnValueForRefs(value, ctx); + continue; + } + const keyText = getNodeText(key, ctx.source); + const valuePreview = getNodeText(value, ctx.source).replace(/\s+/g, ' '); + const prop = ctx.createNode('property', keyText, key, { + signature: valuePreview.length > 80 ? `${valuePreview.slice(0, 77)}...` : valuePreview, + }); + if (prop) { + ctx.pushScope(prop.id); + scanEdnValueForRefs(value, ctx); + ctx.popScope(); + } else { + scanEdnValueForRefs(value, ctx); + } + } + return; + } + // Top-level vector/list/etc. (fixture data) — refs only, no nodes. + scanEdnValueForRefs(node, ctx); +} + +/** `(letfn [(f [x] ...) (g [y] ...)] body)` — local fn bindings, then body. */ +function handleLetfn(kids: SyntaxNode[], ctx: ExtractorContext, state: NsState): void { + const frame = new Set(); + state.locals.push(frame); + const bindings = kids[1]; + if (bindings?.type === 'vec_lit') { + // The local fn NAMES are in scope in every binding body and the letfn + // body (mutual recursion), so collect them all before walking anything. + for (const binding of valueChildren(bindings)) { + if (binding.type !== 'list_lit') continue; + const bSym = valueChildren(binding)[0]; + if (bSym?.type === 'sym_lit') collectBindingNames(bSym, ctx.source, frame); + } + for (const binding of valueChildren(bindings)) { + if (binding.type !== 'list_lit') continue; + const bKids = valueChildren(binding); + const paramsVec = bKids[1]?.type === 'vec_lit' ? bKids[1] : undefined; + walkBodyWithParams(paramsVec, bKids.slice(paramsVec ? 2 : 1), ctx, state); + } + } + for (const form of kids.slice(2)) walkForm(form, ctx); + state.locals.pop(); +} + +export const clojureExtractor: LanguageExtractor = { + // The grammar has no semantic node types — everything routes through the + // visitNode hook below; the core's declarative dispatch never fires. + functionTypes: [], + classTypes: [], + methodTypes: [], + interfaceTypes: [], + structTypes: [], + enumTypes: [], + typeAliasTypes: [], + importTypes: [], + callTypes: [], + variableTypes: [], + nameField: 'name', + bodyField: 'body', + paramsField: 'params', + + visitNode: (node, ctx) => { + const t = node.type; + + // .edn files are pure data — property nodes + references, never calls. + if (ctx.filePath.endsWith('.edn')) { + if (t === 'source') return false; // walk top-level forms + if (t === 'comment' || t === 'dis_expr') return true; + handleEdnTopLevel(node, ctx); + return true; + } + + if (t === 'list_lit') { + handleList(node, ctx); + return true; + } + // Discarded (`#_form`) and quoted data are not code. + if (t === 'dis_expr' || t === 'quoting_lit') return true; + // Everything else (source root, top-level vecs/maps, reader conditionals) + // returns false so the core walks children and this hook sees the lists. + return false; + }, +}; diff --git a/src/extraction/languages/index.ts b/src/extraction/languages/index.ts index 543598b8e..85d11d65f 100644 --- a/src/extraction/languages/index.ts +++ b/src/extraction/languages/index.ts @@ -26,6 +26,7 @@ import { scalaExtractor } from './scala'; import { luaExtractor } from './lua'; import { luauExtractor } from './luau'; import { objcExtractor } from './objc'; +import { clojureExtractor } from './clojure'; export const EXTRACTORS: Partial> = { typescript: typescriptExtractor, @@ -49,4 +50,5 @@ export const EXTRACTORS: Partial> = { lua: luaExtractor, luau: luauExtractor, objc: objcExtractor, + clojure: clojureExtractor, }; diff --git a/src/extraction/wasm/README.md b/src/extraction/wasm/README.md new file mode 100644 index 000000000..16b0831bc --- /dev/null +++ b/src/extraction/wasm/README.md @@ -0,0 +1,38 @@ +# Vendored tree-sitter grammar wasm builds + +Grammars in this directory are vendored because `tree-sitter-wasms` either +doesn't ship them or ships a build with an ABI too old for our `web-tree-sitter` +(old-ABI wasms corrupt the shared WASM heap — see the Lua note in +`../grammars.ts`). Every vendored grammar must be listed in the vendored-path +branch of `loadGrammarsForLanguages` in `../grammars.ts`, and `copy-assets` +(run by `npm run build`) ships `*.wasm` from here into `dist/`. + +**Reproducibility:** each entry below records the exact source commit, +toolchain, and command used to produce the binary. When bumping a grammar, +verify it with `node scripts/add-lang/check-grammar.mjs ` +(ABI print + repeated-parse heap-corruption check) and update its entry. + +## tree-sitter-clojure.wasm + +- **Source:** https://github.com/sogaiu/tree-sitter-clojure + commit `e43eff80d17cf34852dcd92ca5e6986d23a7040f` (master, 2025-08-26) +- **ABI:** 14 +- **Toolchain:** tree-sitter CLI `0.26.9` (`npx --yes tree-sitter-cli`), which + downloads its own wasi-sdk; no Docker/emscripten required +- **Command:** + ```bash + git clone https://github.com/sogaiu/tree-sitter-clojure + cd tree-sitter-clojure && git checkout e43eff80d17cf34852dcd92ca5e6986d23a7040f + npx --yes tree-sitter-cli build --wasm # → tree-sitter-clojure.wasm + ``` +- **Why vendored:** no Clojure grammar in `tree-sitter-wasms`, and upstream + publishes no prebuilt wasm (the npm `tree-sitter-clojure` package is the + unmaintained oakmac grammar at ABI 9, which doesn't load in modern + web-tree-sitter). + +## tree-sitter-pascal.wasm · tree-sitter-scala.wasm · tree-sitter-lua.wasm · tree-sitter-luau.wasm + +Vendored before this README existed; provenance not recorded at the time. +Lua is the upstream ABI-15 build (the `tree-sitter-wasms` Lua is ABI 13 and +fails the heap-corruption check — see `../grammars.ts`). When any of these is +next bumped, record its full recipe here in the format above. diff --git a/src/extraction/wasm/tree-sitter-clojure.wasm b/src/extraction/wasm/tree-sitter-clojure.wasm new file mode 100755 index 0000000000000000000000000000000000000000..de50480c4ade97a99dc1d9b1396977585c7745bd GIT binary patch literal 101206 zcmeEvd7Kr+)pk|i83vdhHW3jNW25#;=UWA#w|v_=bTejx4Z7`JNMqNs_ z-BsOJoqEB!TH&vG!_$8|>g>_ybnl@XY?)hDrj_2I!6|y8o`{_~C76hv!aro3>xsb$ zWvLOPFKTe^h;z>y``bVTByB%z*x92`KYQ$m)5Z-uPnqze9z>dJW!QaSE=;s z*<~Ho$c`#KZ;c+=Gf?R}t%qW9jTN^{m+pTvf&%k64fObH!u5_1HXD zbUa}_=3OlwPg{>I*Nevn>+#YJ;_;I8cz1z#ykb4J%#d6*Iu8@$P4UoaDaq8UOi9P$8*y~!*$l|dTchZR$GsIOp-gT$Ll7S zHP+)R6XQPX@lWHi&U*aCD1F3wyl8-}w;uBijwh_gRmS6K>+!P5c7yd;ZE|_ZdTcTt zuUL<_jK@an@xH`#oxRO|7jNix%Vd}2IiTaS-Sl6lr+qls~y^?2WSEU+Hm zo5C%!9uJuqORUF_hV>QJW3%yCZ9S%%BzIbm>9&4YkNb?V@3S8NF&^uz$9j|G5$o}w z@mOy??lm4ySdW{Hj;F214aQ@G_4veiyktGzGaj#4k2{Rgjn?B<+~GL*e(6+*G<1*5e-IvD$iE zX*}+<9^aVKuCX3BnHW!4k7Xvt)7Ik?!`KGv@wM@I$$I?NBzeVp+;2QKT93z!$D7vU zMdPvAdi-LPzHdE#wr$^fEHFtvw;qecqq3>$I0=6$J!$sTNor)}*I#K}HnCdQ1Zh>1 zN=>X*odZ>)s}-HJZq9LeS5t-vRyAqb zta*!VT5j8FyVl#cX}iOY?W=1#blhp@PP=sOx@))Hy6@g&k6!z~rw6{T5C6>(ALt{G zKK8idPa1a0@H5XEdG6@*FTCjDOMW~4cb8r^Vd5k`Sx?cI>#2I0o~~!;nfeMnN6*#s z^ws(reXYJuU$5uu8}tHwqh6?Q(u?%Xda=GmFVRc&GQC`{(6{Q9dX-+SZ_~HyJM^9U z_xdhI;h ztrJH!3Hu*5U|_|d3i>OnpsaOz8vM)d3@Q#^g+VRkFI8<-8IIqmepL<9n@H6|L1p?H z^djLOaIz1M>u~&tLLS7)J9P2@PX10O_Xnh8J(Mgp0EUcAHyVNh?1xYkuv#~pN}W{I z+MrtZty&YL-=%c-W{QsV`dWAke8)s`o)gV=cUXy(oZJYB@l2F{pSA4oRi@ zr4Hp{aIsbfL9jFcw5tg{XP(}@wyL)z`Q;$w!ijgaCvAnRi$|u*`&Zo;gm-gkYWi2* z8-#c9dAHiCI{`dBX}5k=cLm`c>^Wn->R)vS?8DkNtycq>@>O)A5gFmNh>f!=*Q+6v z?oLybU*1*&Iqsy*h;nBTu3?35Q-V7v!HS6^(?NgGg!BKvBzM~=x05C$xZP;#&<`Ac zN>rj!A@ZBh@BuRKi|nZKg!h<;Qk|{?BaHTB@U}L%jkjL=Gk2iq0B9f#i-T~rNyiKy zv^rQF9K1UCb%x}n!76Nv;40JCLMu{|&bz^Vp!U_3>7_)zum{a40(bbxbY=glWfb>E z)X#Tx#)kVZ3zh}FmjwrL4*S#5Z=NZFPS*9gQf^DRjQ^Oku)MrnzX*bn7;4}Zv)&HqHymZxJ#pOmr&^lTxEI# z84pR|LD=@g5m@ISf@v|UA`sb%h+!e#Y#=@+5LX7_%?=`$VWBMp!>hVA2p5uJ<79ON zko5b61Jz3MpGyG$jhd$^0nTB=UlrDXz|)PK1Py3)5H5&TKqx{};u_GawrZudF>p(g zNxM-Gf>@|_>9YR7i*v{K0xFDfJ*mMMDh!k(!VGGNOKc5DXow3)6&fN{K$DK!8)wJu zaH;i2BO`s`d;=Z)sPv1(4{E!Rm#Hv(wo9}tVV{Ov5BpZ#9)!zLAG)XCqwcs#|F+O# zyMdbi9RltK(2AF#Bmu=n!$ZWb{i{}?KsY-Hy$YZ7!im+>t=8yC?SIrJK#&FkR3p|l#OhGOjmHx$Qa8+F`Gjy?U@4_ zUa)4XK(Hgq%dAD}wYhRHbJ}{ANPKQ{*`dC!a|bquBkcuAr*$Kj=Ce@VQjDTdW+; zK>Vkp@n=Tj<2)9BW)MzC{El3MMg5FObezYc&j`Y4h`w_q`t(S2oX4V155lR4{%jP@ zG#h8w7%oJc{#Db0@bV%WGop7TCefIu24NREMPt64aG)_?9)z9A3ypb-*O;kqMF<+R zO^0i+jZckvN`N}=8nZ)6jae`vH8tkR1vX}z4K-$)4>jf~gf-fjrv%|H1ezPO(Vn9* z+i2XFCzBFx%#(w#6JN35QAlryvXo^AX~#1kR9QBK3w_g+R{6 z5Xku?5y;oQx(VcHSp-(52{%>4^yjDnX5_ROHy-6v1C2*5=2qky_ypw*s3rHUn#6c! zq2*mVga&!kO2f%e@eip)bLjSK5^jRa5i`AoLVrQDFX`lGoP0wk|G~*W>EtJzd_gDw z#>v0v!qIIcH8b2qYG#RUBGbAi1wXlw)Zxp>S$t}SRQ&}4MnU0+ zax?+xagd?rQ4dw$4|MpRM1UR)3Hnz(5A~dZYxS#m;ARvcPHojoRs~n`4|Vvy%bDwQ zCJXALOdddHGUnmd`#OA=iyY)EvdIpW?KuX-khos)S?&RrBAu!nG zbm(0aJTU-HIj(BKRrmr*uqSu8=xshkmWV~YG@o>L-SrD}duKH7&V$vDjAnG;q?yK* z+!!&A-as_jLL=M|CE2_Uu0N1`(Epl@csR^i{zwC3=-Ui^DMr^Ly(i?KWQc5YrS6EL z(SriC>`Zo(jvSf(aI3t@^d)lQ+}=DxuIwEIcEM!I`z2H-a`A%7~xNMb=1N(iPFRtW>k<>qdEM9Ru|2Jni=OoeI{7q+5*!_2Fy#mP0^i5 z4IUqG|BQ&WRqsan%ps7%%?yGMCm^ifB;~I1S;QoD;oEf?r_g4FvPF%995dZv z|EfQ+anPZJe{y8PKyL|`vG!zOcB4+}30I)~O+mg? z4-xRDWCYrh+L!Q3&x@kj(h~!VnRq2*Jx*nb3~>A63De_jK5knkE7P__t!-QMFbYvk z5!pzOkv48#k0F2x#n8)YbU24?Q6dX*hWJ5xBZ{fgf2GXcM6UbtRcgYR^uD!K4-=2}3UkSrbdTDqzw7Yz zLZBc+0`i7d(7BP6Q!<@4h#ra|`h!KJW7)6E&VHi~X|$u#g)*^GS45dWP7x;7NxuBx z@-KuDx{W>L7V&-^uB$7%C!)l^N(1OjF&~I!cdsobW+@% zuqXtt=u{NJ{V@bj2To7aAdt!Ot z6(Rd2Dm0CWq)7wWwwaX3O$u>H|RQVOT9E&Y9jfzq7Xb8uh6Sv2$os|nF_rm znq9xD)nMQ;s&2S`0Lv$Q1cK zkDhpmDCX|@7K@)+#gHnn_a}n?XknUek&7FiIsKXY^;b;VNK35Qj(`gSuF!$u8l;d35q+?|CjR(1h1}DsI;D zeDR??Doma)z!ul)_*2C{SbLZ!kt|br`AOuEst0xWhq^TjIX7u}{#oZmCH5qddnwA% zK1m!>g$old3+Zx<49=ozU7mAb5a4b|HY6@=@K#k8E^+)d8G-@OEl$M1dNEa0+}`4u zH+vH*jNR6WDHF$T867-!Lt3Pxu^TS&@4>u=ej2+yM0Ar{+#n%1gw>1(joluk{AbX} zjYeQ}Uqnro?rW`;Xl`K<(vVudcq1AHZgVm7$GOs|?MBiP8@0{R;f);3!?t^+IBv|H z8M3)}JYHKsdN5wY^Ku<7;7ZUSry&rnrWTPR=J=7OXSwG36ZZ`VR$Wbsaa)rLuVGXj z`*S&e{^gfnD#oNc;~BrXvB9m&A_3c5Zs@@;!ax8NJO(;R5zRg}7*aLQ+*lApH%X!y z-WV8>FQ*~n5SIbbmBp6T#AGQXVx}lshN-zyZPHhuRU=>-Jx_c<4`g7u$-})gWK3j!H#hCqh_b@2wDKnaW!xE5=WXdARP!yo|P6Ju~qT@1Qxi9s%TtU((F z%mXK*EeC@x9y55oV{n$H>j-+*+zpzmgIju&`*9&X@_?Ev*^uZ;aef7#qaU3~8KED& zLWeWi51r@?DYAVUXahN`Uh*6cRZ>*o83okM;H2~*VtQ`6IIT>A({-8_DIs6FE545a z;AtOD3@Yj+9ULNc21SXHI@4y!)zajdgH#ufNuBOUouR|&fLuRPskI4b;1uQ)62K|v z15QB<<-!-HX~N6y(xUMk!3BcRX#|WZ4W~IKrxwH{U0slvn8aP3FnO=VLQJ}Rh)K>U z#w3WAESX7@XAUM^JTz;}=22GQv>)nEI5jVmZ25$TX;6Y|4ATV)kp-#36$*2AlmRG& z`pYeaQ-qM%gEPbemq>|T292Ap#i)@Dap*;31JT!KWV!-#lZIs2eu^V`vgiaQxt~mv z&S-6Ajz2;@pQ^)qSqoKNprkfR7kHBoJ*RdV&7cQploNB|$X!L(JRU9B91yX#3b)4G zSq#CMDLmI+#?PIqQ;er(LqK0RnKPG>-8C|5B;e*IGEc{m*Dl;Oh+QND*kw}2E|WM@ zbQcq)W3UWnsyNJY9nA8c!Z?;JLJx5n7u{X!w3r1es;TOuOcTkI`j82%FuzSA^`7tU z$tIJz?}id%lWUz#C=+86&V?<4Y!VG()U!d>ID@Rwp_G~p!b)k(WwGBz28B5$)X}r# zWu0q2Scjg=Ok=nd=+PFY>F`qW0p-7oQ*L_d{l}!c)=EO68Bx&|XatKjVc>?<;UO}Q z>8S0Z4qwd_5$WUelMXFZRG<=8iDCefQ6W|P-ysH345hy#luR*{CQ@v!0Zj4Tj^evb zA(>%D%2nP$WsWb`;rRM+Y zgMoXwJX1u4o@5k6nMS?HFnv)$OiLcXL`=I{MoeF9n1)iYU@A0}CNdqZ&@raDpdQl> zD)-@2ba+u^x)*LY?m$gR8kPVAu`O)GhI$CWg_JrZkPT(H=fOGHo@4rJG=F+<&z%SR zf$o4yk{RVm9NTm7vuLV3Aw4DI$~_(jy2yYVGnfSVvuriLES%*A&eWl-&;=P_VmO=i z6aN#8TB4uzQcY*Q$eBA7Qkv5MM81K}J<4hgiTXYs*R0eb9=}0iD~M|9&a7@W!lgX^ zOd_9i&0EmeX9-yF_SH- zDR--z$-az-&ZUz*0ePdY+d_bA&InxFY{2yu30(6%%XvI9c`8ozN=r2leyI08mw-m7 zURD^@=NXF=)1%#ujk(_ z0Wl)G;2QT4qp1pm^m9#l9OVM@>^TGCaptWf;bgRhMCe9rQkw+Smni0g8R%@bPi@uN;Yh;5<{{+;na9-~Hjh*uC}H!kVyZ$$GmO)} z>a6hW%EYTNXIZyx`dX9VDXJ+5jKj5|L(DjoF$zxCEEvZyR+n*5{ShC=F|eu8EXgts zno;Ia_7a{9G9FZ7GgFgz@xVQqG}g~E#*uuYj7eY|QxF)3Zx7$#z7lE3DB(>*kl~co_faN32NJNZq{ak&>F`~ zMasP$Qp+glW~k@HqPnch^=58 zV-gt06a>bRXR5Jl4rQD%j$y1Wipe<#tIg4~;nwCmzrv2(R&c8Co zk$lNGCSx+r>Bcxrn=uY4FUUBq0h`t$R-5V*`R;bLQ$AZIR@F>!iWJR zzlFxa@8VezpU`V-Bk&qDo(Eu_pB|U=vtY>#lqYpyOpq}HWmu3ITvcH+NL?WiTnJK3 zW-yxB48z0Iwt^W9`7ncF9cG|%zzilcm|;?)87NT>Gu&)j$8e~k5)jyk8E&k@3`j-A zHIRsA1BPfOokTMeU}z?V1<{NSH1bH~CLIb?rkNE}zD6_Cd}=tn4w^F~tk~nISZzO@ zt~R`g;56#m_^}ja2bzaxYAC2S%*47M#KU0NBt+e&6!0TnZ_HY;fkiEuF6Txa-V_^2 znK|_t-bw?;$}^MDOH<}1Gp1W|BgRifD4GuTP|$Bo)u~hk&C?~L*-h-&wFgKgv6$jh z5S?rCDMD*S&IK9_r-l~7()$OeRkEDvRFdz{Pa~~N^YHLgI>Q7xfwuGG1m+a-2cI&* zLR|}F%n_In@Jk?x+*86~lJz^Nb&5e0KpZ0w;uu11;ur-tb-|I7H;&}&-e92O3Zw)^>L zrU*&wei26P&uFA}Z&VVWq?5=boj8Rl`eUgnqyZ(sDWyO@Q z(ai0BXm~QU`-V9yD6+@Y>dEPk=~4hICB@^?7Sca(bKr_QOe!ucFci7K0z;7jNvX(fFW!;z#E$Wgy3-{r0RTqR$qbTJ(<4ZdS2yXJcy0!#8}|j&T4vZ zz5o&R6((ac5n+WdS?OdPqkYB6QjwT`qd7;NH?i0(Cy^!sauQXdMm3sVe^=>7IW1B! z(*`2yy2%SnQ%+!-^KndHE5&zAo1Owahe?RXbVI`vsJz537An4%sdMx+2z+PQy_RRk zBkW#NAiF0DZ^Z5iQt^r1<2`ndr`l?IVxu_+yCxR1dmN%OyT^qmRpJ$OjW_etl{g$i zBv&Bhot}-m!V( z{R8oSlWv1}$K;JiFXE1_^PA7>ENkoPps0+rPbyN^aH&)pdy&91bEqbcq-g4y03R zA*7R<2kFEqx-L=37+r%{C2LM1H`0hm6HBJPavz1+XMxJCg%wj@WHj?7VnBGjp1k-_W$q@e;0z}H7ik%(EduBk3&jYeYDs3c}d zCoxMpamvh+!ko;~JI}UhAYhiV z1G6SG)ZdXw%$hV{)mUI%cTxP>8DJ+QD8L>Kx zd3>`7k<8DEiCLqWne8tZ(I&~v_Mmb1dUGkGb#V-&kp9Ay4}6>Afo}opvSQdp1; zrpIi6pZbr0%uV5pX0>TKUd@ur`fDsG^ z7~!lWBbYR(^d>S0Cvy#B_|zETSXzS$X$fb=M7Yt+gxAU_GEWVs zk1wh8&zA5hMG$TZ2*OSIKsccQ;RXW;KO>27lLmyF$RL~^2U86v*b%~IO=n$v8->KT z(MWugO5&TDFnmK_L3CdpqkD2}0W9k!x{YF{yKngGI_ful$qFCwHP_`3OIaC7A-%dO zALus41KorIbQ=tydqfi5CJpE|kwJG~LpQ;W&|SxhS0sv5GossQB)Ul@(alU4x|5c+ zPWMJ9Q(|;uc@Qci)NyyhdMR6@nCU(wJh&jbzq7R)>tze3+Yko2P4PfCp#a?m1L!_I ziEfhybeqVa`w*n2+D)({bj$Ocx@xykNOT*GL^r7H}o+of0*slJvm1AM5>FH zZB|Qc8@q;O5_@?OzweU?__eJDV3gv{((1gpqk#HaaiGC_cg#O>M+Pdnt$%E)O zh=_htN%S*0hJNT{jdgX?mqqFSouK57gOw8fMk~{QKzLv>{hVK&ujip2A3!z+%lpe2 zSndT6EN9(6>_tIfc|TTW=Yd@a-XGf!5%5U1SGXV76cm47j*AX6;rTUJ3#xWVNt3;o z;YdqWFAMi474+hm$(Tqa{V)Ue&`2MoH92wPm#iG8cJj-?;m%-Rqy>fjs`d-_t%I#X zYGrT37op@bgD=VgeD$IXn6G`9gB*|N4wOQoRuacjD-Ymz?Pb1CUQF5tVIXzCa34!* zPga;%AAL^gwHwhbrg+Voa)_q@Grq8vT52Lk66MBXgBJuA??=jD>i${H_hS zA5!%@DJ)k`R!HkhjYd??AyxZ^J+}hghA>o5QvlGtKk%ayF#Mx>o}5ItNdvm69tqw1 z%a;&%4i)_`*?|g1#*1_Sg}lOn3x|+O{2DaGFR7%uYw82@oAmaE-~Wz}b^w>i8$HZ1 zE2R#=Xyp!o`ObTbT}YKSrQj~WR%yx(9A~QZNl6@=g5#oQB7;<7n@A;m5mGPAUyG55 z3TPA(sYWA_N-DWSkVe6zLSN2G>y!H1D5)3A`#?-8E2TU$6&I~pDsI!{hjoR_E$&ZR zxL@rR?oDT?xRf18-7nnBvVzJhq~w_JGUC;^b=a{IpHkLghe~{=S>ub;_&Bo;tEmHN zozo8VB}rm|ic7V?Zh$xpzbbLT7(Hgp7<>Rs!r}XCj6YIi@MS9Ufy^Ics3<$XGJO-S zMd6sr^o?|cd@~*KipKSH99^0I6%Oa-K_dFz!s}7(nQ+Rf@i~=dq{9m;0G8&%bb1#o zDd&s`JFS;bEFYB(9~ZRcV$D>t^D8ud^aCYxd8I1kds|$hvhoI%=|u+lLOPyXnO;oC zQI+W%=!kXrSI`klzW2uAY*F-M?oV-WCU)`IZZI9nfPSJItcNqK^Vk)K%sP)D{U9V=21fzRK-N&h*1!gjINv^f(G)6 zAWUAyABMyvq^W2%L7A9=?T$6WWC}jV5>=RiPkFJ@F11zd3O5PEWRrA8qd{&k$u2l# zOtK5LA~s2X&m4e;)Ht7_$cQz3BFU=^ZBWye*2vL z4HVNDvL@UvISyxE$hBc7`FcJYQpa!yA_Lck4mjZ2(1E~7kAiE%P-ike$t;F2>Cmt! z0fWPu6}Bdo1Riv4IK^n9bxV;2Ep%;=A(L-F&Lz4c&E!MZ29pt)vN~9mpIV6*AqS6OUBl!{W5u`-?2|n?yWO)%^ zG$fY<;whp9Nbrw`Ce-SFRojNgSKA^T9G>jkTGk$_C5Q=eJrhLTsv4t}p$&VP1ASK3+^BHeRi;v7@T5?HXoFZC)#Q3pD z#f)!*Wl30_q{GI<_>Ltq-Ymm-BquOV`G9fG#W8+NKE@;I!8j#^3W9u$&-ih+Kw$is zO!Z1K}w7tgUllxcHgQNG92YPF1aKyP7y6Yf`7DSym@$ZC4L`8hYc&S zd`yQGM11=aiLc5K&w5?wa+DC?EIi7hK<9E~KH?+kK|Cci#2@Jsf3z(Z6$=Z-j7Y>E zQ8@9D{D}AnQX>9HpZKPm<8XU@q_zpRV zH|c5MG9K;>1ka2y7eo&N@S`!f&8-m-f)|hQIEXUD zBUJZt7x^PoJlB(tfu}^oGYnd_Ms!gL$TKB(_p)B5b@~9Kes2y-zm8Y{MvuJrQhO;P zQI-$}VJnI8w@i#Z)t)ZKkgAsQLqMEGk6g6uP6#u!>=y2xD^~Y-tZu|Qmti-6!PluU zB0wSHT<)6Ta@R<5aM>l?HAj59*)rUBjmPhd-|WedN zf?7Nbc$_H4P2v=?86F2zzp@|1_Lb>ZDRMhWyCY#M#}Jec+sn@< zK%$#&NOWjjncf|T?YI^q0{5|LT^)=| zKZg5LjH6JXDpN^7;G`qX4%?j_UB0&F-zRp{GDzk zNpC~Yih*lGqpeFnlf?dlWQ7T!ACYWKy)U<_Msm-{T3BC=y)%`&_o@m+g=UPAD{yE# z0J|D`p5B0ZZqBg*r`uI-*^#XY!OeUFL_h@&hg6k^4Jh^|+z-gzmOlVPAOSOJsSBjU zv1^af!Pg#9B3&TjWd^k-65lOQb?2_HjQTwK#gz&<5SG6DRnRWA_-zhEs~wrX5MMUK znfbjD%BcHP^m8x`n@@2+2NSBw%4@XNv=~JJzL2Tt zw_)ghAH!&}mXhCvX-X<;`tzY%EzVKwvE0Hm$DKiw$C6A+%D$`kxg266xE~J02O!~$ z9_{i8m7dh$k~Eh3XjcTd3Rjt%^s*P%=s#OibPft}qOGZZr zH9WPwNt;6Ia^#R(gXGo@`H7~pvNWAhqNY(6I)^FPErkW0t1ZO4RgbmZmpK)U-KE)20$Ny`QD& z-4Zo@oTcf*5;c9ErRmcWHEqe#^kqq!QtB(dzU0zZ$|&*MEL`7|gey=#W@-AqL`^?u zY5J)|O+lG|zo5!WyI*LKrKzk$O^vfOH7rq6lPpbPiJDquX=+xYrdC;+T9&A(O_rwC zC2DGyrD=x}HFe0+R9&K`PFb3EDp6C{EKQwD)YLsoQ@0W|?U|*iM~RwxW@*~1L`}W2 zH0@iWrruea4k%I6!C9L6l&ERz^(wVlm<}ls(_vYf4lPmB;4Do8OVo5kmZrl?)O1Xi zrlU&KbV8P<<4V+&>)BCh*Sn!vm`*AY)9@@!r7p!6<4V*tK1Q)3Y>9Em70fdqO%Z3)9RJ zG0n-+G`mDi*JNp$SE8o*S(>gZQPaXKO$$oYv^Y!CqLMU~tL2!DNvS1e0ne}Orc-Js zwX^E0R%hj~vaChc3`$GPzqBzkHJQ1SJG0W>Ue=oR7c&R5J!W?_=6bfnzB%TDH0FHT zV&4+8LK^wr1?1&w4fajcTI^e^`>@|$J%D{1wGR8X>LKiRRF7caPCbf!d$k_>YV|nw zHR=iMJE;FaTX*#{_C3@u*zbv{CXGq&{jl$*dSSn-It2S#wHo`b$RUcWHT+trHrTgR zJ7C{RJqfMl>S^qosAsWntu|o4y?O!rHtHqp+p3qb-%-7SeLM9U_U+Y1?5ouq*w?5x zvG0&0|KCD?cXcWDJ=A5`@2MtWzgJFN`T2@6B^+%+o~_kp?Axk$v2Ul|$G*M#5c_KN zG4?g;Q|$kD`HC_n9DR;FTdOazZ>zRo-%fpneS7r{_SNcJ>}%Bb*w>e@{C3R7^&|3Z zt$xD3t@;`Jc1o2Kt^oULRfc_yYJh$5xVFI@CDkCY;D$hBMzb+ArBsN0Bh>`^#;O_i zO;ii)o2i!AH&?B&Z>g3bZ5Q=9_T5drct}$X{vYhSsVA_nRSnCfwLhr?wTs$M-RWAr z&0E;W&O1BTHpsQH+JW+&vF2XP%d zz{FUMH41(mTv?XC4qE(4+-0{RX}Oab`F43*&(jFqMZJjqe(GR@@eYUH=3F>)(r*9E zIWI`a`A?ji^c>G}{)d>HZ?AXG)kWr9YjR#w@0^b+GUr20&i7>HT&$gE7HQ|hOwPBt zoKtEbYALOoX~gSh?0OlyTBZ^?1hGC+*2gqdha+7xbtLx9bBr%!d~=NJF9w%~+}pOm zbw$Q`nVvRKLtWcAsjNBYNNaeK`{qE6bScg%FhwVqVy6OAOi9@mJ1JEds%u<|c?G5@ zK94D`9Fqzx$5@x*`~p)fbLCi4U^zZ^DLyPP#pf=?rv;@*spC-B%289TRZdDZGj%0j zOYuPi8^K;7w5++JXcH7>j8<$BheiBNTQbV~_ox-hZIQHACH&8}92D%@8 zW;gW!_PeS-U|*|FFfIC@NZUp2ruI=2UE2=SeGb>!f^Y?DolEgRfhitwDIO{?#d??G z(E?LE;Zi(aV2bIk&QC3{&Ohz)c(TAe7P}OS{?}5>ae2%xFppU-#moXzOmHbKEilCf zm*UxiQZ!W|YHPWY(L{DlKLa(ri8_nx`%J{lRo~ZfeLvmQ_hqQ%Qr{np)c02$u9pkK z6{yCpo-{15o=kUnO#NTWW1&N}pdeJui|$X}N3YUFeT@BX>MQK`F?V>`J&a{M%JV39 zT_0Iq*p+dyF5l0n8!6Pk9h^=F~_BtU0{lh&ibzvX#F=`iZ=>OG0UNv zSs^iQ)Qz0z2RZ*)QU0`;v+@lAm#+PNCALxDA5 zl*?mefqDGsQhZ-vil1GIp9)M7xN@k1%CYrEQnsIKmh;Lo!Qr~JAY6f}btw)hFhv8G zqO8CaOvjdy+VB?a}xTkm>*scBEAI;#wiw5*F<`NtJh{y<&i zYUH>AYvgE`$EbqxD5iHgiF=fjxkr&c<_4VSx)#SrKhbM(3x}^+LHGjI%B5&oV2U;_ zMe71nbZ{xE3rsP|)v*Z$)v;2Z0i5Nme@21UFa9~hc$dc|1?JJ!an-p%uFh~NPA@RU zaF^ng0#g)!cXNWvrO!1OK^+JKD4t9C;DKL-TF2w-_rs(BT>|0=po-V~+1*Itd z{)y&iyQm$x|8AGiXUh|sDdq}xR^+-h-|>81fjm!lz2nq6=37E@FDUbT*)s$a_|8wp zui5i?GY#5D9oo9|-M5wXt=!7`uGq@@p54m&F4)TYj^4`p?%vA!`fX)>HAU(xQ}X=D zj`uTTsSf20BC`(LV}(jm%uWe0XPMS>uWLQs6JpLbG4FFR_sWWSHOA^GHOIuv;9OZ2 z$60Bft3_9fM4uyH$IsPd#Ygsi>m%sBYVjtt%soX{!bEkB}oU~sAj|q4C3{X zveEm}@{~R^<5Xs5oVLeX+2ydJooA|03!$f(x*7ZCIqtLt@!)dN*#gLI&e^?J2NR3l z^Da}%8Q&6vFVn}ZEX&}tx>lN4e{`kj9@8ad&Can?Bvyr5%_X@bQj%z&`xtXS7&#F; zKkIo4Ve>v0#WVQfuPbZtj=&o3y-vHFBoeMs(|6LXKo zYRs|jGr9lS<({!|%Gmg%RGG7IJg4TY;{l`N1*an(t5YH#SuZ81&)`_=jIQS*x}vcT zElAh71;x6KV?AVY+z`nz%F|mYdv04sV+E}15u@uFr>jh@#Hyf{>M^btkDBmIjVM!3 za>&!zm#b&7uTW25yg)6gCHNnTx42T~`v@mt!qUg9MVAx^uizsJ&uwp%$Qu1!L2j)j={Zq`yQ59KXYnzLN2S3I8|y!z=dJ8PXuQv2ZuQJqq_8Ttg(@4ekiI zi{Tc*JqGt7TxFnC54fY@#=|XzdlK$5xMnG(_J%tFZX(G=hC2)HD!8?9 ze}nr8uBM?l)4h`8Mun3 zN_Bx70Cxr49dOUXRW<`YxO3oE!+ikPp}A7S;1D%^OuYvJyKdkXGHxNY#-N;kN^aF@be4|f;bQ*fK% zet>JyMyXxl4uLxv?jpFkaI4`SgL?z+Yq+p2c!KK#=i%Of`v~qEI9;Pu6S%f;o#FO|I~Z;V+(~d}!(9M39&R$+Ot`sl^Whf4 zErweLw-WAlxVzvoe{`h^a4pm5?WrG3VLfXZzR?q4^hBe_06nF~*L>(|iK`gG+agvq z`nXPzyCS?B{C0=i6K-$F`{39MaSp=qV8p3K+<`bA4tEsXaR@sZVW+^IhV!$~@1BR_ zIK&%|@Copr48NK1y9(|aIQpYKx+aaTMB|7UIKe-<4~-^|eqQ1Fo{@2mF*sePDJ_+g z*6Na3d%EwJrG`S5vbhG4tInT3PGAJ<>sSRP+W)C#-Yrth%86?~3-n8%9{W zV>G@8`hZ{IjfcHePxJ!&VijyJwZA$*9jJPvH|T@!pdGA!tq#HaE48X0#+TWD-Qe|K z_g@bFTL-pI9iSHWUoQXmIiPEl@6gMm^DfQd+RFM$P5<9?rSYDRH?nWO@Vdfm{h)X> z;>e|s{A_!Tp3}I|`sX{+n3%@6TjLM6-xwjpT^Gom;pj+?{;W*w&?r{M7xKN(#*k>X zE5t~H?kpNaz6)B{_+s10_@ccVU(h#2cft5#2i%9W!3d=_-aze%_sg~gKYP0I#U2=6 z^ichgY5+zjgD^@Ng8L78KjKJLt!glC=>YAWFp}8`BN{uVk@3uKF0YOlt59w`AP32< z1JbsJrtP4oo5^hm#xwnp*8pf90Gw?lqABG6hP-qHOT@8)7`s&VI{%XGPJpeuAk-AFgqmG}&J zm2RS&>SnsRZlSl)E%mm#mEKOb*4yhgx~<+p@2K19_PScv=nlH0-bwGQJLz3?XWd12 z)w}9$dNo z&Xj3qy#?ibA7y_VJp5A)*WcjoX_IP>e9zKnBmZ-NqO+zF?*i0RYAW%2cna5A|3_8m zji&6wfxHD6Na@$rG<_7x@`h>;9l}OKJr||?8{R?wyK0Owx6xmz?Z8b3lx|xv^qOi3 zX1-Rdy{e=FreiHsq zLwZ_2$I^4~-%bAozJJxP=vVb?`fqTr^dlkO#>G$3CiT+f7hV2U+KcPF1 z$Od!AHeZg1W+dq|w$7g3Xe!5@41G74{wncx;--+AW5dHUw5jdTL*U5?93J$bm#9Ho z+!^hn3t9oSzCCci@`c_8vF)}RobCsoH{kPkxOd^+bD?;$m2*>S59-$Ow>P$Zao5wE zr8HV>KS+mR8w8)j(Qh21j#nqb4TU>ZovzMQBh|U;e071kNc|SSU^)@;=>EDpYCUa& zs5|OYchsKls2Sb$doEPDGaH|dl=&VS4}kGE1zUxh8sliHnvN~Q(abtH+Pa8lsabU* zvR0d&U^ROFQ{l>9I!AQHEQ}+IXig%c|7)ueqIr30_SUspA@#^|bj|+>N7t$A3aUr* z^KukobYk01>Gde&dbB_-s1wn`I@?+^Jd3G6B1}=>ZxOb-T0dEBabAvE;3;tZ5n<8; ze@n0xf{6O;Wr_8Odj9$&k|%$K*6bB|Iog`7f2CSkCr7L6tVi_hwEi5?eMsRw~kd)ByJ+_2-E0&kEycO`TTTn)PU{T3ZlD_vPiN3Fd<8&k?3q3*?Bb z_CQ`lTaTl4>LF|uaM3a3L+TN1nemamt|g5j6=ugWn_NX&Vhq_5W5|{mL$<^i(uT+5 zT3<}$nla>~YCX0;!9B^0J)<_L7a_fj?N#_}gw9Pkz619@+()>l`2u%mU*jI-2i%+d zf;$*mY>?616891`sx1`}o+;pNt^#*1A&)GZ;9jl;?(JG)YXu(~XSc?&4crcJ?QqZ4 z0e4=VaOcz&cR1a354|V+cf&o%3bP5K6?!J*YvESGU1Y*mAg+xUJ(7kV2&2EfaNk#< zdg`dvdg^_#6=SurSZPs-`!L!BQ6`IJM!-Zjw{pw zF^&%4`_f`KIt%lmCu-RQ(TU*bL~wK>I64s=*-*?RyYtK$)?O&fQ6Jq0hz^c%bTIA` z4=IYHTFlmbV>Utb4Qlo`sDgt_2>wF1P~n+Lv)lr23s+R24c?rJF^L*?||q#Ao>o7z5^l~ zid8E)HmnnIh>p|80nrIDL?`HzuoZ)7uo{e~C$b5m?}6xhAo?DNz6T;3s@$3Np`${2 z%aQa)Lrq&dMGr@RG*q9CEmvDRgZrbi^hoqaqx5Kfz8}^UPT` zqVci*XuQ4@TP{TT`s`_1HbFEEh^7J2G$5J=L^c#FZrpk1EZS#J&=XLPCdD|Kq^DrZ zg(%5tx2jv!2rZi+8UaKjfM^5|jQ}DWigixzJaeXy-db&{o(e?MV~D0h+svX6Emcd^ zNYpOc1kp%n7zsopfoLQU*-+);SsyyWmJX5BqggRTv-E6i#c)Jd+EHc`M5BOc6cCL9 zqESF(LzS{;(POk8&Czqf(YzRQ(I$u%0?|SsS_ni7fyjnpWtTh8oMA1RL$pLM0itCwM9X-5 zRScpXF$$ldWfMd*fM^B~%>beqKx9Lei)Vf4NcZs8YErXT#;zhOQO?yxT}7y7+do01 zF=Epgv1yFhG)8PT6i=jc{87!0Uq$Sw_I5m7&o`>Qi?1SU^jcg+?$Zz8>H9cGA9RJ5O*py&R=WaLy8>3b0#>u3SV`&5GiMmV zJF7i``GgAfWDL=hm{)kZC`1>li`Dg7HbHbf5M2*M*8|b@Kx9L)R@9wm&ajHtA(DFZ zYz)z}dIPp%YBpJIjM)Uy7$6!0L}P$x3=r8+tUPw-nX_mQDTrP$h@>99pkKn43sJs$ zG#B$kvyQSxXcA4hg8;X^0~x&?@C0is)g=oTQdp-S1a z=rJ47rWm44dNa0S5z+OCHbJxmh?W4+5+GUvL^f0@dlo%rBYHQ6=v|(%EEdsnEt?=( z4n)g=XgLrq2O=AaHTmv5bC!+BUPa_yB>CA$zItSzbqJ#3pN-fn?T7k9^w}TBIQkfK zxStkfwGLQAaHEz@5ZwrlZUjd+f}V_$&;HyXlGgvZ{t{a*tL1C` zbhkj8AesV1Q-Ej+5KRFh8;WPK?mTlwRB;&4aHjn?mTmbcQTySe$qby(a$kNKhq+gpcq8EVWxd7?zm|aL}P(y zED((aqOm|^L-91tooCKqHHRp`Uxg|Qa&j>RRHJ5hh_(I>!A{tUY5l#ie(4;neWOhforBhY z4qE>?X#MA)_1jRqljzPfXLwuB)$C3|Cm`w^L)1Cwimez#eenLs*;tWBn;<$Hh|UJ0 zvw`SrAhMx&BhH;?&J@yHt91*y0a5oDqV7QtY{eiNrADbiS~fv62#5v&(I6li1VlCz z?;X1H%o(1%IYiR>_cVxPG_q%~7q(o8l13xc)@VM1HbFEHhz0`DKp+|jL^c#p_uYBs zjOvjcjR>Niv6|g8*cV%|HG3N^n;_Z-HG3P>>}^o9w?WOep;%dG;zf_qn%yht1*;tp zWg)>1GH>{XaG1G0FDNLqXFQ^hT;uRcb+-JJFRZSCLHyNanvU`7+bL% z(M&3Bf~X}pY6*^7f}@t;$cAE0*Tjn+qZ|pMLt-2q64YWVh9jy+{mmwb`h%nX;HW=1 z>JN@=DBg;8=b5wYdUR-vqeFwkuocVEYgn~Rn;?1(b4RaX?&vkl9leIRBO8i$Z*y$f z91Vfn;0SERa|o1^1m932;&fUQ`LTH$_*HbK-19JK;Rt-w($aAZUA9=wScJ!W%sQjDXMf}z-o z{;}f&Cw|_M5hG9u@#GGh?Y$d4FRGdKr{r1h5(Tb z#XIQkJad+f$j;NsbuD?GHs5v4&bkVs;^%4O*EKs&D~OWkY4ahnv#x@u_<7nmqSIqm zJ3Tl9Td`IfiaBuF1kq4fZ78fZ6jmDwtJzSc>{;{}wc1(1S-8^Ld0OEpd7d_jqyDPD zn|~EV#m>|AcURhx!AKyo^R$8}d7d^2(Ez-oW9MB3QL*#11014J!6+aa9qW%q2j^oe zraz*U@NMxv2yKF>E&8Lj=#Sc>KWdBq$cADCnu!-Z;;SLvng}~{EF2}z94B!^*M2(_ zEQpGqIkq)>Y%mrajf-(KF1QF=F&xdq8u#{iKZrI#)E*qQ2S@F}QG0M?L$Nx~#ETyB zwHe3JCBY>?G(LuCd~hkYVh~+}Z&K7~*#r^2?Nb9pH9%AYL^f2pGwVY~h4l8Gc0w=# zh$h7lO$w%9D+Up*gztbgceDwj4nWiah&ljK2OzSccq(h+MUM*UEuyKxR3MrjLo_{@ ziLDq!^VNLS5${CNCWtx$QAZ%^2t*x$$c8F+W_{?0@0U14vw~SbG&_c9b}$E9F^G6= zKHf{CO%PQ8Q3ViH08s@HNmvE)wr9~JzJ%it%?suMkzErZZ7q3CL{eKDgm;4M3I{<{ z?3#!{u0OgaxCV&qng~IZye1+E(Ga|uXxBIhqGH!X3~`9A3$6nqyCy;qC9jD{LUcIZ znX)S!1W~bTA`W+m<_GhE$gYVHM9FI+k`Vny{YKd}4uYuIH4(pYh!zA3fXL4G3!>!t z{vOa6E?W>K&t>OBWap*@QSo!xaYS}5TM#ABW#>a= zXQl;F@pIW0(V}1xTED%n$vkcHbuDS0cAZ*R&+FPcW#?%Z2aADdN$lFcBv^*6m}~zo zY8SlKE}L-l0iGXyfagaa;Q7%9cz$F<@om2xn?ia!K3Wm103tiPE>=sPT~D&wvFcbi zYc7b2on1fH>mlvzy3C6t&#vd27qN5Yf~fe}b^C;UWv~()*|}`tD0wbBiKEl-eKd{@nT|i`K*M+0x+4UrjPRASMcCK6y6+63ry5nd~um*_i?7ARIo?TBuG(wF~ zcCK6y6+62=!Xa84tOX)FyDo^5XV;SuouSTfv*v=R*xB_n93r{)-xsS#_XQ7NE2bXN zXrznT1W^~T+GZ_^f~0ZU>!JmD2C{v;1O)aAX3ZmklLt2gmk)7|CdXzlh zpHz>it<`tF-;R%-44#D5?CiQAN}gR$Lew5_wcELJ;i%Zz_4cl2KOH;`M0R#v5GBv9 zCn35-U83w|AhIL24Z#aRQ%Ff8wwD6>CdRPqMd+fCm`Ahh;{-Z8;Wm!n|RTqLVAlxTK}gpM4twqV=D#`ar6$}m8MM)y@MX| z9rTd&wX1i~L)uW~&a4j|qZ|pMFJp+l47Ok^7EvQDn;>cgjv9fZM&PIsII^Mm_L_+o zJw_3I6?_G&ePa;G_m92_zQvXcQSR>_VTCicO=c5Bo50Z~aI^^=Z30I&6kj>dv5|)O zRpk2^N8blOV#|dnU(N1@H^*rcMBTtqH*nMq9CZUnHdMJY>qEyVM`E>~Vu*eUe#TZT zq84~Rh&Dmg0*G1wQ41hy0Yo+wUkNwyqQ`7RDkUrHr1h&*fUQ_WKPuS-(T}j&kFeU0 zu-cEXnhnJ_*>h~!h{|G!%2Exm6^rNxe1m{CLG%L<{QyKi0MQRXWJB=<_#9g{qJ}X< z4O5M=6^rO=d>Mf@LG(2ceGNok1JTz&WJB?-`5ap|qA-RiOf|t)ETYZ$-W+X$XfqIP z2BOVCv>AwOsB&l4hmKK1%~H)$nbt4$D7p0~wY7=(;+SpKf~eTmKhgEsEmAFj$li+x zqU3v#Bt(E=zndA^j>$g^u9x~Z#Nj)T8MXVJ{wOYJ~w0DrgQSu#R5=Vr{ z-VX|*;_o2iS6aK;N5)6Vt9|l~kL>ClK~(%|A3Hv3nQDo8WM}+@qvRRCB#!vIPFm*h z1W~awep6gMYL#jQM0QqG5GBuwCLy|9UG8Q(1yQlHqL({Fty8Um$j%%KqU4$5Bt%p3 zWk5R-HXld!3Mz<-ztUQcc1Z02j_fK8;V5~PMiNK#JxROjLJ$?Z zN@KdmksTiiN6F)(d>q+vk02_3d}KLlmud%&>}nt3XmsLgpCpcEs2Oe*k02^`wa*NX zBRi@Uj*>^U`8cwpR6$hysMc~Mt>3QD6poTtXeMz)*EPH1QVaUL?+uUFjR z#O~LUSJmf3WLL=xqT*N8#}V0gA%vskcOmj2vhOwsqT=6$h$FJ^9SNf3_m1)*vhNWI zqT=5>iX*bmMx?DJKO4!{*6gzmK~(&+k$78c9jiyl@2}CxI&2m(os>W5s zezVThqr`94B_X;}UFp6>Cy0vuX5E$ERm8pvAtV3fcOmkP{O!99f~fd+A?(P%L#hKf zvQOA$1w!%@_M{aE-SLY=^?SnJUD*`~a$U2}A*E&~KZnd$v+Z+5;i&lMkhW&;l-da# z*=PR3QSvkYB#ye_7Yyq6%pbqF5PQOI-#e0el>FXNzItTeA`*^@fA7fFqfV(#;K)AD z7ON#c&rY)1gX%%|oLUeS`#k$W@5!kB{*ko)HEL-wm!m>YC~bME2WJbq6B*4NXCGD88VX`x}}`h;C9h zxo=|%qGG?Hd6PrbBh>?l>~}^5QSx_2lMvCD%I)_<1yQlz8C~QM?U~vWi0tgTAWEKH zPeOFFy4lT|3!-9Y*Kc-+_DbyqM0Wj=AWB|;l!WLOb&Fe9B#4S#e{_pO)HBr+i0u2c z(q|{XPn*`E3vl)RE9AENrMWQilPE9(SN^2)k=i0ZqtE{@2)b1aCG-#N~QsJ`zUTSUE6 zz0n`puMG*JT~{-{r?5A~2*zN00GlE285gy>29LP7n$ z$n>N`ByG*Umo12r-^)%yM6G|R*@UCw-^;dEJ2-VPII^o)1yS-U)+9vFspsmsiuF0i z(IKfrfXJ?f7DUObp_33jub!{xYUt-3qS{m~5ZM*uf+%^#coL!)@%t9_TQUBkLv(2B zP$06a>IG5qs`?~E^jjPCTUGzELv&c`Fd(w;LI|SdcOjAxy`o;J=erQEI7HIc>`Grj zl)Tb62@$om`mXe~ZEav`AULvL91ujwUmQq6bica4o?jfe-{Z)>nn-_1%wL>$%k z-7L$ISk1oQFNl)g@6U&*zVG*2M1xa(LRZBfyb;3tA8*zXhFy=u!1(J>P$%k?N!UsQK_TA zk$rbv5GB96o`mRO^>97kU4Pi)$i8zdh?3tqPC`T+)%Tra%h55ZW5AJpGgA;HznPhY z=qdG7J>SfH%5ijD>Np^>-#`#V$=^UoLi7-RoveP}KzPU@lBWNu7S!DQEm7f7mgn9+PjDe%Ljq ze%LpsKkS^-ANGd7wTH!@%?s*5&h)4cPJpJs3$r7QD1WUqu%87NBzm^ zk9w5TAN47xKk8LZf7CDX&nhqK8T>7eHbzF4-~UDw^Jiq|F`Lga@hdo9Mn2LDZ|KQa zdGZ`jmfu3o=)2krmtQ8%gn#7aFTWF<3ID_kmtXD8gn#OVpYP=-J!K~TZ@lnbJh_3V z?+7oPeqA%le-khM-#ocNEWXx$KTW^=$tL9^{czU`FCD?qtB(E>ZPyZkWBtfJ-JL|mwsPQ|7K6$+n(Ih)3==`Pxs=F^TNk? z^7)>8p2)nX-MJ#ORyrN)$(52H`_j(HF8*0w{1d(SMt1RQz4+gV%zN4y*~LFp!ktXN zfXYpbkMtXdErKO;j=w`$A5UKI$&Y#R<0A8(c7GDt z;c4f|R|p>VrJa#o{ENN#*L(4e?Bb8{!j0_0=X&8rcHxV>a3j0$#a_6PUHC0txRG7> z5-;4yE_}HcZe$nU!QDkzM?OUbvB6_%>d+kzIISFTVj^d?UN~ zExm9fyYT*AxRG7>Ymy)DX=h{?-p@;KWEZ}z7j9%1ev=n&WEW1~HIKFzBfId)UbvB6 z_z*AL$S!=S7j9%1-k-9Mw8sISJj;{Ydisp)^v(0)xA)>3*~P!ci(li#H?oU=ofp4@ z7vIP({(LWfM=!pSUHk=J{0c9=kzM>jUi=}Re7Gn7#*>fm-^echI}+dNGqMYB#Ai;#S9vIEro77~y^F9(WZu)x z$S%B_q<7&)cHu3&a3j0$A33EH@og9S%i#IJ3pcV$@U<6iWEZ~K3pcV0pD5u@p5)0> zJo$1@p6bcdJbAh&&+z0cJ-NFlck|>2J^2PtzR{Bxdh$)4yvUPp_T*bUd5tIU?a9x0 z^1Yti*^{62gd5q0Yr`+2pmr4JTk9?e!J~c?U#ZXd%!+=v z)---E&*W!im*4JQxRGsm***y0DU;sHHoUwS!gtApTiJ$}^+tHdOt_U@I6be3>Nm0r zr{^C#N75VFhL@B6&Y5s4+i*?y3(@i$*@gcK;SDnBwUKRj*@0gEMz-O>&Iqr_q_?sO zSL#bXv-Z*f^$(3A@r~^C_3*-t?851OCYs*JF1#ngD7(`@4L7n2r}h#JH?j+-_R=(>-^eyRMeQTn9~#+)mr?)GA(P(9HatcBZ;MR0m0dWM zf160SkzF{oua=Q;BfD^FU(s+QyKu5!+emsNyKriM(ey^P;oKi}%A~il4Np<|x66cE z*@aX8);bbyWEW2DbNfiRkzF{o&uF-jT{yMRsC|s=!l`{m!;S31seMMnjcmiqsC`81 zyOC{pirQy1y^&pbA1~a+KSp-p#8))j$S$1th=v>4hBu)4&^3eK$~HXJ$4hTy7oPUQjqJjS&uIQew&7f# zyJYgWvJEez_PawS+{!MT#y>ko!j0_0seQDIgd5p~Q~BFR!i^jYulB+ryKw3+Y9irA zcHvZCqV-qgz=j8X5l(-Y$An+D|416HW%4ueO>)eqOL~V#_k*s+89Y|D3DmziHydB6 zpG4+8?GEOcF1mU{Wasg!Ctok=UHD2b{8}&kYEPc)$#X>JJ?*X%+36qW>Az597eDRE zuXyp_@#Nv2{)@fvQ@!w0Jb9QW4;7jBv^&|8dwcqik#P2<)8jn(cuzjTlTQ?x_q037 zlMnLpds@O>ej7ab1yBBqC;wGs-qY@EPwwO8*Hyw@e$+lP^;xM-o=n&GX!uSd^PYAc zMRxk`^zvKlh5yA1|I7=2-IK?A@^3x4xhMbL(|@!lH}v8k=!M@C3)cbhRhDVLR_0XT zh2@kN4%vh&b+@O#XH3@G({E*$exDriBl&UsXnuFa;w!b5q5I)_DmNVr%)RY9c5aP-ih`*JUdetk3 zU#_RTd=B^Ihdue%p1i9kpWw-5o_u9Y4ooRBcz2G4D|IHz6Yv`dzfR)T;fceSAD66&R#BFh zqM8{zwS>}&XR7&$9hZ$%bt_}7W$P$f=>tZh)oZn0Ygeeei5Ov?S=)})9X4UuG&Uu_ z&B?$v4RKq^BAGge8)>_ssnd{1d7@S8Zck%GNfV@H%1d&hF&ZruQ)|OAzh-5bPYID0 zmeXcsSW2=e3rB{P!VDpJE-I!gQjBNXluXo4b3^{z%#h)18MaUpOQg2cW)&s1dA(wF zZ4zP%x~&8>QYo9Rja_Cv&U%;dl>0Kf`V{Lu>0!Mm*y`$C7Z-C@BwW}`for&H-*E=J z-sy&R9ql@9o9)fPogR`3*%CX2-{STjWxe+KaB_P#qC0E+Yx=@HWcqJbbPo&q-w(YX zJA8Js%jq3*6BWu=S^orb5cn&4q<#Hj725X}UVXL}I_x_BWMCVw)mtdhH=kqd+w+XY z7ohL$unO{@eonu5JMuinevPK^1`6m)(C49VLx$i-=mv5fd5gYUsf_%4B0 zz%Sq@@ET%X1u^u?;5RS?IrtUq00jzg16%|t*uuucno}n*iXS* zz=8L`-JeI`+d=Ff;05T*;A!wY_#3{f$h`>tVZc7=%=wbV9$IvUD6oY?O$<=vuGEGG z8^V^Ibe=uHxmS52q-?s2?JQ1QqN?PWHzn7Q)QR15)Mw&tCO*lucfnaMX>;yDUyu2Q z<-RqE1e0;!$HM!L?_q~I%6F;F7zN%^=Md*k#F83oKmH$Oe&DBJukWvnPV0n>?Hq!d zkhPQs@whXeNP)f*g@bT#Ccuh|z^20!C)b>-?CR1|UbwY79-zN2NjYl20%W*Ci8~QR zQI>>Z7RBeXXgND8PG@HxUpX7CM3jYbyb{A)j>7a5d$-ed!sJ+XhDwcdvitf6<5~`b zcKQA8j1lO{^^f_33ksDM22%xkL`G_r_Yah{bVbux`rWc_wj1P9$3oivQM6`faK31Z dK~_0kDA(_zccOo)gaS9N$oBWyB3A)zx?3%Yy0riR literal 0 HcmV?d00001 diff --git a/src/types.ts b/src/types.ts index e710e31a1..abd37421f 100644 --- a/src/types.ts +++ b/src/types.ts @@ -88,6 +88,7 @@ export const LANGUAGES = [ 'lua', 'luau', 'objc', + 'clojure', 'yaml', 'twig', 'xml', From 4ccdda6e5b86b7b5b44af38aede5b627a5290268 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ullrich=20Sch=C3=A4fer?= Date: Fri, 5 Jun 2026 20:20:33 +0200 Subject: [PATCH 2/3] fix(explore): resolve Clojure and monorepo query idioms in flow/seed tokens MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three layers, all surfaced by one real-world session on a 2k-file cljs monorepo (the agent's two explores returned wrong-subsystem noise and it fell back to 6 Reads): 1. Token charset: the symbol-token filter only accepted \w identifier chars, so NO Clojure symbol survived — not kebab-case (on-route-change+), not predicates (valid?), not alias-qualified set-state/dashboard or keyword :profile/logout forms. The flow builder and named-seed injection silently never ran on Clojure repos. Widened to the Lisp symbol alphabet with / qualifiers and optional leading colon; clj extensions added to the strip-file-extension list. 2. Module tokens: a bare token can name a NAMESPACE by its last segment — the Clojure norm ("the deactivate stage" = ns app.page.lifecycle.deactivate, whose fns are named per page type). Callable-only resolution made those tokens contribute nothing, or latch onto an unrelated same-named fn in another subsystem (the only function literally named `deactivate` was the SCIM backend's). Exact last-segment module matches now inject as file pointers and serve as location anchors. 3. Co-location: an ambiguous bare token used to resolve independently to its most-substantive def anywhere in the monorepo. The agent's bag of names describes ONE flow, so tokens are spatially coherent: ambiguous candidates now prefer max path-proximity to the anchor dirs (specific tokens' + module matches' locations), taking all ties up to 3 — when the per-stage overloads of one name all live beside the anchors (lifecycle activate/set-state/deactivate `dashboard`), they are all the answer. 4. Colon-less keywords: agents write the re-frame event `:app/set-page-state` as `app/set-page-state` (and then grep for it when the lookup misses). Namespaced slash tokens without a leading colon now resolve to the colon-prefixed registration node, in both findAllSymbols and exact matching — gated on the `/` so plain names can't be hijacked by same-named unqualified keywords. All four layers are covered by unit tests on a miniature monorepo fixture reproducing the failing session's shapes (__tests__/explore-clojure-tokens.test.ts), including the negative guards: a bare name is never hijacked by a same-named unqualified keyword (gated on the `/` in BOTH findAllSymbols and matchesSymbol), and the co-location pick is exercised on a genuinely ambiguous name with a bigger-bodied wrong-subsystem decoy. The failing session's exact query now renders all three lifecycle stage files with their call inventories in one explore payload instead of SCIM + playground noise. Controls intact: Alamofire's god-file multi-phase invariant (Request spine + Validation.validate + Session in one ~12K payload) and metabase's TS-side dashboard probe unchanged; full suite green. Co-Authored-By: Claude Opus 4.8 --- __tests__/explore-clojure-tokens.test.ts | 178 +++++++++++++++++++++++ src/mcp/tools.ts | 114 +++++++++++++-- 2 files changed, 280 insertions(+), 12 deletions(-) create mode 100644 __tests__/explore-clojure-tokens.test.ts diff --git a/__tests__/explore-clojure-tokens.test.ts b/__tests__/explore-clojure-tokens.test.ts new file mode 100644 index 000000000..de91777dc --- /dev/null +++ b/__tests__/explore-clojure-tokens.test.ts @@ -0,0 +1,178 @@ +/** + * Explore query handling for Clojure / monorepo idioms. + * + * Covers the four layers of the explore token fix: + * 1. Lisp-alphabet symbol tokens (kebab-case, `?`/`!`/`+`, `alias/name`) + * reach the named-seed injection instead of being filtered out. + * 2. A bare token naming a NAMESPACE by its last segment resolves to the + * module and pulls its file into the render. + * 3. An ambiguous bare token prefers the candidate co-located with the + * anchors (other tokens' locations) over a bigger-bodied def in an + * unrelated subsystem. + * 4. A colon-less namespaced keyword (`app/set-page-state`) resolves to the + * re-frame registration node `:app/set-page-state` — without letting a + * bare name be hijacked by a same-named unqualified keyword. + */ + +import { describe, it, expect, beforeAll, beforeEach, afterEach } from 'vitest'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; +import { initGrammars, loadAllGrammars } from '../src/extraction/grammars'; + +beforeAll(async () => { + await initGrammars(); + await loadAllGrammars(); +}); + +function hasSqliteBindings(): boolean { + try { + const { DatabaseSync } = require('node:sqlite'); + const db = new DatabaseSync(':memory:'); + db.close(); + return true; + } catch { + return false; + } +} +const HAS_SQLITE = hasSqliteBindings(); + +function tmpRoot(): string { + return fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-explore-clj-')); +} + +function rmTree(dir: string): void { + if (fs.existsSync(dir)) fs.rmSync(dir, { recursive: true, force: true }); +} + +/** + * A miniature monorepo with the shapes from the real-world failing session: + * - app/page/lifecycle/{activate,set_state}.cljs — per-stage `dashboard` fns + * (the ambiguous name) plus namespace-named stages. + * - app/page/hooks.cljs — a unique kebab fn (`on-route-change+`, the anchor) + * that dispatches the re-frame event. + * - backend/scim.clj — an unrelated subsystem with a LONGER same-named + * `dashboard` fn (the co-location trap) . + * - app/core/handlers.cljs — the re-frame registration `:app/set-page-state`. + */ +async function buildCljMonorepo(): Promise { + const root = tmpRoot(); + const w = (rel: string, content: string) => { + const p = path.join(root, rel); + fs.mkdirSync(path.dirname(p), { recursive: true }); + fs.writeFileSync(p, content); + }; + + w('frontend/src/app/page/hooks.cljs', `(ns app.page.hooks + (:require [re-frame.core :as rf])) + +(defn on-route-change+ [route] + (rf/dispatch [:app/set-page-state {:route route}])) +`); + w('frontend/src/app/page/lifecycle/activate.cljs', `(ns app.page.lifecycle.activate) + +(defn dashboard [ctx] + (assoc ctx :activated true)) +`); + w('frontend/src/app/page/lifecycle/set_state.cljs', `(ns app.page.lifecycle.set-state) + +(defn dashboard [ctx] + (assoc ctx :page-state :dashboard)) +`); + w('frontend/src/app/core/handlers.cljs', `(ns app.core.handlers + (:require [re-frame.core :as rf])) + +(rf/reg-event-fx :app/set-page-state + (fn [{:keys [db]} [_ state]] + {:db (assoc db :page-state state)})) + +(rf/reg-sub :dashboard + (fn [db _] (:dashboard db))) +`); + w('backend/src/backend/scim.clj', `(ns backend.scim) + +(defn dashboard [user opts audit log extra] + (let [a (str user) b (str opts) c (str audit) d (str log) e (str extra) + f (str a b) g (str c d) h (str e f) i (str g h)] + (str a b c d e f g h i))) + +(defn unrelated-one [] 1) +(defn unrelated-two [] 2) +(defn unrelated-three [] 3) +(defn unrelated-four [] 4) +`); + // A 4th `dashboard` def so the name is ambiguous (>3 defs) and the + // co-location pick actually runs — at <=3 defs ALL of them inject by design. + w('backend/src/backend/admin.clj', `(ns backend.admin) + +(defn dashboard [stats] + (str "admin" stats)) +`); + return root; +} + +describe.skipIf(!HAS_SQLITE)('explore — Clojure/monorepo query tokens', () => { + let projectRoot: string; + let cg: any; + let handler: any; + let findAllSymbols: (cg: any, s: string) => { nodes: any[]; note: string }; + + beforeEach(async () => { + projectRoot = await buildCljMonorepo(); + const CodeGraph = (await import('../src/index')).default; + const { ToolHandler } = await import('../src/mcp/tools'); + cg = CodeGraph.initSync(projectRoot, { + config: { include: ['**/*.clj', '**/*.cljs'], exclude: [] }, + }); + await cg.indexAll(); + handler = new ToolHandler(cg); + findAllSymbols = (handler as any).findAllSymbols.bind(handler); + }); + + afterEach(() => { + handler?.closeAll(); + cg?.destroy(); + rmTree(projectRoot); + }); + + async function explore(query: string): Promise { + const res = await handler.execute('codegraph_explore', { query }); + return res.content.map((c: any) => c.text ?? '').join('\n'); + } + + it('kebab-case tokens reach seed injection (named file renders)', async () => { + const out = await explore('on-route-change+ set-page-state route'); + expect(out).toContain('hooks.cljs'); + expect(out).toContain('on-route-change+'); + }); + + it('a namespace-segment token pulls the module file into the render', async () => { + // `set-state` is no function — only the ns app.page.lifecycle.set-state. + const out = await explore('on-route-change+ set-state dashboard'); + expect(out).toContain('set_state.cljs'); + }); + + it('an ambiguous bare token prefers the candidate co-located with anchors', async () => { + // `dashboard` defs: two lifecycle stage fns (small) + backend.scim's + // (largest body, wrong subsystem). The anchor `on-route-change+` lives in + // frontend/src/app/page, so the lifecycle defs must win the render and + // the SCIM file must not appear. + const out = await explore('on-route-change+ activate set-state dashboard page lifecycle'); + expect(out).toContain('lifecycle/activate.cljs'); + expect(out).not.toContain('scim.clj'); + }); + + it('a colon-less namespaced keyword resolves to the registration node', () => { + const { nodes } = findAllSymbols(cg, 'app/set-page-state'); + expect(nodes.length).toBeGreaterThanOrEqual(1); + expect(nodes[0].name).toBe(':app/set-page-state'); + }); + + it('a bare name is NOT hijacked by a same-named unqualified keyword', () => { + // `:dashboard` (reg-sub) exists AND fns named `dashboard` exist — the + // colon fallback must not preempt plain-name resolution for bare tokens. + const { nodes } = findAllSymbols(cg, 'dashboard'); + expect(nodes.length).toBeGreaterThanOrEqual(1); + expect(nodes.every((n: any) => n.name === 'dashboard')).toBe(true); + }); +}); diff --git a/src/mcp/tools.ts b/src/mcp/tools.ts index fc184132e..8bece9bd6 100644 --- a/src/mcp/tools.ts +++ b/src/mcp/tools.ts @@ -1287,18 +1287,25 @@ export class ToolHandler { // names (Class.method / Class::method) — the agent's most precise input, // resolved exactly by findAllSymbols. (The old strip mangled Class.method // into Class, throwing the method away.) - const FILE_EXT = /\.(?:java|kt|kts|ts|tsx|js|jsx|mjs|cjs|cs|py|go|rb|php|swift|rs|cpp|cc|cxx|c|h|hpp|scala|lua|dart|vue|svelte)$/i; + const FILE_EXT = /\.(?:java|kt|kts|ts|tsx|js|jsx|mjs|cjs|cs|py|go|rb|php|swift|rs|cpp|cc|cxx|c|h|hpp|scala|lua|dart|vue|svelte|clj|cljs|cljc|bb|edn)$/i; const tokens = [...new Set( query.split(/[\s,()[\]]+/) .map((t) => t.replace(FILE_EXT, '').trim()) - .filter((t) => t.length >= 3 && /^[A-Za-z_$][\w$]*(?:(?:::|\.)[\w$]+)*$/.test(t)) + // Symbol charset covers Lisp-family names too: kebab-case + // (`on-route-change+`), predicates (`valid?`), and Clojure + // alias-qualified `set-state/dashboard` / keyword `:profile/logout` + // forms. Without these, NO Clojure symbol passes the filter and the + // flow builder silently never runs on Clojure repos. Tokens that + // don't resolve to nodes are dropped downstream, so the wider + // charset admits no noise by itself. + .filter((t) => t.length >= 3 && /^:?[A-Za-z_$][\w$+!?*<>='-]*(?:(?:::|[./]):?[\w$+!?*<>='-]+)*$/.test(t)) )].slice(0, 16); if (tokens.length < 2) return EMPTY; // Pool of name SEGMENTS (Class + method from every token) used to // disambiguate an ambiguous SIMPLE name: keep a candidate only if its // CONTAINER class is itself named in the query. const segPool = new Set(); - for (const t of tokens) for (const s of t.toLowerCase().split(/::|\./)) if (s) segPool.add(s); + for (const t of tokens) for (const s of t.toLowerCase().split(/::|[./]/)) if (s) segPool.add(s); const named = new Map(); // Nodes whose token is SPECIFIC — a (near-)unique callable name (<=3 defs in // the whole graph). These are safe to SPARE a file on: the agent named THIS @@ -1635,14 +1642,21 @@ export class ToolHandler { // agent explicitly named is in the subgraph and its file is scored. const namedSeedIds = new Set(); { - const FILE_EXT = /\.(?:java|kt|kts|ts|tsx|js|jsx|mjs|cjs|cs|py|go|rb|php|swift|rs|cpp|cc|cxx|c|h|hpp|scala|lua|dart|vue|svelte)$/i; + const FILE_EXT = /\.(?:java|kt|kts|ts|tsx|js|jsx|mjs|cjs|cs|py|go|rb|php|swift|rs|cpp|cc|cxx|c|h|hpp|scala|lua|dart|vue|svelte|clj|cljs|cljc|bb|edn)$/i; const CALLABLE = new Set(['method', 'function', 'component', 'constructor']); const isTestPath = (p: string) => /(^|\/)(tests?|specs?|__tests__|testdata|mocks?|fixtures?)\//i.test(p) || /\.(test|spec)\.[a-z]+$/i.test(p); const bodyLines = (n: Node) => Math.max(0, (n.endLine ?? n.startLine) - n.startLine); const tokens = [...new Set( query.split(/[\s,()[\]]+/) .map((t) => t.replace(FILE_EXT, '').trim()) - .filter((t) => t.length >= 3 && /^[A-Za-z_$][\w$]*(?:(?:::|\.)[\w$]+)*$/.test(t)) + // Symbol charset covers Lisp-family names too: kebab-case + // (`on-route-change+`), predicates (`valid?`), and Clojure + // alias-qualified `set-state/dashboard` / keyword `:profile/logout` + // forms. Without these, NO Clojure symbol passes the filter and the + // flow builder silently never runs on Clojure repos. Tokens that + // don't resolve to nodes are dropped downstream, so the wider + // charset admits no noise by itself. + .filter((t) => t.length >= 3 && /^:?[A-Za-z_$][\w$+!?*<>='-]*(?:(?:::|[./]):?[\w$+!?*<>='-]+)*$/.test(t)) )].slice(0, 16); // PascalCase tokens in the query are type/file disambiguators — when the // agent writes "DataRequest task validate", the `task`/`validate` it wants @@ -1655,6 +1669,8 @@ export class ToolHandler { const lc = ct.toLowerCase(); return n.filePath.toLowerCase().includes(lc) || n.qualifiedName.toLowerCase().includes(lc); }); + // PASS 1 — resolve every token's candidate defs (no picking yet). + const perToken: { cands: Node[]; mods: Node[] }[] = []; for (const t of tokens) { // Enumerate ALL defs of a bare token via the direct index, not FTS — a // 50+-overload name (tokio `poll`) ranks the wanted def (`Harness::poll`) @@ -1666,19 +1682,78 @@ export class ToolHandler { const cands = raw .filter((n) => CALLABLE.has(n.kind) && !isTestPath(n.filePath)) .sort((a, b) => (bodyLines(b) > 1 ? 1 : 0) - (bodyLines(a) > 1 ? 1 : 0) || bodyLines(b) - bodyLines(a)); - // A specific name (<=3 defs) injects all its defs. An overloaded name - // (`validate` = 10, `request` = 44) would flood the subgraph, so inject - // only: the overloads whose file/class the query ALSO names (the agent - // told us which one it wants — DataRequest's, not Validation.swift's), - // capped; else fall back to the single most-substantive def. This is the - // explore-side mirror of codegraph_node's overload disambiguation. + // A token can also name a MODULE by its last segment — the Clojure norm + // ("the deactivate stage" = ns `app.page.lifecycle.deactivate`, whose + // fns are named per page type). Callable-only resolution makes those + // tokens contribute nothing, or worse, latch onto an unrelated same-name + // fn in another subsystem. A module match is a strong file pointer. + const last = t.toLowerCase(); + const mods = cg + .searchNodes(t, { limit: 20, kinds: ['module', 'namespace'] }) + .map((r) => r.node) + .filter( + (n) => + !isTestPath(n.filePath) && + lastQualifierPart(n.name).toLowerCase() === last + ) + .slice(0, 3); + perToken.push({ cands, mods }); + } + // Anchor directories: where the SPECIFIC tokens' defs live. The agent's + // bag of names describes ONE flow, so its tokens are spatially coherent — + // when `on-route-change+` (1 def) lives in app/page/, the `deactivate` + // the agent means is app/page/lifecycle's, not the SCIM backend's, even + // though the latter has the longer body. Without this, each ambiguous + // bare token resolved independently to its most-substantive def anywhere + // in the monorepo, dragging wrong-subsystem files into the render budget. + const anchorDirs: string[][] = []; + for (const { cands, mods } of perToken) { + if (cands.length >= 1 && cands.length <= 3) { + for (const n of cands) anchorDirs.push(n.filePath.toLowerCase().split('/').slice(0, -1)); + } + for (const n of mods) anchorDirs.push(n.filePath.toLowerCase().split('/').slice(0, -1)); + } + const sharedSegs = (a: string[], b: string[]) => { + let i = 0; + while (i < a.length && i < b.length && a[i] === b[i]) i++; + return i; + }; + const anchorProximity = (n: Node) => { + const dir = n.filePath.toLowerCase().split('/').slice(0, -1); + let best = 0; + for (const a of anchorDirs) best = Math.max(best, sharedSegs(dir, a)); + return best; + }; + + // PASS 2 — pick per token. A specific name (<=3 defs) injects all its + // defs. An overloaded name (`validate` = 10, `request` = 44) would flood + // the subgraph, so inject only: the overloads whose file/class the query + // ALSO names (the agent told us which one it wants — DataRequest's, not + // Validation.swift's); else the candidate co-located with the anchors + // (>=2 shared path segments so a bare repo-root match doesn't count); + // else the single most-substantive def. This is the explore-side mirror + // of codegraph_node's overload disambiguation. + for (const { cands, mods } of perToken) { let picks: Node[]; if (cands.length <= 3) { picks = cands; } else { const ctx = cands.filter(inNamedContext); - picks = ctx.length > 0 ? ctx.slice(0, 4) : cands.slice(0, 1); + if (ctx.length > 0) { + picks = ctx.slice(0, 4); + } else if (anchorDirs.length > 0) { + // All max-proximity co-located candidates (≥2 shared segments so a + // bare repo-root match doesn't count), capped — when the per-stage + // overloads of one name all live beside the anchors, they are ALL + // the answer (Clojure lifecycle stages, C++ per-backend overrides). + const ranked = [...cands].sort((a, b) => anchorProximity(b) - anchorProximity(a)); + const top = anchorProximity(ranked[0]!); + picks = top >= 2 ? ranked.filter((n) => anchorProximity(n) === top).slice(0, 3) : cands.slice(0, 1); + } else { + picks = cands.slice(0, 1); + } } + picks = picks.concat(mods); for (const n of picks) { if (!subgraph.nodes.has(n.id)) subgraph.nodes.set(n.id, n); // Mark as a named seed EVEN IF the FTS gather already had it — being @@ -2995,6 +3070,12 @@ export class ToolHandler { private matchesSymbol(node: Node, symbol: string): boolean { // Simple name match if (node.name === symbol) return true; + // Clojure keyword nodes (re-frame registrations) are named WITH the + // leading colon (`:app/set-page-state`), but agents habitually write the + // keyword without it. Match the colon-prefixed form too — gated on the + // `/` (namespaced-keyword shape) so a bare name like `dashboard` is never + // hijacked by a same-named unqualified keyword (`:dashboard`). + if (symbol.includes('/') && node.name === ':' + symbol) return true; // File basename match (e.g., "product-card" matches "product-card.liquid") if (node.kind === 'file' && node.name.replace(/\.[^.]+$/, '') === symbol) return true; @@ -3091,6 +3172,15 @@ export class ToolHandler { private findAllSymbols(cg: CodeGraph, symbol: string): { nodes: Node[]; note: string } { let results = cg.searchNodes(symbol, { limit: 50 }); + // A colon-less namespaced keyword (`app/set-page-state` for the re-frame + // event `:app/set-page-state`) resolves to the registration node + // directly. Gated on the `/` so a bare name like `dashboard` can never be + // hijacked by a same-named unqualified keyword. + if (!symbol.startsWith(':') && symbol.includes('/')) { + const kw = cg.getNodesByName(':' + symbol); + if (kw.length > 0) return { nodes: kw, note: '' }; + } + // Mirror the fallback in `findSymbol` for qualified queries — FTS // strips colons, so a module-qualified lookup needs a second pass // by the bare last part. From 3b5216390ed698098d71ee1aa20e738ced31716c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ullrich=20Sch=C3=A4fer?= Date: Fri, 5 Jun 2026 20:20:33 +0200 Subject: [PATCH 3/3] fix(tests): run vitest workers with --liftoff-only (turboshaft Zone OOM) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Every production launch path already pins V8's WASM compilation to the Liftoff baseline tier (src/extraction/wasm-runtime-flags.ts, #293/#298) — but vitest workers didn't, and suites that load every grammar in beforeAll (extraction.test.ts) could hit the turboshaft Zone OOM during background grammar compilation. Reproduced reliably on main on an arm64 Mac with Node 24.16: the tinypool worker dies mid-file ("Worker exited unexpectedly", 1 unhandled error) and ~90 tests at the end of the file silently never run. With execArgv: ['--liftoff-only'] on both pools the full suite runs clean: 59/59 files, 0 unhandled errors, and the previously-vanishing tests execute. Co-Authored-By: Claude Opus 4.8 --- vitest.config.ts | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/vitest.config.ts b/vitest.config.ts index 4a5ad904b..28011bce1 100644 --- a/vitest.config.ts +++ b/vitest.config.ts @@ -19,6 +19,22 @@ export default defineConfig({ * there, so the variable is a no-op. */ env: { CODEGRAPH_ALLOW_UNSAFE_NODE: '1' }, + /** + * Keep tree-sitter grammar compilation off V8's turboshaft optimizing + * tier inside test workers, exactly as every production launch path does + * (see src/extraction/wasm-runtime-flags.ts, issues #293/#298). Without + * it, suites that load many grammars (extraction.test.ts loads ALL of + * them in beforeAll) can abort the worker with the turboshaft Zone OOM — + * observed reliably on an arm64 Mac with Node 24: the worker dies mid- + * file and the remaining tests silently never run ("Worker exited + * unexpectedly", ~90 tests vanish from the count). The flag must be on + * the node command line, so it has to go through execArgv — NODE_OPTIONS + * disallows it and runtime v8.setFlagsFromString is too late. + */ + poolOptions: { + forks: { execArgv: ['--liftoff-only'] }, + threads: { execArgv: ['--liftoff-only'] }, + }, coverage: { provider: 'v8', reporter: ['text', 'json', 'html'],