fix(java-extractor): strip inner call args from chained method-call callee#443
fix(java-extractor): strip inner call args from chained method-call callee#443tirth8205 wants to merge 2 commits into
Conversation
…allee
For a chained call like `builder().build()`, the method_invocation's
`object` field is itself a method_invocation node, so building the callee
as `${objectNode.text}.${nameNode.text}` produced the malformed callee
"builder().build" (parentheses and inner args embedded in the name).
Guard the qualified-call branch so it only prefixes the receiver when the
object is not itself a method_invocation. Chained calls now yield a clean
method name ("build") for the outer call plus the existing well-formed
entry for the inner call ("builder"). Simple receivers such as
`System.out.println` (object type field_access) are unaffected.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
thejesh23
left a comment
There was a problem hiding this comment.
1. Other receiver node types still leak () into the callee.
The guard only excludes method_invocation, but new Foo().bar() (object type object_creation_expression), ((Foo) x).bar() (parenthesized_expression), and (Foo) x.bar() (cast_expression) all have .text containing ()/cast tokens. The PR's own invariant callee.includes("()") === false will be violated by any of these. Consider an allowlist of "simple" receiver types (identifier / field_access / scoped_identifier / this / super) instead of a single-type denylist.
2. super.foo() and explicit-type-args calls aren't covered by tests.
super.foo() (object type super) and obj.<T>foo() / Foo.<T>bar() (type_arguments between object and name) are common in Java but untested here; please add at least a super.bar() assertion so a future grammar/refactor doesn't silently break qualified-super callees.
3. Dropping the receiver hurts call-graph disambiguation.
Replacing builder().build with bare build collapses every fluent terminal into one node, losing the receiver type entirely; downstream consumers can no longer distinguish a.build() vs b.build(). A receiver-typed form like <chain>.build or recursing into the inner method_invocation to extract its name (e.g. builder.build) would preserve more signal. Worth noting since #435's Dart extractor will hit the same chained/cascade shape.
Nit: the new test only asserts presence of "build" and absence of "()"; an explicit assertion that the inner "builder" callee is also emitted would lock in the "inner call still emits its own entry" claim from the PR description.
The chained-call fix used a single-type denylist (`!== method_invocation`), which still leaked `()`/cast tokens into the callee for other complex receivers: `new Foo().bar()` (object_creation_expression) and `((Foo) x).bar()` (parenthesized_expression) both took the qualified branch and produced callees containing `()`, violating the extractor's own "callee is a clean identifier" invariant. Replace the denylist with an allowlist of simple receiver types (identifier / field_access / scoped_identifier / this / super). Only those get the `<receiver>.<name>` qualified form; everything else collapses to the bare method name. Document the intentional receiver-drop and a TODO for receiver-typed callees / call-graph disambiguation (Egonex-AI#435). Tests: lock the inner `builder` entry of a chain, no-`()` for object-creation and parenthesized/cast receivers, and preserved qualified callees for `super.bar()` / `this.bar()`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Addressed all four points; pushed as 78a3a8f. 1. Other receiver types leaking Minor note: 2. 3. Receiver drop / call-graph disambiguation. Left as a deliberate tradeoff for this PR and documented it with a TODO in Nit. The chained-call test now also asserts Core suite green: 697 passed; |
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 78a3a8fd66
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| */ | ||
| private static readonly SIMPLE_RECEIVER_TYPES = new Set([ | ||
| "identifier", | ||
| "field_access", |
There was a problem hiding this comment.
Reject complex field_access receivers
For chained calls that go through a field before the final method, e.g. builder().result.build() or ((Bar) x).result.build(), the final invocation's object is a field_access whose text still includes the complex receiver (builder().result / ((Bar) x).result). Because field_access is allowlisted here, extractMethodInvocationName still returns callees containing () or cast tokens, so the new cleanup only works for direct builder().build() but not common field-in-chain variants.
Useful? React with 👍 / 👎.
|
@tirth8205 Overall PR looks good to me after recent commit , I have just local test on a small JAVA project. |
Problem
JavaExtractor.extractMethodInvocationNamebuilds the callee for a qualified call as${objectNode.text}.${nameNode.text}. When the receiver (theobjectfield) is itself amethod_invocation— i.e. a chained / fluent / builder-style call such asrepository.findAll().forEach(handler)orbuilder().build()—objectNode.textis the full source text of the inner call including its parentheses and arguments.The result is a malformed callee string that embeds
()and argument text, e.g.:builder().build()-> callee"builder().build"repository.findAll().forEach(handler)-> callee"repository.findAll().forEach"A call-graph callee should be a method name (optionally qualified by a simple receiver), never a string containing
(). This pollutes the call graph with un-resolvable, parenthesis-laden callee names for any fluent/builder/stream-style Java code. The existing qualified-call test only passes becauseSystem.out.printlnhas afield_accessobject (System.out), not amethod_invocationobject.Fix
Guard the qualified-call branch so it only prefixes the receiver when the
objectis not itself amethod_invocation:This keeps
System.out.println(object typefield_access) and plain calls unchanged, and turnsbuilder().build()into calleebuild(the innerbuilder()call still emits its own well-formed entry).Testing
extractCallGraphthat parsesbuilder().build();and asserts a callee"build"exists and that no callee contains"()". This test fails before the fix (the outer call's callee is the malformed"builder().build") and passes after.tsc --noEmitonpackages/coreexits 0 and ESLint on both changed files is clean.🤖 Generated with Claude Code