Skip to content

Analytics engine explain api#7

Draft
finnegancarroll wants to merge 115 commits into
finnegancarroll:mainfrom
mch2:analytics-engine-explain-api
Draft

Analytics engine explain api#7
finnegancarroll wants to merge 115 commits into
finnegancarroll:mainfrom
mch2:analytics-engine-explain-api

Conversation

@finnegancarroll
Copy link
Copy Markdown
Owner

Description

[Describe what this change achieves]

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

vinaykpud and others added 30 commits April 24, 2026 14:30
* Added substrait converter for the fragments

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>

* spotless fix

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>

* fixed pr comments

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>

* fix failing test

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>

---------

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
…project#21350)

ArrayList.removeAll(ArrayList) is O(n*m) due to linear contains() checks. Wrap the argument in HashSet for O(1) lookups, reducing the complexity to O(n). This was causing  CPU spikes on the remote_purge thread when metadata file counts grew large.

Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>
Signed-off-by: Andriy Redko <drreta@gmail.com>
…#21249)

Signed-off-by: Divya <divyruhil999@gmail.com>
Co-authored-by: DIVYA2 <DIVYA2@ibm.com>
Co-authored-by: Divya <divyruhil999@gmail.com>
Co-authored-by: Andrew Ross <andrross@amazon.com>
… env param instead of org.bouncycastle.fips.approved_only (opensearch-project#21366)

Signed-off-by: Craig Perkins <cwperx@amazon.com>
… LuceneTestCase (opensearch-project#21363)

BlockTransferManagerTests was extending LuceneTestCase directly which causes
sysout check failures since the test uses loggers that print to console.
Changed to extend OpenSearchTestCase which already includes
@SuppressSysoutChecks and follows the project convention for all server tests.

Signed-off-by: Mayank Harsh <mayankmh@amazon.com>
Co-authored-by: Mayank Harsh <mayankmh@amazon.com>
…ject#21128)

* Adding CompositeMergeHandler and CompositeMergePolicy

Signed-off-by: Sagar Darji <darsaga@amazon.com>

# Conflicts:
#	sandbox/plugins/composite-engine/src/main/java/org/opensearch/composite/CompositeIndexingExecutionEngine.java

* Addressing comments

Signed-off-by: Sagar Darji <darsaga@amazon.com>

* Split the monolithic CompositeMergeHandler into classes with clear responsibilities:

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>

* Fix up tests

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>

* Addressing commits

Signed-off-by: Sagar Darji <darsaga@amazon.com>

* Integrating the merge flow with the DataFormatAwareEngine

Signed-off-by: Sagar Darji <darsaga@amazon.com>

# Conflicts:
#	server/src/main/java/org/opensearch/index/engine/DataFormatAwareEngine.java
#	server/src/main/java/org/opensearch/index/engine/exec/coord/CatalogSnapshotManager.java
#	server/src/test/java/org/opensearch/index/engine/exec/coord/CatalogSnapshotManagerTests.java

# Conflicts:
#	server/src/main/java/org/opensearch/index/engine/DataFormatAwareEngine.java

* Addressed the comments

Signed-off-by: Sagar Darji <darsaga@amazon.com>

---------

Signed-off-by: Sagar Darji <darsaga@amazon.com>
Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
Co-authored-by: Sagar Darji <darsaga@amazon.com>
Co-authored-by: Bukhtawar Khan <bukhtawa@amazon.com>
…1408)

* Fix incorrect defaults in FieldStorageResolver.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* test fixes

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

---------

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
…nsearch-project#21214)

Bumps [com.nimbusds:nimbus-jose-jwt](https://bitbucket.org/connect2id/nimbus-jose-jwt) from 10.8 to 10.9.
- [Changelog](https://bitbucket.org/connect2id/nimbus-jose-jwt/src/master/CHANGELOG.txt)
- [Commits](https://bitbucket.org/connect2id/nimbus-jose-jwt/branches/compare/10.9..10.8)

---
updated-dependencies:
- dependency-name: com.nimbusds:nimbus-jose-jwt
  dependency-version: '10.9'
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Craig Perkins <cwperx@amazon.com>
…project#21291)

Bumps com.google.protobuf from 0.9.6 to 0.10.0.

---
updated-dependencies:
- dependency-name: com.google.protobuf
  dependency-version: 0.10.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sandesh Kumar <sandeshkr419@gmail.com>
Co-authored-by: Craig Perkins <cwperx@amazon.com>
…rch-project#21407)

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
* Onboard issue dedupe workflow (OpenSearch)

Signed-off-by: Peter Zhu <zhujiaxi@amazon.com>

* Add missing ---

Signed-off-by: Peter Zhu <zhujiaxi@amazon.com>

---------

Signed-off-by: Peter Zhu <zhujiaxi@amazon.com>
* Update logic in FIPS bootstrap check

Signed-off-by: Terry Quigley <terry.quigley@sas.com>
…in /plugins/repository-hdfs (opensearch-project#21213)

* Bump org.apache.commons:commons-configuration2

Bumps org.apache.commons:commons-configuration2 from 2.13.0 to 2.14.0.

---
updated-dependencies:
- dependency-name: org.apache.commons:commons-configuration2
  dependency-version: 2.14.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updating SHAs

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Craig Perkins <cwperx@amazon.com>
* update DefaultPlanExecutor to be async.

This change updates DefaultPlanExecutor to accept an actionlistener
from front-end plugins.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* automated code review fixes

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

---------

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
…ses this timeout during search exit when queries timesout (opensearch-project#21316)

Signed-off-by: Navneet Verma <navneev@amazon.com>
* coord-reduce: end-to-end streaming sink + IT

- DatafusionReduceSink: native streaming ExchangeSink via Substrait + Arrow C-Data
- Wire engine phases (PlanForker, FragmentConversionDriver) in DefaultPlanExecutor
- Convertor: stage-input rewrite, partial-agg/final-agg methods
- LocalStageExecution: outputSource wired to downstream; no premature close
- CoordinatorReduceIT: 2-shard parquet-backed composite index smoke test

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* add memtable implementation

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* checkpoint

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* move test-ppl-plugin to a top lvl plugin to be reused across qa and analytics-engine ITs.

This also removes PushDownPlanner, which is no longer used.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* unwanted changes

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* reduce duplication between sink implementations.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* Minor improvements to coordinator reduce

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>

* Fix up

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>

* Fix up

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>

* Fix up

Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>

* add explicit drain thread and ensure cpu runtime is used for query execution

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* spotless

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* fix: seed Calcite THREAD_PROVIDERS in DefaultPlanExecutor.executeInternal to prevent NPE on SEARCH-pool thread

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* fix merge conflict

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

---------

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
Signed-off-by: Bukhtawar Khan <bukhtawa@amazon.com>
Co-authored-by: Bukhtawar Khan <bukhtawa@amazon.com>
Co-authored-by: Bukhtawar Khan <bukhtawar7152@gmail.com>
…h-project#21425)

* support configurable kafka metadata read timeout value

---------

Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
…arm tiered storage (opensearch-project#21332)

- TieredStorageSearchSlowLog: SearchOperationListener for slow query/fetch logging on warm data
- TieredStoragePerQueryMetricImpl: per-query cache hit/miss, prefetch, read-ahead tracking
- TieredStorageQueryMetricService: singleton service managing metric collectors across threads
- PrefetchStats: completed with StreamInput constructor, writeTo, getters, toXContent
- TierActionMetrics: migration success/rejection/latency metrics

Modified existing files:
- OnDemandPrefetchBlockSnapshotIndexInput: replaced metric TODOs with actual recording calls
- StoredFieldsPrefetch: added recordStoredFieldsPrefetch calls
- IndexScopedSettings: registered 10 slow log index-scoped settings

Signed-off-by: Mayank Harsh <mayankmh@amazon.com>
…on (opensearch-project#21438)

* Unify FilterOperator into ScalarFunction and add BackendPlanAdapter SPI

Signed-off-by: expani <anijainc@amazon.com>

* Added tests based on review comments and add more docs

Signed-off-by: expani <anijainc@amazon.com>

* Added a catch-all for Category and removed unused function

Signed-off-by: expani <anijainc@amazon.com>

* Ensured Annotations are intact after adaption

Signed-off-by: expani <anijainc@amazon.com>

---------

Signed-off-by: expani <anijainc@amazon.com>
this resolves CVE-2026-40542

Signed-off-by: Ralph Ursprung <Ralph.Ursprung@avaloq.com>
… and store (opensearch-project#21232)

* Share single FormatChecksumStrategy instance per shard, between engine and directory

Signed-off-by: Kamal Nayan <askkamal@amazon.com>

* Minor refactoring

Signed-off-by: Kamal Nayan <askkamal@amazon.com>

* Added Tests for checkum strategy sharing

Signed-off-by: Kamal Nayan <askkamal@amazon.com>

* Add checksum read-back assertion to shared strategy test

Signed-off-by: Kamal Nayan <askkamal@amazon.com>

* Fix indexingEngine call in CompositeIndexingExecutionEngineTests for updated API

Signed-off-by: Kamal Nayan <askkamal@amazon.com>

* Fix compilation after rebase with upstream Lucene engine plugin

Signed-off-by: Kamal Nayan <askkamal@amazon.com>

* Updated DataFormatRegistry tests to increase test coverage

Signed-off-by: Kamal Nayan <askkamal@amazon.com>

* updated the code to use supplier for DataFormatDescriptors for lazy initialization

Signed-off-by: Kamal Nayan <askkamal@amazon.com>

---------

Signed-off-by: Kamal Nayan <askkamal@amazon.com>
Co-authored-by: Kamal Nayan <askkamal@amazon.com>
update log4j2 to 2.25.4

Signed-off-by: Ralph Ursprung <Ralph.Ursprung@avaloq.com>
Signed-off-by: Andriy Redko <drreta@gmail.com>
…21079)

* add parquet merge support through a K way merge streaming merge sort

Signed-off-by: Shailesh-Kumar-Singh <shaileshkumarsingh260@gmail.com>

* add tests

Signed-off-by: Shailesh-Kumar-Singh <shaileshkumarsingh260@gmail.com>

* add ColumnMapping optimization

Signed-off-by: Shailesh-Kumar-Singh <shaileshkumarsingh260@gmail.com>

* fix sync_to_disk test

Signed-off-by: Shailesh-Kumar-Singh <shaileshkumarsingh260@gmail.com>

* refractor change, add ParquetSortConfig class

Signed-off-by: Shailesh-Kumar-Singh <shaileshkumarsingh260@gmail.com>

* add InvokeIO in RustBridge

Signed-off-by: Shailesh-Kumar-Singh <shaileshkumarsingh260@gmail.com>

* address comments and refractor changes

Signed-off-by: Shailesh-Kumar-Singh <shaileshkumarsingh260@gmail.com>

* run spotlessApply

Signed-off-by: Shailesh-Kumar-Singh <shaileshkumarsingh260@gmail.com>

* do spotlessApply

Signed-off-by: Shailesh-Kumar-Singh <shaileshkumarsingh260@gmail.com>

* add IntegTests, CRC in merge and address comments

Signed-off-by: Shailesh-Kumar-Singh <shaileshkumarsingh260@gmail.com>

---------

Signed-off-by: Shailesh-Kumar-Singh <shaileshkumarsingh260@gmail.com>
…ion (opensearch-project#21000)

Previously, we would only track the size of the last file with a given extension.
---------

Signed-off-by: Thy Tran <58045538+ThyTran1402@users.noreply.github.com>
…search-project#21437)

* Transfer Flight batch to a response-owned root before returning

FlightTransportResponse.nextResponse() returned a reference to the shared
root held by Arrow's FlightStream. FlightStream reuses one root across
batches — the next call to next() clears it and re-loads,
and stream.close() releases its vectors.

Transfer the shared root's vectors into a response-owned VectorSchemaRoot
before returning.

Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>

* Reset source row count to 0 after transferRoot

TransferPair.transfer() clears each source vector's valueCount, but
VectorSchemaRoot.rowCount is a separate scalar on the root — the per-vector
transfers don't touch it. Left alone, the source is inconsistent: the root
reports N rows while every field vector has valueCount=0.

Reset source.setRowCount(0) after the transfer loop so the source is fully
empty. Tighten testTransferToMovesBuffers to assert the root-level row count,
not just the vector-level value count.

Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>

* update test comment

Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>

---------

Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>
…shard-to-partition mapping (opensearch-project#21165)

* Add partition_strategy setting and PartitionAssignment for flexible shard-to-partition mapping

---------

Signed-off-by: Rishab Nahata <rishabnahata07@gmail.com>
…21461)

The three tests in MergeStatsIT kicked off forceMerge() without
awaiting the returned ActionFuture, then read cross-node
MergedSegmentWarmerStats counters in a single shot. Example
failure [here][1].

Await forceMerge, and wrap the stats collection and assertions in
assertBusy so the counter comparison retries until the warmer
push has landed. For testShardStats the result map is now
re-initialised inside the assertBusy body so a partial earlier
attempt cannot corrupt the next comparison.

[1]: https://build.ci.opensearch.org/job/gradle-check/75593/testReport/junit/org.opensearch.merge/MergeStatsIT/testShardStats/

Signed-off-by: Andrew Ross <andrross@amazon.com>
mch2 and others added 30 commits May 7, 2026 22:33
…h-project#21520)

* Adding sweep of MATH scalar functions.

math scalar functions — 32 ITs covering ABS, ACOS, ASIN, ATAN, ATAN2,
CBRT, CEIL, COS, COSH, COT, DEGREES, E, EXP, EXPM1, FLOOR, LN, LOG, LOG10,
LOG2, PI, POWER, RADIANS, RAND, ROUND, SCALAR_MAX, SCALAR_MIN, SIGN, SIN, SINH,
TAN, TRUNCATE — pushed down through analytics-engine → Substrait → DataFusion.

Rebased on upstream/main with opensearch-project#21476 landed (AbstractNameMappingAdapter +
YEAR / CONVERT_TZ / UNIX_TIMESTAMP). Conflicts resolved by union in:
  - ScalarFunction.java (math enum entries)
  - DataFusionAnalyticsBackendPlugin.java (STANDARD_PROJECT_OPS +
    scalarFunctionAdapters map)
  - DataFusionFragmentConvertor.java (ADDITIONAL_SCALAR_SIGS)
  - opensearch_scalar_functions.yaml (cbrt / cot / pi / random / round /
    signum / trunc signatures)

Migrated SCALAR_MAX / SCALAR_MIN / SIGN off the local RewriteOperatorAdapter
onto the shared AbstractNameMappingAdapter from opensearch-project#21476.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix compile

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

---------

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
)

* Extended UT Framework for E2E Query conversion (opensearch-project#21060)

Changes to test the end to end DSL query conversion without OpenSearch cluster.
Currently adds support for match_all and terms with agg for the initial commit.

Signed-off-by: Suresh N S <nssuresh@amazon.com>

* Addressing the comments from the PR revision

Signed-off-by: Suresh N S <nssuresh@amazon.com>

* Removed the README.md file

Signed-off-by: Suresh N S <nssuresh@amazon.com>

* Fixing spotless check failures

Signed-off-by: Suresh N S <nssuresh@amazon.com>

---------

Signed-off-by: Suresh N S <nssuresh@amazon.com>
…ect#21545)

* Support object field types in analytics search route

  - FieldStorageResolver: recurse into `properties` for implicit-object
    fields so nested mappings (e.g. `city.location.latitude`) flatten to
    dotted leaf columns instead of throwing "has no type in mapping".
  - OpenSearchSchemaBuilder: same recursive flattening for the Calcite
    row type so dotted paths appear as addressable columns.
  - ObjectFieldIT: 8 diagnostic tests covering fields/where/stats on
    object paths. 5 active (fields, where); 3 @AwaitsFix (stats) pending
    sql repo fix to CalciteRelNodeVisitor.containsNestedAggregator.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* update tests

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* fix tests relying on update in sql

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

---------

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
…arch-project#21422)

Adding Lucene Merge support for pluggable dataformat support

Signed-off-by: Sagar Darji <darsaga@amazon.com>
Co-authored-by: Sagar Darji <darsaga@amazon.com>
…ject#21522)

* Add dynamic settings for Search Request

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>

* version 2

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>

* Need more docs pls

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>

* Add support for concurrency defaults

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>

* Update concurrency config

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>

* Add more tests

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>

* refactoring

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>

* fix tests

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>

* Fix the tests

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>

* fix spotless

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>

* Fixes

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>

---------

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>
…pensearch-project#21556)

* Enable PPL datetime scalar functions on the analytics-engine route

Wire 23 PPL datetime functions through PPL → Calcite → Substrait → DataFusion.
Date-part functions (year/quarter/month/day/hour/minute/microsecond/week plus
aliases) route through DatePartAdapters → DataFusion date_part; niladic
functions (now/current_timestamp/current_date/curdate/current_time/curtime)
route through DateTimeAdapters to the matching DataFusion builtin; convert_tz
and unix_timestamp continue to use their existing Rust UDFs.

Functions whose DataFusion semantics diverge from legacy PPL (SECOND double-vs-int,
DAYOFWEEK Sun=0-vs-1, SYSDATE per-row-vs-query-constant, DATE_FORMAT/TIME_FORMAT
MySQL dialect, STRFTIME POSIX dialect), 2-arg overloads with no DataFusion signature
(FROM_UNIXTIME(epoch,fmt), DATETIME(expr,tz)), EXTRACT (isthmus resolves via
scalar-function lookup rather than native extract), and DATE/TIME/MAKETIME (PPL's
Calcite binding returns VARCHAR rather than DATE/TIME) are deliberately left on
the legacy Calcite path until dedicated Rust UDFs or adapters land.

Verified end-to-end via ScalarDateTimeFunctionIT (23/23 passing) with no Calcite
fallback possible — BaseScalarFunctionIT loads only analytics-engine plugins,
TestPPLTransportAction has only one execution path, and any routing failure
re-throws as RuntimeException.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>

* Trim verbose ABS/SUBSTRING adapter-map comment

Remove test-name citations that rot as the suite evolves; keep the
WHY (sort-pushdown) and the mechanism (isthmus default catalog).

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>

* Prune unadvertised datetime enum entries

SYSDATE / SECOND / DAYOFWEEK / DATE_FORMAT / TIME_FORMAT / STRFTIME /
MAKETIME / FROM_UNIXTIME / DATE / TIME / DATETIME were never placed
in STANDARD_PROJECT_OPS and have no other references — dead SPI
surface. Removing them keeps the enum aligned with what the backend
actually advertises; the covering comment in STANDARD_PROJECT_OPS
still records why each one is withheld.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>

---------

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
… engine reset (opensearch-project#11869) (opensearch-project#21404)

During resetEngineToGlobalCheckpoint on a segment replication replica,
the ReadOnlyEngine delegates (acquireLastIndexCommit, acquireSafeIndexCommit,
getSegmentInfosSnapshot) synchronized on engineMutex.  This creates a
deadlock cycle when close runs concurrently: the close thread holds
engineMutex and waits for writeLock, while the recovery thread holds
readLock (via recoverFromTranslog) and a refresh listener calls a
delegate that waits for engineMutex.

Keep SetOnce instead of AtomicReference as it already provides
thread-safe reads via its internal AtomicReference, and its
write-once guarantee prevents accidental double-set.  The minimal
fix is just removing synchronized(engineMutex) from the delegates.

Signed-off-by: Sean Chittenden <sean.chittenden@crowdstrike.com>
…tDoesNotLimitExcludedRequests test case (opensearch-project#21565)

Signed-off-by: Andriy Redko <drreta@gmail.com>
…local. (opensearch-project#21569)

This is a temporary work-around to publish latest unified jars
built to integrate with analytics-engine to maven local - we are working off of
a feature branch of sql in the short term.  if the step fails, the build will continue
with jars from maven central. This will be removed once changes for
analytics-engine in sql are merged to mainline.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
…earch-project#21513)

* [Analytics Engine] Port json_array_length to DataFusion backend

First PPL json_* function wired through PPL → Calcite → Substrait →
DataFusion. Scaffolds the pattern every follow-up UDF reuses: Rust kernel
+ YAML signature + ScalarFunction enum entry + JsonFunctionAdapters
rename + FunctionMappings.s(...) binding + STANDARD_PROJECT_OPS entry.

Rust UDF (rust/src/udf/json_array_length.rs) coerces the input to Utf8,
parses with serde_json, and returns Int32 to match PPL's
INTEGER_FORCE_NULLABLE declaration — returning Int64 would leak through
column-valued calls even though literal args const-fold via a narrowing
CAST. Malformed / non-array / NULL input → NULL, matching legacy
JsonArrayLengthFunctionImpl's NullPolicy.ANY + Gson parity.

ScalarFunction.CAST added to STANDARD_PROJECT_OPS so PPL's implicit CAST
around a UDF call (inserted when the UDF's declared return type differs
from the eval column's inferred type) doesn't fail OpenSearchProjectRule
with "No backend supports scalar function [CAST]". DataFusion handles
CAST natively — no UDF needed.

STANDARD_PROJECT_OPS and scalarFunctionAdapters reshaped to one-entry-
per-line (Map.ofEntries / Set.of) so parallel json_* PRs append without
touching neighbour lines.

Tests:
  * 10 Rust unit tests (flat/nested arrays, non-array, malformed, NULL,
    coerce_types accept/reject, arity guard, scalar-input fast path).
  * JsonFunctionAdaptersTests guards adapter shape + return-type
    preservation (BIGINT vs LOCAL_OP's INTEGER_NULLABLE).
  * ScalarJsonFunctionIT covers happy path, empty array, non-array
    object → NULL, malformed → NULL via /_analytics/ppl.

Parity-checked against legacy SQL plugin
CalcitePPLJsonBuiltinFunctionIT.testJsonArrayLength.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>

* [Analytics Engine] JSON: introduce jsonpath-rust parser + shared helpers

Lands the parser crate + a small shared helpers module ahead of the per-
function json_* UDFs. Keeping this on its own commit lets reviewers sign
off on the crate choice (jsonpath-rust 0.7) and path-conversion behaviour
before 8 UDF bodies land on top.

  * rust/Cargo.toml: add jsonpath-rust = "0.7".
  * rust/src/udf/json_common.rs:
      - convert_ppl_path: PPL path syntax (`a{i}.b{}`) -> JSONPath (`$.a[i].b[*]`).
        Mirrors JsonUtils.convertToJsonPath in sql/core. Empty string maps
        to "$" to match legacy root semantics.
      - parse: serde_json wrapper returning None on malformed input, the
        contract every json_* UDF will share.
      - check_arity / check_arity_range: plan_err! wrappers for the
        top-of-invoke guards.
  * rust/src/udf/mod.rs: register the module (helpers are crate-private).

Consumers land in follow-up commits on the same PR (opensearch-project#21513); a module-
level #![allow(dead_code)] keeps this commit's cargo check clean.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>

* [Analytics Engine] Port json_keys to DataFusion backend

Adds the second PPL json_* UDF on top of opensearch-project#21476 (json_array_length).
Matches the legacy SQL-plugin contract: object → JSON-array-encoded keys
in insertion order; non-object / malformed / scalar → SQL NULL.

- Rust UDF at rust/src/udf/json_keys.rs with scalar + columnar paths
- Shared rust/src/udf/json_common.rs helpers (parse, arity, Utf8 downcast,
  PPL-path → JSONPath) seeded for later json_* UDFs
- serde_json preserve_order feature to preserve legacy LinkedHashMap ordering
- Java wiring: ScalarFunction.JSON_KEYS, JsonKeysAdapter, Substrait sig,
  YAML signature, plugin project-op + adapter registration
- ScalarJsonFunctionIT parity test for the four legacy fixtures

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>

* [Analytics Engine] Port json_extract to DataFusion backend

Rust UDF at rust/src/udf/json_extract.rs wraps jsonpath-rust: single path →
unquoted scalar or JSON-serialized container; multi-path → JSON array with
literal null slots for misses. < 2 args, malformed doc, malformed path, and
explicit-null matches all collapse to SQL NULL, matching legacy
JsonExtractFunctionImpl's calcite jsonQuery/jsonValue pair.

JsonExtractAdapter renames the PPL call to the Rust UDF name via the variadic
path; routing lives in FunctionMappings.s(...) in DataFusionFragmentConvertor
and the STANDARD_PROJECT_OPS allow-list.

Also fixes a pre-existing transport bug in DatafusionResultStream.getFieldValue:
VarCharVector.getObject returns Arrow Text, which StreamOutput.writeGenericValue
cannot serialize, so string-valued UDF results (json_keys, json_extract) were
dropped when shard results traveled back to the coordinator. Converting
VarCharVector cells to String at the source mirrors ArrowValues.toJavaValue
and unblocks every string-returning UDF.

Parity IT (ScalarJsonFunctionIT) replays four verbatim legacy cases covering
single-path scalar/container match, wildcard multi-match, multi-path with
missing path, and explicit-null resolution.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>

* [Analytics Engine] Port json_delete to DataFusion backend

Mutation UDF #1. Introduces the shared mutation walker that json_set,
json_append, and json_extend will reuse on the same PR.

Rust side (rust/src/udf/json_delete.rs + json_common.rs):
  * `parse_ppl_segments` tokenises PPL paths (a.b{0}.c{}) into Field /
    Index / Wildcard segments without allocating field names.
  * `walk_mut` drives a mutation closure against every terminal match in
    a serde_json::Value; missing intermediate keys and out-of-range
    indices are silent no-ops, matching Jayway's SUPPRESS_EXCEPTIONS
    behaviour that legacy `JsonDeleteFunctionImpl` (→ Calcite
    `JsonFunctions.jsonRemove`) relies on.
  * `json_delete` terminal closure: `shift_remove` on Object (preserves
    insertion order via serde_json's `preserve_order` feature),
    `Vec::remove` on Array-with-Index, `Vec::clear` on Array-with-Wildcard.
    Any-NULL-arg / malformed doc / malformed path → NULL.

The walker is generic enough that json_set / json_append / json_extend
are now pure terminal-closure swaps (set value, push value, extend
array) — no further traversal plumbing needed.

Java side:
  * JSON_DELETE added to `ScalarFunction`, `STANDARD_PROJECT_OPS`, and
    `scalarFunctionAdapters`.
  * `JsonDeleteAdapter` is a plain `AbstractNameMappingAdapter` rename
    (matches the other json_* adapters).
  * Substrait YAML signature uses `variadic: {min: 1}` — same shape as
    json_extract.

Tests:
  * 10 Rust unit tests for json_delete (4 legacy IT fixtures replayed:
    flat-key, nested, missing-path-unchanged, wildcard-array; plus
    any-NULL / malformed / coerce_types / return_type).
  * 4 new walker tests in json_common (tokeniser, flat-delete,
    missing-noop, wildcard-fan-out, index-out-of-range-noop).
  * ScalarJsonFunctionIT gains `testJsonDeleteParityWithLegacy`
    replaying all 4 legacy assertions.

Parity-checked against legacy SQL plugin
`CalcitePPLJsonBuiltinFunctionIT.testJsonDelete*`.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>

* [Analytics Engine] Port json_set to DataFusion backend

Mutation UDF #2. Reuses the walker introduced by #json_delete; this
commit is a pure terminal-closure swap on the Rust side (replace, not
remove) plus the usual 7-file Java/YAML wiring.

Rust side (rust/src/udf/json_set.rs):
  * Terminal closure overwrites only existing keys on Object
    (`map.contains_key` guard), in-range slots on Array-with-Index, and
    every element on Array-with-Wildcard. This is the replace-only
    semantics from legacy `JsonSetFunctionImpl` (→ Calcite
    `JsonFunctions.jsonSet`, which guards `ctx.set` with
    `ctx.read(k) != null`).
  * Variadic arity: (doc, path1, val1, [path2, val2, ...]). Fewer than
    3 args or an odd total (unpaired trailing path) short-circuits to
    NULL, mirroring the "malformed input → NULL" convention the other
    json_* UDFs follow.
  * Values are always stored as `Value::String` because every arg is
    coerced to Utf8 by `coerce_types` — matches the legacy fixture's
    `"b":"3"` (stringified, not numeric).
  * Root-path (`parse_ppl_segments` returns empty) is a no-op to match
    Jayway's behaviour: `ctx.set("$", v)` silently fails because the
    root is indelible and unreplaceable.

Java side:
  * JSON_SET added to `ScalarFunction`, `STANDARD_PROJECT_OPS`, and
    `scalarFunctionAdapters`.
  * `JsonSetAdapter` is a plain `AbstractNameMappingAdapter` rename.
  * Substrait YAML signature uses `variadic: {min: 1}` — same shape as
    json_extract / json_delete.

Tests:
  * 9 Rust unit tests for json_set (3 legacy IT fixtures replayed:
    wildcard-replace, wrong-path-unchanged, partial-wildcard-set; plus
    multi-pair / any-NULL / malformed-doc / malformed-path /
    coerce_types / return_type).
  * ScalarJsonFunctionIT gains `testJsonSetParityWithLegacy` replaying
    all 3 legacy assertions.

Parity-checked against legacy SQL plugin
`CalcitePPLJsonBuiltinFunctionIT.testJsonSet*`.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>

* [Analytics Engine] Port json_append to DataFusion backend

Mutation UDF #3. Another walker reuse: terminal closure pushes the
paired value onto array-valued targets (non-array / missing targets
are silent no-ops).

Rust side (rust/src/udf/json_append.rs):
  * Terminal closure branches: Object+Field → look up field, if it's an
    Array push the stringified value; Array+Index → if the indexed slot
    is an Array, push; Array+Wildcard → push onto every array-valued
    child. Non-array matches are skipped, matching legacy
    `JsonFunctions.jsonInsert` via Jayway's Collection-parent branch
    (`Collection.add`) which is how `JsonAppendFunctionImpl`'s
    `.meaningless_key` suffix trick ultimately expands.
  * Variadic arity (doc, path1, val1, [path2, val2, ...]). Fewer than 3
    args or an odd total (unpaired trailing path) → NULL — the
    malformed-input-to-NULL convention all other json_* UDFs share.
    Matches legacy's `RuntimeException("needs corresponding path and
    values")` observably-as-error via NULL surface.
  * Pre-stringified values: all args are Utf8-coerced at `coerce_types`
    entry, so nested `json_object(...)` / `json_array(...)` arrive here
    already stringified. They are pushed as `Value::String`, which
    reproduces the legacy IT's quoted-JSON-as-element rows without the
    new engine having to implement `json_object`/`json_array` yet
    (they ship in a follow-up PR).

Java side:
  * JSON_APPEND added to `ScalarFunction`, `STANDARD_PROJECT_OPS`, and
    `scalarFunctionAdapters`.
  * `JsonAppendAdapter` is a plain `AbstractNameMappingAdapter` rename.
  * Substrait YAML signature uses `variadic: {min: 1}` — same shape as
    json_extract / json_delete / json_set.

Tests:
  * 12 Rust unit tests for json_append (3 legacy IT fixtures replayed
    with pre-stringified nested JSON: named-array push, nested-path
    push, stringified-object push; plus multi-pair / wildcard-fan-out /
    non-array-noop / missing-path-noop / any-NULL / malformed-doc /
    malformed-path / coerce_types / return_type).
  * ScalarJsonFunctionIT gains `testJsonAppendParityWithLegacy`
    replaying all 3 legacy assertions with literal stringified JSON in
    place of the nested constructor calls the legacy test uses.

Parity-checked against legacy SQL plugin
`CalcitePPLJsonBuiltinFunctionIT.testJsonAppend`.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>

* [Analytics Engine] Port json_extend to DataFusion backend

Mutation UDF #4 — last walker reuse. Same push shape as json_append,
but each paired value is first tried as a JSON-array parse: success →
spread the elements; failure → push the whole string as one element
(parity with legacy `JsonExtendFunctionImpl`'s `gson.fromJson(v,
List.class)` try/fall-back).

Rust side (rust/src/udf/json_extend.rs):
  * Helper `spread(raw) -> Vec<Value>`: returns the parsed items when
    `raw` is a JSON array, else `[Value::String(raw)]`. Scalars,
    objects, and malformed JSON all go through the single-push branch.
  * Terminal closure reuses json_append's array-target guards (Object
    field → Array, Array+Index → inner Array, Array+Wildcard → every
    array child). `Vec::extend(items.iter().cloned())` handles the
    spread and the single-push case uniformly.
  * Variadic arity matches every other mutation UDF. Invalid arity /
    any-NULL / malformed-doc / malformed-path → NULL.

Deliberate divergence from legacy: integer-typed spread elements stay
integers (serde_json preserves source type) rather than being widened
to Double as Gson does. Documented in `json.md:555` but not covered by
any legacy IT; we preserve the more useful default and will file a
tracking issue for the wider Gson-compat decision.

Java side:
  * JSON_EXTEND added to `ScalarFunction`, `STANDARD_PROJECT_OPS`, and
    `scalarFunctionAdapters`.
  * `JsonExtendAdapter` is a plain `AbstractNameMappingAdapter` rename.
  * Substrait YAML signature uses `variadic: {min: 1}` — same shape as
    the other variadic json_* UDFs.

Tests:
  * 13 Rust unit tests for json_extend (3 legacy IT fixtures replayed:
    single-push on non-array value, plain-string push, JSON-array
    spread; plus empty-array-value / mixed-type-spread / wildcard-fan
    / non-array-noop / missing-path-noop / any-NULL / malformed-doc /
    malformed-path / coerce_types / return_type).
  * ScalarJsonFunctionIT gains `testJsonExtendParityWithLegacy`
    replaying all 3 legacy assertions with literal stringified JSON
    standing in for the nested constructor calls the legacy test uses.

Parity-checked against legacy SQL plugin
`CalcitePPLJsonBuiltinFunctionIT.testJsonExtend`.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>

---------

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
…rch-project#21574)

Creates sandbox/libs/analytics-api targeting JDK 21, containing only the
interface surface that downstream plugins (e.g. opensearch-sql) need to
compile against:

- org.opensearch.analytics.exec.QueryPlanExecutor (moved from analytics-framework)
- org.opensearch.analytics.schema.OpenSearchSchemaBuilder (moved from analytics-engine)

Motivation:

analytics-framework and analytics-engine use java.lang.foreign (FFM) APIs
that finalize only in JDK 22, so those modules must target JDK 25. Their
published Gradle Module Metadata declares org.gradle.jvm.version=25,
which blocks consumption from projects that target JVM 21 — most notably
opensearch-sql, which uses QueryPlanExecutor and OpenSearchSchemaBuilder
but does not touch FFM types.

Extracting these two interfaces into a JVM-21-targeted module unblocks
those consumers without forcing the whole sandbox to downgrade (FFM is
load-bearing for the engine).

Packages are unchanged (org.opensearch.analytics.{exec,schema}), so
existing imports in sandbox consumers (analytics-framework, analytics-engine,
dsl-query-executor, test-ppl-frontend) continue to resolve transitively
via the new `api project(':sandbox:libs:analytics-api')` edge in
analytics-framework/build.gradle.

thirdPartyAudit is disabled on analytics-api: it is a compile-time API
surface consumed transitively with the full Calcite runtime provided by
analytics-framework or analytics-engine, so there is no standalone runtime
where Calcite's optional references could load.

Verification:
- ./gradlew check -p sandbox -Dsandbox.enabled=true  (full sandbox check: BUILD SUCCESSFUL)
- analytics-api jar bytecode is class version 65 (JDK 21) as expected.
- Published Gradle Module Metadata declares org.gradle.jvm.version=21.
- Downstream sql prototype compiles against the new artifact from mavenLocal.

Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>
…napshots to mavenLocal (opensearch-project#21578)

* Pin org.opensearch.query:* (unified-query-*) artifacts to mavenLocal

The remote OpenSearch Snapshots maven repo (ci.opensearch.org/ci/dbc/snapshots)
only republishes from sql/main, not from sql/feature/mustang-ppl-integration,
so its 3.7.0.0-SNAPSHOT jars trail the feature branch by however many merges
(currently missing PPL_REX_MAX_MATCH_LIMIT, CALCITE_ENGINE_ENABLED, …). The
sandbox-check workflow's pre-step opensearch-project#21569 publishes feature-branch unified-query
jars to mavenLocal, but Gradle's default SNAPSHOT resolution weighs the remote's
explicit <buildNumber>/<timestamp> metadata higher than mavenLocal's
<localCopy>true>, so the stale remote wins even when mavenLocal has a newer
<lastUpdated>.

Confirmed via dependencyInsight: every consumer was binding
unified-query-api:3.7.0.0-SNAPSHOT:20260507.224009-12 (60kB, 42 classes, no
PPL_REX_MAX_MATCH_LIMIT field reference) instead of the locally-published
3.7.0.0-SNAPSHOT (29kB, 21 classes, has the field). The runtime cluster
inherited that stale class via the test-ppl-frontend plugin bundle, which
is why every IT touching `rex` failed plan-time with `NullPointerException:
Cannot invoke "java.lang.Integer.intValue()" because the return value of
"Settings.getSettingValue(PPL_REX_MAX_MATCH_LIMIT)" is null` once the
unified path tried to read the setting.

Fix: tell the OpenSearch Snapshots remote to refuse `org.opensearch.query`
artifacts via mavenContent { excludeGroup }. Three sites declare the remote:

  * sandbox/build.gradle subprojects { repositories } — applies to every
    sandbox subproject including qa.
  * sandbox/plugins/analytics-backend-datafusion/build.gradle — own
    declaration; left in place for module isolation, filtered identically.
  * sandbox/plugins/test-ppl-frontend/build.gradle — also pin mavenLocal as
    the only source for org.opensearch.query so the bundlePlugin task
    bundles the freshly-published feature-branch jar rather than the stale
    timestamped one Gradle would otherwise pick.

Verified locally: bundled unified-query-api drops 60kB → 29kB, the
UnifiedQueryContext$Builder constant pool now references PPL_REX_MAX_MATCH_LIMIT,
and RexCommandIT goes 0/16 → 16/16 against the same locally-published jars
the CI workflow already produces.

Drop this filter once the SQL feature branch merges to sql/main and the
remote OpenSearch Snapshots repo catches up — at that point every
3.7.0.0-SNAPSHOT publish will carry the rex max-match default and the
mavenLocal preference becomes redundant.

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* Wire mavenLocal into sandbox subprojects + bump analytics-engine to 3.7

CI fallout from the prior commit's `excludeGroup 'org.opensearch.query'`
filter on the OpenSearch Snapshots remote: the parent subprojects block
no longer carried mavenLocal, so analytics-engine's testImplementation /
internalClusterTest configurations had no repository at all serving
org.opensearch.query, failing with `Could not find
org.opensearch.query:unified-query-api:3.6.0.0-SNAPSHOT` (and -core / -ppl).

Two pieces:

1. sandbox/build.gradle subprojects { repositories } — also declare
   mavenLocal scoped to the org.opensearch.query group via mavenContent
   { includeGroup }. mavenLocal becomes the authoritative source for
   unified-query SNAPSHOTs (populated by the sandbox-check workflow's
   publishUnifiedQueryPublicationToMavenLocal pre-step) without leaking
   into resolution for any other group.

2. sandbox/plugins/analytics-engine/build.gradle — bump
   sqlUnifiedQueryVersion from 3.6.0.0-SNAPSHOT → 3.7.0.0-SNAPSHOT.
   The 3.6 jars don't exist in mavenLocal (only the 3.7 feature-branch
   build does), so the older pin was the proximate cause of the CI
   resolution failure. Aligning with test-ppl-frontend's already-3.7
   declaration also keeps the unified-query consumers consistent.

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

---------

Signed-off-by: Jialiang Liang <jiallian@amazon.com>
…umn included writes (opensearch-project#21464)

Signed-off-by: Chaitanya KSR <ksrchai@amazon.com>
…ery-*) snapshots to mavenLocal (opensearch-project#21578)" (opensearch-project#21580)

This reverts commit 36809cc.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
…multivalue (mv) functions to analytics-engine route (opensearch-project#21554)

* [Analytics Backend / DataFusion] Wire PPL array constructor + array_length / array_slice / array_distinct / mvjoin

Onboards the PPL `array(a, b, …)` constructor and four array-consuming
functions to the analytics-engine route by mapping their Calcite lowering
targets through Substrait to DataFusion's native make_array / array_length /
array_slice / array_distinct / array_to_string.

Same templated shape as the `replace` PR (opensearch-project#21527), with two extensions:

  ScalarFunction enum constants (5)
    + STANDARD_PROJECT_OPS / ARRAY_RETURNING_PROJECT_OPS membership
    + opensearch_array_functions.yaml extension entries
    + ADDITIONAL_SCALAR_SIGS Calcite-op→Substrait-name bridges
    + scalarFunctionAdapters() entries for the 3 functions that need
      operand normalization
    = onboarded to the analytics route.

Capability lookup at OpenSearchProjectRule keys on the call's return type;
for array-returning functions (`array(...)`, `array_slice`, `array_distinct`)
the return type resolves to `SqlTypeName.ARRAY`, which previously hit
`default → null` in `FieldType.fromSqlTypeName` and emptied the viable-backend
list before the registration could match.

  * `FieldType.ARRAY` added to the analytics SPI enum.
  * `SqlTypeName.ARRAY → FieldType.ARRAY` mapping in `fromSqlTypeName`.
  * `ARRAY_RETURNING_PROJECT_OPS` registered against `Set.of(FieldType.ARRAY)`
    only — separate from `STANDARD_PROJECT_OPS` so `FieldType.ARRAY` doesn't
    pollute filter / aggregate capabilities (no meaningful semantics over
    array-typed values there).
  * `ArrowSchemaFromCalcite.toArrowField` recurses into the component type
    to build the matching Arrow `List<inner>` field — without this the result
    schema would have a bare `List` with no element field and the backend's
    Arrow IPC reader would fail to bind result columns.

Substrait's standard catalog has no array_* entries, so isthmus'
`RexExpressionConverter` would fail with "Unable to convert call …" on every
array call. New `opensearch_array_functions.yaml` declares:

  * `make_array(any1, …)` → `list<any1>` (variadic, min: 0).
  * `array_length(list<any1>)` → `i64?`.
  * `array_slice(list<any1>, i64, i64)` → `list<any1>` (with i32 fallback).
  * `array_distinct(list<any1>)` → `list<any1>`.
  * `array_to_string(list<any1>, string)` → `string?` (with varchar fallback).

Loaded via `SimpleExtension.load("/opensearch_array_functions.yaml")` and
merged into the plugin's extension collection in
`DataFusionPlugin.loadSubstraitExtensions()`.

Substrait's call-conversion path (and DataFusion's signature matcher) is
strict about operand types in ways Calcite's PPL lowering doesn't naturally
satisfy. Three adapters bridge the gap:

  * `MakeArrayAdapter` — implements `ScalarFunctionAdapter` directly
    (not `AbstractNameMappingAdapter`). PPL's `ArrayFunctionImpl` infers
    `ARRAY<commonElementType>` for the call's return type but does NOT
    widen the individual operand types. So `array(1, 1.5)` produces a
    RexCall whose operands are `(INTEGER, DECIMAL(2,1))` but whose return
    type is `ARRAY<DOUBLE>`. Substrait's variadic-`any1` consistency
    validator throws an `AssertionError` in that case (not a recoverable
    exception — it fatally exits the search-thread JVM). The adapter
    extracts the call's component type and CASTs each non-matching
    operand to it before emission.
  * `ArrayToStringAdapter` — declares a local `array_to_string` op and
    name-maps `SqlLibraryOperators.ARRAY_JOIN` → it.
  * `ArraySliceAdapter` — passes the `ARRAY_SLICE` call through unchanged
    but coerces the index operands (positions 1, 2, optional 3) to
    `BIGINT`. PPL's parser types positive integer literals as
    `DECIMAL(20,0)`; DataFusion's `array_slice` signature accepts only
    integer indexes and refuses to coerce decimal arguments.

Two third-party dependencies that surfaced as fatal `NoClassDefFoundError`
during execution of array-returning calls:

  * `commons-text` to analytics-engine — Calcite's `SqlFunctions` class
    statically references `org.apache.commons.text.similarity.LevenshteinDistance`.
    Without it, any Calcite RelNode walk that touches `SqlFunctions.<clinit>`
    poisons the search-thread JVM.
  * `jackson-datatype-jsr310` to **arrow-flight-rpc** (the parent plugin
    that bundles `arrow-vector`). `arrow-vector`'s `JsonStringArrayList`
    eagerly registers `JavaTimeModule` on its ObjectMapper in `<clinit>`,
    so any reader of an Arrow `ListVector` (i.e. every array-returning
    DataFusion call flowing through analytics-engine) hits a fatal
    NoClassDefFoundError. The dep belongs on arrow-flight-rpc's classpath
    because that plugin defines arrow-vector's classloader; bundling it
    in analytics-backend-datafusion (the child plugin) is invisible to
    arrow-vector. Marked `compileOnly` here to avoid jar-hell with
    arrow-flight-rpc's `api` dependency.

  * Before: 1/60 (testArrayWithMix only — exercises an error path that
    fails before the ARRAY capability lookup).
  * After:  9/60.
    Newly passing: testArray, testArrayWithString, testArrayLength,
    testMvjoinWithStringArray, testMvjoinWithStringifiedNumbers,
    testMvjoinWithMixedStringValues, testMvjoinWithStringBooleans,
    testMvjoinWithSpecialDelimiters, testMvjoinWithArrayFromRealFields,
    testMvjoinWithMultipleRealFields.

The remaining 51 failures fall into three buckets:

  * 50 — out-of-scope S1+ functions (`mvfind`, `mvzip`, `reduce`, `transform`,
    `forall`, `filter`, `exists`, `ITEM`). These are PPL UDFs without direct
    DataFusion equivalents and need either lambda-substrait wiring or
    custom UDF registration on the Rust side.
  * 5  — `testMvindexRange*` family. PPL's `mvindex(arr, from, to)` lowers
    to `ARRAY_SLICE(arr, from+1, to+1)` (1-based shift) but the lowering
    is missing the +1, so DataFusion's 1-based array_slice returns a
    window shifted by one. Fix belongs in the SQL plugin's PPL→Calcite
    lowering layer.
  * 1  — `testMvindexRangeMixed` JSON formatting mismatch (test code
    expects bare `[a,b,c]` but the response is `\"[\\\"a\\\",\\\"b\\\",\\\"c\\\"]\"`).

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [Analytics Backend / DataFusion] Fix ARRAY_SLICE 0-based-(start, length) → 1-based-(start, end) for DataFusion

Calcite's `SqlLibraryOperators.ARRAY_SLICE` is the Spark / Hive flavor —
0-based start, third arg is the length-of-elements to take. PPL's
`MVIndexFunctionImp.resolveRange` (in the SQL plugin) emits this form,
e.g. `mvindex(arr=[1..5], 1, 3)` → `ARRAY_SLICE(arr, 1, 3)` meaning
"start at 0-based position 1, take 3 elements" → expected `[2, 3, 4]`.

DataFusion's native `array_slice` is the Postgres / Snowflake flavor —
1-based start, third arg is the inclusive end-index. So the same call
`array_slice(arr, 1, 3)` returns elements at 1-based positions 1..3 →
`[1, 2, 3]`. Off-by-one across every `mvindex` range query.

Convert the operands in the adapter rather than the SQL plugin's PPL
lowering, because the lowering's existing semantics are correct for
Calcite's local executor (used by every non-analytics path); the bug is
only in the bridge to DataFusion.

  start' = start + 1
  end'   = start + length    (== start + 1 + (length - 1))

`MVIndexFunctionImp` already normalizes negative indexes to non-negative
0-based positions before invoking ARRAY_SLICE (it uses `arrayLen + idx`),
so the arithmetic above applies uniformly.

Empirically: `mvindex(arr=[1..5], 1, 3)` now returns the correct values
`[2, 3, 4]` (was `[1, 2, 3]`); negative form `mvindex(arr, -3, -1)`
returns `[3, 4, 5]` (was `[2, 3]`); mixed `mvindex(arr, -4, 2)` returns
`[2, 3]` matching the PPL spec.

The 5 `testMvindexRange*` tests still don't pass on the IT, but for an
unrelated reason — array-typed result values are being returned as
JSON-stringified scalars (`"[2,3,4]"`) instead of typed arrays. That's a
response-formatting issue affecting every array-returning test (also
`testArray`, `testArrayWithString`) and lives in a different code path;
it'll be addressed separately.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [Analytics Backend / DataFusion] Wire PPL ITEM (mvindex single-element) → DataFusion array_element

PPL's `mvindex(arr, N)` single-element form lowers (in `MVIndexFunctionImp.resolveSingleElement`)
to Calcite's `SqlStdOperatorTable.ITEM` operator with a 1-based index (already
converted from PPL's 0-based input). DataFusion's native single-element array
accessor is `array_element` (also 1-based), so a name-mapping adapter + yaml
extension entry are sufficient.

Templated shape:

  ScalarFunction.ITEM (SqlKind.ITEM)
    + STANDARD_PROJECT_OPS membership (returns the array's element type, which
      resolves through the existing FieldType.fromSqlTypeName → SUPPORTED_FIELD_TYPES
      capability lookup for non-array element types — array-of-array is rare in
      PPL and not exercised by the current test surface)
    + scalarFunctionAdapters() entry → ArrayElementAdapter
        ↳ rewrites SqlStdOperatorTable.ITEM to a locally-declared SqlFunction
          named "array_element"
        ↳ coerces the index operand to BIGINT (PPL's parser produces DECIMAL
          for positive integer literals; DataFusion's array_element rejects
          DECIMAL indexes, same as array_slice)
    + ADDITIONAL_SCALAR_SIGS bridge for the locally-declared op
    + opensearch_array_functions.yaml extension entry:
        array_element(list<any1>, i64) → any1?

# Pass-rate (CalciteArrayFunctionIT, force-routed)

  * Before this commit: 9/60.
  * After this commit:  12/60.
    Newly passing: testMvindexSingleElementPositive,
    testMvindexSingleElementNegative,
    testMvindexSingleElementNegativeMiddle.

The other 3 tests that hit the ITEM rejection (testMvfindWith*) are
multi-step queries where ITEM is one node in a tree that also includes
unrelated S1+ functions (mvfind/mvzip/etc.); they remain blocked by
the upstream functions, not by ITEM itself.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [Analytics Engine] Carry array-typed cells through RowResponseCodec without JSON-stringifying

The row-oriented fragment-execution wire format (`FragmentExecutionResponse`,
used when arrow-flight streaming is disabled — every single-node test cluster
today) shipped each cell through OpenSearch's `writeGenericValue` /
`readGenericValue`, which preserves `List` values as `ArrayList<Object>`. On
the coordinator side, `RowResponseCodec.decode` then re-materialized the rows
into a `VectorSchemaRoot` for `Iterable<VectorSchemaRoot>`-style consumers.

Two bugs in that re-materialization were eating array values:

1. `inferArrowType` walked rows for the first non-null cell and matched
   against {Long, Integer, …, CharSequence, byte[], Number}. {@code List}
   wasn't in the chain, so it fell through to {@code break} and the
   fallback {@link ArrowType.Utf8} — every array column became a VARCHAR
   column.
2. `setVectorValue` for {@link VarCharVector} called {@code value.toString()}.
   For a {@code JsonStringArrayList} that returns the JSON form
   {@code "[2,3,4]"}, which then got serialized as a JSON string in the
   final response. Tests like {@code testMvindexRangePositive} saw their
   array result come back as a string `"[2,3,4]"` instead of an array
   `[2, 3, 4]`.

Fix:

* Replace {@code inferArrowType} with {@code inferField} that returns a
  full {@link Field}. For {@code List} cells, build a list field with the
  inner element type inferred from the first non-null element (with a
  fallback that scans later rows in case the first list is empty/all-null).
* Add a {@code ListVector} arm to {@code setVectorValue} that delegates to
  a new {@code writeListValue}. The writer bypasses {@link UnionListWriter}
  entirely — it writes directly to the list's offset / validity buffers and
  to the inner data vector via the inner vector's typed `setSafe`. The
  writer-based API requires per-element `ArrowBuf` allocations for varchar
  elements that are easy to leak or use-after-free; the direct path is
  simpler and avoids both classes of bug.

Plus a separate Arrow gotcha that surfaced once arrays started flowing
through correctly:

* {@code ListVector.getObject} for a {@code VarCharVector} child returns a
  {@code JsonStringArrayList} whose elements are Arrow's {@link Text} class,
  not Java {@link String}. {@code ExprValueUtils.fromObjectValue} doesn't
  recognize {@code Text} and threw "unsupported object class
  org.apache.arrow.vector.util.Text". {@code ArrowValues.toJavaValue} now
  mirrors its top-level VarChar branch for list cells: when a list value
  comes back from a {@code ListVector}, normalize each {@code Text} element
  to a {@link String} before handing the list upward.

  * Before: 12/60 (mvindex range tests still showed expected-vs-actual
    diff because `[2,3,4]` came back as a JSON string, not an array).
  * After:  26/60.

  Newly passing:
    testMvindexRangePositive, testMvindexRangeNegative, testMvindexRangeMixed,
    testMvindexRangeFirstThree, testMvindexRangeLastThree,
    testMvindexRangeSingleElement,
    testMvdedupWithDuplicates, testMvdedupWithAllDuplicates,
    testMvdedupWithNoDuplicates, testMvdedupWithStrings,
    testArrayWithString,
    testSplitWithSemicolonDelimiter, testSplitWithMultiCharDelimiter,
    testSplitWithEmptyDelimiter.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [Analytics] Add SHA + LICENSE files for new bundled deps; spotless

The `dependencyLicenses` precommit task scans `licenses/` for a `<jar>.sha1`
sibling per bundled dependency. Two deps added in this PR were missing them:

  * `commons-text-1.11.0` in analytics-engine — needs sha1 + LICENSE +
    NOTICE (no shared `commons-text-*` license files yet in this plugin).
    Apache 2.0; LICENSE and NOTICE extracted from the released jar.
  * `jackson-datatype-jsr310-2.21.3` in arrow-flight-rpc — sha1 only.
    arrow-flight-rpc's `dependencyLicenses` already maps `jackson-.*` to
    the shared `jackson-LICENSE` / `jackson-NOTICE` files via
    `mapping from: /jackson-.*/, to: 'jackson'`, so no new license/notice
    files are needed.

Plus googleJavaFormat reflow on `ArraySliceAdapter` and `DataFusionPlugin`
that spotlessCheck flagged in precommit.

Verified `:plugins:arrow-flight-rpc:precommit`,
`:sandbox:plugins:analytics-engine:precommit`, and
`:sandbox:plugins:analytics-backend-datafusion:precommit` all succeed.

Addresses review feedback on opensearch-project#21554.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [Analytics Engine] Map BigDecimal cells to FloatingPoint in row-codec inference

{@code RowResponseCodec.scalarArrowType} ordered its instanceof checks
{Long, Integer, Short, Byte, Double, Float, Boolean, CharSequence, byte[],
Number(fallback) → Int(64)}. BigDecimal extends {@link Number} but isn't any
of the typed scalar arms, so it fell through to the {@code Number} fallback
and got encoded as a 64-bit integer column — silently truncating fractional
digits.

This bites PPL flows whose common element type is {@code DECIMAL} (e.g.
{@code array(1, -1.5, 2, 1.0)} — the v2-side {@code ArrayImplementor.internalCast}
explicitly maps the DECIMAL target to BigDecimal cells). The element values
{@code -1.5} and {@code 1.0} round to {@code -1} and {@code 1} when forced
through Int(64), so the array reads back as {@code [1, -1, 2, 1]} instead of
{@code [1, -1.5, 2, 1.0]}.

Promote BigDecimal cells to FloatingPoint(DOUBLE) — same precision the v2
engine uses for decimal-typed PPL results, so behavior matches across both
execution paths. The list writer's {@code Float8Vector} arm already uses
{@code ((Number) element).doubleValue()}, which correctly extracts the
fractional value from a BigDecimal.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [Analytics Backend / DataFusion] Onboard PPL mvzip via custom Rust UDF

PPL `mvzip(left, right [, sep])` element-wise zips two arrays into a list of
strings, joined per pair by a separator (default `,`). DataFusion has no
stdlib equivalent — `array_concat` is end-to-end concatenation, and Substrait's
lambda support is too thin for a transform/zip rewrite — so this onboards a
custom Rust ScalarUDF on the analytics-backend-datafusion plugin's session
context and wires the Java side to route to it.

Templated shape (extends the existing pattern from convert_tz):

  Rust side:
    udf::mvzip::MvzipUdf — Signature::user_defined; coerce_types pins the
      first two args to ListArray and the optional 3rd to Utf8; invoke_with_args
      iterates per row, takes min(len(left), len(right)) elements, stringifies
      each (matching `Objects.toString(elem, "")` for null elements), and
      builds a List<Utf8>. Defensive Null-element-type arm handles the empty
      array case before the SQL-plugin VARCHAR-default kicks in.
    Registered on each session context via udf::register_all alongside
    convert_tz. 7 unit tests cover the basic / custom-sep / truncation /
    null-element / null-array / empty-array / numeric-array shapes.

  Java side:
    ScalarFunction.MVZIP enum entry (SqlKind.OTHER_FUNCTION; resolves through
      identifier-name valueOf("MVZIP") since PPL's MVZipFunctionImpl registers
      under the function name "mvzip").
    MvzipAdapter — locally-declared SqlFunction("mvzip") + ADDITIONAL_SCALAR_SIGS
      bridge so isthmus emits a Substrait scalar function call with the exact
      name the Rust UDF is registered under.
    DataFusionAnalyticsBackendPlugin: ARRAY_RETURNING_PROJECT_OPS membership
      (returns ARRAY<VARCHAR>, registered against FieldType.ARRAY); adapter
      registration in scalarFunctionAdapters().
    opensearch_array_functions.yaml: two impls for arity-2 and arity-3.

  * Before: 28/60.
  * After:  34/60.

  Newly passing — all 5 testMvzip* variants:
    testMvzipBasic, testMvzipWithCustomDelimiter, testMvzipNested,
    testMvzipWithEmptyArray, testMvzipWithBothEmptyArrays.

  (Test count delta is +6 because the test class also exercises mvzip in 1
  other test under a different name, picked up by the same fix.)

This PR's run also picks up the SQL-plugin companion #5421 which defaults
empty `array()` to ARRAY<VARCHAR>. Without that companion the testMvzipWith*EmptyArray
variants would still fail — substrait would reject the input ARRAY<NULL>
type before reaching the UDF. The Rust UDF's Null-element arm exists as a
defensive backstop in case the call ever reaches it with a null-typed list.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [Analytics Backend / DataFusion] Onboard PPL mvfind via custom Rust UDF

PPL `mvfind(arr, regex)` finds the 0-based index of the first array element
matching a regex pattern (Java `Matcher.find` substring-match semantics), or
NULL if no match. DataFusion has no stdlib equivalent, and rewriting in terms
of array_position requires per-element regex evaluation that's only
expressible with substrait lambda support — out of scope here. Onboards a
custom Rust ScalarUDF on the analytics-backend-datafusion plugin's session
context, mirroring the mvzip/convert_tz pattern.

Templated shape:

  Rust side:
    udf::mvfind::MvfindUdf — Signature::user_defined; coerce_types pins arg 0
      to a list type and arg 1 to Utf8; invoke_with_args walks each row and
      finds the first non-null element whose stringified form matches the
      regex via Rust's `regex` crate (`Regex::is_match` is unanchored, same
      as Java's `Matcher.find`). Scalar pattern operands compile once up
      front and surface invalid-regex errors at plan time (mirrors the SQL
      plugin's plan-time `tryCompileLiteralPattern`); column-valued patterns
      compile per row and yield NULL for invalid patterns. Supports list
      element types Utf8 / Int{8,16,32,64} / UInt{8,16,32,64} / Float{32,64}
      / Boolean / Null. 7 unit tests cover the basic-match / no-match /
      null-array / empty-array / null-element / numeric-array / unanchored
      shapes.
    Registered on each session context via udf::register_all alongside
    convert_tz and mvzip.

  Java side:
    ScalarFunction.MVFIND enum entry (SqlKind.OTHER_FUNCTION; resolves
      through identifier-name valueOf("MVFIND") since PPL's
      MVFindFunctionImpl registers under the function name "mvfind").
    MvfindAdapter — locally-declared SqlFunction("mvfind") +
      ADDITIONAL_SCALAR_SIGS bridge so isthmus emits a Substrait scalar
      function call with the exact name the Rust UDF is registered under.
    DataFusionAnalyticsBackendPlugin: STANDARD_PROJECT_OPS membership
      (returns INTEGER, registered against the existing scalar
      SUPPORTED_FIELD_TYPES); adapter registration in
      scalarFunctionAdapters().
    opensearch_array_functions.yaml: arity-2 impl returning `i32?`.

  * Before: 34/60.
  * After:  42/60.

  Newly passing — 8 of 9 testMvfind* variants:
    testMvfindWithMatch, testMvfindWithFirstMatch, testMvfindWithMultipleMatches,
    testMvfindWithNoMatch, testMvfindWithEmptyArray, testMvfindWithNumericArray,
    testMvfindWithCaseInsensitive, testMvfindWithComplexRegex.

  Remaining mvfind failure:
    testMvfindWithDynamicRegex — fails with "Unable to convert call
    CONCAT(string, string)" because the test computes the pattern via
    `concat('ban', '.*')` and substrait can't bind the CONCAT call. This is a
    separate analytics-engine CONCAT type-conversion issue, not mvfind-specific.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [Analytics Backend / DataFusion] Onboard PPL mvappend via custom Rust UDF

PPL `mvappend(arg1, arg2, …)` flattens a mixed list of array and scalar
arguments into one array, dropping null arguments and null elements within
array arguments. DataFusion's `array_concat` is the closest stdlib match but
only accepts arrays (not mixed array+scalar) and preserves nulls — different
semantics. Onboards as a custom Rust ScalarUDF on the analytics-backend-datafusion
plugin's session context, mirroring the mvzip / mvfind pattern.

Templated shape:

  Rust side:
    udf::mvappend::MvappendUdf — Signature::user_defined; per-row walk over
      operands, skipping NULL args and NULL elements inside array args, with
      explicit Arrow type arms for {Int8/16/32/64, UInt8/16/32/64,
      Float32/64, Boolean, Utf8/LargeUtf8/Utf8View}. The string arms output
      List<Utf8> or List<Utf8View> depending on the inferred element type so
      the result schema matches what `return_type` declared (DataFusion's
      execution-time schema check rejects mismatches). Defensive Null
      element-type arm covers the empty-array shape. 6 unit tests.
    Registered on each session context via udf::register_all.

  Java side:
    ScalarFunction.MVAPPEND enum entry (SqlKind.OTHER_FUNCTION; resolves
      through identifier-name valueOf("MVAPPEND")).
    MvappendAdapter — locally-declared SqlFunction("mvappend") +
      ADDITIONAL_SCALAR_SIGS bridge. Casts every scalar operand to the
      call's array component type and every array operand to
      ARRAY<componentType> before substrait emission, so the UDF sees a
      single uniform element type across all positions.
    DataFusionAnalyticsBackendPlugin: ARRAY_RETURNING_PROJECT_OPS membership
      (returns ARRAY<commonType>); adapter registration in
      scalarFunctionAdapters().
    opensearch_array_functions.yaml: variadic min:1 entry with `list<any1?>`
      return type.

  * Before: 0/15.
  * After:  6/15.

  Newly passing:
    testMvappendWithMultipleElements, testMvappendWithSingleElement,
    testMvappendWithArrayFlattening, testMvappendWithStringValues,
    testMvappendWithNestedArrays, testMvappendWithRealFields.

  * 8 tests fail with "Unable to convert the type ANY". Root cause is
    PPL's MVAppendFunctionImpl.updateMostGeneralType using strict
    Object.equals on each pair of operand types, returning Calcite's
    ANY type when any two don't match — including when they only differ
    in nullability tag (a literal 3 is INTEGER NOT NULL but the
    component type of `array(1, 2)` is INTEGER NULLABLE). Substrait
    can't serialize ANY. The fix belongs in the SQL plugin's
    MVAppendFunctionImpl (use typeFactory.leastRestrictive instead of
    Object.equals) and isn't addressed here.
  * testMvappendInWhereClause — uses `where array_length(combined) = 2`
    which the analytics-engine planner rejects with "No backend can
    evaluate filter predicate [EQUALS] on fields [combined:ARRAY]".
    Filter-side capability gap unrelated to mvappend.
  * testMvappendWithComplexExpression — fails substrait conversion on
    a nested mvappend call ("Unable to convert call mvappend(list, …)"),
    likely the same nullability widening pattern flowing through nested
    calls. Same upstream fix applies.

  Unchanged at 43/60 — mvappend isn't exercised there.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [Analytics Backend / DataFusion] Reshape mvappend operands as uniform lists; add Decimal128 element support

Two follow-ons to the initial mvappend onboarding (40b2161), both surfaced
once the SQL companion opensearch-project#5424 (`MVAppendFunctionImpl.leastRestrictive`) let
homogeneous-type calls reach substrait conversion.

# Uniform-list operand reshape

Substrait's variadic-`any1` argument shape requires every operand at the same
variadic position to share a type. PPL's `mvappend(arg, …)` accepts a mix of
bare scalars and arrays, which substrait's signature matcher rejected with
`Unable to convert call mvappend(list<i32?>, i32?, i32?)`.

`MvappendAdapter` now wraps each scalar operand in a singleton
`make_array(scalar)` call (using the locally-declared `MakeArrayAdapter.LOCAL_MAKE_ARRAY_OP`)
so by the time the substrait converter sees the operands they're uniformly
`list<componentType>`. The yaml impl was correspondingly tightened from
`args: [{ value: any1 }] variadic` to `args: [{ value: list<any1?> }] variadic`.

Rust UDF (`udf::mvappend`) keeps its scalar-handling branch intact as a
defensive fallback, but in practice every operand it sees is a list now.

# Decimal128 element type

Calcite's leastRestrictive widening on INT + DECIMAL produces DECIMAL(p, s)
which substrait converts to Decimal128(p, s); the Java adapter casts every
operand's element type to that. The Rust UDF needed an explicit
`DataType::Decimal128(p, s)` branch — Decimal128Builder requires
`.with_precision_and_scale(p, s)` configuration before use, and Decimal128Array
elements are read via the `i128`-valued `value(i)` accessor (not via the
generic `build!` macro).

# Pass-rate (CalciteMVAppendFunctionIT, force-routed, with companion opensearch-project#5424 applied)

  * Before this commit: 6/15 (initial mvappend onboarding).
  * After this commit:  10/15.

  Newly passing:
    testMvappendWithMixedArrayAndScalar (uniform-list reshape),
    testMvappendWithComplexExpression (uniform-list reshape),
    testMvappendWithIntAndDouble (Decimal128 element),
    testMvappendWithNumericArrays (Decimal128 element).

  Remaining 5 failures:
    * testMvappendWithMixedTypes / WithFieldsAndLiterals / WithEmptyArray /
      WithNull — call legitimately widens to ARRAY<ANY> because operands
      contain pairs of types with no common widened type (INT + VARCHAR).
      The Calcite engine handles ANY via Object generic dispatch; substrait
      can't encode it. Out of scope without changing PPL UDF semantics.
    * testMvappendInWhereClause — uses `where array_length(combined) = 2`
      which the analytics-engine planner rejects with "No backend can
      evaluate filter predicate [EQUALS] on fields [combined:ARRAY]".
      Filter-side capability gap unrelated to mvappend.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [Analytics Backend / DataFusion] Register UDFs on FFM-created session contexts

create_session_context (the Rust-side builder behind df_create_session_context)
built a fresh DataFusion SessionContext but never called udf::register_all on
it. Every fragment query routed through df_execute_with_context reused that
handle's ctx via query_executor::execute_with_context, so substrait function
references to mvappend / mvfind / mvzip / convert_tz failed planning with
"This feature is not implemented: Unsupported function name". The matching
register_all call exists in execute_query / local_executor / indexed_executor
— this just brings the FFM session-context path to parity.

Verified: CalciteMVAppendFunctionIT against the analytics-engine route now
passes 10/15 (was 0/15) with the SQL companion opensearch-project#5424 widening fix applied.
The remaining 5 are pre-existing ARRAY<ANY>/UNKNOWN substrait-encoding gaps
(heterogeneous mvappend signatures, empty-array default, filter-on-array
predicate) tracked in this PR's "What's left" section.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [Analytics Backend / DataFusion] Don't let substrait AssertionError kill the cluster

Substrait's plan validators (VariadicParameterConsistencyValidator,
RelOptUtil.eq via Litmus.THROW, etc.) throw AssertionError directly via
explicit `throw new AssertionError(...)` rather than via the `assert`
keyword, so the JVM -da flag doesn't gate them. When a malformed plan
triggers one inside a search-thread call to SubstraitRelVisitor.apply,
the AssertionError propagates uncaught up the analytics-engine fragment
handler stack, OpenSearchUncaughtExceptionHandler classifies it as fatal,
and the entire cluster JVM exits.

Wrap the visitor.apply call in a narrow try/catch that re-raises the
AssertionError as IllegalStateException with the original message and
cause preserved. The analytics-engine error path already buckets
IllegalStateException at the fragment boundary into a normal HTTP 500
response — the cluster stays up and the failure shows in the per-query
report instead.

This came up while diagnosing CalciteMVAppendFunctionIT failures: malformed
ARRAY<ANY> plans were taking down the cluster mid-test instead of producing
per-test failures, masking the underlying substrait conversion error.

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* [QA] Add ArrayFunctionIT + MVAppendFunctionIT for analytics-engine REST path

Self-contained QA ITs in sandbox/qa/analytics-engine-rest exercising the
PPL collection functions onboarded in this PR through POST /_analytics/ppl
against a parquet-backed `calcs` dataset, no SQL plugin checkout required.

ArrayFunctionIT (22 tests):
  - array constructor (mixed-numeric BigDecimal → Double promotion + int+string)
  - array_length
  - mvindex range (array_slice — 0-based-(start, length) → 1-based-(start, end))
  - mvindex single (array_element via ITEM rename)
  - mvdedup (array_distinct)
  - mvjoin (array_to_string rename)
  - mvzip (Rust UDF, default + custom delimiter + nested)
  - mvfind (Rust UDF, match / no-match / dynamic regex via concat() Sig bridge)
  - split (returns array)

MVAppendFunctionIT (6 tests):
  - uniform-typed scalar variadic (multiple, single, string)
  - array operands (flattening, nested string arrays)
  - VARCHAR field references via real calcs row

Tests gated on SQL companion opensearch-project#5424 (testMvappendWith{IntAndDouble,
MixedArrayAndScalar, NumericArrays, ComplexExpression}) are intentionally
absent — they fail with "Unable to convert the type ANY" until
MVAppendFunctionImpl's leastRestrictive widening + DECIMAL→DOUBLE
promotion + operand pre-cast is published as unified-query-core. A
top-of-class block lists them with a pointer back to opensearch-project#5424.

Lambda-based functions (transform, mvmap, reduce, forall, exists, filter)
and empty-array operands are absent for the architectural reasons in this
PR's "What's left" section: substrait extension YAML doesn't support
declaring func<…> lambda-typed args, and array() defaults to ARRAY<UNKNOWN>
which substrait can't encode without #5421.

Local verification (per `docs/dev/ppl-analytics-engine-routing.md` SOP):
- :sandbox:qa:analytics-engine-rest:integTest --tests "*ArrayFunctionIT" — 22/22
- :sandbox:qa:analytics-engine-rest:integTest --tests "*MVAppendFunctionIT" — 6/6
- :check -p sandbox — all 718 tasks green

Signed-off-by: Kai Huang <ahkcs@amazon.com>

---------

Signed-off-by: Kai Huang <ahkcs@amazon.com>
…de for distributed execution (opensearch-project#21457)

* feat(spi): add intermediateFields and finalExpression to AggregateFunction

Extend AggregateFunction enum with two nullable fields that describe
how each function decomposes into partial+final phases for distributed
execution:

  - intermediateFields: List<IntermediateField> — Arrow schema of the
    partial output per field, with per-field reducer (SUM, DC, etc.)
  - finalExpression: BiFunction<RexBuilder, List<RexNode>, RexNode> —
    scalar expression over partial columns for primitive-decomposition
    cases (AVG = sum/count)

Four encoding cases are expressible on the same shape:
  - null / null              → pass-through (SUM, MIN, MAX)
  - [one field, reducer==self], null → engine-native merge (DC)
  - [one field, reducer!=self], null → function-swap at FINAL (COUNT→SUM)
  - [N fields], non-null     → primitive decomp + Project (AVG)

This makes AggregateFunction the single source of truth for per-function
decomposition knowledge. Downstream layers (DAGBuilder,
FragmentConversionDriver, LocalStageScheduler) no longer need any
per-function if-branches — they read the enum via the decomposition
resolver.

Self-reference (APPROX_COUNT_DISTINCT as its own reducer) works around
Java's enum-initializer restriction by using null as a "self" sentinel
in the raw field; the intermediateFields() accessor resolves null back
to the owning constant.

Ref: .kiro/docs/distributed-aggregate-design.md §6
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* feat(planner): add ArrowCalciteTypes utility for bidirectional type mapping

Single source for Arrow↔Calcite type conversion in the planner:
  Int(64) ↔ BIGINT
  Int(32) ↔ INTEGER
  FloatingPoint(DOUBLE) ↔ DOUBLE
  FloatingPoint(SINGLE) ↔ REAL/FLOAT
  Utf8 ↔ VARCHAR(max)
  Binary ↔ VARBINARY(max)
  Bool ↔ BOOLEAN

This utility will be consumed by the AggregateDecompositionResolver to
derive Calcite aggregate-call return types from AggregateFunction.
intermediateFields — replacing scattered calls to Calcite's stock
SqlAggFunction.inferReturnType (which produces inconsistent types for
AVG's sum field and DC's binary sketch between the Calcite view and
DataFusion's actual output).

Uses JDK 21 switch expressions with pattern matching for type
discrimination. Unsupported types throw IllegalArgumentException with
the offending type in the message.

Note on VARCHAR/VARBINARY: we pass Integer.MAX_VALUE to request
unbounded precision; Calcite's type factory clamps to its own internal
max (65536 by default). The tests assert equality with the factory's
reported max, since that's the invariant we actually care about —
'unbounded' VARCHAR/VARBINARY.

Ref: .kiro/docs/distributed-aggregate-design.md §8
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* feat(datafusion-rust): add agg_mode module and disable CombinePartialFinalAggregate

Introduce the Rust physical-layer infrastructure for distributed
partial/final aggregate execution:

  - agg_mode.rs: internal Mode enum (Default | Partial | Final),
    force_aggregate_mode() walks Final(Partial(...)) trees and strips
    one half (preserving RepartitionExec / CoalescePartitionsExec
    between), find_partial_input() drills past repartition nodes to
    locate the inner Partial.
  - physical_optimizer_rules_without_combine(): returns the default
    DataFusion physical optimizer rules with CombinePartialFinalAggregate
    removed by name, so the Final(Partial(...)) pair survives to our
    strip pass.
  - SessionContext (shard) and LocalSession (coordinator) now build
    with the new optimizer rule set.
  - SessionContextHandle gains two fields for the prepared-plan path:
      aggregate_mode: Mode (default Default)
      prepared_plan: Option<Arc<dyn ExecutionPlan>> (default None)
    These will be written by prepare_partial_plan / prepare_final_plan
    entry points in a follow-up commit.

Invariants enforced:
  - Mode is pub(crate) — never exposed across FFI.
  - No new pub extern "C" fn signatures here (that's Task 5).
  - No Java files modified.
  - No mode enum int crosses the boundary.

Five Rust unit tests cover strip-partial, strip-final, past-repartition,
past-coalesce, and combine-rule-absent.

DataFusion API note: default rules are obtained from
PhysicalOptimizer::new().rules; with_physical_optimizer_rules on
SessionStateBuilder fully replaces the optimizer (no additive behavior
needed).

Ref: .kiro/docs/distributed-aggregate-design.md §13
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* feat(planner): add AggregateDecompositionResolver pass

The one place per-function decomposition logic lives. Walks the DAG
after backend resolution and rewrites PARTIAL/FINAL aggregate pairs
driven entirely by the AggregateFunction enum's intermediateFields
and finalExpression.

Four cases handled uniformly, no function-name branches:

  1. Pass-through (intermediateFields == null): SUM, MIN, MAX stay
     unchanged; FINAL's arg is rebound to the next column.
  2. Engine-native merge (one field, reducer == self): DC keeps its
     aggregate call at FINAL; exchange row type on StageInputScan
     carries VARBINARY for the sketch column.
  3. Function-swap (one field, reducer != self): COUNT's FINAL
     becomes SUM(count_col); PARTIAL keeps COUNT.
  4. Primitive decomposition (N fields, finalExpression != null):
     AVG's PARTIAL emits SUM(count)+SUM(sum); FINAL reduces each
     with SUM; LogicalProject wraps FINAL to apply finalExpression
     (sum/count) cast to the original call's return type.

Integration point: DefaultPlanExecutor.executeInternal between
BackendPlanAdapter.adaptAll and FragmentConversionDriver.convertAll.

Calcite constraint noted: Aggregate.typeMatchesInferred asserts each
AggregateCall's type matches SqlAggFunction.inferReturnType(...). We
cannot retype PARTIAL's calls directly. Instead, the exchange
contract (what flows over the wire and what FINAL reads) is carried
by the parent stage's StageInputScan row type, derived from
intermediateFields via ArrowCalciteTypes. DataFusion's compiler
ignores the Substrait-declared PARTIAL row type for well-known
aggregates (it uses its own internal StateFields), so Calcite's
inferred types at PARTIAL are harmless.

Seven unit tests cover: pass-through SUM, function-swap COUNT,
engine-native DC, primitive-decomp AVG, mixed Q10, group-keys flow
through, no-Calcite-inference regression guard.

Ref: .kiro/docs/distributed-aggregate-design.md §7
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* feat(datafusion): FFI prepare_partial_plan / prepare_final_plan / execute_local_prepared_plan

Three named FFI entry points wiring the prepared-plan lifecycle
between Java and Rust. No mode enum crosses the boundary \u2014 Rust
sets Mode::Partial / Mode::Final internally based on which function
Java called.

Rust side:
  - SessionContextHandle gets prepare_partial_plan(): decodes
    Substrait bytes, converts to physical plan, applies
    agg_mode::apply_aggregate_mode(Partial), stores in
    handle.prepared_plan.
  - LocalSession gains a prepared_plan field + prepare_final_plan()
    + execute_prepared() that streams from the stored plan.
  - ffm.rs exposes three new pub extern "C" fn entries following
    the existing convention (i64 return; >=0 success, <0 negated
    error pointer):
      df_prepare_partial_plan(handle_ptr, bytes_ptr, bytes_len)
      df_prepare_final_plan(session_ptr, bytes_ptr, bytes_len)
      df_execute_local_prepared_plan(session_ptr)

Java side:
  - NativeBridge adds three MethodHandle fields + three public
    wrappers (preparePartialPlan, prepareFinalPlan,
    executeLocalPreparedPlan) following the existing FFM pattern
    used by EXECUTE_WITH_CONTEXT.

Invariants enforced:
  - grep 'Mode' ffm.rs: zero matches in any pub extern "C" fn.
  - No aggregate_mode reference on the Java side.
  - agg_mode.rs untouched.
  - Mode stays pub(crate).

Tests:
  - Rust: prepare_partial_plan_sets_mode_and_stores_plan,
    prepare_final_plan_stores_plan (484 total Rust tests pass).
  - Java: NativeBridgePreparedPlanTests validates null-pointer
    handling and confirms MethodHandles resolve against the native
    symbols.

Substrait decoding: uses the existing datafusion-substrait
from_substrait_plan(&state, &plan) pattern already present in
execute_substrait / execute_with_context.

Ref: .kiro/docs/distributed-aggregate-design.md \u00a74.3, \u00a711.3, \u00a713
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* feat(datafusion): wire prepared-plan path into coordinator reduce sink

Completes Task 6 of the distributed-aggregate feature: the reduce
sink now consumes the DataFusionReduceState produced by
FinalAggregateInstructionHandler and drives executeLocalPreparedPlan
against the handler's session instead of re-decoding the fragment
bytes on a fresh session.

  AbstractDatafusionReduceSink
    - New constructor taking nullable DataFusionReduceState. When
      present, the state owns session + senders; otherwise the base
      class creates a session as before.
    - close() skips closing session when state != null so the state's
      close() handles it (avoids double-close on the native side).

  DatafusionReduceSink
    - New 3-arg constructor (ctx, runtimeHandle, preparedState).
    - When state is non-null: reuse state's session + senders (indexed
      back to childStageId via ctx.childInputs() iteration order) and
      call executeLocalPreparedPlan.
    - When state is null: legacy path unchanged (register partitions,
      executeLocalPlan). This path is exercised by non-aggregate
      reduce stages where no FinalAggregate instruction ran.

  DataFusionAnalyticsBackendPlugin.getExchangeSinkProvider
    - Casts backendContext to DataFusionReduceState and passes it
      through. Memtable sink is bypassed when a preparedState is
      present (memtable sink doesn't yet support prepared-plan path).

  DataFusionInstructionHandlerFactory
    - Missing import for PartialAggregateInstructionNode added.

  FilterDelegationForIndexFullConversionTests (mock)
    - Updated to new ExchangeSinkProvider.createSink two-arg signature.

Ref: .kiro/docs/distributed-aggregate-design.md \u00a711
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* feat(datafusion): emit approx_distinct via Substrait extension YAML

Replace the post-emit proto-rewrite (renameExtensionFunctions +
FUNCTION_RENAMES) with a declarative Substrait extension approach:

  - New resource: opensearch_aggregate_functions.yaml
      URN  extension:org.opensearch:aggregate_functions
      declares approx_distinct(any) -> i64

  - DataFusionPlugin.loadSubstraitExtensions now merges the aggregate
    YAML alongside delegation and scalar YAMLs.

  - DataFusionFragmentConvertor:
      - New ADDITIONAL_AGGREGATE_SIGS binding the custom
        APPROX_DISTINCT SqlAggFunction to the extension name
        "approx_distinct".
      - Custom APPROX_DISTINCT operator avoids collision with
        Substrait's default approx_count_distinct mapping (default
        catalog shadows FunctionMappings.Sig entries on the stock
        Calcite operator).
      - New rewriteApproxCountDistinct RelShuttle swaps
        SqlStdOperatorTable.APPROX_COUNT_DISTINCT to APPROX_DISTINCT
        on every AggregateCall before Substrait emission. Type is
        preserved (BIGINT NOT NULL) so Aggregate.typeMatchesInferred
        passes.
      - Switched to the 4-arg AggregateFunctionConverter constructor
        to pass ADDITIONAL_AGGREGATE_SIGS.
      - Deleted FUNCTION_RENAMES map + renameExtensionFunctions proto
        walk + serialize-time rename calls.

  - Test extensions catalog now loads the aggregate YAML alongside
    delegation.

Why the custom operator instead of a direct Sig on the stock function:
Substrait's default catalog declares approx_count_distinct under the
standard URN, so isthmus resolves stock Calcite APPROX_COUNT_DISTINCT
through the default mapping first and our additional Sig is shadowed.
A fresh SqlAggFunction with no prior mapping routes cleanly through
ADDITIONAL_AGGREGATE_SIGS.

All 14 DataFusionFragmentConvertor tests pass (including the now
type-correct testApproxCountDistinctRenamed which asserts both
absence of approx_count_distinct AND presence of approx_distinct in
the emitted extension declarations).

Ref: .kiro/docs/distributed-aggregate-design.md \u00a710
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* feat(spi+execution): complete handler infrastructure for prepared-plan path

Hooks up the last mile of the distributed-aggregate instruction-handler
chain. Completes Task 6 by landing the scaffolding that was on the pf4
tree as uncommitted modifications:

  ExchangeSinkProvider.createSink
    Signature change: (ExchangeSinkContext) -> (ExchangeSinkContext,
    BackendExecutionContext). Sink providers receive the backend-opaque
    state produced by the last instruction handler, so the reduce sink
    can drive the already-prepared plan instead of re-decoding the
    fragment bytes.

  FinalAggregateInstructionHandler
    Real implementation (was TODO-only). Creates the DatafusionLocalSession,
    registers one input partition per child stage via
    NativeBridge.registerPartitionStream, calls
    NativeBridge.prepareFinalPlan, and returns a DataFusionReduceState
    bundling session + runtime + senders for the reduce sink.
    Failure path closes any partially-allocated senders + session before
    rethrowing.

  PartialAggregateInstructionHandler (new class)
    Calls NativeBridge.preparePartialPlan on the already-open session
    created by the preceding ShardScanInstructionHandler. Rust side
    sets Mode::Partial and stores the prepared plan on the handle.

  DataFusionReduceState (new class)
    BackendExecutionContext implementation carrying the local session,
    native runtime handle, and partition-sender list. close() tears
    down senders first, then the session, matching the allocation order
    in FinalAggregateInstructionHandler's failure path.

  LocalStageScheduler
    Hoists backendContext out of the try-finally so it's in scope for
    the provider.createSink(context, backendContext) call. Sink is now
    responsible for holding / closing backendContext on success; the
    scheduler only closes it on instruction-apply or sink-creation
    failure.

With these changes the end-to-end path is now connected:
  ShardScan -> opens SessionContext
  PartialAggregate -> NativeBridge.preparePartialPlan (sets Mode::Partial,
                       stores plan) — executed by DataFusion when the
                       scan driver runs its Substrait execute_with_context
  FinalAggregate -> creates LocalSession, registers senders,
                     NativeBridge.prepareFinalPlan (stores Final-stripped
                     plan)
  Sink receives DataFusionReduceState -> NativeBridge.executeLocalPreparedPlan

Ref: .kiro/docs/distributed-aggregate-design.md \u00a711, \u00a714
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* feat(datafusion): register APPROX_COUNT_DISTINCT aggregate capability

Add APPROX_COUNT_DISTINCT to the DataFusion backend's aggregate
capability set so the planner recognises it as a supported aggregate
on this backend (alongside SUM, SUM0, MIN, MAX, COUNT, AVG).

Switch the capability loop from AggregateCapability.simple(...) to the
3-arg AggregateCapability constructor:
  - The per-type factory helpers (simple, approximate, statistical,
    stateExpanding) assert on AggregateFunction.Type. SUM et al. are
    SIMPLE; APPROX_COUNT_DISTINCT is APPROXIMATE — splitting by type
    would branch the loop.
  - The 3-arg constructor accepts any Type and leaves
    decomposition=null so the AggregateDecompositionResolver falls
    back to the enum's intermediateFields + finalExpression (the
    single source of truth for partial/final decomposition).
  - No per-function if/else anywhere: adding future functions
    (STDDEV_POP, VAR_POP) is a one-line addition to AGG_FUNCTIONS.

No other changes: the existing cartesian product across SUPPORTED_FIELD_TYPES
is unchanged. Calcite's operand-type checker filters nonsensical
combinations (e.g. SUM on VARCHAR) at planning time.

Ref: .kiro/docs/distributed-aggregate-design.md \u00a75.4, \u00a714
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* test: extend CoordinatorReduceIT with partial/final aggregate coverage

Adopts pf2's CoordinatorReduceIT test shapes (ported to the REST-based
AnalyticsRestTestCase framework already in sandbox/qa/analytics-engine-rest)
to exercise every branch of the AggregateDecompositionResolver's
four-case rewrite end-to-end:

  testScalarSumAcrossShards      (pass-through)         — pre-existing
  testScalarCountAcrossShards    (function-swap COUNT→SUM)
  testAvgAcrossShards            (primitive decomp + Project)
  testDistinctCountAcrossShards  (engine-native HLL merge)
  testGroupedSumAcrossShards     (group keys flow through)
  testQ10ShapeAcrossShards       (all four families, grouped)

testQ10ShapeAcrossShards is unignored in pf4. pf2 had @Ignore on this
test because its decomposeFinalFragment mishandled parent Project
expressions after decomposition (the scattered per-layer rewrite that
pf4 replaces). pf4's single-pass AggregateDecompositionResolver builds
the Project wrapper in-place from intermediateFields + finalExpression
at the point of decomposition, so Q10 works end-to-end.

Implementation refactors:
  - extracted scalarRows() helper used by each scalar-shape test to
    assert column presence, row count, and non-null cell; tests assert
    only the value semantics.
  - indexing split into constant-value (for SUM/COUNT/AVG predictability)
    and varying-value (needed for DC to have a meaningful cardinality)
    helpers sharing a single bulkAndRefresh path.
  - parquet-backed index creation parameterised by name so DC uses its
    own index (distinct values) without colliding with other tests'
    constant-value data.

Runs via:
  ./gradlew :sandbox:qa:analytics-engine-rest:integTest -Dsandbox.enabled=true

Ref: .kiro/docs/distributed-aggregate-design.md \u00a720 testing matrix
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* fix(datafusion-rust): disable CombinePartialFinalAggregate on IndexedExec SessionContext

Invariant 7 audit turned up one SessionContext construction site
that bypassed physical_optimizer_rules_without_combine() — the
indexed_executor path used for IndexedExec queries. Without the
custom rule set, CombinePartialFinalAggregate can collapse
Final(Partial(...)) pairs, defeating force_aggregate_mode on any
aggregate query routed through this executor.

Aligns with session_context.rs (shard, both callsites) and
local_executor.rs (coordinator reduce), which already configure the
same helper.

Ref: .kiro/docs/distributed-aggregate-design.md \u00a717 invariant 7
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* style: apply spotless formatting

Mechanical run of spotlessApply across analytics-framework,
analytics-engine, and analytics-backend-datafusion. One non-mechanical
fix: replaced the wildcard static import in AggregateFunctionTests
with explicit per-constant imports (spotless can't auto-fix wildcards).

No behavior change. All targeted unit tests remain green.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* scaffold(planner): OpenSearchAggregateReduceRule (wiring deferred)

Introduces the subclass of Calcite's AggregateReduceFunctionsRule that,
when wired into PlannerImpl's HEP marking phase, will decompose AVG /
STDDEV_POP / STDDEV_SAMP / VAR_POP / VAR_SAMP into primitive SUM/COUNT
(+ SUM_SQ for variance) calls wrapped by a scalar Project — replacing
most of AggregateDecompositionResolver's primitive-decomp logic with
Calcite's tested implementation.

Customisations over the stock rule:
  - matches OpenSearchAggregate only (not LogicalAggregate)
  - restricts to AggregateMode.SINGLE so it does not re-fire on
    PARTIAL/FINAL produced by OpenSearchAggregateSplitRule
  - overrides newAggregateRel to rebuild as OpenSearchAggregate,
    preserving mode and viableBackends so downstream marking /
    split rules continue to pattern-match on the reduced inner
    aggregate

Wiring deferred: when this rule fires during marking, Calcite inserts
a CAST around the DIVIDE result (to match AVG's original return type)
and a LogicalProject above the reduced aggregate. OpenSearchProjectRule
then enforces that every RexCall in the Project has a backend declaring
the corresponding ScalarFunction capability — but CAST is not declared
by any backend (including MockDataFusionBackend in test scaffolding).

Before wiring this rule into PlannerImpl we need to decide how CAST /
DIVIDE / TIMES should be treated by the capability system:
  Option 1: treat CAST as an implicit / always-supported operator in
            OpenSearchProjectRule, bypassing scalar-capability lookup
            (CAST is a query-semantics primitive, not a function).
  Option 2: declare CAST / DIVIDE / TIMES on every backend (real + mock)
            that claims any scalar capability.
  Option 3: keep the current hand-rolled primitive-decomp path and
            accept its ~250 LOC as the cost of avoiding the above.

Until that decision is made, this class compiles and is unit-testable
in isolation but is not referenced by PlannerImpl. AggregateDecomposition
Resolver continues to handle AVG primitive-decomp end-to-end.

Ref: .kiro/docs/distributed-aggregate-design.md §7
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* feat(planner): carve out baseline scalar operators from capability enforcement

Introduces a BASELINE_SCALAR_OPS set in OpenSearchProjectRule covering
SQL-execution primitives (arithmetic, CAST, null handling, conditional)
that every viable backend is implicitly assumed to support. These ops
bypass the scalar-capability registry and flow through OpenSearchProject
without backend annotation.

Motivation
----------
Calcite plan-rewrite rules (AggregateReduceFunctionsRule,
ReduceExpressionsRule, future type-coercion rules) routinely introduce
arithmetic / CAST / CASE / null-predicate operators while rewriting
expressions. These are not optional backend features — they are
primitives that every query engine must support to be viable at all.
Modeling them as capability-declared created two failure modes:

  1. Every new backend had to enumerate ~20 operators that are never
     actually optional. Forgetting one produces a plan-time failure
     on queries that don't even reference the forgotten op — they just
     happen to trigger a Calcite rule that emits it.
  2. Any Calcite rule that incidentally emits one of these ops (e.g.
     the CAST around SUM(x)/COUNT(x) that AggregateReduceFunctionsRule
     emits to match AVG's original return type) fails plan-time checks
     with a misleading 'No backend supports scalar function [CAST]'
     error — even though the query semantics are unambiguous and every
     backend executes CAST natively.

Aligns with how other SQL engines model this concern (PostgreSQL,
DuckDB, Presto do not ship capability registries for arithmetic or
CAST — these are execution primitives, not optional features).

Scope
-----
Carved out:
  PLUS, MINUS, MULTIPLY, DIVIDE, UNARY_MINUS, UNARY_PLUS  (arithmetic)
  CAST                                                    (type coercion)
  IS_NULL, IS_NOT_NULL, COALESCE, CASE                    (null / conditional)

Intentionally conservative: comparisons (EQUALS, LESS_THAN, ...) and
logical ops (AND, OR, NOT) typically appear in Filter contexts where
they are already capability-handled by the filter path. Extend the
carve-out only when a specific plan-rewrite rule demonstrably emits a
new operator that every backend already supports.

Recursion: operands of a baseline op are still visited. A baseline op
wrapping a non-baseline function (e.g. CAST(regexp_match(col, 'x')))
still forces the inner call through capability resolution and
annotation.

If a future backend genuinely cannot execute one of these operators
(e.g. Lucene rejecting a CAST between incompatible types), that
becomes a runtime error inside the backend's executor — complementary
to plan-time capability enforcement, not a replacement for it.

Test updates
------------
Ten ProjectRuleTests cases used CAST, PLUS, and similar baseline ops
as representative 'some scalar function' fixtures. The intent of those
tests — capability routing, delegation, annotation depth — is
unchanged; the fixtures are swapped for capability-declared operators
(CEIL, POWER, UPPER) so the tests still exercise capability-registry
behavior. Assertions and expected exceptions are preserved verbatim
against the new fixtures.

Unlocks
-------
This is load-bearing for any future integration of Calcite's plan
rewrites that emit baseline operators. Immediately unblocks
OpenSearchAggregateReduceRule (commit 1847cb0a4bd, currently
scaffold-only) which emits CAST around AVG's sum/count division.

Ref: .kiro/docs/distributed-aggregate-design.md \u00a77
Related: scaffold 1847cb0a4bd (OpenSearchAggregateReduceRule)
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* feat(planner): wire OpenSearchAggregateReduceRule into HEP marking

Activates Calcite's AggregateReduceFunctionsRule (via our
OpenSearchAggregateReduceRule subclass, committed in 1847cb0a4bd) as
part of PlannerImpl's HEP marking phase. AVG is now decomposed into
primitive SUM(x) + COUNT(x) aggregate calls wrapped by a scalar
LogicalProject that computes CAST(sum / count AS avgReturnType) —
before our OpenSearchAggregateSplitRule produces PARTIAL/FINAL pairs.

With this in place, the subsequent split rule operates on the already-
reduced inner aggregate, so PARTIAL and FINAL both carry primitive
calls. Our AggregateDecompositionResolver sees these primitives and
applies its function-swap branch (COUNT at FINAL → SUM over partial
count column) as normal. The previous primitive-decomposition path
in the resolver (buildProjectWrapper, multi-aggregate rewrite) is no
longer exercised for AVG and can be simplified in a follow-up commit.

Scope of reduction (FUNCTIONS_TO_REDUCE)
----------------------------------------
Narrowed to AVG only. Calcite's STDDEV_POP / STDDEV_SAMP / VAR_POP /
VAR_SAMP reductions emit POWER(x, 2) for x², and POWER is not in
OpenSearchProjectRule.BASELINE_SCALAR_OPS — no backend currently
declares POWER as a project capability in our test scaffolding. AVG's
decomposition emits only SUM, COUNT, DIVIDE, and CAST: SUM/COUNT go
through the aggregate capability path; DIVIDE and CAST are baseline
scalars carved out of capability enforcement by commit e07d900352f.

Extending to STDDEV / VAR is a one-line change to FUNCTIONS_TO_REDUCE
once either POWER joins the baseline set or backends declare it.

Calcite aggregate-call deduplication
------------------------------------
When a query mixes AVG(x) with COUNT() or SUM(x) that have identical
arguments, Calcite's reduce rule deduplicates — the user's COUNT()
and SUM(x) are absorbed into AVG's primitive decomposition, and the
Project on top surfaces them via input refs. Per-shard aggregations
strictly decrease; results are semantically equivalent. Q10-mixed
tests updated to assert the deduplicated shape.

Test updates
------------
Three AggregateDecompositionResolverTests cases were rewritten to
match the new plan shape (Calcite's primitive decomposition instead
of the hand-rolled resolver's):

  testPrimitiveDecompAvg: PARTIAL carries [SUM, COUNT] (Calcite's
    order), not [SUM, SUM]; exchange types are integer-family, not
    DOUBLE (pre-reduction intermediateFields override is no longer
    taken). Parent fragment is asserted as Project (works for both
    LogicalProject pre-HEP and OpenSearchProject post-HEP).

  testMixedQ10: asserts Calcite's dedup behavior — 2 PARTIAL primitives
    (not 4) with the Project on top surfacing [status, avg_size, c, s]
    via CAST + input refs.

  testNoCalciteInferReturnType: renamed to testAvgExchangeTypesAreCalcite
    Primitives and repurposed. The old invariant ("sum column must be
    DOUBLE from intermediateFields") is obsolete; Calcite's primitive
    decomposition produces integer-family types that match DataFusion's
    SUM(int) → Int64 emit directly, with no override path.

Resolver code (rewriteDecomposed primitive-decomp branch,
buildProjectWrapper, etc.) remains in place for this commit but is
dead code for AVG. It will be removed in a follow-up commit alongside
any final simplification.

Ref: .kiro/docs/distributed-aggregate-design.md \u00a77
Related: scaffold 1847cb0a4bd, baseline carve-out e07d900352f
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* refactor(planner): remove primitive-decomposition path from resolver

Calcite's AggregateReduceFunctionsRule (wired in 223c632fb60) now
handles AVG / STDDEV / VAR primitive decomposition during HEP marking
— before our resolver ever sees the plan. The hand-rolled multi-field
primitive-decomp branch in rewriteDecomposed is dead code, along with
its supporting machinery.

Removed
-------
  rewriteDecomposed multi-field branch (~20 LOC)  — Calcite replaces
  buildProjectWrapper (~80 LOC)                    — Calcite's reduce
    rule builds the Project (CAST(sum/count AS avgType)) itself
  primitivePartial helper (~30 LOC)                — Calcite emits the
    primitive SUM/COUNT aggregate calls directly
  PendingProject record (~5 LOC)                   — no longer tracks
    per-call Project state since Calcite owns the wrapper
  projectOnTop field on RewriteResult + invocation (~10 LOC)
  Unused imports: LogicalProject, RexBuilder, RexNode

Defensive guard
---------------
The rewriteDecomposed method now throws IllegalStateException if it
encounters an AggregateFunction that declares multi-field intermediate
or scalar-final decomposition. Those cases must be reduced by Calcite's
rule upstream — reaching the resolver with such a call indicates
either (a) the function isn't in OpenSearchAggregateReduceRule's
FUNCTIONS_TO_REDUCE set, or (b) the rule didn't fire for some other
reason. The guard preserves the invariant that only single-field
shapes (pass-through, function-swap COUNT→SUM, engine-native DC)
reach the resolver.

LOC
---
AggregateDecompositionResolver.java: 489 → 362 LOC (-127).
Combined with commits e07d900352f (baseline carve-out, +40 LOC net)
and 223c632fb60 (reduce rule wiring + test updates, +57 LOC net), the
three-commit sequence delivers the decomposition refactor with a net
~-30 LOC reduction and considerably less per-function logic in the
resolver.

Ref: .kiro/docs/distributed-aggregate-design.md \u00a77
Related: 1847cb0a4bd (rule scaffold), e07d900352f (baseline), 223c632fb60 (rule wiring)
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* refactor: remove dead code after Calcite-driven AVG reduction

Followup to commits 1847cb0a4bd → 975fc26339b (Calcite reduce rule
refactor). With AggregateReduceFunctionsRule handling AVG / STDDEV /
VAR decomposition during HEP marking, the enum fields and methods
that only existed to drive hand-rolled primitive decomposition are
dead code. Remove them to prevent future readers from mis-modeling
the enum as the decomposition contract for multi-field cases.

Removed
-------
  AggregateFunction.finalExpression field + accessor
  AggregateFunction.hasScalarFinal() method
  AggregateFunction 4-arg constructor (now only 2/3 args)
  AVG enum entry's finalExpression lambda
  AggregateFunctionTests assertions on the above

  AggregateDecompositionResolver guard: simplified
    'iFields.size() != 1 || fn.hasScalarFinal()' → 'iFields.size() != 1'
    (single-field + scalar-final shape doesn't exist in any enum entry)

  Unused imports: BiFunction, RexBuilder, RexNode, SqlStdOperatorTable,
    FloatingPointPrecision, JavaTypeFactoryImpl

Updated
-------
  OpenSearchAggregateSplitRule javadoc — replaced the pre-refactor TODO
  ("aggregate decomposition is deferred to plan forking") with an
  accurate description of the current split of responsibilities:
    - HEP marking: OpenSearchAggregateReduceRule handles multi-field
      primitive decomposition (AVG / STDDEV / VAR)
    - This split rule: purely structural SINGLE → FINAL+Exchange+PARTIAL
    - AggregateDecompositionResolver: single-field cases only
      (pass-through, function-swap, engine-native merge)

  AggregateFunctionTests: simplified test names and assertions to
  reflect the narrowed enum shape. AVG / STDDEV / VAR entries are
  asserted to declare no intermediateFields — primitive decomposition
  metadata does not belong on the enum when Calcite owns the rewrite.

Kept
----
  AggregateDecomposition interface: unused at runtime but documented
  as the escape hatch for future per-backend overrides. Zero cost to
  keep; non-trivial design decision to remove.

  AggregateCapability.decomposition() field: same rationale.

  Test-runtime Arrow deps on analytics-framework: still needed for
  COUNT / APPROX_COUNT_DISTINCT enum entries' ArrowType instances.

Spotless applied across the 3 modified modules.

Ref: .kiro/docs/distributed-aggregate-design.md §6
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* refactor(spi): remove dead aggregate SPI surface and consolidate resolver helpers

All internal to the analytics planner.

── Remove dead SPI surface ──

Two unused extension points on the analytics SPI had no production
callers and confused the intent of the aggregate decomposition path.
Consolidating on the enum-based source of truth:

1. AggregateFunction.hasBinaryIntermediate()
   Only referenced by its own unit-test assertions. Existed for an
   earlier resolver branch that detected engine-native Binary
   intermediates; the current resolver tests reducer identity directly
   (f.reducer() == this).

2. AggregateDecomposition (SPI interface)
   Published as a per-backend extension point on
   AggregateCapability.decomposition(), but no backend ever
   implemented it and nothing read it back — not the resolver, not
   the split rule. Superseded during actual implementation by
   AggregateFunction.intermediateFields() + AggregateDecompositionResolver,
   which is backend-agnostic and universal. Keeping the interface
   advertised an escape hatch the planner does not honor.

── Consolidate resolver helpers ──

3. Move Calcite-interop to the enum.
   toSqlAggFunction() (AggregateFunction → SqlAggFunction) and
   fromSqlAggFunction(SqlAggFunction) (reverse, with name-then-kind
   fallback) now live on AggregateFunction itself. Previously they
   were private helpers in AggregateDecompositionResolver. The enum
   is now the single place that owns Calcite-identity conversion for
   aggregates.

4. Unify two tree-rewrite helpers.
   replaceTopAggregate and replaceStageInputScan had identical
   identity-based copy-then-swap logic, differing only in target
   type. Replaced by a single replaceFirst(RelNode, RelNode, RelNode).

5. Extract per-call rewrite in AggregateDecompositionResolver.
   The per-aggCall loop in rewriteDecomposed previously mutated four
   parallel collections (partial calls, final calls, exchange types,
   exchange names) with branching intermixed — easy to forget a list
   or desynchronize indices. It now produces one immutable CallRewrite
   record per aggregate call and the outer loop consumes it:
     - rewriteAggCall: classifies + dispatches
     - passThroughRewrite: no decomposition
     - singleFieldRewrite: engine-native merge or function-swap
   Multi-field shapes (should be HEP-reduced upstream) still throw.
   Each branch is independently testable and the four output columns
   stay in lockstep by construction.

6. Single-line comments on private static utilities.

No behaviour change. Build + tests green across analytics-framework,
analytics-engine, and analytics-backend-datafusion. Spotless clean.

If a future backend genuinely needs divergent aggregate decomposition
(e.g. KLL vs HLL sketch for APPROX_COUNT_DISTINCT), a per-backend
AggregateFunctionAdapter SPI can be reintroduced — designed around
that concrete use case rather than this speculative shape.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* fix(datafusion-rust): enter Tokio runtime in df_execute_local_prepared_plan

df_execute_local_prepared_plan calls session.execute_prepared() which invokes
datafusion::physical_plan::execute_stream. That call is synchronous but kicks
off RepartitionExec / stream-channel setup that requires a Tokio reactor —
running it from the JNI-invoked thread (no Tokio context) aborts with
'there is no reactor running, must be called from the context of a Tokio 1.x
runtime'.

Enter the IO runtime's context for the duration of the call via
_guard = mgr.io_runtime.enter() so those operators can register with the
reactor. Matches the pattern already used by df_prepare_final_plan which
wraps its async call in mgr.io_runtime.block_on(...).

Surfaced by multi-shard PPL queries where the final-aggregate plan contains
a RepartitionExec: 13 of 14 previously-failing multi-shard PplClickBenchIT
queries now get past sink creation without this crash.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* feat(exec): extend ArrowSchemaFromCalcite with date / time / timestamp mappings

ArrowSchemaFromCalcite.toArrowType previously threw on any SqlTypeName it
didn't enumerate, including common OpenSearch field types like DATE and
TIMESTAMP. This surfaced whenever a multi-shard query carried a date-typed
column through the exchange row type (e.g. 'min(EventDate)').

Added mappings:
  SMALLINT    → Int(16, signed)
  TINYINT     → Int(8, signed)
  REAL        → FloatingPoint(SINGLE)    (alias for FLOAT)
  DATE        → Date(DAY)
  TIME        → Time(MILLISECOND, 32)
  TIMESTAMP   → Timestamp(MILLISECOND, null)
  TIMESTAMP_WITH_LOCAL_TIME_ZONE → Timestamp(MILLISECOND, null)

Unsupported types still throw IllegalArgumentException with the SqlTypeName
in the message.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* fix(planner): reduce AVG on plain LogicalAggregate in a dedicated HEP phase

OpenSearchAggregateReduceRule previously operated on OpenSearchAggregate and
shared the HEP marking collection with OpenSearchAggregateRule. In that
arrangement the marking rule fires first in BOTTOM_UP traversal, converting
LogicalAggregate → OpenSearchAggregate with per-call AGG_CALL_ANNOTATION
wrappers in aggCall.rexList. When Calcite's AggregateReduceFunctionsRule
then reduces AVG(x), it reads rexList[0].getType() during type inference on
the derived SUM — which carries AVG's original DOUBLE return type, not the
natural BIGINT for a SUM of integer. The stamped DOUBLE type propagates
through the reduced plan and fails typeMatchesInferred downstream
(stripAnnotations, the split rule, or the resolver — all cascades observed).

Clean fix — align with the documented design (§11.1): reduce BEFORE marking,
on a plain LogicalAggregate where aggCall.rexList is empty and Calcite
infers canonical primitive types. Implementation:

 1. OpenSearchAggregateReduceRule: match LogicalAggregate (not
    OpenSearchAggregate). Remove the AggregateMode.SINGLE guard and the
    newAggregateRel override that re-wrapped as OpenSearchAggregate — the
    subsequent marking rule handles that conversion. The rule body is now
    a one-line configuration invoking super.

 2. PlannerImpl: split the single HepPlanner into three chained phases —
    pre-marking (constant folding), aggregate reduction, marking. The
    reduction phase runs its own HepPlanner on the post-pre-marking plan
    so the rule order is enforced by phase boundaries rather than
    BOTTOM_UP rule discovery order.

Result: the plan leaving the reduction phase is Calcite-canonical — clean
LogicalAggregate(SUM, COUNT, ...) + LogicalProject(CAST(SUM)/CAST(COUNT)).
Marking then wraps each with OpenSearch* annotations; Volcano split, the
resolver, and stripAnnotations all see consistent primitive types without
any type-rebuild patches.

Unblocks all grouped-AVG queries (stats avg(x) by ...) on single-shard.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* fix(resolver): align multi-shard FINAL with DataFusion column names + nullability

Two related issues surfaced when PARTIAL/FINAL split fires on multi-shard:

 1. Column-name mismatch: the resolver's fallback for unnamed aggregates
    (e.g. reduced AVG's auto-generated SUM/COUNT) used 'expr$<N>' while
    Calcite — and DataFusion on the Rust side — use '$f<N>'. The PARTIAL's
    Arrow output arrived with '$f2, $f3', the Substrait plan for FINAL
    referenced 'expr$2, expr$3', and DataFusion's schema lookup aborted
    with 'No field named expr$2. Valid fields are …$f2…'.

    Fix: derive exchange column names from the aggregate's own RelDataType
    (agg.getRowType().getFieldList()) — Calcite already assigned the
    canonical names (explicit aggCall.name where present, '$f<N>' where
    not), so reusing them keeps Java-side exchange schema aligned with the
    DataFusion output convention. Removes the hand-rolled 'expr$<N>'
    fallback entirely.

 2. Nullability drift on function-swap: the resolver rewrites COUNT → SUM
    at FINAL for the function-swap case, constructing the new call via
    makeCall(...) with returnType = ArrowCalciteTypes.toCalcite(...), which
    returns NOT-NULL types. Calcite, however, infers SUM over an exchange
    column as nullable (SUM of empty group → null). The declared
    NOT-NULL-vs-inferred-nullable mismatch trips typeMatchesInferred when
    the FINAL OpenSearchAggregate is later copied.

    Fix: in rewriteParentFragment, rebuild each FINAL AggregateCall via the
    (hasEmptyGroup, input, type=null, name) AggregateCall.create variant so
    Calcite re-runs full type inference against the actual FINAL input
    (the rewritten StageInputScan).

 3. Project-above-FINAL RexInputRef rebind: once FINAL's aggCall types
    change, any Project sitting directly above the FINAL (from reduced AVG,
    or a user-written Project) holds RexInputRefs with stale types. The
    plain identity-based replaceFirst copies that Project unchanged and
    Calcite's RexChecker rejects the mismatch.

    Fix: replaceFirstWithRefRebinding — when the immediate parent is a
    Project, walk its projection expressions with a RexShuttle that
    rebinds each RexInputRef to the new FINAL's row-type, CASTing the
    whole expression to the Project's declared field type so the outer
    schema (e.g. AVG's DOUBLE column) stays stable even when the inner
    aggregate now emits primitive BIGINT.

Unblocks AVG queries on multi-shard (Q3, Q4, Q10, Q28, Q33) and stabilises
the COUNT → SUM swap path for Q1, Q2, Q8, Q16 etc when split fires.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* test(qa): enable multi-node + multi-shard ClickBench IT with auto-discovered queries

Turns on the previously-TODO'd 2-node integTest cluster for analytics-engine-rest
and switches the ClickBench dataset to 2 shards so PARTIAL/FINAL split, Arrow
Flight transport between shards and coordinator, and the aggregate
decomposition resolver all exercise under realistic distributed shape — not
just the single-shard no-split fast path.

PplClickBenchIT:
  - Auto-discovers all 43 PPL queries under resources/datasets/clickbench/ppl/
    instead of the Q1-only hardcoded list, so the tested surface grows as new
    PPL features land without touching this class.
  - SKIP_QUERIES (set literal) lists the 24 queries that currently fail for
    reasons unrelated to the partial/final aggregate feature — each with an
    in-file comment pointing at the root cause bucket:
      * Missing PPL frontend features: Q19, Q40, Q43
      * Malformed query in the dataset (missing 'where' keyword): Q29
      * Multi-shard binary exchange can't serialize LocalDateTime yet:
        Q7, Q24-Q27, Q37-Q42
      * DataFusion Arrow 'project index 0 out of bounds' on
        WHERE + GROUP-BY + aggregate: Q11-Q15, Q20, Q22, Q23, Q31, Q32
  - 19 queries pass across all three cluster variants (1-node-1-shard,
    1-node-2-shards, 2-nodes-2-shards), including all AVG-bearing queries
    (Q3, Q4, Q10, Q28, Q33).

mapping.json: number_of_shards = 2, keeps replicas at 0.
build.gradle: testClusters.integTest gets numberOfNodes = 2; memtable and
  streaming variants already use 2 nodes and are unchanged.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* test(qa): fix malformed Q29 PPL query; document Substrait MIN(VARCHAR) gap

Q29 in the ClickBench dataset had `| Referer != ''` with no `where` keyword,
tripping the PPL parser before any planning could happen. Added the missing
`where`.

Q29 then surfaces a distinct follow-up: the Substrait isthmus emitter can't
find a binding for `MIN($1)` when the argument is VARCHAR (the `min(Referer)`
call). That's a Substrait aggregate-catalog gap — unrelated to the
partial/final work — so Q29 stays on the SKIP_QUERIES list with a comment
pointing at the MIN-on-strings binding as the remaining blocker.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* fix(datafusion): emit schema-carrying empty batch when native stream has zero batches

When a shard's partial-aggregate plan produces zero record batches (typical
when a WHERE predicate filters out every row on that shard), the Flight
wire-protocol producer never writes a data frame. Arrow Flight puts the
schema in the first data frame — zero frames means zero schema, so the
coordinator's StreamingTableExec receives a stream with no schema and fails
with 'Arrow error: Schema error: project index 0 out of bounds, max field
0' the first time it projects a column by index.

DatafusionResultStream.BatchIterator now tracks two additional bits of
state: whether the native stream has reported EOS (arrayAddr == 0) and
whether we've ever returned a batch. When the native side yields EOS
without having emitted a single batch, we synthesize one zero-row
VectorSchemaRoot from the already-known schema and return it as the final
batch. Flight carries the schema with that frame; the coordinator sees it
and the downstream aggregate merges correctly over zero rows.

Unblocks 10 multi-shard ClickBench queries (Q11, Q12, Q13, Q14, Q15, Q20,
Q22, Q23, Q31, Q32) that were failing with this exact error. Enforces the
§18 invariant #12 documented in the design revisit:
  "Shard emission of a PARTIAL with zero matching rows still delivers the
   schema message."

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* test(qa): add coordinator-reduce regression for WHERE+GROUP-BY empty-partial case

Adds two CoordinatorReduceIT methods that together cover the empty-partial
contract:

- testWhereGroupByCountMultiShard_reproducer — WHERE filters all rows on
  every shard (all docs have category='', predicate is category != ''), so
  both shards' partial aggregates produce zero batches. Before the Java-
  side schema-preservation fix, this query died with Arrow 'project index
  0 out of bounds, max field 0' in the drain thread. The test asserts a
  non-erroring 200 response; exact shape isn't checked since the point is
  crash-freedom.

- testGroupByCountMultiShard_noWhereControl — same query shape minus the
  WHERE. Every doc has category='' so the result is one group with the
  full count. Acts as the control that isolates the WHERE predicate (and
  the resulting zero-batch partial) as the previous trigger.

PplClickBenchIT SKIP_QUERIES trimmed to the queries that remain failing
after the schema-preservation fix: only the TIMESTAMP/DATE family (Q7,
Q24-Q27, Q37-Q42) and unrelated PPL frontend gaps (Q19, Q29, Q40, Q43).
Previously-skipped WHERE + GROUP-BY queries (Q11, Q12, Q13, Q14, Q15, Q20,
Q22, Q23, Q31, Q32) are now included — 29 ClickBench queries pass on
multi-shard (up from 19).

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* refactor(exec): replace row-oriented fragment wire format with Arrow IPC

The shard-to-coordinator wire format for FragmentExecutionResponse previously
carried each batch as List<Object[]> + List<String> fieldNames, serialized
per-cell via OpenSearch's StreamOutput.writeGenericValue. The path had two
structural problems:

- Lossy round-trip for Arrow types. vector.getObject(i) returns java.time.
  LocalDateTime for TimeStampMilliVector (no TZ), which writeGenericValue
  does not support — every shard emitting a Timestamp column failed with
  'can not write type [class java.time.LocalDateTime]'. Arrow's view-vector
  variants (Utf8View, BinaryView) and dictionary-encoded vectors likewise
  can't round-trip through Object[] inference.

- Per-cell boxing on both sides (O(rows * cols) object allocations + dispatch
  on send, O(rows * cols) allocations + setSafe on receive). Heap-heavy and
  GC-pressure-heavy for larger batches.

Replaces the wire format with Arrow's own IPC stream: MessageSerializer.
serialize(channel, schema) once, then serialize(channel, recordBatch) per
batch, terminated by ArrowStreamWriter.writeEndOfStream. On receive,
ArrowStreamReader handles schema + batch message sequencing and VectorLoader
loads buffers into VSRs. Arrow's library handles every vector type natively
— zero hand-rolled dispatch.

RowResponseCodec.decode copies the reader's reused root into caller-owned
VSRs via makeTransferPair (works for every vector kind including views; the
default VectorSchemaRootAppender rejects view vectors, so we avoid it).
Multi-batch responses concatenate via per-cell copyFromSafe — the only
append primitive in Arrow Java that supports view vectors.

Wire format: byte[] ipcPayload + vint rowCount. rowCount is cached purely
for the existing onFragmentSuccess metric so consumers don't decode just to
count rows.

Deletes RowBatchToArrowConverter — the row-to-Arrow bridge is now Arrow IPC
end-to-end. Updates ArrowSchemaFromCalcite javadoc accordingly.

Net: +224 / -340 lines across the codec. All 39 non-skipped ClickBench PPL
queries pass on multi-shard (was 20 with the row-oriented codec), including
the full TIMESTAMP family (Q7, Q24-Q27, Q37-Q42) that the old Object[] path
structurally could not handle.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* fix(datafusion): coerce DataFusion's Utf8View emission to Utf8 before senderSend

DataFusion's HashAggregateExec internally converts string group keys to
Utf8View for performance (inline short-string optimization, non-configurable
in DataFusion 49+). The coordinator's FINAL plan, by contrast, is decoded
from substrait — substrait has only a generic 'string' type which DataFusion
round-trips to Utf8, and the coordinator's childSchemas (computed from the
Calcite row type via ArrowSchemaFromCalcite) likewise declares Utf8.

With the Arrow IPC wire codec preserving exact Arrow types end-to-end, the
shard's Utf8View batches now reach the coordinator as Utf8View. The Rust
StreamingTable partition was registered with a Utf8 schema, so Utf8View
batches trigger an 'as_primitive::<>()' downcast panic ('byte array') the
first time a cross-batch operator (coalesce / repartition) handles the
string column. Observed on every multi-string-column group-by from Q17
onward.

Previously the Object[] row-path laundered this silently: getObject ->
String -> VarCharVector.setSafe always produced Utf8, regardless of the
shard's actual emit type. Removing the lossy row-path exposed the contract
gap.

Adds a single coercion point at feedToSender(): when batch.getSchema()
differs from the declared childSchemas entry, allocate a new VSR matching
the declared schema and copy per column. Same-type columns use makeTransfer
Pair (zero-copy). Utf8View -> Utf8 uses per-cell byte copy:
ViewVarCharVector.get(i) -> VarCharVector.setSafe(i, bytes). Unknown type
pairs throw with a diagnostic naming source type, target type, and column —
future mismatches surface as clear errors rather than opaque Rust panics.

AbstractDatafusionReduceSink gains a childSchemas map parallel to the
existing childInputs bytes map, populated once in the constructor. Sinks
(single-input feed() and per-child ChildSink) look up the declared schema
by childStageId.

Extends only on observed mismatch — no speculative pair coverage. Zero-copy
fast path when schemas match (numeric-only aggregates, which is the common
case).

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* test(qa): un-skip TIMESTAMP and string-group-by queries that now pass

With the Arrow IPC wire codec and the Utf8View -> Utf8 coercion at the
Java-to-Rust boundary, the previously-skipped query families now pass on
multi-shard (2 shards, 2 nodes). Trims SKIP_QUERIES from 14 entries down
to the 4 remaining PPL frontend / Substrait library gaps that are unrelated
to distributed execution:

  Q19 - extract(minute from ...) not implemented in the PPL frontend
  Q29 - Substrait library can't bind MIN on VARCHAR inputs
  Q40 - case() else + head N from M - PPL frontend gap
  Q43 - date_format() + head N from M - PPL frontend gap

Unblocked: Q7 + Q24-Q27 + Q37-Q42 (TIMESTAMP / DATE family, 11 queries);
Q17-Q18 + Q31-Q36 (multi-string-column group-by family, 8 queries). Total
39 of 39 non-skipped ClickBench PPL queries pass on multi-shard.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* docs(datafusion): document plan-level alternatives for Utf8View coercer

Adds a TODO on coerceToDeclaredSchema summarizing the plan-level alternatives
we tried and why they're currently blocked:

- declaring Utf8View up-front in childSchemas SIGSEGVs because DataFusion's
  optimizer emits Utf8View across more operators than HashAggregate (filter
  + sort + project queries also hit it), making static prediction in Java
  fragile and engine-version-specific
- Arrow Java 18.3's FFI can import Utf8View natively (BufferImportTypeVisitor
  has visit(Utf8View)); the blocker is predicting the emission, not
  importing it
- three forward paths recorded for future revisit: (a) a DataFusion schema-
  introspection API, (b) substrait view-type extension, (c) a Rust-side
  normalize pass using DataFusion's vectorized CastExpr at PARTIAL root

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* refactor(datafusion): remove APPROX_COUNT_DISTINCT plan rewrite

The prior approach introduced a dummy Calcite SqlAggFunction APPROX_DISTINCT
and a plan shuttle (rewriteApproxCountDistinct) that swapped every
SqlStdOperatorTable.APPROX_COUNT_DISTINCT call for it right before Substrait
emission. The dummy was needed because isthmus' default AGGREGATE_SIGS
binds APPROX_COUNT_DISTINCT to substrait's standard approx_count_distinct
URN, and the resulting entry in FunctionConverter's IdentityHashMap
(keyed by Calcite SqlOperator) shadows any additional Sig entry for the
same operator.

Replaces the workaround with a small OpenSearchAggregateFunctionConverter
subclass that filters APPROX_COUNT_DISTINCT out of the default signature
list via getSigs(). With the default binding gone, a plain
Sig(SqlStdOperatorTable.APPROX_COUNT_DISTINCT, "approx_distinct") in
ADDITIONAL_AGGREGATE_SIGS is the sole matcher and routes directly to the
YAML-declared extension — no operator rewrite, no dummy SqlAggFunction,
no plan shuttle.

Net: -78 lines. No runtime per-function branch in the convertor. Restores
invariant #3 (no ad-hoc 'if function == X' dispatch outside AggregateFunction
and the resolver's enum lookup).

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* style: replace FQN references with imports across touched files

Audits every file touched in this branch's commits and converts
fully-qualified class references to imports. Preserves FQN only where
there's a genuine same-name collision that forces it.

Changes:
- AggregateDecompositionResolver: import Project, RelCollations, RexBuilder,
  RexInputRef, RexNode, RexShuttle (~10 FQN sites -> short names).
- ArrowSchemaFromCalcite + ArrowCalciteTypesTests: import DateUnit, TimeUnit
  (5 FQN sites -> short names).
- AbstractDatafusionReduceSink: import org.apache.arrow.vector.types.pojo.Schema
  (2 FQN sites).
- AnalyticsSearchService: import ArrowStreamWriter (1 FQN site).
- DataFusionFragmentConvertor: import ImmutableList (2 FQN sites).
- DataFusionFragmentConvertorTests: import RexNode (1 FQN site).
- ShardFragmentStageExecutionTests: import ActionResponse (1 FQN site) —
  also fixes stale new FragmentExecutionResponse(List<String>, List<Object[]>)
  calls that I missed updating in the earlier IPC refactor.

Preserves FQN on the two io.substrait.proto.Plan references in
DataFusionFragmentConvertor — that class collides with the already-imported
io.substrait.plan.Plan, so FQN is required.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* docs: strip internal design references from production comments

Comments in production files should not reference internal design-doc
section numbers (§18 #12), branch names (pf2, pf4), or session-specific
debug narrative ("we attempted", "SIGSEGV in Q26"). Those belong in
design docs and commit messages, not in the code.

- DatafusionResultStream: drop '§18 invariant #12' reference from the
  nativeStreamExhausted field comment; the surrounding explanation already
  states what the field is for.
- DatafusionReduceSink.coerceToDeclaredSchema: rewrite the TODO block as a
  forward-looking technical note instead of a session history, keeping only
  the actionable alternatives.
- CoordinatorReduceIT.testQ10ShapeAcrossShards: drop pf2/pf4 branch
  narrative; describe what the test covers structurally.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* test(qa): remove repro-style narrative from CoordinatorReduceIT

CoordinatorReduceIT's multi-shard WHERE+GROUP-BY tests carried leftover
debug narrative from their original reproducer role — now that the bug
they caught is fixed, the javadoc and naming should describe what the
tests validate, not their origin story.

- Rename testWhereGroupByCountMultiShard_reproducer →
  testGroupByCountMultiShard_allRowsFilteredByWhere
  and testGroupByCountMultiShard_noWhereControl →
  testGroupByCountMultiShard_noWhereClause. Describes shape, not history.
- Rewrite their javadocs to state what behavior is being asserted
  (empty-partial path reporting empty result without erroring), drop the
  'project index 0 out of bounds' stack trace and '@AwaitsFix to keep
  green' commentary.
- Rename the supporting fixtures WHERE_REPRO_INDEX, createWhereReproIndex,
  indexWhereReproDocs → STRING_GROUP_INDEX, createStringGroupIndex,
  indexStringGroupDocs. 'repro' was session-specific vocabulary.
- Class-level javadoc: drop internal Rust function names
  (prepare_partial_plan, force_aggregate_mode, execute_local_prepared_plan)
  from the pipeline diagram; keep the Calcite/Java-side names that describe
  the layer boundaries.
- Remove now-unused AwaitsFix import.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* test(lucene): update ExchangeSinkProvider lambda to 2-arg form

Pre-existing test file hadn't been updated after the
ExchangeSinkProvider#createSink contract went from (context) to
(context, backendContext) during the handler-infrastructure work.
Lambda now matches the current SPI signature.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* test(datafusion): update exhausted-stream test for schema-carrying empty batch

DatafusionResultStream.loadNextBatch now synthesizes a zero-row VSR
carrying the declared schema when the native stream produces no batches,
so downstream transports see the column layout on their first data frame.
testNextOnExhaustedStreamThrows was asserting the old contract (hasNext
returns false immediately on empty result). Consume the synthetic batch
first, then assert the iterator is exhausted.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* test(qa): mute DslClickBenchIT queries pending DSL-path investigation

The single DSL query (Q1: sum(GoodEvent)) has been hanging on this branch with
a 60s client-side socket timeout. Root cause is in the DSL aggregation path,
not the distributed-aggregate machinery — orthogonal to the coercer and
partial/final wiring. Skipping the query lets CI run the structural test
(provisioning, discovery) without blocking on the unrelated regression.

Restore 'List.of(1)' once the DSL hang is diagnosed and fixed.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

* fix(planner): strip annotations at every depth for baseline-op roots

BASELINE_SCALAR_OPS (COALESCE, CASE, CAST, arithmetic, IS_NULL, …) bypass
the AnnotatedProjectExpression wrap at the call site but their operands
still go through capability-driven annotation. A project expression like
COALESCE(num0, CEIL(num1)) ends up as

    COALESCE(num0, AnnotatedProjectExpression(CEIL(num1)))

with the outer COALESCE unwrapped and the inner CEIL wrapped.
OpenSearchProject.stripAnnotations only ran the nested-annotation shuttle
when the top-level expression was itself an AnnotatedProjectExpression;
the plain-top-level branch passed the expression through untouched, so
the inner wrapper survived into the substrait converter. Isthmus then
rejected the plan with 'Unable to convert call ANNOTATED_PROJECT_EXPR(…)'.

The strip logic predates the baseline carve-out (commit 196fd424d0a)
and was never updated to account for inner annotations under a baseline
root. The carve-out intentionally recurses into operands precisely so
those inner calls still go through capability resolution — strip must
mirror that recursion.

Run the RexShuttle unconditionally; it already no-ops for subtrees
without annotations. Same behaviour on top-level annotations, now
correctly scrubs nested ones regardless of what root wraps them.

Resolves 'ANNOTATED_PROJECT_EXPR(…)' failures in
FillNullCommandIT.testFillNullWithFunctionOnOtherField (fillnull with
ceil(num1) in num0 → COALESCE root) and
MultisearchCommandIT.testMultisearchEvalCaseProjection (eval case → CASE
root).

Also tags MultisearchCommandIT.testMultisearchCountEvalConditionalCount
with @AwaitsFix pending a separate DataFusion count-accumulator state-type
mismatch (Int32 vs Int64) that only surfaces after the strip fix lets
the plan reach execution. The test body is preserved so restoration is
just removing the annotation once the underlying issue is fixed.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

---------

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
…21589)

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
…rch-project#21435)

Add cluster-scope defaults for pluggable dataformat settings 

Signed-off-by: Arpit Bandejiya <abandeji@amazon.com>
…tService (opensearch-project#21581)

* Stub support for streaming-transport request handlers in MockTransportService

Adds Node.wrapStreamTransport hook so MockNode can wrap the streaming
transport before it's shared between TransportService and
StreamTransportService. MockTransportService.addRequestHandlingBehavior
falls back to the streaming-transport's stub registry when the action
isn't on the regular transport — needed for streaming-only handlers
(e.g. when FlightStreamPlugin is loaded).

Demonstrated by a new sandbox/qa/analytics-engine-coordinator IT that
stubs FragmentExecutionAction (streaming-only) end-to-end, plus a
deterministic unit test for a lookahead-reordering bug surfaced via
this infrastructure: the listener was offloading onStreamResponse
bodies to a thread pool, letting the isLast=true task race ahead of
earlier batches and drop them via the SUCCEEDED short-circuit.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* fix imports

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* spotless

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

---------

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
…-project#21596)

Signed-off-by: Ajay Raj Nelapudi <ajnelapu@amazon.com>
Co-authored-by: Ajay Raj Nelapudi <ajnelapu@amazon.com>
…oute (opensearch-project#21582)

Extends the PPL datetime surface of analytics-backend-datafusion with a
Wave A bundle of 14 functions: strftime, dayofweek / day_of_week,
second / second_of_minute, date, datetime, sysdate, extract,
from_unixtime (1-arg), maketime, makedate, date_format, time_format,
str_to_date. Each function is routed through
DataFusionAnalyticsBackendPlugin's scalar capabilities and wired end-to-end
to Substrait so that force-routed queries on /_plugins/_ppl execute on
DataFusion without any Calcite fallback.

Routing strategy per function:
 - Calcite builtins reused (date, datetime, dayofweek, second, sysdate)
   via name-mapping adapters that preserve PPL's declared return type.
 - Rust UDFs added for the MySQL-flavored behaviors that have no 1:1
   DataFusion builtin (strftime, extract, from_unixtime, maketime,
   makedate, date_format, time_format, str_to_date) with a shared
   mysql_format token table underpinning the *_format family.

End-to-end verification: all 14 functions pass against a force-routed
runTask cluster on /_plugins/_ppl, confirmed via explain
(viableBackends=[datafusion]) and ShardFragmentStageExecution traces.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
…opensearch-project#21594)

* refactor: streaming-only fragment dispatch in analytics-engine

- delete RowResponseCodec, ResponseCodec, FragmentExecutionResponse, RowBatchToArrowConverter
- delete AnalyticsSearchService.executeFragment / collectResponse
- require StreamTransportService at injection (fail-fast); force streaming feature flag in sandbox QA clusters
- ShardFragmentStageExecution: catch outputSink.feed() exceptions so a feed failure fails the stage instead of hanging to QUERY_TIMEOUT
- DatafusionResultStream: synthesise one zero-row schema-bearing batch for empty native streams (Flight requires ≥1 schema frame)
- DatafusionPartitionSender: read-write lock around send/close to prevent sender_close UAF while sender_send is mid-await
- @AwaitsFix 7 Append/AppendPipe IT methods — Utf8View FFI schema-lie bug, tracked separately

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* fix: wire UAF lock through feedToSender; unmute Append ITs

- DatafusionReduceSink.feedToSender now calls sender.send() instead of
  NativeBridge.senderSend() directly, so the read-write lock added on
  DatafusionPartitionSender actually protects the hot path.
- Drop @AwaitsFix on 7 Append/AppendPipe IT methods. The Utf8View
  schema-lie they hit is fixed by coerceToDeclaredSchema in upstream
  opensearch-project#21457 (merged here); all 7 now pass.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* fix merge conflicts

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

---------

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
…h-project#21592)

Add OPENSEARCH_NATIVE_LIB env var (or -PnativeLibOverride project property) so
sandbox tests can reuse a prebuilt libopensearch_native.{dylib,so,dll} from
another worktree, a blessed shared copy, or a CI-provided binary. When set,
buildRustLibrary is skipped via onlyIf and ext.nativeLibPath resolves to the
override path.

All 14 sandbox/plugins build.gradle sites already read
project(':sandbox:libs:dataformat-native').ext.nativeLibPath, so the override
propagates to every consumer without per-plugin changes. ./gradlew run is
unaffected — it already accepts -Dnative.lib.path via tests.jvm.argline and
does not depend on buildRustLibrary.

Without the env var / property, behavior is unchanged: buildRustLibrary runs
(or is UP-TO-DATE) and rust/target/release/libopensearch_native.dylib is used.

Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>
Co-authored-by: Sandesh Kumar <sandeshkr419@gmail.com>
…iver (opensearch-project#21618)

* Remove final-agg special case in FragmentConversionDriver

The dedicated FINAL-aggregate branch was forcing an AggregateMode override that
diverged from the standard partial/final path, producing the cnt[count] Int32
vs Int64 mismatch downstream. FragmentConversionDriver now treats aggregate
fragments via the same conversion path as the rest, letting the decomposition
resolver + Calcite's reduce rule carry the types through correctly.

Follow-on adjustments:
- FUNCTIONS_TO_REDUCE expanded to {AVG, STDDEV_POP, VAR_POP}; STDDEV_SAMP/VAR_SAMP
excluded because the CASE-WHEN boolean guard defeats stripAnnotations.
- POWER added to BASELINE_SCALAR_OPS (emitted as final sqrt).
- IT coverage for all four stat aggs in StreamingCoordinatorReduceIT.
- Existing planner tests updated for the new post-reduction pipeline shape.

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>

* fix(qa,planner): enable STDDEV_SAMP/VAR_SAMP, drop partial-agg instruction, test cleanups

Follow-up to Marc Handalian's commit that removed the FINAL-aggregate special case
in FragmentConversionDriver.

FragmentConversionDriver:
  - Drop the PARTIAL-aggregate instruction emission too. The Calcite-layer split
    (OpenSearchAggregateSplitRule + AggregateDecompositionResolver) already produces
    properly decomposed PARTIAL and FINAL fragments; the shard-side mode-forcing via
    NativeBridge.preparePartialPlan → force_aggregate_mode(Partial) was belt-and-
    suspenders. With it removed, every fragment takes the same executeLocalPlan path
    and DataFusion handles its own Final(Partial(...)) pair — correctness preserved
    because every aggregate reaching either stage is associative (SUM/MIN/MAX, COUNT
    function-swapped to SUM, HLL sketch merge, AVG/STDDEV primitive-decomposed).
    The downstream Java and Rust machinery (PartialAggregateInstructionHandler,
    FinalAggregateInstructionHandler, prepareFinalPlan FFI, force_aggregate_mode)
    stays in place — dormant, not dead — ready for re-enablement once upstream
    DataFusion's substrait consumer respects aggregation_phase (see
    .kiro/docs/datafusion-upstream-aggregation-phase.md).

OpenSearchAggregateReduceRule:
  - FUNCTIONS_TO_REDUCE expanded to the full statistical set
    {AVG, STDDEV_POP, STDDEV_SAMP, VAR_POP, VAR_SAMP}.
  - Javadoc updated to reflect the full reduction set and explain why STDDEV_SAMP/
    VAR_SAMP now flow through (Bessel's-correction CASE guard uses comparison
    operators that joined BASELINE_SCALAR_OPS in this commit).

OpenSearchProjectRule:
  - Added the six SQL comparison operators (>, >=, <, <=, =, !=) to
    BASELINE_SCALAR_OPS. They are emitted by Calcite's reduce rule for the SAMP-
    variant Bessel's-correction guard and are SQL-execution primitives every
    backend supports natively — consistent with the existing baseline rationale.

StreamingCoordinatorReduceIT:
  - Add missing semicolon at testAvgAcrossShards (int total = NUM_SHARDS *
    DOCS_PER_SHARD). The standard integTest task excludes this class via a glob,
    so the build failure only surfaced under integTestStreaming.
  - Remove the @AwaitsFix on testStddevSampAcrossShards and testVarSampAcrossShards
    (now enabled by the FUNCTIONS_TO_REDUCE and BASELINE_SCALAR_OPS changes above);
    drop the now-unused AwaitsFix import.

AppendPipeCommandIT:
  - Replace fully-qualified java.util.* references in testAppendPipeSort with
    imported short forms; add HashMap and Set imports. Matches the rest of the file.

Verified locally (after rebuilding the Rust dylib against the Wave A UDF set):
  - :sandbox:plugins:analytics-engine:test — 148 tests, 0 failures
  - :sandbox:qa:analytics-engine-rest:integTest — 304 tests, 4 skipped, 0 failures
  - :sandbox:qa:analytics-engine-rest:integTestStreaming — 7 tests, 0 skipped,
    0 failures (both STDDEV_SAMP and VAR_SAMP tests active and green)

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>

---------

Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Co-authored-by: Marc Handalian <marc.handalian@gmail.com>
Signed-off-by: Marc Handalian <marc.handalian@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.