Skip to content

Use leastRestrictive for mvappend element-type widening#5424

Open
ahkcs wants to merge 3 commits intoopensearch-project:feature/mustang-ppl-integrationfrom
ahkcs:feature/mvappend-leastrestrictive
Open

Use leastRestrictive for mvappend element-type widening#5424
ahkcs wants to merge 3 commits intoopensearch-project:feature/mustang-ppl-integrationfrom
ahkcs:feature/mvappend-leastrestrictive

Conversation

@ahkcs
Copy link
Copy Markdown
Collaborator

@ahkcs ahkcs commented May 8, 2026

Description

MVAppendFunctionImpl.updateMostGeneralType used strict Object.equals to compare each operand's component type against the running "most general" type, falling back to Calcite's ANY on any mismatch. Object.equals returns false for type pairs that differ only in nullability tag — e.g. array(1, 2) synthesizes INTEGER NULLABLE for its component while literal 3 is INTEGER NOT NULL — and for straightforwardly-widenable numerics like INT + DECIMAL.

The Calcite engine's enumerable runtime tolerates ANY because MVAppendImplementor.implement processes elements through Object[] — the declared element type is unused at execution time. The analytics-engine route is stricter: substrait can't serialize ANY, so isthmus throws UnsupportedOperationException: Unable to convert the type ANY during substrait conversion.

Two changes:

  1. updateMostGeneralType widens via RelDataTypeFactory.leastRestrictive — the same routine SqlLibraryOperators.ARRAY uses for its return-type inference. For genuinely incompatible operand types (INT + VARCHAR, …) leastRestrictive returns null; fall back to ANY there to preserve the existing in-process Calcite engine Object[] runtime semantics that the mvappend(1, 'text', 2.5)-style tests rely on. Promote DECIMAL → DOUBLE on the way through: RowResponseCodec maps DECIMAL cells to FloatingPoint(DOUBLE) anyway, and an explicit DECIMAL element type triggers Calcite's element coercion to BigDecimal, which the JSON formatter renders inconsistently across paths.

  2. MVAppendImplementor.implement pre-casts each scalar operand to the call's element Java class via EnumUtils.convert. Without this, Avatica's AbstractCursor.ArrayAccessor dispatches the per-element accessor by the declared SQL type — e.g. DoubleAccessor.getDouble does (Double) value — and would throw a runtime ClassCastException on an Integer cell when the call's element type widens to DOUBLE. Array operands pass through; their element-type alignment is the planner's responsibility.

A previous revision of this PR went straight to leastRestrictive widening and triggered a runtime corruption (mvappend(1, 2.5)[0, 2.5]) for exactly the Avatica-cast reason above. The pre-cast in MVAppendImplementor resolves that without reintroducing the regression.

Test plan

  • ./gradlew :core:test --tests "*MVAppend*" → green.
  • ./gradlew :integ-test:integTest --tests "org.opensearch.sql.calcite.remote.CalciteMVAppendFunctionIT" (Calcite engine path) → 15/15 pass.
  • ./gradlew :integ-test:integTest --tests "org.opensearch.sql.calcite.remote.CalciteArrayFunctionIT"60/60 pass (no regression on the sister IT).

Analytics-engine compatibility

Verified end-to-end against the analytics-engine path per the Mustang + SQL plugin SOP: cluster started with arrow-flight-rpc + opensearch-job-scheduler + analytics-engine + composite-engine + parquet-data-format + analytics-backend-lucene + analytics-backend-datafusion + opensearch-sql-plugin, this PR's SQL plugin republished to maven local on top of feature/ppl-coverage-bundle, the OpenSearch companion opensearch-project/OpenSearch#21554 applied (including its Register UDFs on FFM-created session contexts Rust fix), :integ-test:integTestRemote against the running cluster with tests.analytics.{parquet_indices,force_routing}=true.

CalciteMVAppendFunctionIT (force-routed through analytics-engine)

Status Before After
Passed 0/15 (was: 9/15 with conservative equalSansNullability-only widening) 10/15
Failed 15/15 5/15

Newly passing on the analytics-engine route as a result of this PR:

Test Operand types Why it now passes
testMvappendWithMixedArrayAndScalar ARRAY<INT>, INT, INT nullability-only mismatch → bridged by leastRestrictive
testMvappendWithComplexExpression ARRAY<INT>, ARRAY<INT>, INT nullability-only mismatch → bridged by leastRestrictive
testMvappendWithIntAndDouble INT, DECIMAL widens to DOUBLE via DECIMAL → DOUBLE promotion + operand pre-cast

The other 7 passing tests already passed under the prior equalSansNullability baseline (uniform-type mvappend signatures and pure-array operands).

Remaining 5 failures (all are documented architectural limits, not regressions)

# Test Operand types Why it still fails
1 testMvappendWithMixedTypes INT, VARCHAR, DECIMAL genuinely heterogeneous → ARRAY<ANY> (substrait can't encode)
2 testMvappendWithFieldsAndLiterals INT, VARCHAR, VARCHAR INT + VARCHAR → ARRAY<ANY>
3 testMvappendWithEmptyArray ARRAY<VARCHAR> (#5421 default), INT, INT VARCHAR + INT → ARRAY<ANY>. The empty-array array() is bound to empty_arr via eval; at MVAppend's type-inference site we see the column reference, not the literal, so per-call detection can't reach back through the project chain.
4 testMvappendWithNull VARCHAR, INT, INT VARCHAR + INT → ARRAY<ANY> (nullif(1, 1) is INTEGER-typed, not NULL-typed)
5 testMvappendInWhereClause VARCHAR, VARCHAR analytics-engine OpenSearchFilterRule.resolveViableBackends extracts only the predicate's TOP operator (EQUALS) and checks it against the field's storage type (ARRAY); doesn't walk into nested calls, so array_length(combined) = 2 is rejected as "EQUALS on ARRAY field". Tracked separately as a planner refactor; not specific to mvappend.

Calcite's ANY is a JVM Object[] pass-through with no substrait/Arrow/DataFusion equivalent. Resolving (1)–(4) would require either changing PPL's mvappend contract from heterogeneous-Object[] to a uniform stringified type (breaks user expectations) or shipping Arrow Union arrays through the wire format (multi-PR upstream work). Documented in the corresponding "What's left" section of opensearch-project/OpenSearch#21554.

Companion changes

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

PR Reviewer Guide 🔍

(Review updated until commit 6ea6ff7)

Here are some key observations to aid the review process:

🧪 No relevant tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Possible Issue

The boxedJavaClass method returns Object.class for unhandled SQL types (default case), but EnumUtils.convert at line 127 may fail or produce unexpected results when attempting to convert to Object.class. If elementClass is Object.class, the conversion is skipped (line 124), but if elementType is non-null with an unhandled SqlTypeName (e.g., DATE, TIMESTAMP, BINARY), the code will attempt conversion to Object.class, which may not behave as intended.

private static Class<?> boxedJavaClass(SqlTypeName sqlType) {
  return switch (sqlType) {
    case BOOLEAN -> Boolean.class;
    case TINYINT -> Byte.class;
    case SMALLINT -> Short.class;
    case INTEGER -> Integer.class;
    case BIGINT -> Long.class;
    case FLOAT, REAL -> Float.class;
    case DOUBLE -> Double.class;
    case DECIMAL -> BigDecimal.class;
    case CHAR, VARCHAR -> String.class;
    default -> Object.class;
  };
}
Type Safety

When leastRestrictive returns null (line 97), the code falls back to ANY type but continues to attempt type coercion at line 127 with elementClass derived from ANY. The boxedJavaClass method returns Object.class for ANY, which causes the coercion to be skipped (line 124-125). However, this creates inconsistency: the element type is declared as ANY but individual elements retain their original types without uniform coercion, potentially causing runtime type mismatches in downstream code expecting homogeneous arrays.

RelDataType least = typeFactory.leastRestrictive(java.util.List.of(current, candidate));
if (least == null) {
  return typeFactory.createSqlType(SqlTypeName.ANY);
}

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

PR Code Suggestions ✨

Latest suggestions up to 6ea6ff7

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Enforce consistent type conversion logic

The condition skips conversion when elementClass is Object.class, but this may lead
to heterogeneous types in the result list. Verify that downstream code handles mixed
types correctly, or consider enforcing type consistency even for Object.class to
prevent runtime casting issues.

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVAppendFunctionImpl.java [124-128]

-if (opType.getComponentType() != null || elementClass == Object.class) {
+if (opType.getComponentType() != null) {
   coerced.add(op);
 } else {
   coerced.add(EnumUtils.convert(op, elementClass));
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion questions the elementClass == Object.class condition, but the existing logic appears intentional: when the element type is Object, no conversion is needed since any type is acceptable. Removing this check could cause unnecessary conversions and potential issues.

Low
Validate element type is non-null

When elementType is null, the code defaults to Object.class, but this may mask
underlying issues. Consider adding a null-check or logging to ensure elementType is
expected to be null in valid scenarios, preventing silent failures.

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVAppendFunctionImpl.java [118-119]

-Class<?> elementClass =
-    elementType == null ? Object.class : boxedJavaClass(elementType.getSqlTypeName());
+if (elementType == null) {
+  throw new IllegalStateException("Element type should not be null for array operations");
+}
+Class<?> elementClass = boxedJavaClass(elementType.getSqlTypeName());
Suggestion importance[1-10]: 3

__

Why: While validation could be helpful, the code explicitly handles the null case by defaulting to Object.class, which appears intentional for handling edge cases. The suggestion to throw an exception would break this fallback behavior without clear evidence it's incorrect.

Low
Use mutable list for compatibility

Using java.util.List.of() creates an immutable list. If leastRestrictive internally
modifies the list or expects a mutable collection, this could cause runtime
failures. Consider using Arrays.asList() or explicitly creating a mutable ArrayList
to avoid potential UnsupportedOperationException.

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVAppendFunctionImpl.java [96]

-RelDataType least = typeFactory.leastRestrictive(java.util.List.of(current, candidate));
+RelDataType least = typeFactory.leastRestrictive(Arrays.asList(current, candidate));
Suggestion importance[1-10]: 2

__

Why: List.of() creates an immutable list, but leastRestrictive is a standard Calcite API method that only reads the list for type comparison. There's no evidence it modifies the list, making this suggestion overly cautious and unlikely to prevent actual issues.

Low

Previous suggestions

Suggestions up to commit 6cf2b65
CategorySuggestion                                                                                                                                    Impact
General
Add null safety for SQL type

The elementType null check may be insufficient. If elementType.getSqlTypeName()
returns null or an unmapped type, boxedJavaClass returns Object.class, but this
could mask type resolution issues. Consider validating the SQL type name explicitly.

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVAppendFunctionImpl.java [118-119]

-Class<?> elementClass =
-    elementType == null ? Object.class : boxedJavaClass(elementType.getSqlTypeName());
+SqlTypeName sqlTypeName = elementType == null ? null : elementType.getSqlTypeName();
+Class<?> elementClass = (elementType == null || sqlTypeName == null) 
+    ? Object.class 
+    : boxedJavaClass(sqlTypeName);
Suggestion importance[1-10]: 5

__

Why: The suggestion adds defensive null checking for getSqlTypeName(), which could improve robustness. However, the boxedJavaClass method already handles unmapped types by returning Object.class in the default case, so this is a minor defensive improvement rather than fixing a critical bug.

Low
Remove unnecessary fully qualified name

Using the fully qualified java.util.List is unnecessary since List is already
imported. This creates inconsistency with other usages in the file and reduces code
readability.

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVAppendFunctionImpl.java [96]

-RelDataType least = typeFactory.leastRestrictive(java.util.List.of(current, candidate));
+RelDataType least = typeFactory.leastRestrictive(List.of(current, candidate));
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly identifies that java.util.List is redundant since List is already imported at line 12. However, this is a minor style improvement with minimal impact on functionality.

Low
Suggestions up to commit 5fc778f
CategorySuggestion                                                                                                                                    Impact
General
Use leastRestrictive for type widening

Consider using SqlTypeUtil.leastRestrictive instead of manually checking nullability
equality and creating nullable types. This aligns with the PR title and provides a
more robust type widening mechanism that handles edge cases consistently with
Calcite's type system.

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVAppendFunctionImpl.java [95-97]

-if (SqlTypeUtil.equalSansNullability(typeFactory, current, candidate)) {
-  return typeFactory.createTypeWithNullability(current, true);
+RelDataType leastRestrictive = SqlTypeUtil.leastRestrictive(typeFactory, List.of(current, candidate));
+if (leastRestrictive != null) {
+  return leastRestrictive;
 }
Suggestion importance[1-10]: 4

__

Why: While SqlTypeUtil.leastRestrictive could provide more robust type widening, the PR's explicit comment explains that the current approach intentionally avoids widening genuinely different types (INT + DOUBLE, INT + VARCHAR) to preserve heterogeneous values. Using leastRestrictive might contradict this design decision by performing unwanted type coercion.

Low
Suggestions up to commit 5ab670b
CategorySuggestion                                                                                                                                    Impact
General
Add null check for candidate parameter

Consider validating that current and candidate are non-null before calling
leastRestrictive. While the method checks current == null earlier, candidate could
theoretically be null, which might cause unexpected behavior in leastRestrictive.

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVAppendFunctionImpl.java [91-92]

+if (candidate == null) {
+  return current;
+}
 RelDataType least = typeFactory.leastRestrictive(java.util.List.of(current, candidate));
 return least != null ? least : typeFactory.createSqlType(SqlTypeName.ANY);
Suggestion importance[1-10]: 5

__

Why: The suggestion adds defensive null-checking for candidate, which could improve robustness. However, this is a minor improvement since the method likely receives non-null values from its callers in practice, and the score is moderate because it's primarily a defensive programming practice rather than fixing a critical bug.

Low

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Persistent review updated to latest commit 5fc778f

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Persistent review updated to latest commit 6cf2b65

@ahkcs ahkcs added the enhancement New feature or request label May 9, 2026
ahkcs added 3 commits May 9, 2026 10:34
`MVAppendFunctionImpl.updateMostGeneralType` used strict {@code Object.equals}
to compare each operand's component type against the running "most general"
type, falling back to Calcite's {@code ANY} on any mismatch. That's too
aggressive: {@code Object.equals} returns false for type pairs that differ
only in nullability tag (e.g. {@code array(1, 2)} synthesizes INTEGER NULLABLE
for its component while literal {@code 3} is INTEGER NOT NULL), and for
straightforwardly-widenable numerics like INTEGER + DOUBLE. The PPL UDF result
would then be {@code ARRAY<ANY>}.

The Calcite engine's enumerable runtime tolerates {@code ANY} because
{@code MVAppendImplementor.eval} processes elements through {@code Object} —
the declared element type is unused at execution time. The analytics-engine
route is stricter: substrait can't serialize {@code ANY}, so isthmus throws
{@code UnsupportedOperationException: Unable to convert the type ANY} during
the substrait conversion phase.

Widen with {@link RelDataTypeFactory#leastRestrictive} — the same routine
{@code SqlLibraryOperators.ARRAY} uses for its return-type inference. Falls
back to ANY only when {@code leastRestrictive} returns null (genuinely
incompatible operand types like INT + VARCHAR), preserving the original
behavior on those queries.

# Test plan

* {@code :core:test --tests "*MVAppend*"} — passes (no existing test asserted
  on the {@code ANY} fallback).
* Companion to opensearch-project/OpenSearch#21554 — unblocks 8+ tests in
  {@code CalciteMVAppendFunctionIT} force-routed through the analytics-engine
  path that previously failed with "Unable to convert the type ANY".

Signed-off-by: Kai Huang <ahkcs@amazon.com>
Calling Calcite's leastRestrictive widens mixed numerics like INT + DECIMAL
to a single common numeric type (e.g. DECIMAL(11,1)). The Calcite engine then
casts each operand to that type at codegen — Integer(1) becomes BigDecimal
with scale 1, which renders as 0.1 (or 0 after JSON round-trip), breaking
testMvappendWithIntAndDouble that expects mvappend(1, 2.5) to return [1, 2.5].

The original goal was just to bridge the nullability-tag mismatch that
synthesizes an array's component as INTEGER NULLABLE versus a bare literal's
INTEGER NOT NULL. Limit the widening to that case via equalSansNullability
and keep the ANY fallback for genuinely different types — preserving the
Calcite engine's heterogeneous-Object[] runtime semantics that pre-existing
tests rely on.

Signed-off-by: Kai Huang <huangkaics@gmail.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
The previous nullability-only bridge fixed `array(1, 2) + literal 3` but left
`mvappend(1, 2.5)` falling back to ARRAY<ANY>. ARRAY<ANY> is not
substrait-serializable, so any analytics-engine query through that call
fails at substrait conversion. Aggressive `leastRestrictive` widening was
the obvious next step but earlier triggered a runtime corruption — Integer 1
showed up as 0 in the response — because the Avatica result-set's
ArrayAccessor uses element-type-specific accessors (e.g.
`DoubleAccessor.getDouble` does `(Double) value`), and an Integer cell in a
declared-DOUBLE list triggered a ClassCastException that the error path
masked as `[0, 2.5]`.

Fix the corruption by pre-casting each scalar operand to the call's element
Java class in `MVAppendImplementor` via `EnumUtils.convert`. The result list
is now homogeneously typed at codegen, so Avatica's per-element cast
succeeds. Promote DECIMAL → DOUBLE on the way through `updateMostGeneralType`
because `RowResponseCodec` maps DECIMAL cells to FloatingPoint(DOUBLE)
anyway; an explicit DECIMAL element type triggers Calcite's element coercion
to BigDecimal, which the JSON formatter renders inconsistently across paths.

For genuinely incompatible operand pairs (INT + VARCHAR, …)
`leastRestrictive` returns null and the existing `ANY` fallback stands —
heterogeneous mvappend output stays on the Calcite engine path; only the
analytics-engine route can't emit substrait for those.

Local verification:
- :core:test --tests *MVAppend* — green
- :integ-test:integTest --tests CalciteMVAppendFunctionIT — 15/15
- :integ-test:integTest --tests CalciteArrayFunctionIT — 60/60

Signed-off-by: Kai Huang <huangkaics@gmail.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@ahkcs ahkcs force-pushed the feature/mvappend-leastrestrictive branch from 6cf2b65 to 6ea6ff7 Compare May 9, 2026 17:34
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

Persistent review updated to latest commit 6ea6ff7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant