From 94e36d59a8da6c7711c008e2936d84e26f6a62b5 Mon Sep 17 00:00:00 2001
From: Kai Huang <huangkaics@gmail.com>
Date: Mon, 11 May 2026 16:25:14 -0700
Subject: [PATCH 1/3] Forward PPL_SYNTAX_LEGACY_PREFERRED through
 UnifiedQueryContext

`RestUnifiedQueryAction.applyClusterOverrides` previously only forwarded
`PPL_REX_MAX_MATCH_LIMIT` into the per-request `UnifiedQueryContext`. As a
result, cluster-side updates to `plugins.ppl.syntax.legacy.preferred` were
ignored on the analytics-engine route: `PPLQueryParser` -> `AstBuilder` ->
`ArgumentFactory` read the legacy-preferred flag from the unified context's
settings map, which never received the override. This caused queries like
`top age` / `rare state` with `usenull` defaulting off to behave as if
`usenull=true` on the analytics route.

Refactor the override builder into a small helper and forward both
`PPL_REX_MAX_MATCH_LIMIT` and `PPL_SYNTAX_LEGACY_PREFERRED`. Future keys can
be added with a one-liner.

Fixes `CalciteTopCommandIT.testTopCommandLegacyFalse` and
`CalciteRareCommandIT.testRareCommandLegacyFalse` against the analytics
route (`-Dtests.analytics.force_routing=true`).

Signed-off-by: Kai Huang <huangkaics@gmail.com>
---
 .../plugin/rest/RestUnifiedQueryAction.java   | 26 ++++++++++---------
 1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java b/plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java
index e1bb84778d..95ff51b0b9 100644
--- a/plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java
+++ b/plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java
@@ -190,23 +190,25 @@ private UnifiedQueryContext buildContext(QueryType queryType, boolean profiling)
    * setting(String, Object)} API, keeping {@link UnifiedQueryContext} decoupled from any specific
    * {@link org.opensearch.sql.common.setting.Settings} implementation.
    *
-   * <p>Currently scoped to {@code plugins.ppl.rex.max_match.limit} — required so the unified path
-   * honors {@code _cluster/settings} updates for {@code rex max_match} (CalciteRexCommandIT's
-   * testRexMaxMatchConfigurableLimit). Add keys here if a future PR / IT depends on cluster-side
-   * fidelity for one of the other planning settings.
+   * <p>Add keys here if a future PR / IT depends on cluster-side fidelity for one of the other
+   * planning settings.
    */
   private UnifiedQueryContext.Builder applyClusterOverrides(UnifiedQueryContext.Builder builder) {
-    Object rexLimit =
-        pluginSettings.getSettingValue(
-            org.opensearch.sql.common.setting.Settings.Key.PPL_REX_MAX_MATCH_LIMIT);
-    if (rexLimit != null) {
-      builder.setting(
-          org.opensearch.sql.common.setting.Settings.Key.PPL_REX_MAX_MATCH_LIMIT.getKeyValue(),
-          rexLimit);
-    }
+    forwardClusterSetting(
+        builder, org.opensearch.sql.common.setting.Settings.Key.PPL_REX_MAX_MATCH_LIMIT);
+    forwardClusterSetting(
+        builder, org.opensearch.sql.common.setting.Settings.Key.PPL_SYNTAX_LEGACY_PREFERRED);
     return builder;
   }
 
+  private void forwardClusterSetting(
+      UnifiedQueryContext.Builder builder, org.opensearch.sql.common.setting.Settings.Key key) {
+    Object value = pluginSettings.getSettingValue(key);
+    if (value != null) {
+      builder.setting(key.getKeyValue(), value);
+    }
+  }
+
   /**
    * Extract the source index name by parsing the query and visiting the AST to find the Relation
    * node. Uses the context's parser which supports both PPL and SQL.

From f6b8b98ab9e84bc5dc786a27c5fad92c8d007e76 Mon Sep 17 00:00:00 2001
From: Kai Huang <huangkaics@gmail.com>
Date: Mon, 11 May 2026 16:25:35 -0700
Subject: [PATCH 2/3] Stabilize rare/top tie-break by appending field columns
 to ROW_NUMBER order

`CalciteRelNodeVisitor.visitRareTopN` lowers `rare`/`top` to a
`ROW_NUMBER() OVER (PARTITION BY ... ORDER BY count [DESC])` window. With
only the count column in the ORDER BY clause, ties at the same count
resolved via the upstream operator's insertion order, which differed
between backends (in-process Calcite vs. analytics-engine vs. Lucene
pushdown). On the analytics-engine route, `testRareWithGroup` failed
because ROW_NUMBER picked NV at count=8 while the test expected AR.

Append the rare/top field columns as secondary ASC keys so ties resolve
alphabetically and deterministically across backends. This matches the
behavior of the existing OpenSearch terms-aggregation pushdown, which
tie-breaks on `_key:asc`.

Update `RareTopPushdownRule` to accept the new shape: 1 or 2 order keys,
where the (optional) second key must be the rare/top target field in ASC
direction. The pushdown's wire payload is unchanged.

Update the matching unit-test expectations in `CalcitePPLRareTopNTest`
(11 RelNode/result/SparkSQL strings) and 5 explain YAML fixtures.

Fixes `CalciteRareCommandIT.testRareWithGroup` against the analytics
route (and removes the same class of tie-break flakiness across other
rare/top tests).

Signed-off-by: Kai Huang <huangkaics@gmail.com>
---
 .../sql/calcite/CalciteRelNodeVisitor.java    |  10 +-
 .../calcite/explain_nested_agg_top_push.yaml  |   2 +-
 .../calcite/explain_rare_usenull_false.yaml   |   2 +-
 .../calcite/explain_rare_usenull_true.yaml    |   4 +-
 .../calcite/explain_top_usenull_false.yaml    |   2 +-
 .../calcite/explain_top_usenull_true.yaml     |   4 +-
 .../planner/rules/RareTopPushdownRule.java    |  30 ++++-
 .../ppl/calcite/CalcitePPLRareTopNTest.java   | 106 +++++++++---------
 8 files changed, 95 insertions(+), 65 deletions(-)

diff --git a/core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java b/core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
index 1251f51b13..eae02484c9 100644
--- a/core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
+++ b/core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
@@ -3038,6 +3038,14 @@ public RelNode visitRareTopN(RareTopN node, CalcitePlanContext context) {
       countField = context.relBuilder.field(countFieldName);
     }
 
+    // Append the rare/top field columns as secondary order keys so ties in the count column
+    // resolve deterministically. Without this, ROW_NUMBER's tie-break is insertion-order
+    // dependent and varies between backends (e.g. analytics-engine vs in-process Calcite).
+    List<RexNode> tieBreakKeys = rexVisitor.analyze(fieldList, context);
+    List<RexNode> orderKeys = new ArrayList<>(tieBreakKeys.size() + 1);
+    orderKeys.add(countField);
+    orderKeys.addAll(tieBreakKeys);
+
     RexNode rowNumberWindowOver =
         PlanUtils.makeOver(
             context,
@@ -3045,7 +3053,7 @@ public RelNode visitRareTopN(RareTopN node, CalcitePlanContext context) {
             null,
             List.of(),
             partitionKeys,
-            List.of(countField),
+            orderKeys,
             WindowFrame.toCurrentRow());
     context.relBuilder.projectPlus(
         context.relBuilder.alias(rowNumberWindowOver, ROW_NUMBER_COLUMN_FOR_RARE_TOP));
diff --git a/integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yaml b/integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yaml
index 65f3ed320f..7fedc9a434 100644
--- a/integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yaml
+++ b/integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yaml
@@ -3,7 +3,7 @@ calcite:
     LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT])
       LogicalProject(address.city=[$0], count=[$1])
         LogicalFilter(condition=[<=($2, 10)])
-          LogicalProject(address.city=[$0], count=[$1], _row_number_rare_top_=[ROW_NUMBER() OVER (ORDER BY $1 DESC)])
+          LogicalProject(address.city=[$0], count=[$1], _row_number_rare_top_=[ROW_NUMBER() OVER (ORDER BY $1 DESC, $0)])
             LogicalAggregate(group=[{0}], count=[COUNT()])
               LogicalProject(address.city=[$3])
                 LogicalFilter(condition=[IS NOT NULL($3)])
diff --git a/integ-test/src/test/resources/expectedOutput/calcite/explain_rare_usenull_false.yaml b/integ-test/src/test/resources/expectedOutput/calcite/explain_rare_usenull_false.yaml
index 9c5157c72b..eb0c361ad8 100644
--- a/integ-test/src/test/resources/expectedOutput/calcite/explain_rare_usenull_false.yaml
+++ b/integ-test/src/test/resources/expectedOutput/calcite/explain_rare_usenull_false.yaml
@@ -3,7 +3,7 @@ calcite:
     LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT])
       LogicalProject(gender=[$0], state=[$1], count=[$2])
         LogicalFilter(condition=[<=($3, 2)])
-          LogicalProject(gender=[$0], state=[$1], count=[$2], _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2)])
+          LogicalProject(gender=[$0], state=[$1], count=[$2], _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2, $1)])
             LogicalAggregate(group=[{0, 1}], count=[COUNT()])
               LogicalProject(gender=[$4], state=[$7])
                 LogicalFilter(condition=[AND(IS NOT NULL($4), IS NOT NULL($7))])
diff --git a/integ-test/src/test/resources/expectedOutput/calcite/explain_rare_usenull_true.yaml b/integ-test/src/test/resources/expectedOutput/calcite/explain_rare_usenull_true.yaml
index 7bc4aa1e21..f936ad008e 100644
--- a/integ-test/src/test/resources/expectedOutput/calcite/explain_rare_usenull_true.yaml
+++ b/integ-test/src/test/resources/expectedOutput/calcite/explain_rare_usenull_true.yaml
@@ -3,12 +3,12 @@ calcite:
     LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT])
       LogicalProject(gender=[$0], state=[$1], count=[$2])
         LogicalFilter(condition=[<=($3, 2)])
-          LogicalProject(gender=[$0], state=[$1], count=[$2], _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2)])
+          LogicalProject(gender=[$0], state=[$1], count=[$2], _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2, $1)])
             LogicalAggregate(group=[{0, 1}], count=[COUNT()])
               LogicalProject(gender=[$4], state=[$7])
                 CalciteLogicalIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]])
   physical: |
     EnumerableLimit(fetch=[10000])
       EnumerableCalc(expr#0..3=[{inputs}], expr#4=[2], expr#5=[<=($t3, $t4)], proj#0..2=[{exprs}], $condition=[$t5])
-        EnumerableWindow(window#0=[window(partition {0} order by [2] rows between UNBOUNDED PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
+        EnumerableWindow(window#0=[window(partition {0} order by [2, 1] rows between UNBOUNDED PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
           CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0, 1},count=COUNT())], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"composite_buckets":{"composite":{"size":1000,"sources":[{"gender":{"terms":{"field":"gender.keyword","missing_bucket":true,"missing_order":"first","order":"asc"}}},{"state":{"terms":{"field":"state.keyword","missing_bucket":true,"missing_order":"first","order":"asc"}}}]}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])
\ No newline at end of file
diff --git a/integ-test/src/test/resources/expectedOutput/calcite/explain_top_usenull_false.yaml b/integ-test/src/test/resources/expectedOutput/calcite/explain_top_usenull_false.yaml
index 21457c6170..bcb8a6c16f 100644
--- a/integ-test/src/test/resources/expectedOutput/calcite/explain_top_usenull_false.yaml
+++ b/integ-test/src/test/resources/expectedOutput/calcite/explain_top_usenull_false.yaml
@@ -3,7 +3,7 @@ calcite:
     LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT])
       LogicalProject(gender=[$0], state=[$1], count=[$2])
         LogicalFilter(condition=[<=($3, 2)])
-          LogicalProject(gender=[$0], state=[$1], count=[$2], _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC)])
+          LogicalProject(gender=[$0], state=[$1], count=[$2], _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC, $1)])
             LogicalAggregate(group=[{0, 1}], count=[COUNT()])
               LogicalProject(gender=[$4], state=[$7])
                 LogicalFilter(condition=[AND(IS NOT NULL($4), IS NOT NULL($7))])
diff --git a/integ-test/src/test/resources/expectedOutput/calcite/explain_top_usenull_true.yaml b/integ-test/src/test/resources/expectedOutput/calcite/explain_top_usenull_true.yaml
index 51ffb88340..704b043dce 100644
--- a/integ-test/src/test/resources/expectedOutput/calcite/explain_top_usenull_true.yaml
+++ b/integ-test/src/test/resources/expectedOutput/calcite/explain_top_usenull_true.yaml
@@ -3,12 +3,12 @@ calcite:
     LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT])
       LogicalProject(gender=[$0], state=[$1], count=[$2])
         LogicalFilter(condition=[<=($3, 2)])
-          LogicalProject(gender=[$0], state=[$1], count=[$2], _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC)])
+          LogicalProject(gender=[$0], state=[$1], count=[$2], _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC, $1)])
             LogicalAggregate(group=[{0, 1}], count=[COUNT()])
               LogicalProject(gender=[$4], state=[$7])
                 CalciteLogicalIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]])
   physical: |
     EnumerableLimit(fetch=[10000])
       EnumerableCalc(expr#0..3=[{inputs}], expr#4=[2], expr#5=[<=($t3, $t4)], proj#0..2=[{exprs}], $condition=[$t5])
-        EnumerableWindow(window#0=[window(partition {0} order by [2 DESC] rows between UNBOUNDED PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
+        EnumerableWindow(window#0=[window(partition {0} order by [2 DESC, 1] rows between UNBOUNDED PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
           CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0, 1},count=COUNT())], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"composite_buckets":{"composite":{"size":1000,"sources":[{"gender":{"terms":{"field":"gender.keyword","missing_bucket":true,"missing_order":"first","order":"asc"}}},{"state":{"terms":{"field":"state.keyword","missing_bucket":true,"missing_order":"first","order":"asc"}}}]}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])
\ No newline at end of file
diff --git a/opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.java b/opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.java
index 5a815e8745..2bee10c130 100644
--- a/opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.java
+++ b/opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/RareTopPushdownRule.java
@@ -8,6 +8,7 @@
 import java.util.List;
 import java.util.function.Predicate;
 import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.rel.RelFieldCollation;
 import org.apache.calcite.rel.logical.LogicalFilter;
 import org.apache.calcite.rel.logical.LogicalProject;
 import org.apache.calcite.rel.rules.SubstitutionRule;
@@ -49,22 +50,39 @@ protected void onMatchImpl(RelOptRuleCall call) {
       List<Integer> groupIndices = PlanUtils.getSelectColumns(windows.getFirst().partitionKeys);
       List<String> byList = groupIndices.stream().map(fieldNameList::get).toList();
 
-      if (windows.getFirst().orderKeys.size() != 1) {
+      // ORDER BY must be either the count column alone, or the count column followed by the
+      // rare/top target field ASC (the stable tie-break inserted by visitRareTopN). The pushdown
+      // to the OpenSearch terms aggregation naturally tie-breaks on `_key:asc`, so the latter
+      // shape lowers to the same wire request as the single-key shape.
+      List<RexFieldCollation> orderKeys = windows.getFirst().orderKeys;
+      if (orderKeys.isEmpty() || orderKeys.size() > 2) {
         return;
       }
-      RexFieldCollation orderKey = windows.getFirst().orderKeys.getFirst();
-      List<Integer> orderIndices = PlanUtils.getSelectColumns(List.of(orderKey.getKey()));
-      List<String> orderList = orderIndices.stream().map(fieldNameList::get).toList();
+      RexFieldCollation primaryOrderKey = orderKeys.getFirst();
+      List<Integer> primaryOrderIndices =
+          PlanUtils.getSelectColumns(List.of(primaryOrderKey.getKey()));
+      List<String> primaryOrderList = primaryOrderIndices.stream().map(fieldNameList::get).toList();
       List<String> targetList =
           fieldNameList.stream()
               .filter(Predicate.not(byList::contains))
-              .filter(Predicate.not(orderList::contains))
+              .filter(Predicate.not(primaryOrderList::contains))
               .toList();
       if (targetList.size() != 1) {
         return;
       }
       String targetName = targetList.getFirst();
-      digest = new RareTopDigest(targetName, byList, number, orderKey.getDirection());
+      if (orderKeys.size() == 2) {
+        RexFieldCollation tieBreakKey = orderKeys.get(1);
+        if (tieBreakKey.getDirection() != RelFieldCollation.Direction.ASCENDING) {
+          return;
+        }
+        List<Integer> tieBreakIndices = PlanUtils.getSelectColumns(List.of(tieBreakKey.getKey()));
+        List<String> tieBreakList = tieBreakIndices.stream().map(fieldNameList::get).toList();
+        if (tieBreakList.size() != 1 || !tieBreakList.getFirst().equals(targetName)) {
+          return;
+        }
+      }
+      digest = new RareTopDigest(targetName, byList, number, primaryOrderKey.getDirection());
     } catch (Exception e) {
       return;
     }
diff --git a/ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLRareTopNTest.java b/ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLRareTopNTest.java
index 2fcee84931..21aa15c2e6 100644
--- a/ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLRareTopNTest.java
+++ b/ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLRareTopNTest.java
@@ -27,7 +27,7 @@ public void testRare() {
         "LogicalProject(JOB=[$0], count=[$1])\n"
             + "  LogicalFilter(condition=[<=($2, 10)])\n"
             + "    LogicalProject(JOB=[$0], count=[$1], _row_number_rare_top_=[ROW_NUMBER() OVER"
-            + " (ORDER BY $1)])\n"
+            + " (ORDER BY $1, $0)])\n"
             + "      LogicalAggregate(group=[{0}], count=[COUNT()])\n"
             + "        LogicalProject(JOB=[$2])\n"
             + "          LogicalTableScan(table=[[scott, EMP]])\n";
@@ -38,14 +38,14 @@ public void testRare() {
             + "JOB=PRESIDENT; count=1\n"
             + "JOB=ANALYST; count=2\n"
             + "JOB=MANAGER; count=3\n"
-            + "JOB=SALESMAN; count=4\n"
-            + "JOB=CLERK; count=4\n";
+            + "JOB=CLERK; count=4\n"
+            + "JOB=SALESMAN; count=4\n";
     verifyResult(root, expectedResult);
 
     String expectedSparkSql =
         "SELECT `JOB`, `count`\n"
             + "FROM (SELECT `JOB`, COUNT(*) `count`, ROW_NUMBER() OVER (ORDER BY COUNT(*) NULLS"
-            + " LAST) `_row_number_rare_top_`\n"
+            + " LAST, `JOB` NULLS LAST) `_row_number_rare_top_`\n"
             + "FROM `scott`.`EMP`\n"
             + "GROUP BY `JOB`) `t1`\n"
             + "WHERE `_row_number_rare_top_` <= 10";
@@ -60,7 +60,7 @@ public void testRareBy() {
         "LogicalProject(DEPTNO=[$0], JOB=[$1], count=[$2])\n"
             + "  LogicalFilter(condition=[<=($3, 10)])\n"
             + "    LogicalProject(DEPTNO=[$0], JOB=[$1], count=[$2],"
-            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2)])\n"
+            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2, $1)])\n"
             + "      LogicalAggregate(group=[{0, 1}], count=[COUNT()])\n"
             + "        LogicalProject(DEPTNO=[$7], JOB=[$2])\n"
             + "          LogicalTableScan(table=[[scott, EMP]])\n";
@@ -69,20 +69,20 @@ public void testRareBy() {
     String expectedResult =
         ""
             + "DEPTNO=20; JOB=MANAGER; count=1\n"
-            + "DEPTNO=20; JOB=CLERK; count=2\n"
             + "DEPTNO=20; JOB=ANALYST; count=2\n"
-            + "DEPTNO=10; JOB=MANAGER; count=1\n"
+            + "DEPTNO=20; JOB=CLERK; count=2\n"
             + "DEPTNO=10; JOB=CLERK; count=1\n"
+            + "DEPTNO=10; JOB=MANAGER; count=1\n"
             + "DEPTNO=10; JOB=PRESIDENT; count=1\n"
-            + "DEPTNO=30; JOB=MANAGER; count=1\n"
             + "DEPTNO=30; JOB=CLERK; count=1\n"
+            + "DEPTNO=30; JOB=MANAGER; count=1\n"
             + "DEPTNO=30; JOB=SALESMAN; count=4\n";
     verifyResult(root, expectedResult);
 
     String expectedSparkSql =
         "SELECT `DEPTNO`, `JOB`, `count`\n"
             + "FROM (SELECT `DEPTNO`, `JOB`, COUNT(*) `count`, ROW_NUMBER() OVER (PARTITION BY"
-            + " `DEPTNO` ORDER BY COUNT(*) NULLS LAST) `_row_number_rare_top_`\n"
+            + " `DEPTNO` ORDER BY COUNT(*) NULLS LAST, `JOB` NULLS LAST) `_row_number_rare_top_`\n"
             + "FROM `scott`.`EMP`\n"
             + "GROUP BY `DEPTNO`, `JOB`) `t1`\n"
             + "WHERE `_row_number_rare_top_` <= 10";
@@ -97,7 +97,7 @@ public void testRareDisableShowCount() {
         "LogicalProject(DEPTNO=[$0], JOB=[$1])\n"
             + "  LogicalFilter(condition=[<=($3, 10)])\n"
             + "    LogicalProject(DEPTNO=[$0], JOB=[$1], count=[$2],"
-            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2)])\n"
+            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2, $1)])\n"
             + "      LogicalAggregate(group=[{0, 1}], count=[COUNT()])\n"
             + "        LogicalProject(DEPTNO=[$7], JOB=[$2])\n"
             + "          LogicalTableScan(table=[[scott, EMP]])\n";
@@ -106,20 +106,20 @@ public void testRareDisableShowCount() {
     String expectedResult =
         ""
             + "DEPTNO=20; JOB=MANAGER\n"
-            + "DEPTNO=20; JOB=CLERK\n"
             + "DEPTNO=20; JOB=ANALYST\n"
-            + "DEPTNO=10; JOB=MANAGER\n"
+            + "DEPTNO=20; JOB=CLERK\n"
             + "DEPTNO=10; JOB=CLERK\n"
+            + "DEPTNO=10; JOB=MANAGER\n"
             + "DEPTNO=10; JOB=PRESIDENT\n"
-            + "DEPTNO=30; JOB=MANAGER\n"
             + "DEPTNO=30; JOB=CLERK\n"
+            + "DEPTNO=30; JOB=MANAGER\n"
             + "DEPTNO=30; JOB=SALESMAN\n";
     verifyResult(root, expectedResult);
 
     String expectedSparkSql =
         "SELECT `DEPTNO`, `JOB`\n"
             + "FROM (SELECT `DEPTNO`, `JOB`, COUNT(*) `count`, ROW_NUMBER() OVER (PARTITION BY"
-            + " `DEPTNO` ORDER BY COUNT(*) NULLS LAST) `_row_number_rare_top_`\n"
+            + " `DEPTNO` ORDER BY COUNT(*) NULLS LAST, `JOB` NULLS LAST) `_row_number_rare_top_`\n"
             + "FROM `scott`.`EMP`\n"
             + "GROUP BY `DEPTNO`, `JOB`) `t1`\n"
             + "WHERE `_row_number_rare_top_` <= 10";
@@ -134,7 +134,7 @@ public void testRareCountField() {
         "LogicalProject(DEPTNO=[$0], JOB=[$1], my_cnt=[$2])\n"
             + "  LogicalFilter(condition=[<=($3, 10)])\n"
             + "    LogicalProject(DEPTNO=[$0], JOB=[$1], my_cnt=[$2],"
-            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2)])\n"
+            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2, $1)])\n"
             + "      LogicalAggregate(group=[{0, 1}], my_cnt=[COUNT()])\n"
             + "        LogicalProject(DEPTNO=[$7], JOB=[$2])\n"
             + "          LogicalTableScan(table=[[scott, EMP]])\n";
@@ -143,20 +143,20 @@ public void testRareCountField() {
     String expectedResult =
         ""
             + "DEPTNO=20; JOB=MANAGER; my_cnt=1\n"
-            + "DEPTNO=20; JOB=CLERK; my_cnt=2\n"
             + "DEPTNO=20; JOB=ANALYST; my_cnt=2\n"
-            + "DEPTNO=10; JOB=MANAGER; my_cnt=1\n"
+            + "DEPTNO=20; JOB=CLERK; my_cnt=2\n"
             + "DEPTNO=10; JOB=CLERK; my_cnt=1\n"
+            + "DEPTNO=10; JOB=MANAGER; my_cnt=1\n"
             + "DEPTNO=10; JOB=PRESIDENT; my_cnt=1\n"
-            + "DEPTNO=30; JOB=MANAGER; my_cnt=1\n"
             + "DEPTNO=30; JOB=CLERK; my_cnt=1\n"
+            + "DEPTNO=30; JOB=MANAGER; my_cnt=1\n"
             + "DEPTNO=30; JOB=SALESMAN; my_cnt=4\n";
     verifyResult(root, expectedResult);
 
     String expectedSparkSql =
         "SELECT `DEPTNO`, `JOB`, `my_cnt`\n"
             + "FROM (SELECT `DEPTNO`, `JOB`, COUNT(*) `my_cnt`, ROW_NUMBER() OVER (PARTITION BY"
-            + " `DEPTNO` ORDER BY COUNT(*) NULLS LAST) `_row_number_rare_top_`\n"
+            + " `DEPTNO` ORDER BY COUNT(*) NULLS LAST, `JOB` NULLS LAST) `_row_number_rare_top_`\n"
             + "FROM `scott`.`EMP`\n"
             + "GROUP BY `DEPTNO`, `JOB`) `t1`\n"
             + "WHERE `_row_number_rare_top_` <= 10";
@@ -171,7 +171,7 @@ public void testRareUseNullFalse() {
         "LogicalProject(DEPTNO=[$0], JOB=[$1], count=[$2])\n"
             + "  LogicalFilter(condition=[<=($3, 10)])\n"
             + "    LogicalProject(DEPTNO=[$0], JOB=[$1], count=[$2],"
-            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2)])\n"
+            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2, $1)])\n"
             + "      LogicalAggregate(group=[{0, 1}], count=[COUNT()])\n"
             + "        LogicalProject(DEPTNO=[$7], JOB=[$2])\n"
             + "          LogicalFilter(condition=[AND(IS NOT NULL($7), IS NOT NULL($2))])\n"
@@ -181,20 +181,20 @@ public void testRareUseNullFalse() {
     String expectedResult =
         ""
             + "DEPTNO=20; JOB=MANAGER; count=1\n"
-            + "DEPTNO=20; JOB=CLERK; count=2\n"
             + "DEPTNO=20; JOB=ANALYST; count=2\n"
-            + "DEPTNO=10; JOB=MANAGER; count=1\n"
+            + "DEPTNO=20; JOB=CLERK; count=2\n"
             + "DEPTNO=10; JOB=CLERK; count=1\n"
+            + "DEPTNO=10; JOB=MANAGER; count=1\n"
             + "DEPTNO=10; JOB=PRESIDENT; count=1\n"
-            + "DEPTNO=30; JOB=MANAGER; count=1\n"
             + "DEPTNO=30; JOB=CLERK; count=1\n"
+            + "DEPTNO=30; JOB=MANAGER; count=1\n"
             + "DEPTNO=30; JOB=SALESMAN; count=4\n";
     verifyResult(root, expectedResult);
 
     String expectedSparkSql =
         "SELECT `DEPTNO`, `JOB`, `count`\n"
             + "FROM (SELECT `DEPTNO`, `JOB`, COUNT(*) `count`, ROW_NUMBER() OVER (PARTITION BY"
-            + " `DEPTNO` ORDER BY COUNT(*) NULLS LAST) `_row_number_rare_top_`\n"
+            + " `DEPTNO` ORDER BY COUNT(*) NULLS LAST, `JOB` NULLS LAST) `_row_number_rare_top_`\n"
             + "FROM `scott`.`EMP`\n"
             + "WHERE `DEPTNO` IS NOT NULL AND `JOB` IS NOT NULL\n"
             + "GROUP BY `DEPTNO`, `JOB`) `t2`\n"
@@ -230,7 +230,7 @@ public void testTop() {
         "LogicalProject(JOB=[$0], count=[$1])\n"
             + "  LogicalFilter(condition=[<=($2, 10)])\n"
             + "    LogicalProject(JOB=[$0], count=[$1], _row_number_rare_top_=[ROW_NUMBER() OVER"
-            + " (ORDER BY $1 DESC)])\n"
+            + " (ORDER BY $1 DESC, $0)])\n"
             + "      LogicalAggregate(group=[{0}], count=[COUNT()])\n"
             + "        LogicalProject(JOB=[$2])\n"
             + "          LogicalTableScan(table=[[scott, EMP]])\n";
@@ -238,8 +238,8 @@ public void testTop() {
 
     String expectedResult =
         ""
-            + "JOB=SALESMAN; count=4\n"
             + "JOB=CLERK; count=4\n"
+            + "JOB=SALESMAN; count=4\n"
             + "JOB=MANAGER; count=3\n"
             + "JOB=ANALYST; count=2\n"
             + "JOB=PRESIDENT; count=1\n";
@@ -248,7 +248,7 @@ public void testTop() {
     String expectedSparkSql =
         "SELECT `JOB`, `count`\n"
             + "FROM (SELECT `JOB`, COUNT(*) `count`, ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC"
-            + " NULLS FIRST) `_row_number_rare_top_`\n"
+            + " NULLS FIRST, `JOB` NULLS LAST) `_row_number_rare_top_`\n"
             + "FROM `scott`.`EMP`\n"
             + "GROUP BY `JOB`) `t1`\n"
             + "WHERE `_row_number_rare_top_` <= 10";
@@ -263,7 +263,7 @@ public void testTopBy() {
         "LogicalProject(DEPTNO=[$0], JOB=[$1], count=[$2])\n"
             + "  LogicalFilter(condition=[<=($3, 10)])\n"
             + "    LogicalProject(DEPTNO=[$0], JOB=[$1], count=[$2],"
-            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC)])\n"
+            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC, $1)])\n"
             + "      LogicalAggregate(group=[{0, 1}], count=[COUNT()])\n"
             + "        LogicalProject(DEPTNO=[$7], JOB=[$2])\n"
             + "          LogicalTableScan(table=[[scott, EMP]])\n";
@@ -271,21 +271,22 @@ public void testTopBy() {
 
     String expectedResult =
         ""
-            + "DEPTNO=20; JOB=CLERK; count=2\n"
             + "DEPTNO=20; JOB=ANALYST; count=2\n"
+            + "DEPTNO=20; JOB=CLERK; count=2\n"
             + "DEPTNO=20; JOB=MANAGER; count=1\n"
-            + "DEPTNO=10; JOB=MANAGER; count=1\n"
             + "DEPTNO=10; JOB=CLERK; count=1\n"
+            + "DEPTNO=10; JOB=MANAGER; count=1\n"
             + "DEPTNO=10; JOB=PRESIDENT; count=1\n"
             + "DEPTNO=30; JOB=SALESMAN; count=4\n"
-            + "DEPTNO=30; JOB=MANAGER; count=1\n"
-            + "DEPTNO=30; JOB=CLERK; count=1\n";
+            + "DEPTNO=30; JOB=CLERK; count=1\n"
+            + "DEPTNO=30; JOB=MANAGER; count=1\n";
     verifyResult(root, expectedResult);
 
     String expectedSparkSql =
         "SELECT `DEPTNO`, `JOB`, `count`\n"
             + "FROM (SELECT `DEPTNO`, `JOB`, COUNT(*) `count`, ROW_NUMBER() OVER (PARTITION BY"
-            + " `DEPTNO` ORDER BY COUNT(*) DESC NULLS FIRST) `_row_number_rare_top_`\n"
+            + " `DEPTNO` ORDER BY COUNT(*) DESC NULLS FIRST, `JOB` NULLS LAST)"
+            + " `_row_number_rare_top_`\n"
             + "FROM `scott`.`EMP`\n"
             + "GROUP BY `DEPTNO`, `JOB`) `t1`\n"
             + "WHERE `_row_number_rare_top_` <= 10";
@@ -300,7 +301,7 @@ public void testTopDisableShowCount() {
         "LogicalProject(DEPTNO=[$0], JOB=[$1])\n"
             + "  LogicalFilter(condition=[<=($3, 10)])\n"
             + "    LogicalProject(DEPTNO=[$0], JOB=[$1], count=[$2],"
-            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC)])\n"
+            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC, $1)])\n"
             + "      LogicalAggregate(group=[{0, 1}], count=[COUNT()])\n"
             + "        LogicalProject(DEPTNO=[$7], JOB=[$2])\n"
             + "          LogicalTableScan(table=[[scott, EMP]])\n";
@@ -308,21 +309,22 @@ public void testTopDisableShowCount() {
 
     String expectedResult =
         ""
-            + "DEPTNO=20; JOB=CLERK\n"
             + "DEPTNO=20; JOB=ANALYST\n"
+            + "DEPTNO=20; JOB=CLERK\n"
             + "DEPTNO=20; JOB=MANAGER\n"
-            + "DEPTNO=10; JOB=MANAGER\n"
             + "DEPTNO=10; JOB=CLERK\n"
+            + "DEPTNO=10; JOB=MANAGER\n"
             + "DEPTNO=10; JOB=PRESIDENT\n"
             + "DEPTNO=30; JOB=SALESMAN\n"
-            + "DEPTNO=30; JOB=MANAGER\n"
-            + "DEPTNO=30; JOB=CLERK\n";
+            + "DEPTNO=30; JOB=CLERK\n"
+            + "DEPTNO=30; JOB=MANAGER\n";
     verifyResult(root, expectedResult);
 
     String expectedSparkSql =
         "SELECT `DEPTNO`, `JOB`\n"
             + "FROM (SELECT `DEPTNO`, `JOB`, COUNT(*) `count`, ROW_NUMBER() OVER (PARTITION BY"
-            + " `DEPTNO` ORDER BY COUNT(*) DESC NULLS FIRST) `_row_number_rare_top_`\n"
+            + " `DEPTNO` ORDER BY COUNT(*) DESC NULLS FIRST, `JOB` NULLS LAST)"
+            + " `_row_number_rare_top_`\n"
             + "FROM `scott`.`EMP`\n"
             + "GROUP BY `DEPTNO`, `JOB`) `t1`\n"
             + "WHERE `_row_number_rare_top_` <= 10";
@@ -337,7 +339,7 @@ public void testTopCountField() {
         "LogicalProject(DEPTNO=[$0], JOB=[$1], my_cnt=[$2])\n"
             + "  LogicalFilter(condition=[<=($3, 10)])\n"
             + "    LogicalProject(DEPTNO=[$0], JOB=[$1], my_cnt=[$2],"
-            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC)])\n"
+            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC, $1)])\n"
             + "      LogicalAggregate(group=[{0, 1}], my_cnt=[COUNT()])\n"
             + "        LogicalProject(DEPTNO=[$7], JOB=[$2])\n"
             + "          LogicalTableScan(table=[[scott, EMP]])\n";
@@ -345,21 +347,22 @@ public void testTopCountField() {
 
     String expectedResult =
         ""
-            + "DEPTNO=20; JOB=CLERK; my_cnt=2\n"
             + "DEPTNO=20; JOB=ANALYST; my_cnt=2\n"
+            + "DEPTNO=20; JOB=CLERK; my_cnt=2\n"
             + "DEPTNO=20; JOB=MANAGER; my_cnt=1\n"
-            + "DEPTNO=10; JOB=MANAGER; my_cnt=1\n"
             + "DEPTNO=10; JOB=CLERK; my_cnt=1\n"
+            + "DEPTNO=10; JOB=MANAGER; my_cnt=1\n"
             + "DEPTNO=10; JOB=PRESIDENT; my_cnt=1\n"
             + "DEPTNO=30; JOB=SALESMAN; my_cnt=4\n"
-            + "DEPTNO=30; JOB=MANAGER; my_cnt=1\n"
-            + "DEPTNO=30; JOB=CLERK; my_cnt=1\n";
+            + "DEPTNO=30; JOB=CLERK; my_cnt=1\n"
+            + "DEPTNO=30; JOB=MANAGER; my_cnt=1\n";
     verifyResult(root, expectedResult);
 
     String expectedSparkSql =
         "SELECT `DEPTNO`, `JOB`, `my_cnt`\n"
             + "FROM (SELECT `DEPTNO`, `JOB`, COUNT(*) `my_cnt`, ROW_NUMBER() OVER (PARTITION BY"
-            + " `DEPTNO` ORDER BY COUNT(*) DESC NULLS FIRST) `_row_number_rare_top_`\n"
+            + " `DEPTNO` ORDER BY COUNT(*) DESC NULLS FIRST, `JOB` NULLS LAST)"
+            + " `_row_number_rare_top_`\n"
             + "FROM `scott`.`EMP`\n"
             + "GROUP BY `DEPTNO`, `JOB`) `t1`\n"
             + "WHERE `_row_number_rare_top_` <= 10";
@@ -374,7 +377,7 @@ public void testTopUseNullFalse() {
         "LogicalProject(DEPTNO=[$0], JOB=[$1], count=[$2])\n"
             + "  LogicalFilter(condition=[<=($3, 10)])\n"
             + "    LogicalProject(DEPTNO=[$0], JOB=[$1], count=[$2],"
-            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC)])\n"
+            + " _row_number_rare_top_=[ROW_NUMBER() OVER (PARTITION BY $0 ORDER BY $2 DESC, $1)])\n"
             + "      LogicalAggregate(group=[{0, 1}], count=[COUNT()])\n"
             + "        LogicalProject(DEPTNO=[$7], JOB=[$2])\n"
             + "          LogicalFilter(condition=[AND(IS NOT NULL($7), IS NOT NULL($2))])\n"
@@ -383,21 +386,22 @@ public void testTopUseNullFalse() {
 
     String expectedResult =
         ""
-            + "DEPTNO=20; JOB=CLERK; count=2\n"
             + "DEPTNO=20; JOB=ANALYST; count=2\n"
+            + "DEPTNO=20; JOB=CLERK; count=2\n"
             + "DEPTNO=20; JOB=MANAGER; count=1\n"
-            + "DEPTNO=10; JOB=MANAGER; count=1\n"
             + "DEPTNO=10; JOB=CLERK; count=1\n"
+            + "DEPTNO=10; JOB=MANAGER; count=1\n"
             + "DEPTNO=10; JOB=PRESIDENT; count=1\n"
             + "DEPTNO=30; JOB=SALESMAN; count=4\n"
-            + "DEPTNO=30; JOB=MANAGER; count=1\n"
-            + "DEPTNO=30; JOB=CLERK; count=1\n";
+            + "DEPTNO=30; JOB=CLERK; count=1\n"
+            + "DEPTNO=30; JOB=MANAGER; count=1\n";
     verifyResult(root, expectedResult);
 
     String expectedSparkSql =
         "SELECT `DEPTNO`, `JOB`, `count`\n"
             + "FROM (SELECT `DEPTNO`, `JOB`, COUNT(*) `count`, ROW_NUMBER() OVER (PARTITION BY"
-            + " `DEPTNO` ORDER BY COUNT(*) DESC NULLS FIRST) `_row_number_rare_top_`\n"
+            + " `DEPTNO` ORDER BY COUNT(*) DESC NULLS FIRST, `JOB` NULLS LAST)"
+            + " `_row_number_rare_top_`\n"
             + "FROM `scott`.`EMP`\n"
             + "WHERE `DEPTNO` IS NOT NULL AND `JOB` IS NOT NULL\n"
             + "GROUP BY `DEPTNO`, `JOB`) `t2`\n"

From 59c9f7dcc6a8a67959bbd41ab07e0aa06f91eddf Mon Sep 17 00:00:00 2001
From: Kai Huang <huangkaics@gmail.com>
Date: Mon, 11 May 2026 16:43:39 -0700
Subject: [PATCH 3/3] Reorder doctest null rows in rare.md / top.md to match
 new ASC tie-break

The stable tie-break added in the previous commit appends the rare/top
field columns as secondary ASC keys to the `ROW_NUMBER` `ORDER BY`. ASC
ordering uses NULLS LAST by default, so the existing `usenull=true email`
examples in `docs/user/ppl/cmd/rare.md` and `docs/user/ppl/cmd/top.md`
now emit `null` last instead of first.

Update the doctest expected output blocks accordingly. No behavior change
for the non-null rows.

Signed-off-by: Kai Huang <huangkaics@gmail.com>
---
 docs/user/ppl/cmd/rare.md | 2 +-
 docs/user/ppl/cmd/top.md  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/user/ppl/cmd/rare.md b/docs/user/ppl/cmd/rare.md
index c3f28eb1a7..8014e95785 100644
--- a/docs/user/ppl/cmd/rare.md
+++ b/docs/user/ppl/cmd/rare.md
@@ -152,10 +152,10 @@ fetched rows / total rows = 4/4
 +-----------------------+-------+
 | email                 | count |
 |-----------------------+-------|
-| null                  | 1     |
 | amberduke@pyrami.com  | 1     |
 | daleadams@boink.com   | 1     |
 | hattiebond@netagy.com | 1     |
+| null                  | 1     |
 +-----------------------+-------+
 ```
   
diff --git a/docs/user/ppl/cmd/top.md b/docs/user/ppl/cmd/top.md
index 4f93c8fd58..56f8ba7481 100644
--- a/docs/user/ppl/cmd/top.md
+++ b/docs/user/ppl/cmd/top.md
@@ -166,10 +166,10 @@ fetched rows / total rows = 4/4
 +-----------------------+-------+
 | email                 | count |
 |-----------------------+-------|
-| null                  | 1     |
 | amberduke@pyrami.com  | 1     |
 | daleadams@boink.com   | 1     |
 | hattiebond@netagy.com | 1     |
+| null                  | 1     |
 +-----------------------+-------+
 ```