Streaming offset v1 to v2 migration by ericm-db · Pull Request #29 · ericm-db/spark

ericm-db · 2026-02-19T22:47:29Z

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

github-actions · 2026-02-19T22:51:37Z

⚠️ Pull Request Title Validation

This pull request title does not contain a JIRA issue ID.

Please update the title to either:

Include a JIRA ID: [SPARK-12345] Your description
Mark as minor change: [MINOR] Your description

For minor changes that don't require a JIRA ticket (e.g., typo fixes), please prefix the title with [MINOR].

This comment was automatically generated by GitHub Actions

…sted runner test action ### What changes were proposed in this pull request? Add `PYSPARK_TEST_TIMEOUT` to hosted runner test action. ### Why are the changes needed? The test could stuck in hosted runners: https://github.com/apache/spark/actions/runs/22286360532 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? action change. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54431 from gaogaotiantian/add-timeout-hosted-runner. Authored-by: Tian Gao <gaogaotiantian@hotmail.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>

… hanging and enable retries ### What changes were proposed in this pull request? This PR aims to add a generic connection and read timeout parameter timeout=(3.05, 30) to the `jira.client.JIRA()` initializations in the `dev/merge_spark_pr.py` script. ### Why are the changes needed? By default, the `requests` library (used by the jira python package) can hang indefinitely when attempting to establish a connection if the server is unresponsive. Setting a short connection timeout of 3.05 seconds (slightly larger than a standard 3-second TCP packet retransmission window) ensures that the client will quickly fail the connection attempt and properly trigger its internal retry logic, rather than stalling the entire script. The 30-second read timeout allows sufficient time for normal API responses once the connection is successfully established. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual tests. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: `Gemini 3.1 Pro (High)` on `Antigravity` Closes apache#54432 from dongjoon-hyun/SPARK-55643. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

… separate thread-local capture from execution ### What changes were proposed in this pull request? Previously, callers had to provide an ExecutorService upfront: thread-local capture and task submission were fused into a single call that immediately returned a CompletableFuture. Now, captureThreadLocals(sparkSession) captures the current thread's SQL context into a standalone SQLExecutionThreadLocalCaptured object. Callers can then invoke `runWith { body }` on any thread, at any time, using any concurrency primitive — not just ExecutorService. `withThreadLocalCaptured` is preserved for backward compatibility and now delegates to these two primitives. ### Why are the changes needed? Refactoring to make withThreadLocalCaptured easier to use. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? Existing UTs ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#54434 from huanliwang-db/huanliwang-db/refactor-sqlthread. Authored-by: huanliwang-db <huanli.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…Step` traits to `Stable` ### What changes were proposed in this pull request? This PR aims to promote the following traits to `Stable` from Apache Spark 4.2.0: - `KubernetesFeatureConfigStep` - `KubernetesDriverCustomFeatureConfigStep` - `KubernetesExecutorCustomFeatureConfigStep` ### Why are the changes needed? Since Apache Spark 3.3.0, the `Kubernetes*FeatureConfigStep` traits have been serving stably without any modifications for 4 years. We had better promote them to `Stable`. - apache#35345 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: `Gemini 3.1 Pro (High)` on `Antigravity` Closes apache#54439 from dongjoon-hyun/SPARK-55649. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

…ilureTracker ### What changes were proposed in this pull request? Adds tracking of executor pod creation with the ExecutorFailureTracker. ### Why are the changes needed? If there are unrecoverable pod creation errors then Spark continues to try and create pods instead of failing. An example is where a note book server is constrained to have a maximum number of pods and the user tries to start a notebook with twice the number of executors as the limit. In this case the user gets and 'Unauthorized' message in the logs but Spark will keep on trying to spin up new pods. By tracking pod creation failures we can stop trying after reaching max executor failures. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? New unit tests added ### Was this patch authored or co-authored using generative AI tooling? Unit tests generated using Claude Code. Closes apache#53840 from parthchandra/k8s-failures. Authored-by: Parth Chandra <parthc@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

…tributes with non-binary-stable collations ### What changes were proposed in this pull request? * `ConstantPropagation` optimizer rule substitutes attributes with literals derived from equality predicates (e.g. `c = 'hello'`), then propagates them into other conditions in the same conjunction. This is unsafe for non-binary-stable collations (e.g. `UTF8_LCASE`) where equality is non-identity: `c = 'hello'` (case-insensitive) does not imply `c` holds exactly the bytes `'hello'` - it could also be `'HELLO'`, `'Hello'`, etc. * Substituting `c → 'hello'` in a second condition like `c = 'HELLO' COLLATE UNICODE` turns it into the constant expression `'hello' = 'HELLO' COLLATE UNICODE`, which is always `false`, producing incorrect results. * Fixed by guarding `safeToReplace` with `isBinaryStable(ar.dataType)` so propagation is skipped for attributes whose type is not binary-stable. ### Why are the changes needed? Bug fix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? New unit test. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54435 from ilicmarkodb/fix_collation_. Authored-by: ilicmarkodb <marko.ilic@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…iases and ResolveInlineTables ### What changes were proposed in this pull request? Replace `AlwaysProcess.fn` with pattern-based pruning in two Analyzer rules: 1. **EliminateSubqueryAliases**: Use `_.containsPattern(SUBQUERY_ALIAS)` - Skips entire plan traversal when no `SubqueryAlias` nodes exist - Common in resolved plans after initial resolution passes 2. **ResolveInlineTables**: Use `_.containsPattern(INLINE_TABLE_EVAL)` - Skips traversal when no `UnresolvedInlineTable` nodes exist - Inline tables are rare; most queries never contain them Also adds `INLINE_TABLE_EVAL` to `UnresolvedInlineTable.nodePatterns`, which was previously only defined on `ResolvedInlineTable`. Without this, the pruning condition for `ResolveInlineTables` could never be satisfied for unresolved inline tables. Both rules previously used `AlwaysProcess.fn`, forcing full tree traversal on every fixedPoint iteration even when no matching nodes existed. TreePatternBits propagation enables O(1) root-level short-circuit. ### Why are the changes needed? Performance optimization: avoids unnecessary full-plan traversals during analysis when the relevant node types are absent. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests: `AnalysisSuite`, `EliminateSubqueryAliasesSuite`, and inline table related tests all pass. ### Was this patch authored or co-authored using generative AI tooling? Yes, GitHub Copilot. Closes apache#54440 from yaooqinn/tree-pattern-pruning-inline-tables. Authored-by: Kent Yao <kentyao@microsoft.com> Signed-off-by: Kent Yao <kentyao@microsoft.com>

### What changes were proposed in this pull request? This PR aims to support `NetworkPolicy` for Spark executor pods. ### Why are the changes needed? `NetworkPolicy` is frequently used in the production to isolate Spark applications. - https://kubernetes.io/docs/concepts/services-networking/network-policies/ ### Does this PR introduce _any_ user-facing change? This is a security feature to make Spark K8s executor pods access only from the pods with the same application ID. There are two ways if a user wants to access the executor pods from outside. 1. Use a pod with the same application ID with the target Spark applications. 2. Submit a Spark job with the following configuration. ``` spark.kubernetes.driver.pod.excludedFeatureSteps=org.apache.spark.deploy.k8s.features.NetworkPolicyFeatureStep ``` ### How was this patch tested? Pass the CIs with the newly added test suite. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: `Gemini 3.1 Pro (High)` on `Antigravity` Closes apache#54442 from dongjoon-hyun/SPARK-55653. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

…when counts are equal ### What changes were proposed in this pull request? This pr fix `CountVectorizer` to use a deterministic ordering when selecting the top vocabulary terms. Specifically, when two terms have the same frequency (count), they are now sorted by the term itself (lexicographically) as a tie-breaker. ### Why are the changes needed? Currently, `CountVectorizer` uses `wordCounts.top(...)(Ordering.by(_._2))` to select the vocabulary. This comparison only considers term counts. When multiple terms have the same count, the resulting order in the vocabulary is non-deterministic and depends on the RDD partition processing order or the iteration order of the internal hash maps. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass Github Actions - Added a new test case in `CountVectorizerSuite` that intentionally creates a dataset with tied term counts and asserts a specific, deterministic vocabulary order. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#54446 from LuciferYang/SPARK-55655. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

### What changes were proposed in this pull request? This pull request adds a new ExecuteOutput command to the Spark Connect pipelines protobuf definition. The new command enables clients to directly execute multiple flows writing to an output. ### Why are the changes needed? Required to enable standlone Python MV/ST ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? ### Was this patch authored or co-authored using generative AI tooling? Closes apache#54104 from aditya-nambiar/SPARK-55264. Authored-by: Aditya Nambiar <aditya.nambiar007@gmail.com> Signed-off-by: Herman van Hövell <herman@databricks.com>

### What changes were proposed in this pull request? `AQEShuffleRead` can have `local` / `coalesced` / `skewed` / `coalesced and skewed` properties when reading shuffle files. When Physical Plan Tree is complex, it is hard to track this info by correlating with AQEShuffleRead details such as which AQEShuffleRead has local read or skewed partition info etc. For example, following skewed SortMergeJoin case, this helps to understand which SMJ leg has AQEShuffleRead with skew. This addition aims to access this kind of use-cases at physical plan tree level. Plan Tree details section per AQEShuffleRead node also shows these properties but when query plan tree is too complex (e.g: composed by 1000+ physical nodes), it is hard to correlate this information with AQEShuffleRead details. **Current Physical Plan Tree:** ``` == Physical Plan == AdaptiveSparkPlan (24) +- == Final Plan == ResultQueryStage (17), Statistics(sizeInBytes=8.0 EiB) +- * Project (16) +- * SortMergeJoin(skew=true) Inner (15) :- * Sort (7) : +- AQEShuffleRead (6) : +- ShuffleQueryStage (5), Statistics(sizeInBytes=15.6 KiB, rowCount=1.00E+3) : +- Exchange (4) : +- * Project (3) : +- * Filter (2) : +- * Range (1) +- * Sort (14) +- AQEShuffleRead (13) +- ShuffleQueryStage (12), Statistics(sizeInBytes=3.1 KiB, rowCount=200) +- Exchange (11) +- * Project (10) +- * Filter (9) +- * Range (8) ``` **New Physical Plan Tree:** ``` == Physical Plan == AdaptiveSparkPlan (24) +- == Final Plan == ResultQueryStage (17), Statistics(sizeInBytes=8.0 EiB) +- * Project (16) +- * SortMergeJoin(skew=true) Inner (15) :- * Sort (7) : +- AQEShuffleRead (6), coalesced : +- ShuffleQueryStage (5), Statistics(sizeInBytes=15.6 KiB, rowCount=1.00E+3) : +- Exchange (4) : +- * Project (3) : +- * Filter (2) : +- * Range (1) +- * Sort (14) +- AQEShuffleRead (13), coalesced and skewed +- ShuffleQueryStage (12), Statistics(sizeInBytes=3.1 KiB, rowCount=200) +- Exchange (11) +- * Project (10) +- * Filter (9) +- * Range (8) ``` ### Why are the changes needed? When physical plan tree is complex (e.g: composed by 1000+ physical nodes), it is hard to correlate this information with `AQEShuffleRead` details. ### Does this PR introduce _any_ user-facing change? Yes, when the user investigates the physical plan, new AQEShuffleRead properties will be seen at Physical Plan Tree. ### How was this patch tested? Added a new UT ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#53817 from erenavsarogullari/SPARK-55052. Authored-by: Eren Avsarogullari <eren@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…mes on SessionCatalog ### What changes were proposed in this pull request? Add a unit test for Iceberg's case of supporting multi part identifiers in SessionCatalog (for metadata tables). Add a fake metadata table to InMemoryDataSource. ### Why are the changes needed? It can increase Spark coverage to catch issue like: apache#54247 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Ran the added test ### Was this patch authored or co-authored using generative AI tooling? Yes, cursor claude 4.5 opus Closes apache#54411 from szehon-ho/add_test. Authored-by: Szehon Ho <szehon.apache@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>

### What changes were proposed in this pull request? This PR aims to improve `EventLogFileWriter` to log `stop` operation. ### Why are the changes needed? Apache Spark has been logging the start of event log processing. We had better log the end of event log processing as a pair. **BEFORE** ``` $ bin/run-example -c spark.eventLog.enabled=true -c spark.eventLog.dir=/tmp SparkPi 2>&1 | grep RollingEventLogFilesWriter 26/02/24 08:24:27 INFO RollingEventLogFilesWriter: Logging events to file:/private/tmp/eventlog_v2_local-1771950267185/events_1_local-1771950267185.zstd ``` **AFTER** ``` $ bin/run-example -c spark.eventLog.enabled=true -c spark.eventLog.dir=/tmp SparkPi 2>&1 | grep RollingEventLogFilesWriter 26/02/24 08:24:41 INFO RollingEventLogFilesWriter: Logging events to file:/private/tmp/eventlog_v2_local-1771950279197/events_1_local-1771950279197.zstd 26/02/24 08:24:42 INFO RollingEventLogFilesWriter: Stopping event writer for file:/private/tmp/eventlog_v2_local-1771950279197 ``` ### Does this PR introduce _any_ user-facing change? No behavior change because this is a log. ### How was this patch tested? Manual test. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: `Gemini 3.1 Pro (High)` on `Antigravity` Closes apache#54452 from dongjoon-hyun/SPARK-55659. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

…y(axis)` with pandas 3 ### What changes were proposed in this pull request? Handles an unexpected keyword argument error `groupby(axis)` with pandas 3. ### Why are the changes needed? The `axis` argument was removed from `groupby` in pandas 3. ### Does this PR introduce _any_ user-facing change? Yes, it will behave more like pandas 3. ### How was this patch tested? Updated the related tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54436 from ueshin/issues/SPARK-55648/axis. Lead-authored-by: Takuya Ueshin <ueshin@databricks.com> Co-authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: Takuya Ueshin <ueshin@databricks.com>

…mimic the CoW mode behavior ### What changes were proposed in this pull request? This is another follow-up of apache#54375. Disconnects the anchor for more cases. ### Why are the changes needed? The anchor can be disconnect in most cases with pandas 3 to mimic the CoW mode behavior. ### Does this PR introduce _any_ user-facing change? Yes, it will behave more like pandas 3. ### How was this patch tested? The existing tests should pass. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54437 from ueshin/issues/SPARK-55296/cow_series. Authored-by: Takuya Ueshin <ueshin@databricks.com> Signed-off-by: Takuya Ueshin <ueshin@databricks.com>

… has 0 columns on classic ### What changes were proposed in this pull request? This PR fixes the row count loss issue when creating a Spark DataFrame from a pandas DataFrame with 0 columns in classic. The issue occurs due to PyArrow limitations when creating RecordBatches or Tables with 0 columns - row count information is lost. ### Why are the changes needed? Before this fix: ```python import pandas as pd from pyspark.sql.types import StructType pdf = pd.DataFrame(index=range(5)) # 5 rows, 0 columns df = spark.createDataFrame(pdf, schema=StructType([])) df.count() # Returns 0 (wrong!) ``` After this fix: ```python df.count() # Returns 5 (correct!) ``` ### Does this PR introduce _any_ user-facing change? Yes. Creating a DataFrame from a pandas DataFrame with 0 columns now correctly preserves the row count in Classic Spark. ### How was this patch tested? Added unit test `test_from_pandas_dataframe_with_zero_columns` in `test_creation.py` that tests both Arrow-enabled and Arrow-disabled paths. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#54382 from Yicong-Huang/SPARK-55600/fix/pandas-arrow-zero-columns-row-count. Authored-by: Yicong-Huang <17627829+Yicong-Huang@users.noreply.github.com> Signed-off-by: Takuya Ueshin <ueshin@databricks.com>

I generated some benchmarks for new implementation and compared against the old implementation. The performance numbers are show below. - **Row counts:** 1,000 and 10,000 - **Column counts:** 2, 5, 10, 20, 40, 100 - **Data distribution:** Random uniform distribution over 10 distinct values per column - **Total tests:** 11 configurations | Rows | Columns | Old Time | New Time | Speedup | Time Saved | Improvement | Jobs (Old→New) | Jobs Saved | |---------|---------|----------|----------|----------|------------|-------------|----------------|------------| | 1,000 | **1** | **0.125s** | **0.188s** | **0.66x** | **-0.063s** | **-50.6%** | **2 → 3** | **-1** | | 1,000 | 2 | 0.226s | 0.233s | 0.97x | -0.007s | -2.9% | 4 → 3 | 1 | | 1,000 | 5 | 0.501s | 0.225s | 2.23x | 0.276s | 55.1% | 10 → 3 | 7 | | 1,000 | 10 | 0.861s | 0.351s | 2.46x | 0.511s | 59.3% | 20 → 3 | 17 | | 1,000 | 20 | 1.539s | 0.418s | 3.68x | 1.120s | 72.8% | 40 → 3 | 37 | | 1,000 | 40 | 3.176s | 0.514s | 6.18x | 2.662s | 83.8% | 80 → 3 | 77 | | 1,000 | 100 | 7.483s | 0.586s | 12.77x | 6.897s | 92.2% | 200 → 3 | 197 | | 10,000 | **1** | **0.073s** | **0.111s** | **0.66x** | **-0.038s** | **-51.9%** | **2 → 3** | **-1** | | 10,000 | 5 | 0.362s | 0.194s | 1.87x | 0.168s | 46.5% | 10 → 3 | 7 | | 10,000 | 10 | 1.446s | 0.257s | 5.61x | 1.188s | 82.2% | 20 → 3 | 17 | | 10,000 | 20 | 1.424s | 0.382s | 3.72x | 1.041s | 73.1% | 40 → 3 | 37 | | 10,000 | 40 | 3.171s | 0.521s | 6.09x | 2.650s | 83.6% | 80 → 3 | 77 | | 10,000 | 100 | 10.953s | 1.163s | 9.41x | 9.789s | 89.4% | 200 → 3 | 197 | **Aggregate Statistics:** - Average speedup: 4.33x - Average improvement: 48.7% - Average jobs saved: 54.2 per operation - Maximum speedup: 12.77x (100 columns) - **Regression case: 0.66x for N=1** (new approach is 50% slower) ### What changes were proposed in this pull request? Fixes describe for string-only dataframes to have a fixed number of jobs rather than one job per column ### Why are the changes needed? Performance ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI ### Was this patch authored or co-authored using generative AI tooling? Co-authored-by: Claude Sonnet 4.5 Closes apache#54370 from devin-petersohn/devin/describe_strings_oneshot. Authored-by: Devin Petersohn <devin.petersohn@gmail.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>

### What changes were proposed in this pull request? Always set `__module__` to be something meaningful for datasource functions and workers. ### Why are the changes needed? The data source profiler depends on the module name of the worker/function. When invoked with simpler worker, the `__module__` would be `__main__` which is not informative enough for the profilers. We should avoid having them to be `__main__`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Locally simple worker + profiler recognizes it. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54457 from gaogaotiantian/unify-module-name. Authored-by: Tian Gao <gaogaotiantian@hotmail.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>

### What changes were proposed in this pull request? This PR adds Arrow schema type validation for the `pa.RecordBatch` code path in Python data source reads. The fix adds a `pa_schema.equals(first_element.schema)` check after the existing column name validation in `records_to_arrow_batches()`, raising a clear `DATA_SOURCE_RETURN_SCHEMA_MISMATCH` error with the expected and actual Arrow schemas. ### Why are the changes needed? When a Python data source returns a `pa.RecordBatch` with data types that don't match the declared schema, the resulting JVM-side errors are confusing and do not indicate the root cause. For example: - `IllegalArgumentException: not all nodes, buffers and variadicBufferCounts were consumed` from `VectorLoader.load()` - `UnsupportedOperationException: Cannot call the method "getUTF8String" of ArrowColumnVector$ArrowVectorAccessor` These errors give no indication that the issue is a schema type mismatch in the Python data source's `read()` method. ### Does this PR introduce _any_ user-facing change? Yes. Previously, returning a `pa.RecordBatch` with mismatched types from a Python data source would result in cryptic JVM errors. Now it raises a clear `DATA_SOURCE_RETURN_SCHEMA_MISMATCH` error showing the expected and actual Arrow schemas. Example error msg: ``` PySparkRuntimeError: [DATA_SOURCE_RETURN_SCHEMA_MISMATCH] Return schema mismatch in the result from 'read' method. Expected: <expected_schema>, Found: <actual_schema> ``` Example error msg for streaming: ``` [PYTHON_STREAMING_DATA_SOURCE_RUNTIME_ERROR] Failed when Python streaming data source perform planPartitions: PySparkRuntimeError: [DATA_SOURCE_RETURN_SCHEMA_MISMATCH] Return schema mismatch in the result from 'read' method. Expected: <expected_schema>, Found: <actual_schema> ``` ### How was this patch tested? Added a test case in `test_python_datasource.py::test_arrow_batch_data_source`. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#54362 from Yicong-Huang/SPARK-55583/wrap-arrow-error-python-datasource. Lead-authored-by: Yicong-Huang <17627829+Yicong-Huang@users.noreply.github.com> Co-authored-by: Yicong Huang <17627829+Yicong-Huang@users.noreply.github.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>

### What changes were proposed in this pull request? Replace the rasterized `spark-logo-77x50px-hd.png` (77×50px) with an SVG version in the Spark Web UI navbar and favicon. The SVG is derived from the official [Apache Spark website logo](https://github.com/apache/spark-website/blob/asf-site/images/spark-logo-rev.svg), with the text fill changed from white (`#FFF`) to the original dark color (`#3C3C3B`) to match the light navbar background. ### Why are the changes needed? - **Resolution independence**: SVG renders crisply on HiDPI/Retina displays without pixelation - **Scalability**: No artifacts when the browser zooms or the layout resizes - **Consistency**: Uses the same vector source as the official Apache Spark website ### Does this PR introduce _any_ user-facing change? The logo looks the same but renders more crisply on high-DPI screens. ### How was this patch tested? - `UIUtilsSuite` (8 tests) passes - Manual verification: logo renders correctly in navbar and browser favicon - before <img width="858" height="536" alt="image" src="https://github.com/user-attachments/assets/2805aa6d-2202-44b4-8303-b09c9e0a2675" /> - after <img width="1238" height="484" alt="image" src="https://github.com/user-attachments/assets/36cd40ff-214b-4e23-94d2-a5bebcdfcc42" /> ### Was this patch authored or co-authored using generative AI tooling? Yes, GitHub Copilot was used. Closes apache#54562 from yaooqinn/SPARK-55780. Authored-by: Kent Yao <kentyao@microsoft.com> Signed-off-by: Kent Yao <kentyao@microsoft.com>

…ps after Bootstrap 5 upgrade ### What changes were proposed in this pull request? Fix 7 instances of `data-title` that should be `data-bs-title` after the Bootstrap 5 upgrade (SPARK-55753). Bootstrap 5 Tooltip reads `data-bs-title` or `title` attribute, not `data-title` (which was a Bootstrap 4 convention). This caused timeline tooltips to show empty content on hover. ### Why are the changes needed? The BS5 upgrade (SPARK-55753) migrated most `data-*` attributes to `data-bs-*` prefix, but missed `data-title` in timeline HTML string interpolations. Affected tooltips: - **AllJobsPage.scala** (3): Job timeline content, executor added/removed events - **JobPage.scala** (3): Stage timeline content, executor added/removed events - **StagePage.scala** (1): Task assignment timeline content ### Does this PR introduce _any_ user-facing change? Yes — timeline tooltips that were broken (showing empty) after the BS5 upgrade now display their content correctly. ### How was this patch tested? - Verified no remaining `data-title=` (without `data-bs-` prefix) in any Scala/JS/HTML files - This is a trivial string replacement with no logic change ### Was this patch authored or co-authored using generative AI tooling? Yes, GitHub Copilot was used. Closes apache#54561 from yaooqinn/SPARK-55776. Authored-by: Kent Yao <kentyao@microsoft.com> Signed-off-by: Kent Yao <kentyao@microsoft.com>

…component ### What changes were proposed in this pull request? Replace the custom progress bar implementation with Bootstrap 5's `progress-stacked` component for proper accessibility and modern CSS. **Changes:** - **CSS** (`webui.css`): Remove 21 lines of IE9 vendor-prefixed gradients (`-moz-`, `-webkit-`, `-o-`, `filter:progid`), legacy `box-shadow` border, and `background-repeat`. Replace with clean `background: linear-gradient()` rules targeting `.progress-stacked` selector. - **Scala** (`UIUtils.scala`): Use BS5 `progress-stacked` wrapper with proper ARIA attributes (`role="progressbar"`, `aria-valuenow/min/max`). Replace inline styles with BS5 utility classes (`position-absolute`, `d-flex`, `align-items-center`, etc.). Extract label text to a `progressLabel` variable for clarity. - **Tests**: Update `UIUtilsSuite` assertions for new BS5 markup structure. Update `UISeleniumSuite` CSS selector from `.progress` to `.progress-stacked`. **Before → After (HTML output):** ```html <div class="progress"> <span style="display:flex;align-items:center;...">3/4</span> <div class="progress-bar progress-completed" style="width:75%"></div> <div class="progress-bar progress-started" style="width:25%"></div> </div> <div class="progress-stacked" title="3/4 (2 running)"> <span class="position-absolute w-100 h-100 d-flex align-items-center justify-content-center">3/4</span> <div class="progress" role="progressbar" aria-label="Completed" aria-valuenow="75" aria-valuemin="0" aria-valuemax="100" style="width:75%"> <div class="progress-bar progress-completed"></div> </div> <div class="progress" role="progressbar" aria-label="Running" aria-valuenow="25" aria-valuemin="0" aria-valuemax="100" style="width:25%"> <div class="progress-bar progress-started"></div> </div> </div> ``` ### Why are the changes needed? 1. **Accessibility**: Progress bars lacked ARIA attributes, making them invisible to screen readers 2. **Dead code**: IE9 vendor prefixes (`-moz-`, `-webkit-`, `-o-`, `filter:progid`) are no longer needed 3. **BS5 alignment**: Use the standard BS5 Progress component API instead of custom markup 4. **Maintainability**: Inline styles replaced with BS5 utility classes ### Does this PR introduce _any_ user-facing change? Yes — progress bars now have proper ARIA attributes for accessibility. Visual appearance is unchanged. ### How was this patch tested? - `UIUtilsSuite` — progress bar markup assertions updated and passing - `UISeleniumSuite` — CSS selector updated for new structure - Scalastyle checks passing ### Was this patch authored or co-authored using generative AI tooling? Yes, GitHub Copilot CLI was used. Closes apache#54564 from yaooqinn/SPARK-55771. Authored-by: Kent Yao <kentyao@microsoft.com> Signed-off-by: Kent Yao <kentyao@microsoft.com>

…p lazy initialization ### What changes were proposed in this pull request? Replace eager tooltip initialization on `DOMContentLoaded` with a single delegated `mouseover` event listener that lazily creates `bootstrap.Tooltip` instances on first hover. Changes: - `initialize-tooltips.js`: Replace `DOMContentLoaded` + `querySelectorAll` eager init with a single delegated `mouseover` listener on `document` - `stagepage.js`: Remove 6 redundant `new bootstrap.Tooltip()` calls - `historypage.js`: Remove 1 redundant tooltip init after DataTable render - `executorspage.js`: Remove 2 redundant tooltip inits after DataTable render - `timeline-view.js`: Use `getOrCreateInstance` for `show()` calls so programmatic tooltip display works without prior initialization ### Why are the changes needed? After the Bootstrap 5 upgrade (SPARK-55753), tooltips are eagerly initialized on `DOMContentLoaded` via `initialize-tooltips.js`, plus scattered `new bootstrap.Tooltip()` re-initialization calls in 5+ JS files after dynamic content renders (DataTables, timeline, etc.). This is fragile and error-prone — dynamically rendered elements (e.g., vis.js timeline items) are not present at `DOMContentLoaded`, so their tooltips silently fail. The delegated listener approach: - **Single source of truth** — no per-page boilerplate or re-init calls - **Handles dynamic content automatically** — DataTables, DAG viz, timeline items all work without explicit re-initialization - **Better performance** — only creates Tooltip for elements actually hovered - **Simpler code** — removed 9 scattered tooltip init calls - **Fixes timeline tooltips** — `getOrCreateInstance` ensures tooltips work on vis.js items that were not present at page load ### Does this PR introduce _any_ user-facing change? No. Tooltips behave identically — they appear on hover with the same content and positioning. ### How was this patch tested? - ESLint passes - 77 Scala UI tests pass (`core/testOnly org.apache.spark.ui.*`) - Manual verification via `spark-shell`: all pages (Jobs, Stages, Storage, Executors, SQL) serve correct BS5 tooltip markup and `initialize-tooltips.js` contains the lazy listener ### Was this patch authored or co-authored using generative AI tooling? Yes, GitHub Copilot was used. Closes apache#54560 from yaooqinn/SPARK-55764. Authored-by: Kent Yao <kentyao@microsoft.com> Signed-off-by: Kent Yao <kentyao@microsoft.com>

### What changes were proposed in this pull request? Follow up on my previous pr apache#53447. Found more obscure and strange cases so want to add them to golden files ### Why are the changes needed? Better test coverage ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? run golden tests ### Was this patch authored or co-authored using generative AI tooling? Generated-by: claude 2.1.56 Closes apache#54487 from mikhailnik-db/add-even-more-generators-tests. Authored-by: Mikhail Nikoliukin <mikhail.nikoliukin@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…dation ### What changes were proposed in this pull request? This PR introduces the foundation of the **Spark Types Framework** - a system for centralizing type-specific operations that are currently scattered across 50+ files using diverse patterns. **Framework interfaces** (4 files in `sql/api` and `sql/catalyst`): - `TypeOps` (catalyst) - mandatory server-side trait consolidating physical type representation, literal creation, and external type conversion - `TypeApiOps` (sql-api) - mandatory client-side trait consolidating string formatting and row encoding - `TimeTypeOps` + `TimeTypeApiOps` - proof-of-concept implementation for TimeType All mandatory operations for a type live in a single interface per module (`TypeOps` for catalyst, `TypeApiOps` for sql-api). This makes it clear what a new type must implement - one mandatory interface per module contains everything required. Optional capabilities (e.g., proto serialization, Arrow SerDe, JDBC) will be defined as separate traits in subsequent PRs that can be mixed in incrementally. **Integration points** (9 existing files modified): - `PhysicalDataType.scala` - physical type dispatch - `CatalystTypeConverters.scala` - external/internal type conversion (via `TypeOpsConverter` adapter) - `ToStringBase.scala` - string formatting - `RowEncoder.scala` - row encoding - `literals.scala` - default literal creation (`Literal.default`) - `EncoderUtils.scala` - encoder Java class mapping - `CodeGenerator.scala` - codegen Java class mapping - `SpecificInternalRow.scala` - mutable value creation - `InternalRow.scala` - row writer dispatch **Feature flag**: `spark.sql.types.framework.enabled` (defaults to `true` in tests via `Utils.isTesting`, `false` otherwise), configured in `SQLConf.scala` + `SqlApiConf.scala`. **Factory design:** `TypeOps.apply(dt)` returns `Option[TypeOps]`, serving as both lookup and existence check. The feature flag is checked inside `apply()`, so callers don't need to check it separately. Integration points use `getOrElse` to fall through to legacy handling: ```scala def someOperation(dt: DataType) = TypeOps(dt).map(_.someMethod()).getOrElse { dt match { // Legacy types (unchanged) case DateType => ... case TimestampType => ... } } ``` The split across `sql/api` and `sql/catalyst` follows existing Spark module separation - `TypeApiOps` lives in `sql/api` for client-side operations that depend on `AgnosticEncoder`, while `TypeOps` lives in `sql/catalyst` for server-side operations that depend on `InternalRow`, `PhysicalDataType`, etc. This is the first of several planned PRs. Subsequent PRs will add client-side integrations (Spark Connect proto, Arrow SerDe, JDBC, Python, Thrift) and storage format integrations (Parquet, ORC, CSV, JSON, etc.). ### Why are the changes needed? Adding a new data type to Spark currently requires modifying **50+ files** with scattered type-specific logic. Each file has its own conventions, and there is no compiler assistance to ensure completeness. Integration points are non-obvious and easy to miss - patterns include `_: TimeType` in Scala pattern matching, `TimeNanoVector` in Arrow SerDe, `.hasTime()`/`.getTime()` in proto fields, `LocalTimeEncoder` in encoder helpers, `java.sql.Types.TIME` in JDBC, `instanceof TimeType` in Java files, and compound matches like `case LongType | ... | _: TimeType =>` that are invisible to naive searches. The framework centralizes type-specific infrastructure operations in Ops interface classes. When adding a new type with the framework in place, a developer creates two Ops classes (one in `sql/api`, one in `sql/catalyst`) and registers them in the corresponding factory objects. The compiler enforces that all required interface methods are implemented, significantly reducing the risk of missing integration points. **Concrete example - TimeType:** TimeType has integration points spread across 50+ files using the diverse patterns listed above (physical type mapping, literals, type converters, encoders, formatters, Arrow SerDe, proto conversion, JDBC, Python, Thrift, storage formats). With the framework, these are consolidated into two Ops classes: `TimeTypeOps` (~80 lines) and `TimeTypeApiOps` (~60 lines). A developer adding a new type with similar complexity would create two analogous files instead of touching 50+ files. The framework does not cover type-specific expressions (e.g., `CurrentTime`, `TimeAddInterval`) or SQL parser changes, which are inherently type-specific - it provides the primitives those build on. This PR covers the core infrastructure integration. Subsequent PRs will add client-side integrations (Spark Connect proto, Arrow SerDe, JDBC, Python, Thrift) and storage format integrations (Parquet, ORC, CSV, JSON, etc.). ### Does this PR introduce _any_ user-facing change? No. This is an internal refactoring behind a feature flag (`spark.sql.types.framework.enabled`). When the flag is enabled, framework-supported types use centralized Ops dispatch instead of direct pattern matching. Behavior is identical in both paths. The flag defaults to `true` in tests and `false` otherwise. ### How was this patch tested? The framework is a refactoring of existing dispatch logic - it changes the mechanism but preserves identical behavior. The feature flag is enabled by default in test environments (`Utils.isTesting`), so the entire existing test suite validates the framework code path. No new tests are added in this PR because the framework delegates to the same underlying logic that existing tests already cover. In subsequent phases, the testing focus will be on: 1. Testing the framework itself (Ops interface contracts, roundtrip correctness, edge cases) 2. Designing a generalized testing mechanism that enforces proper test coverage for each type added through the framework ### Was this patch authored or co-authored using generative AI tooling? Co-authored with: claude-opus-4-6 Closes apache#54223 from davidm-db/davidm-db/types_framework. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

### What changes were proposed in this pull request? In this PR I propose that we always alias `OuterReference`s ### Why are the changes needed? These changes are needed for 2 main reasons: provide avoid potential issues with exposing raw outer references and their expressions ids and provide compatibility between fixed-point and single-pass analyzers. For example, in a query like: ``` table t |> where exists ( table other |> extend t.x |> select * except (a, b)) ``` before this change, the output will be: ``` Filter exists#x [x#1] : +- Project [x#1] : +- Project [a#3, b#4, outer(x#1)] : +- SubqueryAlias spark_catalog.default.other : +- Relation spark_catalog.default.other[a#3,b#4] json +- PipeOperator +- SubqueryAlias spark_catalog.default.t +- Relation spark_catalog.default.t[x#1,y#2] csv ``` The output of the subquery is now exactly the same as the one from outer reference (`x#1`). This can potentially cause query failures or correctness issues, but at the moment only presents as a compatibility issue between fixed-point and single-pass analyzers ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added new test cases + existing tests ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#54576 from mihailotim-db/mihailo-timotic_data/fix_outer_ref. Authored-by: Mihailo Timotic <mihailo.timotic@databricks.com> Signed-off-by: Daniel Tenedorio <daniel.tenedorio@databricks.com>

…ctions [design_sketch.md](https://github.com/user-attachments/files/25671695/design_sketch.md) ### What changes were proposed in this pull request? - Allow reference of built in functions with qualifiers builtin or system.builtin and temporary functions as session or system.session. Functions registered as extensionz can be qualified with system.extension or extension. - Cleaned up APIs to resolve functions to prep for configurable path - Register builtin extension, and session functions with qualified names, so they can co-exist - Fix a bug that allowed session functions with the same name to co-exist as table and scalar functions [design_sketch.md](https://github.com/user-attachments/files/25671695/design_sketch.md) ### Why are the changes needed? This portion of work allows users to excplicitly pick a builtin or temporary function, the same way they would pick a persisted function by fully qualifying it. This increases security. WIth this we now have a fixed order: extension -> builtin -> session -> current schema for function resolution. In follow on work we plan to allow the priority of function resolution to be configurable, for example to push temporary functions after built-ins or even after persisted functions. Ultimately we aim for proper SQL Standard PATH support where a user can add "libraries" of functions to the path. ### Does this PR introduce _any_ user-facing change? You can now reference builtin functions such as concat with builtin.concat or system.builtin.concat. Teh same for temporary functions which can be qualified as session, or system.session. ### How was this patch tested? A new suite: functionQualificationSuite.scala has been added ### Was this patch authored or co-authored using generative AI tooling? Yes: Claude Sonnet Closes apache#53570 from srielau/search-path. Lead-authored-by: Serge Rielau <serge@rielau.com> Co-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Serge Rielau <srielau@users.noreply.github.com> Signed-off-by: Gengliang Wang <gengliang@apache.org>

### What changes were proposed in this pull request? This PR proposes to change `ClosureClenaer` to work with Java 22+. Current `ClosureCleaner` doesn't work with Java 22. For example, the following code fails. ``` val x = 100 sc.parallelize(1 to 10).map(v => v + x).collect java.lang.InternalError: java.lang.IllegalAccessException: final field has no write access: $Lambda/0x00001c0001bae838.arg$1/putField, from class java.lang.Object (module java.base) at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.newFieldAccessor(MethodHandleAccessorFactory.java:207) at java.base/jdk.internal.reflect.ReflectionFactory.newFieldAccessor(ReflectionFactory.java:144) at java.base/java.lang.reflect.Field.acquireOverrideFieldAccessor(Field.java:1200) at java.base/java.lang.reflect.Field.getOverrideFieldAccessor(Field.java:1169) at java.base/java.lang.reflect.Field.set(Field.java:836) at org.apache.spark.util.ClosureCleaner$.setFieldAndIgnoreModifiers(ClosureCleaner.scala:563) at org.apache.spark.util.ClosureCleaner$.cleanupScalaReplClosure(ClosureCleaner.scala:431) at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:256) at org.apache.spark.util.SparkClosureCleaner$.clean(SparkClosureCleaner.scala:39) at org.apache.spark.SparkContext.clean(SparkContext.scala:2844) at org.apache.spark.rdd.RDD.$anonfun$map$1(RDD.scala:425) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:417) at org.apache.spark.rdd.RDD.map(RDD.scala:424) ... 38 elided Caused by: java.lang.IllegalAccessException: final field has no write access: $Lambda/0x00001c0001bae838.arg$1/putField, from class java.lang.Object (module java.base) at java.base/java.lang.invoke.MemberName.makeAccessException(MemberName.java:889) at java.base/java.lang.invoke.MethodHandles$Lookup.unreflectField(MethodHandles.java:3609) at java.base/java.lang.invoke.MethodHandles$Lookup.unreflectSetter(MethodHandles.java:3600) at java.base/java.lang.invoke.MethodHandleImpl$1.unreflectField(MethodHandleImpl.java:1619) at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.newFieldAccessor(MethodHandleAccessorFactory.java:185) ... 52 more ``` The reason is that as of Java 22, final fields cannot be modified even if using reflection by [JEP416](https://openjdk.org/jeps/416). The current `ClosureCleaner` tries to modify a final field `arg$1` with a cloned and cleaned object so this part will fail. At first I considered two solutions: 1. Using Unsafe API 2. Using `--enable-final-field-mutation` option which is expected to be introduced by [JEP 500](https://openjdk.org/jeps/500) But either of them cannot resolve the issue because final fields of hidden classes cannot be modified and lambdas created by JVM internally using `invokedynamic` instruction are hidden classes (let's call such lambda indy lambda). So the PR resolves this issue by cloning indy lambdas with cleaned `arg$1` using `LambdaMetaFactory` with the impl method from the original lambdas. ### Why are the changes needed? To make Spark work with Java 22+. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested on my customized GA job with Java 22. https://github.com/sarutak/spark/actions/runs/19331525569 All the failed tests are related to `datasketch`, which is a [separate issue](https://issues.apache.org/jira/browse/SPARK-53327). ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#52956 from sarutak/spark-shell-java22. Authored-by: Kousuke Saruta <sarutak@amazon.co.jp> Signed-off-by: Cheng Pan <chengpan@apache.org>

### What changes were proposed in this pull request? In github actions, group the sbt build output so it is collapsable. ### Why are the changes needed? This PR is made in favor of apache#54052. The sbt build outputs thousands of lines that many people do not care. It lags the webpage and made it difficult for people to find the part they are interested in. However, some people are interested in the message so it might not be the best solution to completely make `sbt build` quiet. By making it collapsable, I think everyone gets what they want. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Check the result of the CI of this PR to see the difference. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54524 from gaogaotiantian/group-sbt-build-message. Authored-by: Tian Gao <gaogaotiantian@hotmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

…ECT_RELEASE_SESSION_ON_EXIT is set ### What changes were proposed in this pull request? This PR adds an _on_exit handler to SparkConnectClient that is registered with Python's atexit module. When enabled via the SPARK_CONNECT_RELEASE_SESSION_ON_EXIT environment variable, the client will automatically ### Why are the changes needed? Currently, when a PySpark Connect client process exits without explicitly calling `spark.stop()`, the session may remain active on the server side, consuming resources unnecessarily. This change provides an opt-in mechanism to automatically release the session during process exit ### Does this PR introduce _any_ user-facing change? Yes. Users can now set the environment variable `SPARK_CONNECT_RELEASE_SESSION_ON_EXIT=true` to enable automatic session release when the Python process exits. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? Yes, co-authored with claude-4.5-opus-high Closes apache#54106 from wbo4958/release-on-exit. Authored-by: Bobby Wang <wbo4958@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

### What changes were proposed in this pull request? Fixed all the unnecessary and ambiguous unicode character usage. A set of `ruff` rules are also added to prevent future regressions. ### Why are the changes needed? We should avoid using non-ascii unicode character usage as much as possible. There are few rationales behind it * Sometimes it's just wrong. e.g. `‘index’` vs `'index'` * Some editor (VSCode) will highlight it as a warning and some editor/terminal might not display it well * It's difficult to keep consistency because people don't know how to type that * For docstrings, it could actually be displayed somewhere while users are using it and unicode could cause problems ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? `ruff check` passed. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54410 from gaogaotiantian/fix-ascii. Authored-by: Tian Gao <gaogaotiantian@hotmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

### What changes were proposed in this pull request? This PR aims to upgrade ORC to 2.3.0 for Apache Spark 4.2.0. (This is a draft during the vote period of ORC 2.3.0 RC0) ### Why are the changes needed? Apache ORC 2.3.0 is the first release which is tested with Java 25. To bring the latest improved and bug fixed version. - https://github.com/apache/orc/milestone/50 - apache/orc#2555 - apache/orc#2557 Note that there is a known Java 25 issue, [JDK-8377811](https://bugs.openjdk.org/browse/JDK-8377811). - ORC-2116 Java 25 G1GC: Optional Evacuations may evacuate pinned objects ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54481 from dongjoon-hyun/SPARK-55685. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

…Cache ### What changes were proposed in this pull request? Add `None` to type hint of `getCache` method. ### Why are the changes needed? The `getCache` method can return `None` but the type hint says otherwise. This confuses the type checker. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? CI. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54582 from gaogaotiantian/get-cache-type-hint. Authored-by: Tian Gao <gaogaotiantian@hotmail.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>

### What changes were proposed in this pull request? Upgrade lz4-java 1.10.4 to bring performance back. ### Why are the changes needed? https://github.com/yawkat/lz4-java/releases/tag/v1.10.4 > These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact. ### Does this PR introduce _any_ user-facing change? Performance regression has gone. ### How was this patch tested? Benchmark reports are updated ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54585 from pan3793/SPARK-55803. Lead-authored-by: Cheng Pan <chengpan@apache.org> Co-authored-by: pan3793 <pan3793@users.noreply.github.com> Signed-off-by: Cheng Pan <chengpan@apache.org>

…ariant ### What changes were proposed in this pull request? Replace the custom `.btn-spark` CSS class with Bootstrap 5's built-in `btn-outline-secondary` class. ### Why are the changes needed? The custom `.btn-spark` class (lines 45-59 in webui.css) defines a legacy gradient background, box-shadow, and hover transition that duplicate what Bootstrap 5 provides natively. Replacing it with `btn-outline-secondary` simplifies the CSS and ensures consistent look with the rest of the BS5-upgraded UI. Part of SPARK-55760 (Spark Web UI Modernization). ### Does this PR introduce _any_ user-facing change? Yes — the "Go" buttons in paged tables and stage page now use Bootstrap 5's outline-secondary style instead of the custom gradient style. The visual difference is minimal. ### How was this patch tested? - Verified no remaining references to `btn-spark` in the codebase - Visual inspection of paged table "Go" buttons ### Was this patch authored or co-authored using generative AI tooling? Yes, GitHub Copilot was used. Closes apache#54573 from yaooqinn/SPARK-55770. Authored-by: Kent Yao <kentyao@microsoft.com> Signed-off-by: Kent Yao <kentyao@microsoft.com>

…p 5 Collapse API ### What changes were proposed in this pull request? This PR replaces the custom JavaScript collapse/toggle system (`collapseTable`, `collapseTablePageLoad`, arrow-open/closed classes) with the native Bootstrap 5 Collapse API across the entire Spark Web UI. ### Key Changes **JavaScript (`webui.js`)** - Removed custom `collapseTable()` and `collapseTablePageLoad()` functions - Removed 45+ individual `collapseTablePageLoad()` registration calls and click handler - Added BS5 event-based localStorage persistence using `shown.bs.collapse` / `hidden.bs.collapse` events - Added page-load restore handler that reads localStorage and collapses previously-collapsed sections **JavaScript (`table.js`)** - Removed `collapseTableAndButton()` function (was used for legacy collapse-with-button pattern) **Scala HTML generators (21 files)** - Converted all collapsible sections to use `data-bs-toggle="collapse"` + `data-bs-target="#id"` on trigger elements - Added `collapse show` classes on content elements (initially visible) - Added `aria-expanded="true"` for accessibility - Kept `data-collapse-name` for localStorage key compatibility ### Affected pages - Environment (7 sections: Runtime, Spark Props, Resource Profiles, Hadoop, System, Metrics, Classpath) - Jobs (event timeline, active/completed/failed tables) - Job detail (event timeline, stages tables) - Stages (pool table, active/pending/completed/failed tables) - Pool page (stages table) - Storage (RDD table) - Executor thread dump (summary, stack trace tables) - SQL executions (running/completed/failed tables) - SQL execution detail (physical plan) - Streaming (timelines, histograms, receivers) - Streaming Query page (active/completed tables) - ThriftServer pages (session/operation tables) - Spark Connect Server pages (session/operation tables) - Standalone Master page (workers, apps, drivers) - Standalone Worker page (executors, drivers) - Standalone Application page (executors) ### Why are the changes needed? Part of the Bootstrap 5 migration umbrella (SPARK-55760). The custom collapse system predates BS5 and duplicates functionality that BS5 provides natively with better accessibility support (`aria-expanded`, keyboard navigation) and simpler code. ### Does this PR introduce _any_ user-facing change? No functional change. Collapse/expand behavior and localStorage persistence work identically. ### How was this patch tested? - Existing unit tests pass (`EnvironmentPageSuite`, `StoragePageSuite`, `ThriftServerPageSuite`, `SparkConnectServerPageSuite`) - Manual testing of collapse behavior across all affected pages - Verified localStorage persistence (collapse → reload → stays collapsed) - Scalastyle checks pass ### Screenshots **Jobs page** (all sections expanded): <img src="https://raw.githubusercontent.com/yaooqinn/spark/SPARK-55786-screenshots/screenshots/SPARK-55773/jobs.png" width="800"> **Jobs page** (sections collapsed): <img src="https://raw.githubusercontent.com/yaooqinn/spark/SPARK-55786-screenshots/screenshots/SPARK-55773/jobs_collapsed.png" width="800"> **Stages page:** <img src="https://raw.githubusercontent.com/yaooqinn/spark/SPARK-55786-screenshots/screenshots/SPARK-55773/stages.png" width="800"> **Environment page** (all sections expanded): <img src="https://raw.githubusercontent.com/yaooqinn/spark/SPARK-55786-screenshots/screenshots/SPARK-55773/environment.png" width="800"> **Environment page** (first 3 sections collapsed): <img src="https://raw.githubusercontent.com/yaooqinn/spark/SPARK-55786-screenshots/screenshots/SPARK-55773/environment_collapsed.png" width="800"> **Storage page:** <img src="https://raw.githubusercontent.com/yaooqinn/spark/SPARK-55786-screenshots/screenshots/SPARK-55773/storage.png" width="800"> **SQL Executions page:** <img src="https://raw.githubusercontent.com/yaooqinn/spark/SPARK-55786-screenshots/screenshots/SPARK-55773/sql.png" width="800"> **SQL Execution detail:** <img src="https://raw.githubusercontent.com/yaooqinn/spark/SPARK-55786-screenshots/screenshots/SPARK-55773/sql_detail.png" width="800"> **Job detail:** <img src="https://raw.githubusercontent.com/yaooqinn/spark/SPARK-55786-screenshots/screenshots/SPARK-55773/job_detail.png" width="800"> **Stage detail:** <img src="https://raw.githubusercontent.com/yaooqinn/spark/SPARK-55786-screenshots/screenshots/SPARK-55773/stage_detail.png" width="800"> Closes apache#54574 from yaooqinn/SPARK-55773. Authored-by: Kent Yao <kentyao@microsoft.com> Signed-off-by: Kent Yao <kentyao@microsoft.com>

…n-local return ### What changes were proposed in this pull request? 1. Changed error message to make it more clear and suggest resolution. 2. Do not use non-local return in a closure. This is a follow-up on apache#54254. ### Why are the changes needed? To make error message more informative and avoid a bug-prone pattern. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54583 from fedimser/fix-10. Authored-by: Dmytro Fedoriaka <dmytro.fedoriaka@databricks.com> Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>

…stacked progress ### What changes were proposed in this pull request? Fix the height mismatch between `.progress-stacked` container (1.42rem) and its inner `.progress` divs (BS5 default 1rem) by adding `height: 100%` to the inner `.progress` elements. ### Why are the changes needed? After SPARK-55771 introduced BS5 `.progress-stacked`, the inner `.progress` bars use Bootstrap 5's default height (1rem ≈ 14px) which is shorter than the custom stacked container height (1.42rem ≈ 20px). This causes the colored progress bars to not fill the full height of the container, creating a visual misalignment. **Before (misaligned — bar shorter than container):** Inner `.progress` = 14px, container = 20px **After (aligned — bar fills container):** Inner `.progress` = 100% = 20px, container = 20px ### Does this PR introduce _any_ user-facing change? Visual fix only — progress bars now properly fill their container height. ### How was this patch tested? Manual verification with Playwright — confirmed all three elements (`.progress-stacked`, `.progress`, `.progress-bar`) now render at the same height (20px). Closes apache#54587 from yaooqinn/SPARK-55771-followup. Authored-by: Kent Yao <kentyao@microsoft.com> Signed-off-by: Cheng Pan <chengpan@apache.org>

### What changes were proposed in this pull request? This pr aims to add a microbenchmark for `Platform`. ### Why are the changes needed? `Platform` is a fundamental utility class that is heavily reliant on the underlying JDK implementation. Adding a microbenchmark allows for better observation of its performance discrepancies across different JDK versions, facilitating timely and targeted optimization. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass Github Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#54570 from LuciferYang/SPARK-55789. Lead-authored-by: yangjie01 <yangjie01@baidu.com> Co-authored-by: LuciferYang <LuciferYang@users.noreply.github.com> Signed-off-by: yangjie01 <yangjie01@baidu.com>

### What changes were proposed in this pull request? We check if an object is an instance of `datetime.date` when it could be `datetime.datetime` - that's unnecessary and pointless because `datetime.datetime` is a subclass of `datetime.date`. It's documented in python docs so it's guaranteed. We can also safely convert a `datetime.datetime` object directly. (We have been doing that for a long time). ### Why are the changes needed? Remove unnecessary and misleading logic. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? CI. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54581 from gaogaotiantian/fix-expression-datetime. Authored-by: Tian Gao <gaogaotiantian@hotmail.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>

Add automatic upgrade mechanism for Spark Structured Streaming offset logs, migrating from V1 (position-based) to V2 (name-based) format without losing state or requiring checkpoint deletion. Changes: - Add config spark.sql.streaming.offsetLog.v1ToV2.autoUpgrade.enabled - Add SupportsOffsetLogUpgrade trait for sources to handle metadata migration - Implement upgrade logic in MicroBatchExecution with two paths: - Positional: V1 to V2 with keys "0", "1", "2" - Named: V1 to V2 with actual source names, migrating metadata directories - FileStreamSource implements metadata migration by copying all batches from old to new paths - Add OffsetSeq.toOffsetMap() conversion method - Add comprehensive test suite with 7 passing tests Co-Authored-By: Claude <noreply@anthropic.com>

github-actions bot added SQL STRUCTURED STREAMING labels Feb 19, 2026

ericm-db force-pushed the streaming-source-identifying-name branch 2 times, most recently from ce945a3 to f0ffdc9 Compare February 23, 2026 19:10

gaogaotiantian and others added 13 commits February 24, 2026 11:30

ericm-db force-pushed the streaming-source-identifying-name branch 7 times, most recently from e798736 to 9f02f2f Compare February 24, 2026 20:03

ericm-db force-pushed the streaming-source-identifying-name branch from 9f02f2f to 7bf7e7d Compare February 24, 2026 20:14

ericm-db force-pushed the streaming-source-identifying-name branch from 7bf7e7d to 50b06fc Compare February 24, 2026 20:16

devin-petersohn and others added 7 commits March 2, 2026 14:25

ericm-db force-pushed the streaming-offset-v1-to-v2-migration branch 4 times, most recently from 6f02737 to 9768076 Compare March 2, 2026 18:04

mikhailnik-db and others added 18 commits March 3, 2026 02:57

ericm-db force-pushed the streaming-offset-v1-to-v2-migration branch from ade713c to 9eab4f6 Compare March 3, 2026 22:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming offset v1 to v2 migration#29

Streaming offset v1 to v2 migration#29
ericm-db wants to merge 130 commits intostreaming-source-identifying-namefrom
streaming-offset-v1-to-v2-migration

ericm-db commented Feb 19, 2026

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

ericm-db commented Feb 19, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

github-actions bot commented Feb 19, 2026

⚠️ Pull Request Title Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants