Skip to content

Feature Request: Direct Spark UI listing without relying on History Server for incomplete applications #11

@mbi-flh

Description

@mbi-flh

Summary

Currently, spark-web-proxy relies on the Spark History Server's "incomplete applications" feature to display running Spark applications. This approach has significant limitations when using S3-compatible object storage.

Problem Description

Current Behavior

The proxy successfully detects running Spark applications via the Kubernetes API (visible in logs):
The application 'spark-xxx' was updated: Running at [http://10.233.x.x:4040]

However, these applications do not appear in the UI because the History Server cannot read in-progress event logs from S3.

Root Cause

  1. S3 doesn't support partial file writes: Event log files (.inprogress) remain at 0 bytes until the job completes or the 10MB rolling threshold is reached
  2. Spark enforces a minimum 10MB rolling size: spark.eventLog.rolling.maxFileSize cannot be set below 10MB
  3. Small/short jobs never appear: Applications that don't generate 10MB of event logs are invisible until completion

Environment Details

  • Spark version: 3.5.6
  • Storage: S3-compatible (MinIO)
  • Event log format: eventlog_v2 with rolling enabled
  • History Server: Configured with spark.history.fs.inProgressOptimization.enabled=true

Tested Configurations (all failed)

  • ✅ Enabled inProgressOptimization
  • ✅ Set spark.eventLog.rolling.enabled=true
  • ✅ Tried reducing rolling size (blocked by 10MB minimum)
  • ✅ Tried eventlog_v1 format
  • ✅ Aligned Spark versions (History Server 3.5.6 = Applications 3.5.6)
  • ❌ None of these solve the S3 partial-write limitation

Thank you for this great project! 🙏

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions