Skip to content

Conversation

@eemario
Copy link
Contributor

@eemario eemario commented Jan 16, 2026

What is the purpose of the change

This pull request make HistoryServer support application archives while ensuring that legacy job archives are handled properly.

Brief change log

  • Add configuration options for application archive fetching
  • Adapt the archive retention strategy to be compatible with application archives
  • Introduce a new archiveFetcher for application archives (and their associated job archives)
  • Integrate the new archiveFetcher into the HistoryServer

Verifying this change

This change added tests and can be verified as follows:

  • Updated tests for archive retention strategy to include coverage for application archives
  • Added tests that validates the behavior of the HistoryServer when fetching jobs and applications
  • Manually verified the change by running a standalone cluster and HistoryServer, confirming that the archives can be parsed properly and served via REST responses

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (yes)
  • If yes, how is the feature documented? (docs)

@flinkbot
Copy link
Collaborator

flinkbot commented Jan 16, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

<td><h5>historyserver.archive.retained-applications</h5></td>
<td style="word-wrap: break-word;">-1</td>
<td>Integer</td>
<td>The maximum number of applications to retain in each archive directory defined by org.apache.flink.configuration.description.TextElement@ae3540e. This option works together with the TTL (see <code class="highlighter-rouge">historyserver.archive.retained-ttl</code>). Archived entities will be removed if their TTL has expired or the retention count limit has been reached. <br />If set to `-1`(default), there is no limit to the number of archives. If set to <code class="highlighter-rouge">0</code> or less than <code class="highlighter-rouge">-1</code>, HistoryServer will throw an <code class="highlighter-rouge">IllegalConfigurationException</code>. <br />Note, when there are multiple history server instances, two recommended approaches when using this option are: <ul><li>Specify the option in only one HistoryServer instance to avoid errors caused by multiple instances simultaneously cleaning up remote files, </li><li>Or you can keep the value of this configuration consistent across them. </li></ul></td>
Copy link
Contributor

@davidradl davidradl Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is org.apache.flink.configuration.description.TextElement@ae3540e - is this supposed to be a link?
what is <br />. This seems to be an end br tag with a space in. what does this mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was caused by a mistake in the HistoryServerOptions description. I will fix it and regenerate the docs. Thanks for pointing it out!

"Whether HistoryServer should cleanup jobs that are no longer present in the archive directory defined by %s. ",
code(HISTORY_SERVER_ARCHIVE_DIRS.key()))
.linebreak()
.text(LEGACY_NOTE_MESSAGE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should deprecate the options that only apply to the legacy case. I assume there is an intention to remove them in the next version change

@github-actions github-actions bot added the community-reviewed PR has been reviewed by the community. label Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants