PMF-TSFM

Process Model Forecasting with Time Series Foundation Models

Systematic evaluation of Time Series Foundation Models (TSFMs) for Process Model Forecasting (PMF), predicting how directly-follows (DF) relations in a process evolve over time. The repository benchmarks Chronos, Moirai, and TimesFM across zero-shot, LoRA, and full fine-tuning settings on four real-world event logs, using MAE/RMSE alongside Entropic Relevance as a process-aware conformance metric.

Try it live: an interactive forecast explorer and the talk slides, both hosted as Hugging Face Spaces.

At a Glance

Zero-shot coverage: 12 TSFM variants across Chronos, Moirai, and TimesFM.
Fine-tuning coverage: LoRA for Chronos-Bolt and Moirai-1.1; full fine-tuning for Chronos-Bolt, Chronos-2, and Moirai-1.1.
Data assets: daily DF-count time series in Parquet and XES logs for Entropic Relevance evaluation.
Orchestration: Hydra-driven Python entry points plus local orchestration scripts and VSC HPC helpers.
Self-host & agents: run zero-shot forecasting on your own log via the Docker image or the headless MCP server.

Supported Models

Family	Variants
Chronos	Bolt Tiny, Bolt Mini, Bolt Small, Bolt Base, Chronos-2
Moirai	1.1 Small/Large, 2.0 Small, MoE Base
TimesFM	1.0-200M, 2.0-500M, 2.5-200M

LoRA experiments in this repo cover chronos/bolt_small, chronos/bolt_base, moirai/1_1_small, and moirai/1_1_large.
Full fine-tuning covers chronos/bolt_small, chronos/bolt_base, chronos/chronos2, moirai/1_1_small, and moirai/1_1_large.

Datasets

Four process mining event logs from the BPI Challenge and healthcare domains:

Dataset	Description	Cases	DFs
bpi2017	Loan application process	40,229	21
bpi2019_1	Purchase order process (3-way match)	197,521	149
sepsis	Sepsis clinical pathway	999	135
hospital_billing	Hospital billing process	78,828	73

The experiment data assets are published on Zenodo. After extraction, the archive is organized as:

data/
├── raw_logs/         # original XES logs from source benchmarks
├── processed_logs/   # processed XES logs used by ER evaluation
├── time_series/      # daily DF-count Parquet files used by inference/training
└── metadata/         # release metadata and preprocessing statistics

See Data Setup for download commands. If you want the upstream preprocessing workflow and source-log preparation details, see pmf-benchmark.

Installation

Requires Python 3.10+ and uv. The timesfm_v25 extra requires Python 3.11+ because the pinned TimesFM 2.5 package is only installed on 3.11+.

# Clone and install
git clone https://github.com/YongboYu/pmf-tsfm.git
cd pmf-tsfm
uv sync

# Optional model extras
uv sync --extra timesfm_v25      # TimesFM 2.5
uv sync --extra timesfm_legacy   # TimesFM 1.0 / 2.0

# Optional dev tools
uv sync --group dev

# Optional: activate the uv-managed environment for plain `python -m ...` usage
source .venv/bin/activate

Examples below assume either the .venv is activated or commands are prefixed with uv run. See Tested Environments for the macOS (Apple Silicon / MPS), Linux (NVIDIA GPUs / CUDA), and VSC wICE HPC cluster setups used with this repo.

Create the local config files that are meant to stay machine-specific:

cp .env.example .env
cp configs/local/default.yaml.example configs/local/default.yaml

.env is used for environment variables such as PROJECT_ROOT, WANDB_API_KEY, CUDA_VISIBLE_DEVICES, and TIMESFM_V1_PATH.
configs/local/default.yaml is optional and useful for Hydra-only overrides such as device, training.num_workers, or a local Weights & Biases entity.

Data Setup

Download from the Zenodo record page: https://zenodo.org/records/18327515.

# With zenodo-get
pip install zenodo-get
zenodo_get 10.5281/zenodo.18327515 -o data/

Or download the current archive directly:

wget -O data/pmf_data_v1.1.zip \
  https://zenodo.org/api/records/18327515/files/pmf_data_v1.1.zip/content

Extract the downloaded archive into data/:

unzip -o data/pmf_data_v1.1.zip -d data/

After extraction the data/ directory should contain raw_logs/, processed_logs/, and time_series/ as described in Datasets.

For the reproducible paper workflow, preprocess the Parquet time series once so training and inference share the exact same split boundaries:

python -m pmf_tsfm.data.preprocess --multirun \
  data=bpi2017,bpi2019_1,sepsis,hospital_billing

This writes data/processed/{dataset}/full.parquet, train.parquet, val.parquet, test.parquet, and metadata.json, which are then used by training and inference. See Common Workflows for run examples and HPC for the cluster path.

Quick Start

# 1. Zero-shot inference on one model/dataset pair
python -m pmf_tsfm.inference model=chronos/bolt_small data=bpi2017

# 2. Evaluate that output directory
python -m pmf_tsfm.evaluate \
  results_dir=outputs/zero_shot/bpi2017/chronos_bolt_small

# 3. Evaluate Entropic Relevance on the same predictions
python -m pmf_tsfm.er.evaluate_er model=chronos/bolt_small data=bpi2017

Add logger=wandb or logger=wandb_offline to any Hydra command if you want W&B tracking.

Predictions are written under outputs/{task}/{dataset}/{model}/ and fine-tuned checkpoints / LoRA adapters under results/{task}/{dataset}/{model}/. Both directories are generated per run and git-ignored.

Common Workflows

All experiment entry points are Hydra-based.

Zero-shot inference

# Default run (Chronos Bolt Small on bpi2017)
python -m pmf_tsfm.inference

# Single model + dataset
python -m pmf_tsfm.inference model=chronos/bolt_small data=bpi2017

# Sweep multiple combinations
python -m pmf_tsfm.inference --multirun \
  model=chronos/bolt_small,chronos/bolt_base \
  data=bpi2017,bpi2019_1,sepsis,hospital_billing

Fine-tuning

# LoRA fine-tuning
python -m pmf_tsfm.train \
  task=lora_tune model=chronos/bolt_small data=bpi2017 lora=chronos

# Full fine-tuning
python -m pmf_tsfm.train \
  task=full_tune model=chronos/bolt_small data=bpi2017

Inference with fine-tuned models

# LoRA-adapted inference
python -m pmf_tsfm.inference model=chronos/bolt_small data=bpi2017 \
  task=lora_tune lora_adapter_path=results/lora_tune/bpi2017/chronos_bolt_small/lora_adapter/best

# Fully fine-tuned inference
python -m pmf_tsfm.inference model=chronos/bolt_small data=bpi2017 \
  task=full_tune checkpoint_path=results/full_tune/bpi2017/chronos_bolt_small/checkpoints/best

Evaluation

# Evaluate all zero-shot outputs
python -m pmf_tsfm.evaluate

# Evaluate all LoRA or full-tune outputs
python -m pmf_tsfm.evaluate task=lora_tune
python -m pmf_tsfm.evaluate task=full_tune

# Evaluate a specific model/dataset directory
python -m pmf_tsfm.evaluate \
  results_dir=outputs/zero_shot/bpi2017/chronos_bolt_small

# Entropic Relevance on one model/dataset pair
python -m pmf_tsfm.er.evaluate_er model=chronos/bolt_small data=bpi2017

Batch scripts

# All zero-shot combinations
bash scripts/run_inference_all.sh

# All LoRA train + inference runs
bash scripts/run_lora_all.sh

# All full fine-tune + inference runs
bash scripts/run_full_tune_all.sh

# Batch ER evaluation
bash scripts/run_er_all.sh

# Full 10-stage end-to-end pipeline
bash scripts/run_full_pipeline.sh

These are local orchestration scripts: shell helpers for running sequential experiment batches on your workstation or server, without Slurm job submission. The shell scripts source scripts/env.sh, which loads .env and activates .venv automatically when present.

Run on Your Own Data (Self-Host & Agents)

Beyond the research CLI above, the core pipeline ships as two self-host artifacts so you can run zero-shot DF-relation forecasting plus accuracy (MAE / RMSE + Entropic Relevance) on your own process log — a raw .xes/.xes.gz (auto-converted to the daily DF-relation series) or a prepared DF-relation .parquet — with no caps. Both wrap the same Gradio-free seam src/pmf_tsfm/api.py (forecast_backtest / forecast_only / list_models) — a zero-shot holdout backtest that reuses the real cores, so the numbers match the CLI and paper. See the per-artifact READMEs below for setup and design details.

Docker — self-host CLI (docker/README.md). Build the core image and forecast your own log:
```
docker build -f docker/Dockerfile -t pmf-tsfm .
docker run --rm -v "$PWD/data:/data" -v pmf-cache:/cache \
  pmf-tsfm backtest --input /data/processed_logs/sepsis.xes --model chronos/chronos2
```
The image also runs the full Hydra CLIs (inference, evaluate, evaluate_er, ...). Default models are Chronos + Moirai; TimesFM is an opt-in build (--build-arg INSTALL_TIMESFM=1).

MCP — agent server (mcp/README.md). A headless FastMCP server exposing the same capability as typed MCP tools:

uv sync --extra mcp
python mcp/server.py        # stdio; connect any MCP client or the MCP Inspector

The capped Gradio demo in demo/ remains the hosted visualization Space; these two artifacts are the uncapped, bring-your-own-data path.

Tested Environments

macOS on Apple Silicon: tested with device=mps for local development and lighter runs.
Linux workstation/server with NVIDIA GPUs: tested with device=cuda and the local scripts/env.sh helpers.
VSC wICE cluster: tested with Slurm submission scripts under scripts/hpc/ for NVIDIA H100 GPU jobs.

For macOS with MPS, keep training.num_workers=0. For Linux systems with NVIDIA GPUs and for the HPC cluster, higher worker counts such as training.num_workers=4 are the intended path.

HPC (VSC wICE cluster)

Slurm submission scripts for the VSC wICE cluster live under scripts/hpc/. Use scripts/hpc/.env.hpc.example as the cluster-specific starting point.

W&B logging on HPC depends on LOGGER:

bash scripts/hpc/submit_pipeline.sh defaults to LOGGER=disabled.
Direct stage scripts such as submit_zero_shot.sh, submit_lora.sh, and submit_full_tune.sh default to LOGGER=wandb.
Use LOGGER=wandb_offline only if you want offline runs, then sync them later with bash scripts/hpc/sync_wandb_offline.sh.

bash scripts/hpc/setup_vsc.sh          # One-time environment setup
bash scripts/hpc/submit_pipeline.sh    # Default: no W&B logging
LOGGER=wandb bash scripts/hpc/submit_pipeline.sh
LOGGER=wandb_offline bash scripts/hpc/submit_pipeline.sh
bash scripts/hpc/submit_zero_shot.sh   # Default: LOGGER=wandb for direct stage runs
bash scripts/hpc/sync_wandb_offline.sh # Only after explicit offline runs

Project Structure

pmf-tsfm/
├── src/pmf_tsfm/       # Python package: model adapters, data modules, evaluation, api.py seam
├── configs/            # Hydra configs for tasks, models, datasets, loggers, paths
├── scripts/            # Local orchestration scripts and HPC helpers
├── docker/             # Self-host image for the core pipeline (see docker/README.md)
├── mcp/                # Headless FastMCP server over the api.py seam (see mcp/README.md)
├── demo/               # Gradio forecast explorer, hosted as a live HF Space (see demo/README.md)
├── tests/              # pytest suite
├── data/               # Zenodo assets plus generated processed splits
├── outputs/            # Generated predictions and evaluation artifacts (git-ignored)
├── results/            # Generated checkpoints and LoRA adapters (git-ignored)
├── notebooks/          # Analysis notebooks
├── manuscript/         # Paper assets
└── slides/             # Slidev talk deck, published as a live HF Space

Citation

@article{yu2025time,
  title={Time Series Foundation Models for Process Model Forecasting},
  author={Yu, Yongbo and Peeperkorn, Jari and De Smedt, Johannes and De Weerdt, Jochen},
  journal={arXiv preprint arXiv:2512.07624},
  year={2025}
}

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
.github/workflows		.github/workflows
configs		configs
data		data
demo		demo
docker		docker
mcp		mcp
scripts		scripts
slides		slides
src/pmf_tsfm		src/pmf_tsfm
tests		tests
.deepsource.toml		.deepsource.toml
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CONTEXT.md		CONTEXT.md
LICENSE		LICENSE
README.md		README.md
codecov.yml		codecov.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PMF-TSFM

At a Glance

Supported Models

Datasets

Installation

Data Setup

Quick Start

Common Workflows

Zero-shot inference

Fine-tuning

Inference with fine-tuned models

Evaluation

Batch scripts

Run on Your Own Data (Self-Host & Agents)

Tested Environments

HPC (VSC wICE cluster)

Project Structure

Citation

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PMF-TSFM

At a Glance

Supported Models

Datasets

Installation

Data Setup

Quick Start

Common Workflows

Zero-shot inference

Fine-tuning

Inference with fine-tuned models

Evaluation

Batch scripts

Run on Your Own Data (Self-Host & Agents)

Tested Environments

HPC (VSC wICE cluster)

Project Structure

Citation

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages