Unofficial PyTorch Reproduction of
This is an unofficial implementation maintained by @StaryMoon. If this repository helps your reading, reproduction, or course project, please consider giving it a star and following my GitHub profile.
- 2026-06-10: Initial public release with model interfaces, configuration, smoke test, and reproduction roadmap.
This repository organizes a PyTorch implementation for A Frame Is Worth One Token: Efficient Generative World Modeling with Delta Tokens, focusing on efficient generative world modeling with compact delta tokens. The codebase is structured like a standard research repository so that each paper component can be replaced, tested, and extended independently.
Main goals:
- provide a clean PyTorch module layout for the paper;
- keep training, inference, evaluation, and configuration entry points explicit;
- track paper-reported metrics separately from local reproduction logs;
- make it easy for contributors to fill in missing paper-specific details.
DeltaWorld-Unofficial/
├── configs/
│ └── default.yaml
├── scripts/
│ └── smoke_test.py
├── src/deltaworld_unofficial/
│ ├── __init__.py
│ └── model.py
├── README.md
├── requirements.txt
└── pyproject.toml
git clone https://github.com/StaryMoon/DeltaWorld-Unofficial.git
cd DeltaWorld-Unofficial
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtFor CUDA-enabled experiments, install the PyTorch build matching your CUDA version from the official PyTorch website before installing the rest of the dependencies.
Run the minimal forward-pass check:
python scripts/smoke_test.pyExpected output:
output: (...)
loss: ...
This confirms that the package import path, model interface, and tensor flow are working.
Create a local data directory:
mkdir -p data checkpoints outputsRecommended layout:
data/
├── train/
├── val/
└── test/
Dataset-specific converters will be added under scripts/ as the reproduction becomes more complete. Please do not commit private datasets, downloaded checkpoints, or generated outputs.
Current minimal module usage:
import torch
from deltaworld_unofficial import StarterConfig, UnofficialModel, reconstruction_loss
config = StarterConfig(hidden_dim=128, num_layers=2, num_heads=4)
model = UnofficialModel(config)
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)
x = torch.randn(2, 3, 64, 64)
token_ids = torch.randint(0, config.vocab_size, (2, 8))
target = torch.zeros(2, config.output_dim)
pred = model(x, token_ids=token_ids)
loss = reconstruction_loss(pred, target)
loss.backward()
optimizer.step()Full training scripts will be added as paper-specific datasets and loss terms are implemented.
import torch
from deltaworld_unofficial import UnofficialModel
model = UnofficialModel().eval()
with torch.no_grad():
x = torch.randn(1, 3, 64, 64)
y = model(x)
print(y.shape)Planned evaluation entry points:
python scripts/smoke_test.py
# future:
# python scripts/evaluate.py --config configs/default.yaml --ckpt checkpoints/model.ptMetrics and protocols will follow the original paper as closely as possible. Paper-reported values and local reproduction values should be kept in separate columns.
For copyright and license clarity, this repository links to the original paper figures and tables instead of redistributing screenshots copied from the PDF. The table below tracks the paper-reported result locations so readers can quickly compare against future local logs.
| Result Type | Paper Location | Source |
|---|---|---|
| Main quantitative comparison | Main paper tables | Original paper / project page |
| Ablation study | Ablation section | Original paper / project page |
| Qualitative examples | Main paper figures and supplement | Original paper / project page |
| Date | Config | Dataset Split | Metric | Value | Notes |
|---|---|---|---|---|---|
| 2026-06-10 | configs/default.yaml |
smoke check | forward pass | ok | package interface validation |
- Package layout and install metadata
- Core PyTorch module interfaces
- Config file and smoke test
- Dataset-specific preprocessing
- Paper-specific losses and heads
- Training script
- Evaluation script
- Model zoo / checkpoints
- Reproduction logs
| Model | Checkpoint | Config | Notes |
|---|---|---|---|
| default | TBA | configs/default.yaml |
compact implementation interface |
If you find this repository useful, please cite the original paper:
@article{deltaworldunofficial,
title = {A Frame Is Worth One Token: Efficient Generative World Modeling with Delta Tokens},
author = {Tommie Kerssies, Gabriele Berton, Ju He, Qihang Yu, Wufei Ma, Daan de Geus, Gijs Dubbelman, Liang-Chieh Chen},
year = {2026},
note = {CVPR 2026 Highlight}
}- Thanks to the authors of A Frame Is Worth One Token: Efficient Generative World Modeling with Delta Tokens for the original research.
- This repository is inspired by standard open-source PyTorch research codebases.
- The implementation is unofficial and all paper names, datasets, and trademarks belong to their respective owners.
This repository is released under the MIT License. The original paper, datasets, official code, and project assets remain governed by their own licenses.
world-model, video-generation, delta-tokens, cvpr-highlight, pytorch, cvpr-2026, unofficial-pytorch, reproduction