Skip to content

Archerkattri/sam3d-plus-plus

Repository files navigation

SAM 3D Objects + HiCache++

Image-to-3D with a training-free, tree-aware exponential (DMD/Prony) velocity cache on the slat-stage flow matching.

A fork of SAM 3D Objects (Meta) that skips most network evaluations in the slat-stage flow-matching sampler by forecasting the cached velocity with a Dynamic-Mode-Decomposition exponential basis — exact on the feature-ODE solution class where the HiCache polynomial drifts, so it stays lossless at a larger skip interval. Generalised to SAM3D's PyTree (structured) velocities.

base: SAM 3D Objects  arXiv: SAM 3D  arXiv: HiCache  license: SAM  basis: exponential (DMD/Prony)

When to use this repo

These repos are complementary accelerators, not competing solutions — each speeds up a different base generator, and the + / ++ suffix is a method choice, not a rival product. Pick by (1) which base model you run, then (2) which forecast basis you want:

base generator + = HiCache (Hermite) ++ = HiCache++ (DMD)
Hunyuan3D-2.1 hunyuan2.1-plus hunyuan2.1-plus-plus
Hunyuan3D-2 mini hunyuan2-plus hunyuan2-plus-plus
SAM 3D Objects sam3d-plus sam3d-plus-plus
Fast-SAM3D fastsam3d-plus fastsam3d-plus-plus
TRELLIS (v1) faster-trellis faster-trellis-plus-plus
TRELLIS.2-4B (v2) hermit-trellis2 hermit-trellis2-plus-plus
  • + (HiCache / scaled-Hermite): the published polynomial velocity-forecast basis — conservative, reproduces the HiCache paper. Use it to deploy the established method.
  • ++ (HiCache++ / DMD exponential): our Dynamic-Mode-Decomposition basis — the same near-lossless quality at wider skip intervals, where the polynomial diverges. Use it when you push the cache interval for more speed.
  • standalone / model-agnostic: hicache-plus-plus — the forecaster itself, to add DMD caching to your own diffusion/flow model.
  • fast-trellis2 = the TaylorSeer baseline fork (the upstream "Fast" accel) — the v2 reference point, not a HiCache variant.

This repo: sam3d-plus-plusSAM 3D Objects × HiCache++ (DMD) — geometry-lossless to interval-6.


What this is

SAM 3D Objects reconstructs full 3D shape, texture, and layout from a single masked image. The expensive part of inference is the slat-stage flow-matching sampler: an Euler ODE solve that calls the backbone once per step over many steps.

This fork adds HiCache++ to that sampler — training-free, geometry-preserving feature caching. On most solver steps it skips the backbone and forecasts the (CFG-combined) velocity from cached anchors, calling the network only every interval steps. Where HiCache forecasts with a scaled-Hermite polynomial, HiCache++ forecasts with a DMD/Prony exponential basis — the exact function class a diffusion feature trajectory lives in — so quality holds at skip intervals where the polynomial drifts. As in SAM3D the solver state and backbone velocity are torch.utils._pytree structures, so the cache is tree-aware: snapshots are flattened to one vector per step, the propagator is identified once, and the forecast is unflattened back to the tree.

HiCache++ ships alongside HiCache (Hermite) here as the comparison baseline. The forecaster itself is packaged standalone as hicache-plus-plus; the pure-Hermite fork is the sibling sam3d-plus.

Method

A flow-matching sampler integrates dx/dt = v_θ(x, t); across steps the cached velocity F_t evolves under a slowly-varying, near-linear feature-ODE Ḟ = M F, whose exact solution class is a sum of (damped/oscillatory) exponentials Σ_j a_j e^{μ_j t}not polynomials. A polynomial basis (Taylor, Hermite) is only a local truncation of that exponential and diverges as the skip horizon grows, which is what caps a polynomial cache at a modest interval. HiCache++ instead uses Dynamic Mode Decomposition (Schmid 2010), the SVD-regularised generalisation of Prony's method (1795): identify the linear propagator from raw velocity snapshots (F_{t+1} ≈ A F_t), eigendecompose it once, and predict any (fractional) horizon k by eigenvalue powers F_{t+k} ≈ Φ (λ^k ⊙ b). This is exact on the exponential class, so it extends the lossless skip range. Because SAM3D velocities are PyTrees, snapshots are flattened per leaf, DMD'd, and unflattened (accel.py: dmd_*_tree). A ≥4-snapshot floor applies — a real trajectory spends two real DOF per complex pole, so even one oscillatory mode needs rank 3 (3 pairs = 4 snapshots) — and below it (or across a non-uniform window) HiCache++ falls back to the Hermite forecast for warm-up. The hook is native: the Euler solver calls the cache helpers directly, no monkey-patching.

Enable (real API)

HiCache++ lives on the Euler solver of the slat-stage FlowMatching module and is exposed as enable_dmd (the Hermite baseline remains enable_hicache) — see flow_matching/model.py, flow_matching/solver.py, flow_matching/accel.py:

# fm is the slat-stage FlowMatching module inside the SAM 3D Objects pipeline.
# HiCache++ requires the Euler solver (one dynamics_fn eval per step).

fm.enable_dmd(
    interval=6,        # call the backbone every 6th step; forecast the other 5
    history=5,         # sliding window of raw velocity snapshots fed to DMD
    first_enhance=2,   # always run full for the first 2 (warm-up) steps
    end_enhance=None,  # always run full for the final step(s); None = last step
    max_order=2,       # Hermite order used for the <4-snapshot warm-up fallback
    sigma=0.5,         # Hermite scale for that fallback, in (0,1)
)

# baseline for comparison — the Hermite-polynomial forecaster:
# fm.enable_hicache(interval=3, max_order=1, first_enhance=2, sigma=0.5)

# optional, composable: drop the unconditional CFG pass once it aligns
fm.enable_adaptive_guidance(gamma_bar=0.94, warmup=2)

# ... run the normal SAM 3D Objects inference (generate / demo.py) ...

fm.disable_hicache()            # disable_hicache() turns off DMD too (shared slot)
fm.disable_adaptive_guidance()

enable_dmd / enable_hicache are also available directly on the solver (ODESolver.enable_dmd(...), which sets backend="dmd"); the solver resets the per-trajectory cache at the start of every run and only activates it for Euler. A CPU unit test that needs no GPU or model weights — including the DMD exact-on-exponentials and ≥4-snapshot-floor checks — ships in the accel module:

python -m sam3d_objects.model.backbone.generator.flow_matching.accel

Results

On the slat-stage FlowMatching (real SAM 3D Objects weights, F1 vs the uncached baseline), HiCache++ (DMD) is geometry-lossless (F1 = 1.000) out to interval-6 at 1.56× — where HiCache (Hermite) is lossless only to interval-3. The exponential basis is what extends the lossless skip range; both methods stay exactly on the baseline geometry up to their respective ceilings.

config speedup F1 vs baseline
vanilla (uncached) 1.00× 1.000
HiCache (Hermite) i3 1.44× 1.000
DMD i5 1.47× 1.000
HiCache++ (DMD) i6 1.56× 1.000

For the controlled forecast microbenchmark (the exponential basis is ~1e-8 flat in horizon while the polynomial diverges), the Hunyuan3D tables, and the math, see the standalone library hicache-plus-plus.

hicache-pp 1.2.0 alignment (2026-06-10)

Two updates relative to hicache-plus-plus 1.2.0:

  • Hermite comparison arm corrected. The vendored Hermite forecast (the HiCache baseline arm, also the DMD warm-up fallback) evaluated the basis at x = -k; corrected to x = +k (the upstream TaylorSeer distance convention; -k flips every odd-order term). The published numbers above were measured with the as-released code and remain valid as-measured. The DMD arm itself is unaffected by the sign convention.
  • Eigencache not yet vendored. hicache-plus-plus 1.2.0 caches the DMD eigendecomposition per compute window; the DMD fit vendored here still refits on every skipped step. That is forecast-side latency overhead only (quality is identical); the standalone library ships the cached fit, and porting it here is pending.

Attribution

  • SAM 3D Objects © Meta Platforms, Inc. — model, weights, and code under the SAM License (note its restrictions: research/responsible-use terms, a publication-acknowledgement requirement, and trade-control / ITAR / sanctions compliance — not a permissive or unconditionally commercial license). The full upstream README (install, demos, benchmark, citation) is preserved below.
  • HiCache — scaled-Hermite velocity forecasting, arXiv:2508.16984 — the polynomial baseline this fork compares against (reimplemented for PyTree velocities).
  • HiCache++ (this work) — the DMD/Prony exponential forecaster. DMD (Schmid 2010) / Prony (1795) / Matrix-Pencil (Hua–Sarkar 1990) are classical spectral estimation; their application to diffusion feature caching is, to our knowledge, new.
  • Adaptive GuidancearXiv:2312.12487.

The acceleration code added by this fork lives in sam3d_objects/model/backbone/generator/flow_matching/{accel,solver,model}.py.

Citation

If you use this fork, please cite the base model and the acceleration methods it builds on.

SAM 3D Objects (base model):

@article{sam3dteam2025sam3d3dfyimages,
      title={SAM 3D: 3Dfy Anything in Images}, 
      author={SAM 3D Team and Xingyu Chen and Fu-Jen Chu and Pierre Gleize and Kevin J Liang and Alexander Sax and Hao Tang and Weiyao Wang and Michelle Guo and Thibaut Hardin and Xiang Li and Aohan Lin and Jiawei Liu and Ziqi Ma and Anushka Sagar and Bowen Song and Xiaodong Wang and Jianing Yang and Bowen Zhang and Piotr Dollár and Georgia Gkioxari and Matt Feiszli and Jitendra Malik},
      year={2025},
      eprint={2511.16624},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2511.16624}, 
}

HiCache (scaled-Hermite velocity forecasting — the polynomial baseline):

@misc{hicache2025,
      title={HiCache: Training-free Acceleration of Diffusion Models via Hermite Polynomial Feature Forecasting},
      eprint={2508.16984},
      archivePrefix={arXiv},
      year={2025}
}

Dynamic Mode Decomposition (the exponential basis behind HiCache++):

@article{schmid2010dmd,
      title={Dynamic mode decomposition of numerical and experimental data},
      author={Schmid, Peter J.},
      journal={Journal of Fluid Mechanics},
      volume={656},
      pages={5--28},
      year={2010}
}

Adaptive Guidance:

@misc{adaptiveguidance2023,
      title={Adaptive Guidance: Training-free Acceleration of Conditional Diffusion Models},
      eprint={2312.12487},
      archivePrefix={arXiv},
      year={2023}
}


SAM 3D

SAM 3D Objects is one part of SAM 3D, a pair of models for object and human mesh reconstruction. If you’re looking for SAM 3D Body, click here.

SAM 3D Objects

SAM 3D Team, Xingyu Chen*, Fu-Jen Chu*, Pierre Gleize*, Kevin J Liang*, Alexander Sax*, Hao Tang*, Weiyao Wang*, Michelle Guo, Thibaut Hardin, Xiang Li⚬, Aohan Lin, Jia-Wei Liu, Ziqi Ma⚬, Anushka Sagar, Bowen Song⚬, Xiaodong Wang, Jianing Yang⚬, Bowen Zhang⚬, Piotr Dollár†, Georgia Gkioxari†, Matt Feiszli†§, Jitendra Malik†§

Meta Superintelligence Labs

*Core contributor (Alphabetical, Equal Contribution), ⚬Intern, †Project leads, §Equal Contribution

[Paper] [Code] [Website] [Demo] [Blog] [BibTeX] [Roboflow]

SAM 3D Objects is a foundation model that reconstructs full 3D shape geometry, texture, and layout from a single image, excelling in real-world scenarios with occlusion and clutter by using progressive training and a data engine with human feedback. It outperforms prior 3D generation models in human preference tests on real-world objects and scenes. We released code, weights, online demo, and a new challenging benchmark.


Latest updates

Installation

Follow the setup steps before running the following.

Single or Multi-Object 3D Generation

SAM 3D Objects can convert masked objects in an image, into 3D models with pose, shape, texture, and layout. SAM 3D is designed to be robust in challenging natural images, handling small objects and occlusions, unusual poses, and difficult situations encountered in uncurated natural scenes like this kidsroom:

For a quick start, run python demo.py or use the the following lines of code:

import sys

# import inference code
sys.path.append("notebook")
from inference import Inference, load_image, load_single_mask

# load model
tag = "hf"
config_path = f"checkpoints/{tag}/pipeline.yaml"
inference = Inference(config_path, compile=False)

# load image and mask
image = load_image("notebook/images/shutterstock_stylish_kidsroom_1640806567/image.png")
mask = load_single_mask("notebook/images/shutterstock_stylish_kidsroom_1640806567", index=14)

# run model
output = inference(image, mask, seed=42)

# export gaussian splat
output["gs"].save_ply(f"splat.ply")

For more details and multi-object reconstruction, please take a look at out two jupyter notebooks:

SAM 3D Body

SAM 3D Body (3DB) is a robust promptable foundation model for single-image 3D human mesh recovery (HMR).

As a way to combine the strengths of both SAM 3D Objects and SAM 3D Body, we provide an example notebook that demonstrates how to combine the results of both models such that they are aligned in the same frame of reference. Check it out here.

License

The SAM 3D Objects model checkpoints and code are licensed under SAM License.

Contributing

See contributing and the code of conduct.

Contributors

The SAM 3D Objects project was made possible with the help of many contributors.

Robbie Adkins, Paris Baptiste, Karen Bergan, Kai Brown, Michelle Chan, Ida Cheng, Khadijat Durojaiye, Patrick Edwards, Daniella Factor, Facundo Figueroa, Rene de la Fuente, Eva Galper, Cem Gokmen, Alex He, Enmanuel Hernandez, Dex Honsa, Leonna Jones, Arpit Kalla, Kris Kitani, Helen Klein, Kei Koyama, Robert Kuo, Vivian Lee, Alex Lende, Jonny Li, Kehan Lyu, Faye Ma, Mallika Malhotra, Sasha Mitts, William Ngan, George Orlin, Peter Park, Don Pinkus, Roman Radle, Nikhila Ravi, Azita Shokrpour, Jasmine Shone, Zayida Suber, Phillip Thomas, Tatum Turner, Joseph Walker, Meng Wang, Claudette Ward, Andrew Westbury, Lea Wilken, Nan Yang, Yael Yungster

Citing SAM 3D Objects

If you use SAM 3D Objects in your research, please use the following BibTeX entry.

@article{sam3dteam2025sam3d3dfyimages,
      title={SAM 3D: 3Dfy Anything in Images}, 
      author={SAM 3D Team and Xingyu Chen and Fu-Jen Chu and Pierre Gleize and Kevin J Liang and Alexander Sax and Hao Tang and Weiyao Wang and Michelle Guo and Thibaut Hardin and Xiang Li and Aohan Lin and Jiawei Liu and Ziqi Ma and Anushka Sagar and Bowen Song and Xiaodong Wang and Jianing Yang and Bowen Zhang and Piotr Dollár and Georgia Gkioxari and Matt Feiszli and Jitendra Malik},
      year={2025},
      eprint={2511.16624},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2511.16624}, 
}

Weights & data

Model weights and demo/example assets are not committed to this repo — only the acceleration architecture (code + integration). Download the base-model weights from the upstream project, facebookresearch/sam-3d-objects, per its instructions, and point the loader at them (see the code / upstream README). This keeps the repository lightweight and avoids redistributing third-party weights.


Family

Part of the HiCache++ acceleration family.

  • Family hub: hicache-plus-plus — the basis library behind this adapter.
  • Sibling: sam3d-plus — the same base model with the HiCache (scaled-Hermite) polynomial-forecast variant.

About

HiCache++ (DMD) acceleration for SAM 3D Objects — lossless to i6.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors