Releases: SimonZeng7108/efficientsam3
EfficientSAM3 v0.4.0 — Stage 3 Fine-Tuned Models & Full PCS Release
What's New
Apologies for the Delay — Fine-Tuned Models Are Here!
We know it's been a while, but good things take time! We're happy to share that the fine-tuned EfficientSAM3 models are finally ready.
This release brings lightweight image encoders — EV-M, RV-M, and TV-M — trained end-to-end on 5% SA1B data with SACap labels for full Promptable Concept Segmentation (PCS) capabilities.
| Model | Vision Encoder | Text Encoder | Decoder | Other | Total Params | vs ImageSAM3 |
|---|---|---|---|---|---|---|
| EV-M | EfficientViT-B1 (22.2M) | MobileCLIP-S0 (42.5M) | 21.0M | 3.5M | 89.2M | 90% smaller |
| RV-M | RepViT-M1.1 (25.6M) | MobileCLIP-S0 (42.5M) | 21.0M | 3.5M | 92.7M | 89% smaller |
| TV-M | TinyViT-11M (28.3M) | MobileCLIP-S0 (42.5M) | 21.0M | 3.5M | 95.3M | 89% smaller |
Note: ImageSAM3 comparison: Vision (463M) + Text (354M) + Decoder (30.3M) + Other (14.2M) = 861.5M
Grab the checkpoints on HuggingFace.
We've also given the README a clean-up — easier to navigate, clearer model descriptions, and better organized for contributors.
Three-Stage Progressive Distillation
EfficientSAM3 delivers a complete distillation pipeline:
- Stage 1: Compact encoder distillation on SA-1B (image) + Recap-DataComp-1B (text)
- Stage 3: End-to-end fine-tuning on SAM3 data for full PCS quality
EfficientSAM3 v0.3.0 — EfficientSAM3.1 & SAM3.1-LiteText Image Model Release
EfficientSAM3 — v3.0 Models & Hugging Face Integration (2026-04-13)
We’re excited to announce the release of the EfficientSAM3.1 and SAM3.1-LiteText models, alongside official Hugging Face integration and advanced data engine support!
What’s new
- New Models Released: EfficientSAM3.1 and SAM3.1-LiteText image models are now available on the
stage1_sam3.1branch. - Hugging Face Integration: SAM3-LiteText has been officially merged into HuggingFace Transformers.
- Data Engine Support: Stage 3 data engine support is now live on the
data_enginebranch.

🏆 Community Heroes
A massive thank you to the community members who helped make this release possible:
- @NielsRogge, @yonigozlan, and the Hugging Face team: For integrating SAM3-LiteText into the Hugging Face API.
- @colinlin1982: For finding redundant original SAM3 weights in the efficient models and adding a trim script to reduce the model size. (Note: The new EfficientSAM3.1 models do not need to be trimmed).
- @clcl777: For adding multi-device support.
EfficientSAM3 v0.2.0 — SAM3-LiteText
EfficientSAM3 — SAM3-LiteText (2026-02-18)
We’re releasing SAM3-LiteText, a highly efficient text encoder variant for SAM3.
-
What’s new
- SAM3-LiteText released! Reduces text encoder parameters by up to 88% with similar performance to the original text encoder.
- Paper Released: arXiv:2602.12173 (PDF)
- Code: Available in the
sam3_litetextbranch.
-
Get the weights
- Download from Hugging Face:
Simon7108528/EfficientSAM3/tree/main/sam3_litetext
- Download from Hugging Face:
-
Quick start
- Checkout the branch:
git checkout sam3_litetext - Install:
pip install -e ".[stage1]" - Run inference with the new lite text encoders.
- Checkout the branch:
EfficientSAM3 v0.1.1 — Stage 1+ Fine-Tuned (ft) Weights
EfficientSAM3 — Stage 1+ Fine-Tuned (ft) Weights (2026-01-11)
We’re releasing/refreshing Stage 1 geometry-prompt fine-tuned (ft) checkpoints for EfficientSAM3.
-
What’s new
- Image encoder ft weights (trained on 1% SA-1B with geometry-prompt fine-tuning)
- Text encoder ft weights (fine-tuned on SA-Co Gold + Silver text annotations)
- Updated download links in the Model Zoo (Google Drive + Hugging Face)
-
Get the weights
- See the Model Zoo in the repo
README.mdand the project page:https://simonzeng7108.github.io/efficientsam3/
- See the Model Zoo in the repo
-
Quick start
- Install:
pip install -e ".[stage1]" - Inference examples are in the repo under
sam3/efficientsam3_examples/
- Install:


