An interactive segmentation tool powered by DINOv3 for high-resolution image segmentation with real-time visual feedback.
- Interactive Segmentation: Click and drag to segment objects in real-time
- DINOv3 Powered: Leverages Meta's powerful DINOv3 vision transformer models
- Multiple Upsampling Methods: Choose between AnyUp (high-quality) or bilinear (fast) upsampling
- PyQt5 GUI: Clean, responsive interface for interactive use
- Similarity Visualization: Explore patch-level feature similarity across images
- Flexible Resize Methods: Support for both padding and center-cropping preprocessing
- Python 3.8+
- CUDA-capable GPU (recommended for best performance)
- 8GB+ RAM
-
Clone the repository:
git clone https://github.com/Jordan-Pierce/segment-a-saurus.git cd segment-a-saurus -
Install dependencies:
pip install -r requirements.txt
For GPU acceleration with FAISS:
pip install faiss-gpu
Run the main segmentation tool:
python qtdemo_segmenter.py -f data/ --resolution 448 --upsampler bilinear --segmenter torchUsage:
- Hover: See low-resolution patch similarity
- Left-click and drag: Add positive prompts for high-resolution segmentation
- Adjust threshold: Use the slider to fine-tune segmentation sensitivity
- Navigate: Use Previous/Next buttons or arrow keys to cycle through images
- Clear/Undo: Remove prompts as needed
Explore DINOv3 feature similarity:
python qtdemo_visualizer.py -f data/ --resolution 448Usage:
- Mouse over: Visualize similarity between patches in real-time
- Navigate: Use buttons or arrow keys to switch between images
segment-a-saurus/
├── src/ # Core modules
│ ├── DinoSegmenter.py # Main segmentation engine
│ ├── DinoVisualizer.py # Similarity visualization engine
│ └── engine/ # Additional model components
├── qtdemo_segmenter.py # Interactive segmentation GUI
├── qtdemo_visualizer.py # Similarity visualization GUI
├── examples/ # Example notebooks and scripts
│ ├── anyup.ipynb # AnyUp upsampling example
│ ├── foreground_segmentation.py # Foreground segmentation tutorial
│ └── gradio/ # Web-based demos
├── data/ # Sample images
└── web/ # Browser-based demo
--folder, -f: Path to image folder (default:data/)--resolution: Target resolution for processing (default: 448)--upsampler: Upsampling method -anyup(high-quality) orbilinear(fast)--segmenter: Segmentation backend -faiss(constant-time) ortorch(brute-force)--resize-method: Image preprocessing -pad(add padding) orcrop(center crop)
Fast mode (good for quick testing):
python qtdemo_segmenter.py -f data/ --resolution 224 --upsampler bilinear --segmenter torchHigh-quality mode (best results, requires more compute):
python qtdemo_segmenter.py -f data/ --resolution 512 --upsampler anyup --segmenter torch- Feature Extraction: DINOv3 processes images to extract rich semantic features at patch level
- Upsampling: Features are upsampled to high resolution using either:
- AnyUp: Learning-based upsampling for maximum quality
- Bilinear: Fast interpolation for real-time performance
- Interactive Prompting: User clicks are converted to feature queries
- Similarity Search: FAISS or PyTorch computes similarity between prompt and all pixels
- Real-time Feedback: Results are visualized with adjustable confidence thresholds
- Mouse: Hover for patch similarity, drag for segmentation
- Threshold Slider: Adjust segmentation sensitivity (0.0-1.0)
- Previous/Next: Navigate through images
- Undo: Remove last prompt
- Clear: Remove all prompts
- Q Key: Quit application
- Mouse Movement: Explore patch similarity in real-time
- Arrow Keys: Navigate between images
- Q Key: Quit application
Contributions are welcome! Please feel free to submit a Pull Request.
This project is open source. Please check individual model licenses (DINOv3, AnyUp) for their specific terms.
- Meta AI for DINOv3 models
- AnyUp authors for upsampling techniques
- Hugging Face for model hosting and transformers library