Skip to content

Jordan-Pierce/segment-a-saurus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

segment-a-saurus 🦕

An interactive segmentation tool powered by DINOv3 for high-resolution image segmentation with real-time visual feedback.

🚀 Features

  • Interactive Segmentation: Click and drag to segment objects in real-time
  • DINOv3 Powered: Leverages Meta's powerful DINOv3 vision transformer models
  • Multiple Upsampling Methods: Choose between AnyUp (high-quality) or bilinear (fast) upsampling
  • PyQt5 GUI: Clean, responsive interface for interactive use
  • Similarity Visualization: Explore patch-level feature similarity across images
  • Flexible Resize Methods: Support for both padding and center-cropping preprocessing

📋 Requirements

  • Python 3.8+
  • CUDA-capable GPU (recommended for best performance)
  • 8GB+ RAM

🛠️ Installation

  1. Clone the repository:

    git clone https://github.com/Jordan-Pierce/segment-a-saurus.git
    cd segment-a-saurus
  2. Install dependencies:

    pip install -r requirements.txt

    For GPU acceleration with FAISS:

    pip install faiss-gpu

🎯 Quick Start

Interactive Segmentation Demo

Run the main segmentation tool:

python qtdemo_segmenter.py -f data/ --resolution 448 --upsampler bilinear --segmenter torch

Usage:

  • Hover: See low-resolution patch similarity
  • Left-click and drag: Add positive prompts for high-resolution segmentation
  • Adjust threshold: Use the slider to fine-tune segmentation sensitivity
  • Navigate: Use Previous/Next buttons or arrow keys to cycle through images
  • Clear/Undo: Remove prompts as needed

Similarity Visualization Demo

Explore DINOv3 feature similarity:

python qtdemo_visualizer.py -f data/ --resolution 448

Usage:

  • Mouse over: Visualize similarity between patches in real-time
  • Navigate: Use buttons or arrow keys to switch between images

📁 Project Structure

segment-a-saurus/
├── src/                          # Core modules
│   ├── DinoSegmenter.py         # Main segmentation engine
│   ├── DinoVisualizer.py        # Similarity visualization engine  
│   └── engine/                  # Additional model components
├── qtdemo_segmenter.py          # Interactive segmentation GUI
├── qtdemo_visualizer.py         # Similarity visualization GUI
├── examples/                     # Example notebooks and scripts
│   ├── anyup.ipynb             # AnyUp upsampling example
│   ├── foreground_segmentation.py # Foreground segmentation tutorial
│   └── gradio/                 # Web-based demos
├── data/                        # Sample images
└── web/                         # Browser-based demo

⚙️ Configuration Options

Segmentation Tool Options

  • --folder, -f: Path to image folder (default: data/)
  • --resolution: Target resolution for processing (default: 448)
  • --upsampler: Upsampling method - anyup (high-quality) or bilinear (fast)
  • --segmenter: Segmentation backend - faiss (constant-time) or torch (brute-force)
  • --resize-method: Image preprocessing - pad (add padding) or crop (center crop)

Example Commands

Fast mode (good for quick testing):

python qtdemo_segmenter.py -f data/ --resolution 224 --upsampler bilinear --segmenter torch

High-quality mode (best results, requires more compute):

python qtdemo_segmenter.py -f data/ --resolution 512 --upsampler anyup --segmenter torch

🧠 How It Works

  1. Feature Extraction: DINOv3 processes images to extract rich semantic features at patch level
  2. Upsampling: Features are upsampled to high resolution using either:
    • AnyUp: Learning-based upsampling for maximum quality
    • Bilinear: Fast interpolation for real-time performance
  3. Interactive Prompting: User clicks are converted to feature queries
  4. Similarity Search: FAISS or PyTorch computes similarity between prompt and all pixels
  5. Real-time Feedback: Results are visualized with adjustable confidence thresholds

🎮 Controls

Segmentation Demo

  • Mouse: Hover for patch similarity, drag for segmentation
  • Threshold Slider: Adjust segmentation sensitivity (0.0-1.0)
  • Previous/Next: Navigate through images
  • Undo: Remove last prompt
  • Clear: Remove all prompts
  • Q Key: Quit application

Visualization Demo

  • Mouse Movement: Explore patch similarity in real-time
  • Arrow Keys: Navigate between images
  • Q Key: Quit application

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is open source. Please check individual model licenses (DINOv3, AnyUp) for their specific terms.

🙏 Acknowledgments

  • Meta AI for DINOv3 models
  • AnyUp authors for upsampling techniques
  • Hugging Face for model hosting and transformers library

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published