DeepVisionary is a modular, PyTorch-based object detection framework inspired by YOLO, designed for developers and researchers to build, train, and evaluate object detection models. It is ideal for learning, experimentation, and small-scale projects, with support for COCO-format datasets and extensible components.
- Modular Architecture: Separates data loading, model components (Backbone, Neck, Head), and training logic for easy customization.
- Configurable: Hyperparameters and dataset paths defined in
config.yaml. - Evaluation & Visualization: Includes mAP computation and visualization tools for detection results.
- Extensible: Supports custom data augmentation, loss functions, and inference pipelines.
- Clone the repository:
git clone https://github.com/colinwps/DeepVisionary.git cd DeepVisionary - Install dependencies:
pip install -r requirements.txt
- Prepare a COCO-format dataset in
data/annotations/anddata/images/.
DeepVisionary/
├── configs/ # Configuration files (e.g., config.yaml)
├── data/ # Data loading and preprocessing
│ ├── dataset.py
│ ├── transforms.py
│ └── annotations/
├── models/ # Model components
│ ├── yolo.py
│ ├── backbone.py
│ ├── neck.py
│ └── head.py
├── scripts/ # Training and inference scripts
│ ├── train.py
│ └── inference.py
├── utils/ # Utilities
│ ├── loss.py
│ ├── metrics.py
│ ├── logger.py
│ └── visualization.py
├── requirements.txt # Project dependencies
└── README.md # Documentation
- Configure: Edit
configs/config.yamlto set dataset paths and hyperparameters. - Train: Run the training script:
python scripts/train.py
- Inference: Run inference on a single image:
Update
python scripts/inference.py --image path/to/image.jpg --checkpoint path/to/model.pth --output output.jpg
class_namesininference.pyto match your dataset. - Evaluate: Use
utils/metrics.pyto compute mAP on validation data. - Visualize: Use
utils/visualization.pyto draw bounding boxes on images.
- Python >= 3.8
- PyTorch >= 1.9.0
- Torchvision >= 0.10.0
- NumPy >= 1.19.0
- OpenCV-Python >= 4.5.0
- PyYAML >= 5.4.0
We welcome contributions! To contribute:
- Fork the repository.
- Create a feature branch (
git checkout -b feature/your-feature). - Commit changes (
git commit -m 'Add your feature'). - Push to the branch (
git push origin feature/your-feature). - Open a Pull Request.
This project is licensed under the MIT License.
DeepVisionary 是一个基于 PyTorch 的目标检测框架,灵感来源于 YOLO,专为开发者和研究人员设计,适合构建、训练和评估目标检测模型。该项目适用于学习、实验和小规模项目,支持 COCO 格式数据集,并提供可扩展的组件。
- 模块化架构:数据加载、模型组件(主干网络、Neck、Head)和训练逻辑分离,便于定制。
- 可配置:通过
config.yaml定义超参数和数据集路径。 - 评估与可视化:支持 mAP 计算和检测结果可视化。
- 可扩展:支持自定义数据增强、损失函数和推理流程。
- 克隆仓库:
git clone https://github.com/colinwps/DeepVisionary.git cd DeepVisionary - 安装依赖:
pip install -r requirements.txt
- 准备 COCO 格式数据集,放置于
data/annotations/和data/images/。
DeepVisionary/
├── configs/ # 配置文件(如 config.yaml)
├── data/ # 数据加载与预处理
│ ├── dataset.py
│ ├── transforms.py
│ └── annotations/
├── models/ # 模型组件
│ ├── yolo.py
│ ├── backbone.py
│ ├── neck.py
│ └── head.py
├── scripts/ # 训练与推理脚本
│ ├── train.py
│ └── inference.py
├── utils/ # 工具
│ ├── loss.py
│ ├── metrics.py
│ ├── logger.py
│ └── visualization.py
├── requirements.txt # 项目依赖
└── README.md # 项目文档
- 配置:编辑
configs/config.yaml,设置数据集路径和超参数。 - 训练:运行训练脚本:
python scripts/train.py
- 推理:对单张图片进行推理:
在
python scripts/inference.py --image path/to/image.jpg --checkpoint path/to/model.pth --output output.jpg
inference.py中更新class_names以匹配你的数据集。 - 评估:使用
utils/metrics.py计算验证集的 mAP。 - 可视化:使用
utils/visualization.py在图片上绘制边界框。
- Python >= 3.8
- PyTorch >= 1.9.0
- Torchvision >= 0.10.0
- NumPy >= 1.19.0
- OpenCV-Python >= 4.5.0
- PyYAML >= 5.4.0
欢迎贡献代码!请按以下步骤操作:
- Fork 本仓库。
- 创建特性分支(
git checkout -b feature/your-feature)。 - 提交更改(
git commit -m 'Add your feature')。 - 推送分支(
git push origin feature/your-feature)。 - 提交 Pull Request。
本项目采用 MIT 许可证。