Skip to content

Added support for converting dataset to COCO#3

Open
Kubson900 wants to merge 13 commits into
mainfrom
feature/coco_format
Open

Added support for converting dataset to COCO#3
Kubson900 wants to merge 13 commits into
mainfrom
feature/coco_format

Conversation

@Kubson900

Copy link
Copy Markdown
Collaborator

No description provided.

@Kubson900 Kubson900 requested a review from folkien October 10, 2025 09:16
@Kubson900 Kubson900 self-assigned this Oct 10, 2025
@Kubson900 Kubson900 added the enhancement New feature or request label Oct 10, 2025
@folkien folkien requested a review from Copilot October 10, 2025 12:36

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for converting datasets to COCO format, a widely-used annotation format for computer vision models. The changes introduce new functionality for converting YOLO-format datasets to COCO format with train/validation/test splits.

  • Adds COCO format conversion capability through new helper functions
  • Implements class remapping functionality for annotation datasets
  • Refactors argument parsing with improved formatting and new options

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
yaya_tools/yaya_dataset.py Adds COCO conversion command-line options and integration logic with progress tracking
yaya_tools/helpers/coco_format.py New module providing COCO format conversion, RGBA-to-RGB conversion, and dataset splitting functions
yaya_tools/helpers/annotations.py Adds class remapping functionality and reformats existing code for consistency

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +8 to +27
"""
COCO format required by e.g RFDETR

dataset/
├── train/
│ ├── _annotations.coco.json
│ ├── image1.jpg
│ ├── image2.jpg
│ └── ... (other image files)
├── valid/
│ ├── _annotations.coco.json
│ ├── image1.jpg
│ ├── image2.jpg
│ └── ... (other image files)
└── test/
├── _annotations.coco.json
├── image1.jpg
├── image2.jpg
└── ... (other image files)
"""

Copilot AI Oct 10, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module docstring should use proper docstring format with triple quotes at the module level, not as a multi-line comment within the file.

Copilot uses AI. Check for mistakes.

Returns:
tuple[sv.DetectionDataset, sv.DetectionDataset, sv.DetectionDataset]:
A tuple containing training validation and test datasets

Copilot AI Oct 10, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing comma in docstring return description. Should be 'A tuple containing training, validation and test datasets'.

Suggested change
A tuple containing training validation and test datasets
A tuple containing training, validation and test datasets

Copilot uses AI. Check for mistakes.
Comment on lines +1 to 2
import yaml
import logging

Copilot AI Oct 10, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import statements should be grouped and ordered: standard library imports first, then third-party imports. The yaml import should come after the standard library imports (logging, os, pathlib).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants