Skip to content

Dataset merge using filename index introduces redundant columns #2

@JuliaDima

Description

@JuliaDima

Issue

Problem Description

The _get_combined_labeled_data method in session.py merges two dataframes on filename as index and left as merge type. This creates new column (usually label_x and label_y as the label column appears in both datasets). When using the dataset further, one can encounter errors due to improper label column.

Expected Behaviour

The merge should yield a single label column, taking into account what values to choose from first or second dataset (i.e., in case there is an update for a filename, take the values from the active learning dataset since this suggests updated labels for a cutout).

How Can It Be Tested or Reproduced

Labelling some files using the AM GUI, saving the labels, then using the labels dataset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions