Skip to content

Clarification on Dataset Licensing #59

@boeddeker

Description

@boeddeker

Dear NotSoFar Team,

I was reviewing the dataset license on Hugging Face (https://huggingface.co/datasets/microsoft/NOTSOFAR) and noticed that the README currently includes the following “Data License” statement:

This public data is currently licensed for use exclusively in the NOTSOFAR challenge event. We appreciate your understanding that it is not yet available for academic or commercial use. However, we are actively working towards expanding its availability for these purposes. We anticipate a forthcoming announcement that will enable broader and more impactful use of this data. Stay tuned for updates. Thank you for your interest and patience.

At the same time, the README of this GitHub repository describes the data as being licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0), which allows broader use with attribution.

I was wondering if you could clarify the current licensing status of this dataset. As it stands, the license statement attached to the data appears to restrict its use, which may make some users, including myself, hesitant to work with it.

Apologies for reposting (https://huggingface.co/datasets/microsoft/NOTSOFAR/discussions/1), but I’m not sure whether Hugging Face discussions generate notifications.

Best regards,
Christoph

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions