Improve HF integration

Hi @WangHelin1997,

Niels here from the open-source team at Hugging Face. I discovered your work through the daily papers: https://huggingface.co/papers/2409.08425, congrats!. I work together with [AK](https://x.com/_akhaliq) on improving the visibility of researchers' work on the hub. 

I see you already made have a model and demo on the 🤗 hub which is great. I've got a couple of suggestions on improving the HF integration:

## Uploading datasets

I see the datasets are currently inside the model repo: https://huggingface.co/westbrook/SoloAudio/tree/main. Would be great to make them available as Dataset repos. See https://huggingface.co/docs/datasets/loading for details. The datasets could then also be made compatible with the Datasets library so that people can load the data in 2 lines of code, there's also the dataset viewer: https://huggingface.co/docs/hub/en/datasets-viewer, etc. (which allows people to explore the data right from the browser)

## Uploading models

Regarding the models, we encourage researchers to push each model checkpoint to a separate model repository, so that things like download stats also work. Currently it seems that the VAE and SoloAudio checkpoints are in a single repo: https://huggingface.co/westbrook/SoloAudio/tree/main.

See here for a guide: https://huggingface.co/docs/hub/models-uploading. In case the models are custom PyTorch model, we could probably leverage the [PyTorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) class which adds `from_pretrained` and `push_to_hub` to each model. Alternatively, one can leverages the [hf_hub_download](https://huggingface.co/docs/huggingface_hub/en/guides/download#download-a-single-file) one-liner to download a checkpoint from the hub. 

Let me know if you're interested/need any help regarding this!

Cheers,

Niels
ML Engineer @ HF 🤗 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve HF integration #5

Uploading datasets

Uploading models

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Improve HF integration #5

Description

Uploading datasets

Uploading models

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions