Hello,
Yes, it is a known issue.
The simulated data is not available since the data was transferred to HuggingFace. We are working on this. Sorry for the inconvenience!
Igor
From: luoj21 @.***>
Sent: Wednesday, February 19, 2025 8:50 PM
To: microsoft/NOTSOFAR1-Challenge @.***>
Cc: Subscribed @.***>
Subject: [microsoft/NOTSOFAR1-Challenge] Access to Simulated Data (Issue #57)
Hello,
Do we still have access to the full simulated training data (both 200hr and 1000hr variants)? When I run:
train_set_path = download_simulated_subset( version=ver, volume='200hrs', subset_name='train', destination_dir=os.path.join(project_dir, 'train'))
I get:
RuntimeError: Failed to list files in directory css-datasets/v1.5/200hrs/train in the Hugging Face repository: Failed to list directory css-datasets/v1.5/200hrs/train in the Hugging Face repository: 404 Client Error. (Request ID: Root=1-67b6b17c-33b2cee92c21e8c75237416e;a44d6604-d391-4b11-ae74-0a543a8c90f2)
Entry Not Found for url: https://huggingface.co/api/datasets/microsoft/NOTSOFAR/tree/main/css-datasets%2Fv1.5%2F200hrs%2Ftrain?recursive=True&expand=False. css-datasets/v1.5/200hrs/train does not exist on "main"
—
Reply to this email directly, view it on GitHub#57, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A62UUF6TRJ57ZMMIS7KEX232QVNJRAVCNFSM6AAAAABXPZ5W6SVHI2DSMVQWIX3LMV43ASLTON2WKOZSHA3DKMBXGA3TGOI.
You are receiving this because you are subscribed to this thread.Message ID: @.***>
[luoj21]luoj21 created an issue (#57)#57
Hello,
Do we still have access to the full simulated training data (both 200hr and 1000hr variants)? When I run:
train_set_path = download_simulated_subset( version=ver, volume='200hrs', subset_name='train', destination_dir=os.path.join(project_dir, 'train'))
I get:
RuntimeError: Failed to list files in directory css-datasets/v1.5/200hrs/train in the Hugging Face repository: Failed to list directory css-datasets/v1.5/200hrs/train in the Hugging Face repository: 404 Client Error. (Request ID: Root=1-67b6b17c-33b2cee92c21e8c75237416e;a44d6604-d391-4b11-ae74-0a543a8c90f2)
Entry Not Found for url: https://huggingface.co/api/datasets/microsoft/NOTSOFAR/tree/main/css-datasets%2Fv1.5%2F200hrs%2Ftrain?recursive=True&expand=False. css-datasets/v1.5/200hrs/train does not exist on "main"
—
Reply to this email directly, view it on GitHub#57, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A62UUF6TRJ57ZMMIS7KEX232QVNJRAVCNFSM6AAAAABXPZ5W6SVHI2DSMVQWIX3LMV43ASLTON2WKOZSHA3DKMBXGA3TGOI.
You are receiving this because you are subscribed to this thread.Message ID: @.***>
Originally posted by @igor0304 in #57
Hello, is the simulated data available on the HuggingFace? There is no css-datasets on
https://huggingface.co/datasets/microsoft/NOTSOFAR/tree/main