Skip to content

feat: add 3D mesh support and MeshFolder builder#8055

Merged
lhoestq merged 12 commits into
huggingface:mainfrom
Vinay-Umrethe:mesh-support
May 27, 2026
Merged

feat: add 3D mesh support and MeshFolder builder#8055
lhoestq merged 12 commits into
huggingface:mainfrom
Vinay-Umrethe:mesh-support

Conversation

@Vinay-Umrethe
Copy link
Copy Markdown
Contributor

This PR introduces 3D mesh support to the datasets library, mirroring the existing paradigms for Image, Audio, and Video modalities. this is made to support 3D data just like image, audio, etc...

new Meshfeature class, which manages 3D data via a PyArrow struct containing both raw bytes and file paths. support is intentionally focused on self-contained binary formats like GLB, PLY, and STL (since they seem sweetspot to me because others like .obj .gltf requires external sub files).

new MeshFolder builder module. This packaged module enables users to load datasets directly from structured or unstructured directories of mesh files. implementation has been integrated into library's core, including registration in the main features module and support for 3D data within `WebDataset``

Tests Were conducted using some new files too.

TESTS CONDUCTED :

python -m pytest tests/features/test_mesh.py tests/packaged_modules/test_meshfolder.py -vv

Output:

================================================= test session starts =================================================
platform win32 -- Python 3.12.7, pytest-8.4.2, pluggy-1.6.0 -- C:\Users\vinay_\AppData\Local\Programs\Python\Python312\python.exe
cachedir: .pytest_cache
rootdir: C:\Users\vinay_\Desktop\DatasetPR\datasets
configfile: pyproject.toml
plugins: anyio-4.11.0, dash-3.3.0, Faker-38.2.0, hydra-core-1.3.2, langsmith-0.5.0
collected 21 items

tests/features/test_mesh.py::test_mesh_instantiation PASSED                                                      [  4%]
tests/features/test_mesh.py::test_mesh_feature_type_to_arrow PASSED                                              [  9%]
tests/features/test_mesh.py::test_mesh_feature_encode_example[<lambda>0] PASSED                                  [ 14%]
tests/features/test_mesh.py::test_mesh_feature_encode_example[<lambda>1] PASSED                                  [ 19%]
tests/features/test_mesh.py::test_mesh_feature_encode_example[<lambda>2] PASSED                                  [ 23%]
tests/features/test_mesh.py::test_mesh_feature_encode_example[<lambda>3] PASSED                                  [ 28%]
tests/features/test_mesh.py::test_mesh_feature_encode_example[<lambda>4] PASSED                                  [ 33%]
tests/features/test_mesh.py::test_mesh_feature_encode_example[<lambda>5] PASSED                                  [ 38%]
tests/features/test_mesh.py::test_mesh_feature_encode_example[<lambda>6] PASSED                                  [ 42%]
tests/features/test_mesh.py::test_mesh_feature_encode_example[<lambda>7] PASSED                                  [ 47%]
tests/features/test_mesh.py::test_mesh_decode_example PASSED                                                     [ 52%]
tests/features/test_mesh.py::test_dataset_with_mesh_feature PASSED                                               [ 57%]
tests/features/test_mesh.py::test_dataset_cast_to_mesh_features PASSED                                           [ 61%]
tests/features/test_mesh.py::test_dataset_concatenate_mesh_features PASSED                                       [ 66%]
tests/features/test_mesh.py::test_require_decoding PASSED                                                        [ 71%]
tests/features/test_mesh.py::test_mesh_embed_storage PASSED                                                      [ 76%]
tests/packaged_modules/test_meshfolder.py::test_meshfolder_config_and_extensions PASSED                          [ 80%]
tests/packaged_modules/test_meshfolder.py::test_config_raises_when_invalid_name PASSED                           [ 85%]
tests/packaged_modules/test_meshfolder.py::test_generate_examples_with_labels PASSED                             [ 90%]
tests/packaged_modules/test_meshfolder.py::test_data_files_with_metadata_and_single_split[False] PASSED          [ 95%]
tests/packaged_modules/test_meshfolder.py::test_data_files_with_metadata_and_single_split[True] PASSED           [100%]

================================================= 21 passed in 7.17s ==================================================

some test files were added too in tests/features/data folder like I saw for other modalites.

Not sure this will be given attention or not or will be merged or not, still doing since I reported this in #8048

@Vinay-Umrethe
Copy link
Copy Markdown
Contributor Author

A Test Conducted:

from datasets import Features, Value, Sequence, Image, Audio, Mesh, load_dataset

# Define features.
features = Features({
    'id': Value('string'),
    'objaverse_uid': Value('string'),
    'text': Value('string'),
    'image': Image(),
    'audio': Audio(),
    'mesh': Mesh(), # NEW automatically handles struct<bytes, path>
    'metadata': {
        'image_score': Value('double'),
        'audio_score': Value('double'),
        'tags': Sequence(Value('string'))
    }
})

# Load a Parquet.
dataset = load_dataset(
    "parquet", 
    data_files={"train": "train-00001.parquet"}, 
    features=features,
    streaming=True
)["train"]

# Push.
dataset.push_to_hub("VINAY-UMRETHE/Vividha-Test")

This can be viewed at VINAY-UMRETHE/Vividha-test

Although the dataset viewer does not show anything since the site is not configured to show Mesh with a rendered image (heavy) or a simple placeholder icon. Up to devs.

@Vinay-Umrethe
Copy link
Copy Markdown
Contributor Author

@lhoestq review

@lhoestq
Copy link
Copy Markdown
Member

lhoestq commented May 26, 2026

Looks really cool ! is there a python lib that can be used to load the data instead of returning bytes/path ?

and sorry for the delay !

@Vinay-Umrethe
Copy link
Copy Markdown
Contributor Author

@lhoestq

is there a python lib that can be used to load the data instead of returning bytes/path ?

Yes, trimesh library exists for that and I think that's the most reliable one.

I updated:

Mesh(decode=True) to load .glb, / .ply, / .stl files with trimesh, returning trimesh.Trimesh (for objects) or trimesh.Scene (for scenes, mostly common for GLB files).

Mesh(decode=False) still returns {"path": ..., "bytes": ...} struct, matching existing Image(), Audio() feature pattern.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm !

```py
>>> from datasets import Dataset, Features, Mesh

>>> dataset = Dataset.from_dict({"mesh": ["path/to/model.glb"]}, features=Features({"mesh": Mesh()}))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's update the docs once with a cool mesh dataset on HF, do ou have an idea ?

Copy link
Copy Markdown
Contributor Author

@Vinay-Umrethe Vinay-Umrethe May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lhoestq

I've done a test which you can now find at VINAY-UMRETHE/My-Mesh-Dataset dataset repo which used Mesh() feature

However, while testing I noticed a error with embed_external_files which is fixed now but pending a merge, Created at #8224

Before you merge that, we can update the docs in that PR as well, this would finalize the whole Mesh-Support

Commits:

fix: embed_external_files=True

style: Match other test_features

@lhoestq lhoestq merged commit d4284e9 into huggingface:main May 27, 2026
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants