Skip to content

[ACM MM 2025] An official repository for the paper "B4DL: A Benchmark for 4D LiDAR LLM in Spatio-Temporal Understanding"

Notifications You must be signed in to change notification settings

Youngwoo-git/B4DL

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

B4DL

Official PyTorch implementation of the paper "B4DL: A Benchmark for 4D LiDAR LLM in Spatio-Temporal Understanding".

Hugging Face arXiv Paper


Data Generation Pipeline

You should make your own OpenAI API key before running the code.

cd datageneration

4D LiDAR Context Extraction Step

Please download the nuScenes dataset and set the nuscenes_root argument to the download path.

Run the following commands:

bash scripts/generate_description.sh

or you can run the python code directly

python3 generate_description.py \
    --start_index 10 \
    --end_index 20 \
    --api_key {your openai api key} \
    --nuscenes_root /mnt/nfs_shared_data/dataset/cch/nuScenes \
    --dataroot ./data

Context-to-QA Transformation Step

bash scripts/generate_dataset.sh

or you can run the python code directly

python3 generate_dataset.py \
    --start_index 0 \
    --end_index 10 \
    --api_key {your openai api key} \
    --nuscenes_root /mnt/nfs_shared_data/dataset/cch/nuScenes \
    --dataroot ./data \
    --task existence

Training Script

Before running, please download this file and place it under ./base_model/

bash run_stages.sh \
     --s1_data ./b4dl_dataset/stage1_lidarllm_mm.json \
     --s1_feat ./b4dl/stage1_features \
     --s2_data ./b4dl_dataset/stage2.json \
     --s2_feat ./b4dl/stage2_features \
     --model_name_or_path ./base_model/vicuna-v1-5-7b

For training, check out here(mllm/README.md).


Demo

Example of Generated Dataset
Dataset (LiDAR) Dataset (Camera)
Dataset (Text)
Example of Inference
Inference (LiDAR) Inference (Camera)
Inference (Text)

Acknowledgements

This work was partly supported by the Institute of Information & Communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT) (No.RS-2024-00439020, Developing Sustainable, Real-Time Generative AI for Multimodal Interaction, SW Starlab) and partly supported by the Institute of Information & Communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT) (No.RS2025-02283048, Developing the Next-Generation General AI with Reliability, Ethics, and Adaptability)

If you're using VTimeLLM in your research or applications, please cite using this BibTeX:

@inproceedings{choi2025b4dl,
  title={B4DL: A Benchmark for 4D LiDAR LLM in Spatio-Temporal Understanding},
  author={Choi, Changho and Shin, Youngwoo and Han, Gyojin and Lee, Dong-Jae and Kim, Junmo},
  booktitle={Proceedings of the 33rd ACM International Conference on Multimedia},
  pages={3399--3407},
  year={2025}
}

License

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License.

About

[ACM MM 2025] An official repository for the paper "B4DL: A Benchmark for 4D LiDAR LLM in Spatio-Temporal Understanding"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 89.6%
  • C++ 5.8%
  • Cuda 3.8%
  • Other 0.8%