Skip to content

LFGfg/TransDex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 

Repository files navigation

TransDex: Pre-training Visuo-Tactile Policy with Point Cloud Reconstruction for Dexterous Manipulation of Transparent Objects

🌐Project Page | 📄Arxiv | 🎬Video

Fengguan Li, Yifan Ma, Chen Qian, Wentao Rao, Weiwei Shang

University of Science and Technology of China



We propose TransDex, a 3D visuo-tactile fusion motor policy based on point cloud reconstruction pre-training.


⚙️ Installation

✅ This project is recommended to run in the following environment:

  • Linux: Ubuntu 20.04.
  • CUDA version: 12.9.1.
  • Python version: 3.9.
  • PyTorch version: The official build version adapted to CUDA 12.9.

Important: Please ensure CUDA 12.9.1 and the corresponding PyTorch version are already installed. This guide does not cover CUDA/PyTorch installation.

Clone Repository

git clone https://github.com/LFGfg/TransDex.git
cd TransDex

Setup Instructions

1. Create and activate Conda environment

conda create -n VTFusion python=3.9 -y && conda activate VTFusion

2. Install dependencies

sudo apt install libgl1 -y && sudo apt-get install g++ -y
conda install pinocchio xorg-libx11 -c conda-forge -y
pip install -r requirements.txt

3. Build and install PointOps extension

Adjust the path according to your project layout:

cd ~/policy_ws/src/VTFusion/src/extensions/pointops
python setup.py install

🛠️ Usage

1. Pretrain

First enter the pre-training document directory:

cd ~/policy_ws/src/VTFusion/src/PretrainPoint

The code for the dataset processing and model in the pre-training stage can be found in ./models/Dataset_process_nor.py and ./models/PretrainPoint.py. Pre-trained data used in this project are generated in Pybullet simulator, and the dexterous hand used can be found in this paper. The example dataset will be released at Google Drive later.

😸 We strongly suggest that users generate corresponding point cloud datasets according to your own dexterous hand systems and process them in the format provided by the data processing codes.

👉🏻 Before training, please place the hand-object dataset in ~/policy_ws/src/VTFusion/hand_object_data/hand_object_dataset/, or make sure that dataset.data_dir in ./cfgs/pretrain_hand_object.yaml should be changed to the storage location of your own dataset.

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/pretrain_hand_object.yaml

The trained weight files can be found in ./experiments/pretrain_hand_object/.

To perform a simple evaluation across the entire dataset, you can run:

export PYTHONPATH="~/policy_ws/src/VTFusion/src:$PYTHONPATH"
python ./models/Evaluation.py  --mask_ratio 0.70  --ckpt_path ./experiments/pretrain_hand_object/ckpt-last.pth  --data_dir ~/policy_ws/src/VTFusion/hand_object_data/hand_object_dataset

2. Policy

Hardware Setup

The real robotic system consists of a 16-DOF dexterous hand and a 7-DOF humanoid arm. The robot used in this project can also be found in this paper. The dexterous hand is equipped with Paxini array tactile sensors. Additionally, the system requires two Intel RealSense D435i depth cameras positioned at the wrists of the robotic arms and around the workbench respectively.

Policy Training

Enter the document directory:

cd ~/policy_ws/src/VTFusion/src/

The code for the dataset processing and model of the policy can be found in VTFusion_dataset.py and FusionNetwork.py. Users can collect manipulation dataset through your own robotic system.

👉🏻 Before training the policy, please ensure:

  • The pre-trained encoder weight file is copied to ./pretrain_pointencoder/ckpt.pth.
  • Manipulation dataset is placed under ../data_record/task_name/, and edit the task_name in the config file ./config/config.yaml.
  • Put the URDF file of the robot in ~/policy_ws/src/URDF/.
  • Adjust parameters such as pos_mins/maxs, rpy_mins/maxs, joint_mins/maxs in the config file according to robotic system and task. Relevant instructions are already commented in the sample config file config/config.yaml.

Use the following script for training:

CUDA_VISIBLE_DEVICES=0,1,2,3 python ./training.py --config ./config/config.yaml

The trained weight files can be found in ./ckpts/.

Note: Certain code files, such as ./pin_forward.py, ./VTFusion_dataset.py, contain functions specifically designed for the robotic system used in the project. When using these modules, you need to modify and adapt the specific functions within them.

Real-World Deployment

This project utilizes ROS and TwinCat communication for underlying motor control. Users can evaluate through your own robotic systems and corresponding trained networks.

💡 Deployment Tips

  • Ensure all hardware devices is supplied with stable power and properly connected.
  • Dual RealSense cameras require hand-eye calibration and time synchronization.
  • Point cloud fusion necessitates coordinate transformation and ICP registration.
  • Calibrate robotic arm/dexterous hand joint zero positions and set limits.
  • Visualize the point cloud to ensure that the 3D position of tactile points is calculated correctly.

📚 Citation

😄 If you find our work useful, please consider citing:

@article{li2026transdex,
  title={TransDex: Pre-training Visuo-Tactile Policy with Point Cloud Reconstruction for Dexterous Manipulation of Transparent Objects},
  author={Li, Fengguan and Ma, Yifan and Qian, Chen and Rao, Wentao and Shang, Weiwei},
  journal={arXiv preprint arXiv:2603.13869},
  year={2026}
}

❓ If you have any questions, please contact lfguan@mail.ustc.edu.cn.

About

TransDex: a 3D visuo-tactile fusion motor policy based on point cloud reconstruction pre-training.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors