🛫 Introduction to Embodied Intelligence (A Quick Guide of Embodied-AI)

🤖 About

With the rapid advancement of large-scale models, a key question that has drawn considerable attention among researchers is how to equip a model-based intelligent agent with a physical body capable of interacting with the real world. In response, the concept of embodied intelligence has been introduced, attracting a growing community of researchers. To help researchers quickly grasp the frontiers of embodied intelligence and intelligent robotics—and to better promote and publicize developments in this field—this project summarizes representative works in the domain of embodied intelligence and intelligent robotics. It will be continually updated to remain at the cutting edge. If you find any errors while reading through this project, please do not hesitate to contact us to correct them; we greatly appreciate such feedback. Likewise, if you would like to contribute to the further exploration and promotion of embodied intelligence, you are welcome to reach out to me via email: yinchenghust@outlook.com.

Author List

Thank you to all the authors for their contributions to the project.

Cheng Yin, Nengyu Wang, Yimeng Wang, Chenyu Yang, Zhiwen Hu, Yunxiang Mi, Weichen Lin.

🛫 Introduction to Embodied Intelligence (A Quick Guide of Embodied-AI)
🤖 About
Author List
Table of Contents
Symbol representation
📑 Survey
👁️ Perception
🧠 Brain Model
🏆 VLA Model
👑 E-AI-RL
🏁 Interactive
💻 Simulator
📊 Dataset
🔧 Toolkit
📖 Citation
😊 Acknowledgements

Symbol representation

represents closed source.
represents open source.

📑 Survey

Teleoperation of Humanoid Robots: A Survey [Paper Link] [Project Link] [2023]
Deep Learning Approaches to Grasp Synthesis: A Review [Paper Link] [Project Link] [2023]
A survey of embodied ai: From simulators to research tasks [Paper Link] [2022]
A Survey of Embodied Learning for Object-Centric Robotic Manipulation [Paper Link] [Project Link] [2024]
A Survey on Vision-Language-Action Models for Embodied AI [Paper Link] [2024]
Embodied Intelligence Toward Future Smart Manufacturing in the Era of AI Foundation Model [Paper Link] [2024]
Towards Generalist Robot Learning from Internet Video: A Survey [Paper Link] [2024]
A Survey on Robotics with Foundation Models: toward Embodied AI [Paper Link] [2024]
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis [Paper Link][Project Link] [2024]
Robot Learning in the Era of Foundation Models: A Survey [Paper Link] [2023]
Foundation Models in Robotics: Applications, Challenges, and the Future [Paper Link] [Project Link] [2023]
Large Language Models for Robotics: Opportunities, Challenges, and Perspectives [Paper Link] [2024]
Awesome-Embodied-Agent-with-LLMs [Project Link] [2024]
Awesome Embodied Vision [Project Link] [2024]
Awesome Touch [Project Link] [2024]
Grasp-Anything Project [Project Link] [2024]
GraspNet Project [Project Link] [2024]
Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions [Paper Link] [Project Link] [2024]
Survey of Learning-based Approaches for Robotic In-Hand Manipulation [Paper Link] [2024]
A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches [Paper Link] [2024]
Neural Scaling Laws in Robotics [Paper Link] [2025]
Deep Reinforcement Learning for Robotics: A Survey of Real-World Successes [Paper Link] [2024]
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI [Paper Link] [Project Link] [2024]

👁️ Perception

- RGBGrasp: Image-based Object Grasping by Capturing Multiple Views during Robot Arm Movement with Neural Radiance Fields [Paper Link] [Project Link] [2024]
- RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation [Paper Link] [Project Link] [2024]
- ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation [Paper Link] [Project Link] [2023]
- Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation [Paper Link] [Project Link] [2024]
- A Contact Model based on Denoising Diffusion to Learn Variable Impedance Control for Contact-rich Manipulation [Paper Link] [2024]

🧠 Brain Model

- RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning [Paper Link] [Project Link] [2024]
- Errors are Useful Prompts: Instruction Guided Task Programming with Verifier-Assisted Iterative Prompting [Paper Link] [Project Link] [2023]
- Generalized Planning in PDDL Domains with Pretrained Large Language Models [Paper Link] [Project Link] [2023]
- QueST: Self-Supervised Skill Abstractions for Learning Continuous Control [Paper Link] [Project Link] [2024]
- Plan Diffuser: Grounding LLM Planners with Diffusion Models for Robotic Manipulation [Paper Link] [2024]
- Action-Free Reasoning for Policy Generalization [Paper Link] [Project Link] [2025]
- Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection [Paper Link] [Project Link] [2024]
- DoReMi: Grounding Language Model by Detecting and Recovering from Plan-Execution Misalignment [Paper Link] [Project Link]
- Chain-of-Thought Predictive Control [Paper Link] [Project Link] [2024]
- CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation [Paper Link] [Project Link] [2024]
- ClevrSkills: Compositional Language and Visual Reasoning in Robotics [Paper Link] [Project Link] [2024]
- Do As I Can, Not As I Say: Grounding Language in Robotic Affordances [Paper Link] [Project Link] [2022]
- RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World [Paper Link] [Project Link] [2024]
- Look Before You Leap: Unveiling the Power of GPT-4V in Robotic Vision-Language Planning [Paper Link] [Project Link] [2023]
- Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation [Paper Link] [Project Link] [2024]
- DexCap: Scalable and Portable Mocap Data Collection System for Dexterous Manipulation [Paper Link] [Project Link] [2024]
- HumanPlus: Humanoid Shadowing and Imitation from Humans [Paper Link] [Project Link] [2024]
- On Bringing Robots Home [Paper Link] [Project Link] [2023]
- Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots [Paper Link] [Project Link] [2024]
- Diffusion Policy: Visuomotor Policy Learning via Action Diffusion [Paper Link] [Project Link] [2023]
- Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware [Paper Link] [Project Link] [2023]
- Surgical Robot Transformer (SRT): Imitation Learning for Surgical Tasks [Paper Link] [Project Link] [2024]
- Yell At Your Robot: Improving On-the-Fly from Language Corrections [Paper Link] [Project Link] [2024]

🏆 VLA Model

- RDT-1B: A DIFFUSION FOUNDATION MODEL FOR BIMANUAL MANIPULATION [Paper Link] [Project Link] [2024]
- π0: A Vision-Language-Action Flow Model for General Robot Control [Paper Link] [Project Link] [2024]
- DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes [Paper Link] [Project Link] [2024]
- Yell At Your Robot: Improving On-the-Fly from Language Corrections [Paper Link] [Project Link] [2024]
- Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation [Paper Link] [Project Link] [2022]
- Q-attention: Enabling Efficient Learning for Vision-based Robotic Manipulation [Paper Link] [Project Link] [2022]
- RVT: Robotic View Transformer for 3D Object Manipulation [Paper Link] [Project Link] [2023]
- UP-VLA: A Unified Understanding and Prediction Model for Embodied Agent [Paper Link] [2025]
- Universal Actions for Enhanced Embodied Foundation Models [Paper Link] [Project Link] [2025]
- OpenVLA: An Open-Source Vision-Language-Action Model [Paper Link] [Project Link] [2024]
- AnyPlace: Learning Generalized Object Placement for Robot Manipulation [Paper Link] [Project Link] [2025]
- Robotic Control via Embodied Chain-of-Thought Reasoning [Paper Link] [Project Link] [2024]
- Language-Guided Object-Centric Diffusion Policy for Collision-Aware Robotic Manipulation [Paper Link] [2024]
- Hierarchical Diffusion Policy: manipulation trajectory generation via contact guidance [Paper Link] [Project Link] [2024]
- DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control [Paper Link] [Project Link] [2025]
- RoboGrasp: A Universal Grasping Policy for Robust Robotic Control [Paper Link] [2025]
- Improving Vision-Language-Action Model with Online Reinforcement Learning [Paper Link] [2025]
- RoboHorizon: An LLM-Assisted Multi-View World Model for Long-Horizon Robotic Manipulation [Paper Link] [2025]
- Equivariant Diffusion Policy [Paper Link] [Project Link] [2024]
- FAST: Efficient Action Tokenization for Vision-Language-Action Models [Paper Link] [Project Link] [2025]
- Gemini Robotics: Bringing AI into the Physical World [Paper Link] [2025]
- Robotic Control via Embodied Chain-of-Thought Reasoning [Paper Link] [Project Link] [2025]
- RT-H: Action Hierarchies Using Language [Paper Link] [Project Link] [2024]
- AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems [Paper Link] [Project Link] [2025]

👑 E-AI-RL

- Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control [Paper Link] [Project Link] [2024]
- MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning [Paper Link] [Project Link] [2024]

🏁 Interactive

- Learning to Learn Faster from Human Feedback with Language Model Predictive Control [Paper Link] [Project Link] [2024]

💻 Simulator

- ORBIT: A Unified Simulation Framework for Interactive Robot Learning Environments [Paper Link] [Project Link] [2023]
- Gazebo [Paper Link] [Project Link] [2004]
- Pybullet, a python module for physics simulation for games, robotics and machine learning [Project Link] [2021]
- Mujoco: A physics engine for model-based control [Paper Link] [Project Link] [2012]
- V-REP: A versatile and scalable robot simulation framework [Project Link] [2013]
- AI2-THOR: An Interactive 3D Environment for Visual AI [Paper Link] [Project Link] [2017]
- CLIPORT: What and Where Pathways for Robotic Manipulation [Paper Link] [Project Link] [2021]
- BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday Activities and Realistic Simulation [Paper Link] [Project Link] [2024]
- RLBench: The Robot Learning Benchmark & Learning Environment [Paper Link] [Project Link] [2019]
- MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations [Paper Link] [Project Link] [2023]
- CALVIN: A Benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks [Paper Link] [Project Link] [2022]
- Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning [Paper Link] [Project Link] [2019]
- ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI [Paper Link] [Project Link] [2024]
- HomeRobot: Open-Vocabulary Mobile Manipulation [Paper Link] [Project Link] [2023]
- ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes [Paper Link] [Project Link] [2023]
- Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots [Paper Link] [Project Link] [2023]
- InfiniteWorld: A Unified Scalable Simulation Framework for General Visual-Language Robot Interaction [Paper Link] [Project Link] [2024]
- ProcTHOR: Large-Scale Embodied AI Using Procedural Generation [Paper Link] [Project Link] [2022]
- Holodeck: Language Guided Generation of 3D Embodied AI Environments [Paper Link] [Project Link] [2023]
- PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI [Paper Link] [Project Link] [2024]
- RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation [Paper Link] [Project Link] [2023]
- Genesis: A Universal and Generative Physics Engine for Robotics and Beyond [Project Link] [2025]
- Webots: open-source robot simulator [Paper Link] [Project Link] [2018]
- Unity: A General Platform for Intelligent Agents [Paper Link] [Project Link] [2020]
- ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation [Paper Link] [Project Link] [2021]
- iGibson 1.0: A Simulation Environment for Interactive Tasks in Large Realistic Scenes [Paper Link] [Project Link] [2021]
- SAPIEN: A SimulAted Part-based Interactive ENvironment [Paper Link] [Project Link] [2020]
- VirtualHome: Simulating Household Activities via Programs [Paper Link] [Project Link] [2018]
- Modular Open Robots Simulation Engine: MORSE [Paper Link] [Project Link] [2011]
- VRKitchen: an Interactive 3D Virtual Environment for Task-oriented Learning [Paper Link] [Project Link] [2019]
- CHALET: Cornell House Agent Learning Environment [Paper Link] [Project Link] [2018]
- Habitat: A Platform for Embodied AI Research [Paper Link] [Project Link] [2019]
- MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge [Paper Link] [Project Link] [2022]
- ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks [Paper Link] [Project Link] [2019]
- BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning [Paper Link] [Project Link] [2019]
- Gibson Env: Real-World Perception for Embodied Agents [Paper Link] [Project Link] [2018]
- iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks [Paper Link] [Project Link] [2021]
- RoboTHOR: An Open Simulation-to-Real Embodied AI Platform [Paper Link] [Project Link] [2020]
- LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning [Paper Link] [Project Link] [2023]
- robosuite: A Modular Simulation Framework and Benchmark for Robot Learning [Paper Link] [Project Link] [2020]
- Demonstrating HumanTHOR: A Simulation Platform and Benchmark for Human-Robot Collaboration in a Shared Workspace [Paper Link] [Project Link] [2024]
- Robomimic: What Matters in Learning from Offline Human Demonstrations for Robot Manipulation [Paper Link] [Project Link] [2021]
- Adroit: Manipulators and Manipulation in high dimensional spaces [Paper Link] [Project Link] [2016]
- Gymnasium-Robotics [Paper Link] [Project Link] [2024]
- RoboHive: A Unified Framework for Robot Learning [Paper Link] [Project Link] [2024]

📊 Dataset

- Efficient Grasping from RGBD Images: Learning using a new Rectangle Representation [Paper Link] [2011]
- Real-World Multiobject, Multigrasp Detection [Paper Link] [Project Link] [2018]
- Jacquard: A Large Scale Dataset for Robotic Grasp Detection [Paper Link] [Project Link] [2018]
- Learning 6-DOF Grasping Interaction via Deep Geometry-aware 3D Representations [Paper Link] [Project Link] [2018]
- ACRONYM: A Large-Scale Grasp Dataset Based on Simulation [Paper Link] [Project Link] [2020]
- EGAD! an Evolved Grasping Analysis Dataset for diversity and reproducibility in robotic manipulation [Paper Link] [Project Link] [2020]
- GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping [Paper Link] [Project Link] [2020]
- Grasp-Anything: Large-scale Grasp Dataset from Foundation Models [Paper Link] [Project Link] [2023]
- DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes [Paper Link] [Project Link] [2024]
- Yale-CMU-Berkeley dataset for robotic manipulation research [Paper Link] [Project Link] [2017]
- AKB-48: A Real-World Articulated Object Knowledge Base [Paper Link] [Project Link] [2022]
- GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts [Paper Link] [Project Link] [2022]
- Bi-DexHands: Towards Human-Level Bimanual Dexterous Manipulation [Paper Link] [Project Link] [2022]
- DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects [Paper Link] [Project Link] [2023]
- PartManip: Learning Cross-Category Generalizable Part Manipulation Policy from Point Cloud Observations [Paper Link] [Project Link] [2023]
- Open X-Embodiment: Robotic Learning Datasets and RT-X Models [Paper Link] [Project Link] [2024]
- RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents [Paper Link] [Project Link] [2025]
- ALOHA 2: An Enhanced Low-Cost Hardware for Bimanual Teleoperation [Paper Link] [Project Link] [2024]
- GRUtopia: Dream General Robots in a City at Scale [Paper Link] [Project Link] [2024]
- All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents [Paper Link] [Project Link] [2024]
- VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks [Paper Link] [Project Link] [2024]
- RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation [Paper Link] [Project Link] [2024]
- On Bringing Robots Home [Paper Link] [Project Link] [2023]
- Empowering Embodied Manipulation: A Bimanual-Mobile Robot Manipulation Dataset for Household Tasks [Paper Link] [Project Link] [2024]
- DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset [Paper Link] [Project Link] [2024]
- BridgeData V2: A Dataset for Robot Learning at Scale [Paper Link] [Project Link] [2024]
- RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking [Paper Link] [Project Link] [2023]
- AgiBot World Colosseum [Paper Link] [Project Link] [2024]
- REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction [Paper Link] [Project Link] [2023]
- OakInk2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion [Paper Link] [Project Link] [2024]
- A dataset of relighted 3d interacting hands [Paper Link] [Project Link] [2023]
- Human-agent joint learning for efficient robot manipulation skill acquisition [Paper Link] [Project Link] [2025]
- RoboNet: Large-Scale Multi-Robot Learning [Paper Link] [Project Link] [2020]
- MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale [Paper Link] [Project Link] [2021]
- BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning [Paper Link] [Project Link] [2022]
- VIMA: General Robot Manipulation with Multimodal Prompts [Paper Link] [Project Link] [2023]
- FastUMI: A Scalable and Hardware-Independent Universal Manipulation Interface with Dataset [Paper Link] [Project Link] [2024]

🔧 Toolkit

- PyRep: Bringing V-REP to Deep Robot Learning [Paper Link] [Project Link] [2024]
- Yet Another Robotics and Reinforcement learning framework for PyTorch [Project Link] [2024]

📖 Citation

If you think this repository is helpful, please feel free to leave a star ⭐️

😊 Acknowledgements

Thanks for the repository:

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Assets		Assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛫 Introduction to Embodied Intelligence (A Quick Guide of Embodied-AI)

🤖 About

Author List

Table of Contents

Symbol representation

📑 Survey

👁️ Perception

🧠 Brain Model

🏆 VLA Model

👑 E-AI-RL

🏁 Interactive

💻 Simulator

📊 Dataset

🔧 Toolkit

📖 Citation

😊 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🛫 Introduction to Embodied Intelligence (A Quick Guide of Embodied-AI)

🤖 About

Author List

Table of Contents

Symbol representation

📑 Survey

👁️ Perception

🧠 Brain Model

🏆 VLA Model

👑 E-AI-RL

🏁 Interactive

💻 Simulator

📊 Dataset

🔧 Toolkit

📖 Citation

😊 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages