Skip to content

natnew/awesome-physical-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

47 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Awesome Physical AI Awesome

GitHub stars GitHub forks Licence Last commit Link check PRs welcome

Physical AI and embodied AI bring large-scale machine learning to robotics β€” agents that perceive, reason, and act in the physical world. This is a curated, engineering-oriented map of Physical AI resources: robot learning, foundation models for robotics, vision-language-action (VLA) models, world models, simulation and sim-to-real, datasets and benchmarks, generalist policies, and patterns for safe, production-grade systems.

Built for researchers and practitioners across the stack β€” from foundations to deployment β€” and for technical leaders weighing how embodied intelligence will reshape products, operations, and infrastructure.

Visit the Live Docs Site

Get Started

New to Physical AI? Start here.
Physical AI sits at the intersection of robotics, machine learning, simulation, and embodied decision-making. The easiest way to begin is to start in simulation, understand the learning loop, then move toward robot policies, foundation models, and real-world systems.

Quick start path

1. Learn the core reinforcement learning loop
Start with Gymnasium CartPole to build intuition for states, actions, rewards, and policies.

2. Explore simulation
Use MuJoCo to understand how control, physics, contacts, and robot dynamics are modelled.

3. Inspect a modern robot-learning workflow
Browse LeRobot to see how robot datasets, policies, training, and evaluation are structured in practice.

4. Explore the frontier of embodied foundation models
Look at OpenVLA to understand how vision-language-action models connect perception, language, and robot control.

5. Move deeper when ready
Once the above feels clear, continue into imitation learning, generalist robot policies, sim-to-real methods, hardware, and evaluation.

Choose your route

  • Foundations β€” simulation, control, reinforcement learning, and core robot-learning concepts
  • Robot Learning β€” imitation learning, diffusion policies, and visuomotor control
  • Foundation Models β€” VLA models, generalist policies, world models, and large-scale robot data
  • Evaluation & Safety β€” benchmarks, robustness, deployment constraints, and responsible embodied AI
  • Hardware β€” low-cost robot platforms and practical embodied projects

Tip: You do not need hardware to begin.
Start with simulation first, then move to real-world systems when you understand the learning and evaluation loop.

Contents

Canonical categories

Appendices


Simulators

Physics engines and high-fidelity simulation environments for robotics and embodied AI.

  • MuJoCo β€” Multi-joint dynamics with contact; fast, accurate physics widely used for RL research.
  • NVIDIA Isaac Sim β€” GPU-accelerated robotics simulator on Omniverse with photorealistic rendering and PhysX.
  • Isaac Lab β€” Unified robot learning framework on Isaac Sim for RL, imitation learning, and motion planning.
  • Drake β€” Model-based design toolbox from TRI/MIT for planning, control, and rigorous dynamics analysis.
  • Gazebo β€” Open-source robotics simulator with mature ROS integration and broad sensor support.
  • PyBullet β€” Open-source physics engine (Bullet) with Python bindings, popular for prototyping and RL.
  • Habitat β€” Embodied AI platform optimised for high-throughput 3D navigation and instruction-following research.
  • SAPIEN β€” Physics-rich simulator with the PartNet-Mobility articulated object dataset.
  • Genesis β€” Universal differentiable simulator for robotics and embodied AI with cross-platform physics solvers.
  • Webots β€” Open-source robot simulator with mature educational and research workflows for mobile and manipulation robotics.
  • CoppeliaSim β€” General-purpose robot simulation platform with rich scene scripting and broad manipulation benchmark usage.
  • CARLA β€” Open urban-driving simulator used for closed-loop autonomy and robustness testing at scale.
  • AirSim β€” Photoreal simulation for drones and autonomous vehicles with configurable sensors and environments.
  • Brax β€” Differentiable, accelerator-native physics engine designed for high-throughput RL experimentation.
  • RaiSim β€” High-performance rigid-body simulator widely used for legged-locomotion research and control.

Datasets

Large-scale teleoperation, demonstration, and interaction datasets used to train robot policies.

  • Open X-Embodiment β€” 1M+ trajectories across 22 embodiments; the de facto cross-embodiment training corpus.
  • DROID β€” Large-scale, in-the-wild manipulation dataset collected across 13 institutions.
  • BridgeData V2 β€” Diverse manipulation behaviours designed to support broad generalisation.
  • RH20T β€” Robot manipulation dataset with paired human demonstrations for one-shot learning research.
  • RoboMIND β€” Multimodal bimanual mobile manipulation dataset with 310K+ trajectories.
  • AgiBot World β€” Large-scale dataset designed to train and evaluate robot foundation models.
  • Ego4D β€” Massive-scale egocentric video dataset useful for pretraining perception and world models.
  • Something-Something V2 β€” Action recognition dataset frequently used to pretrain manipulation perception.
  • RLDS β€” Standardized format and tooling for logged trajectories used across robot-learning datasets.
  • BEHAVIOR-1K β€” Large-scale household activity dataset and environment targeting realistic embodied task diversity.
  • CALVIN ABC-D β€” Language-conditioned manipulation dataset for long-horizon policy training and evaluation.
  • EPIC-KITCHENS-100 β€” Egocentric video corpus useful for action understanding and embodied perception pretraining.
  • nuScenes β€” Multisensor autonomous-driving dataset with rich annotations for perception and planning research.
  • Waymo Open Dataset β€” Large-scale real-world driving dataset used for perception, motion forecasting, and closed-loop autonomy studies.
  • Argoverse 2 β€” High-quality motion-forecasting and 3D tracking datasets for real-world embodied prediction tasks.

Benchmarks

Task suites and standardised evaluations for manipulation, locomotion, and embodied reasoning.

  • LIBERO β€” Lifelong robot learning benchmark with 130 diverse manipulation tasks.
  • RLBench β€” Vision-guided manipulation benchmark covering 100+ tasks in CoppeliaSim.
  • MetaWorld β€” Meta-RL benchmark with 50 manipulation tasks for multi-task and transfer studies.
  • CALVIN β€” Benchmark for long-horizon, language-conditioned manipulation.
  • HumanoidBench β€” Simulated humanoid benchmark for whole-body control across locomotion and manipulation.
  • FurnitureBench β€” Real-world long-horizon furniture assembly benchmark.
  • ARNOLD β€” Language-grounded continuous-task benchmark in physically realistic scenes.
  • Colosseum β€” Generalisation benchmark perturbing 14 axes of variation for manipulation.
  • OpenEQA β€” Embodied question-answering benchmark over scanned real environments.
  • CARLA Leaderboard β€” Standardized autonomous-driving benchmark emphasizing closed-loop safety and robustness.
  • MineDojo β€” Open-ended embodied-agent benchmark for long-horizon decision making in complex 3D worlds.
  • ALFRED β€” Vision-language benchmark for household instruction following and embodied task completion.
  • TEACh β€” Interactive benchmark for embodied dialog and task execution in household environments.
  • RoboTHOR β€” Navigation benchmark focused on sim-to-real transfer and unseen-scene generalization.
  • ManiSkill Benchmark β€” Manipulation benchmark suite with scalable GPU simulation and reproducible baselines.

Evaluation Methodology

Harnesses, metrics, and methodology for measuring robot policy performance, robustness, and sim-to-real validity.

  • RoboArena β€” Decentralised real-world evaluation protocol for generalist robot policies.
  • robomimic β€” Standardised offline-RL and imitation-learning evaluation pipeline with reproducible baselines.
  • Bench2Drive β€” Closed-loop evaluation protocol for end-to-end driving policies.
  • Eval-vs-Train Mismatch (Kumar et al.) β€” Methodology paper on why offline metrics mispredict deployed robot performance.
  • SimplerEnv β€” Aligned simulator-based evaluation that correlates with real-robot performance for VLAs.
  • Statistical Reliability of RL Evaluations β€” rliable library and methodology for confidence intervals on RL benchmarks.
  • EvalAI β€” Open platform for challenge hosting, leaderboard management, and standardized evaluation workflows.
  • CodaLab Competitions β€” Reproducible benchmark and submission platform for shared evaluation protocols.
  • CARLA ScenarioRunner β€” Scenario-based closed-loop evaluation harness for safety-critical driving behaviors.
  • nuPlan Devkit β€” End-to-end planning evaluation stack with documented metrics and simulation loops.
  • Waymo Open Challenges β€” Public challenge suite with fixed protocols for forecasting and planning evaluation.
  • LeRobot Evaluation Scripts β€” Practical evaluation tooling for imitation-learning and policy-regression checks.
  • RoboHive β€” Benchmarking suite with standardized tasks and scoring across manipulation and locomotion.
  • Deep RL That Matters β€” Foundational paper on statistical pitfalls and reproducibility in RL evaluation.
  • Empirical Design in Reinforcement Learning β€” Guidance on experimental design choices that materially affect reported results.

Robotics Foundation Models

Generalist policies and vision-language-action (VLA) models for robotic control.

  • Octo β€” Open-source generalist robot policy trained on Open X-Embodiment with cross-embodiment fine-tuning.
  • OpenVLA β€” Open-source 7B-parameter vision-language-action model built on Prismatic VLMs.
  • RT-2 β€” Vision-language-action model that transfers web knowledge to robotic control.
  • RT-X β€” Cross-embodiment models demonstrating positive transfer across robot platforms.
  • Gemini Robotics β€” Google DeepMind VLA family with embodied reasoning capabilities.
  • GR00T N1 (NVIDIA) β€” Open humanoid foundation model with a dual-system slow/fast architecture.
  • Helix (Figure) β€” Vision-language-action model targeting generalist humanoid control.
  • RT-1 β€” Robotics Transformer for large-scale real-robot manipulation with language-conditioned control.
  • PaLM-E β€” Embodied multimodal language model integrating visual and robot-state observations for action.
  • SayCan β€” Language-model-guided skill selection framework for grounded robot task execution.
  • Code as Policies β€” Program-synthesis approach that compiles language instructions into executable robot policies.
  • VIMA β€” Promptable transformer for multimodal robot manipulation via in-context generalization.
  • Gato β€” Generalist policy architecture spanning embodied control and non-robotic tasks with tokenized actions.
  • RoboFlamingo β€” Open vision-language-action model for low-cost adaptation to robot manipulation tasks.

World Models

Generative and predictive models of physical dynamics used for planning, simulation, and pretraining.

  • V-JEPA 2 (Meta FAIR) β€” Self-supervised video world model trained on 1M+ hours enabling zero-shot robot planning.
  • NVIDIA Cosmos β€” World foundation models for physically-grounded synthetic data generation.
  • Genie 2 (DeepMind) β€” Foundation world model that generates interactive, controllable 3D environments.
  • DreamerV3 β€” General world-model algorithm achieving strong results across 150+ tasks with fixed hyperparameters.
  • DayDreamer β€” World models applied to physical robot learning for sample-efficient skill acquisition.
  • UniSim β€” Universal simulator learning real-world interactions from diverse video data.
  • I-JEPA β€” Image joint-embedding predictive architecture; foundational to the JEPA world-model line.
  • World Models (Ha & Schmidhuber) β€” Foundational latent-dynamics framework for planning and control in learned simulators.
  • PlaNet β€” Latent-space planning with learned dynamics, a core precursor to Dreamer-style methods.
  • Dream to Control β€” Demonstrates control directly in latent imagination without pixel-space rollouts.
  • SimPLe β€” Model-based RL baseline showing strong sample efficiency from learned video prediction.
  • MuZero β€” Learned model-based planning architecture with strong performance across control domains.
  • DreamerV2 β€” Robust latent world-model RL variant with improved discrete latent representations.
  • GAIA-1 (Wayve) β€” Driving-oriented generative world model for physically plausible scenario synthesis.
  • TD-MPC2 β€” Modern latent model-predictive control method with broad robot-control transfer.
  • Robotic World Model (ETH RSL) β€” Learned world model for legged robots from ETH Zurich's Robotic Systems Lab; companion lite variant for lighter-weight experimentation.

Manipulation

Methods, models, and tools for grasping, dexterous manipulation, and contact-rich tasks.

  • Diffusion Policy β€” Visuomotor policy learning via action diffusion; widely-used baseline for imitation.

  • ACT (Action Chunking Transformers) β€” Transformer policy for bimanual fine manipulation from demonstrations.

  • Mobile ALOHA β€” Bimanual mobile manipulation system with low-cost teleoperation hardware.

  • ALOHA Unleashed β€” Recipe for scaling robot dexterity via large-scale imitation learning.

  • Dex-Net β€” Datasets and models for analytic and learned robust grasping.

  • Contact-GraspNet β€” Grasp pose generation directly from partial point clouds.

  • RoboCasa β€” Large-scale household-scene simulation suite for training generalist manipulation policies.

  • MIT 6.4210 β€” Robotic Manipulation β€” Russ Tedrake's reference text covering perception, planning, and control for manipulation.

  • PerAct β€” 3D voxel-action transformer for language-conditioned, long-horizon manipulation.

  • CLIPort β€” Language-conditioned manipulation with CLIP-based perception and transport-based action heads.

  • Transporter Networks β€” Keypoint-based pick-and-place architecture for data-efficient tabletop manipulation.

  • GraspNet-1Billion β€” Large-scale benchmark and dataset for robust 6-DoF grasp planning.

  • AnyGrasp β€” Efficient 6-DoF grasp generation framework for real-time deployment.

  • 3D Diffusion Policy (DP3) β€” Point-cloud-conditioned diffusion policy that improves data efficiency and robustness over image-based variants.

  • robosuite β€” Modular simulation framework for manipulation research with reproducible task environments.

  • DreamZero β€” Zero-shot dexterous manipulation with diffusion models and vision-language-action policies.

Locomotion

Legged, bipedal, and humanoid locomotion β€” controllers, learning approaches, and reference platforms.

Sim-to-Real

Techniques and case studies for transferring policies trained in simulation to real hardware.

Safety & Robustness

Tools, benchmarks, and methodology for safe exploration, robustness testing, and failure-mode analysis.

Governance & Policy

Standards, frameworks, and policy guidance relevant to deploying Physical AI systems.

  • NIST AI Risk Management Framework β€” Voluntary framework for managing risks across the AI lifecycle, applicable to robotics.
  • EU AI Act β€” Regulation establishing risk-tiered obligations for AI systems sold or operated in the EU.
  • ISO 10218 / ISO/TS 15066 β€” Industrial robot and collaborative-robot safety standards underpinning workplace deployment.
  • IEEE 7000 Series β€” Standards on ethical and value-based design for autonomous and intelligent systems.
  • OECD AI Principles β€” Intergovernmental principles guiding trustworthy AI deployment, including embodied systems.
  • UK AI Safety Institute β€” Government body publishing evaluations and guidance on frontier AI risks.
  • White House Executive Order on AI (14110) β€” US federal directive on safe and trustworthy AI development relevant to robotics deployers.
  • ISO/IEC 42001 β€” AI management-system standard for governance, controls, and continuous improvement.
  • NIST AI RMF Generative AI Profile β€” Practical profile extending AI RMF controls to generative-model deployments.
  • EU Machinery Regulation (EU 2023/1230) β€” Core legal framework governing safety requirements for machinery and many robotic systems in the EU.
  • UNECE R155 β€” Cybersecurity requirements for connected and automated road vehicles.
  • UNECE R156 β€” Software update and update-management requirements for vehicles.
  • ISO 13482 β€” Safety standard for personal care robots operating near people.
  • UL 4600 β€” Safety case standard for autonomous products and systems.
  • ISO 26262 β€” Functional safety standard for electrical and software systems in road vehicles.

Production Patterns / Reference Architectures

Middleware, runtime stacks, and reference patterns for shipping robots in production.

  • ROS 2 β€” De facto middleware for production robot software, with QoS, security, and real-time profiles.
  • NVIDIA Isaac ROS β€” GPU-accelerated ROS 2 perception and navigation packages for production robots.
  • MoveIt 2 β€” Production-grade motion planning framework integrated with ROS 2.
  • Nav2 β€” ROS 2 navigation stack with behaviour trees, planners, and recovery patterns.
  • Open-RMF β€” Open-source framework for multi-robot, multi-vendor fleet orchestration.
  • DDS Security (OMG Spec) β€” Reference standard for authenticated, encrypted robot data buses (used by ROS 2).
  • Foxglove β€” Observability and visualisation stack for robotics telemetry, logs, and replay.
  • micro-ROS β€” ROS 2 client stack for microcontrollers and embedded real-time robot components.
  • ros2_control β€” Standardized hardware abstraction and controller framework for production robot actuation.
  • BehaviorTree.CPP β€” Widely adopted behavior-tree runtime for deterministic task orchestration.
  • Eclipse Cyclone DDS β€” High-performance DDS implementation commonly deployed in ROS 2 production stacks.
  • Fast DDS β€” Industrial-grade DDS middleware with configurable QoS and security support.
  • MCAP β€” Modern log container format for robotics telemetry, replay, and long-term data retention.
  • Zenoh β€” Data-centric middleware for distributed robotics over constrained and heterogeneous networks.
  • rosbag2 β€” Standard ROS 2 recording and replay pipeline for debugging and incident analysis.

Courses

University courses and structured learning programs in robot learning and embodied AI.

Companies

Organisations advancing Physical AI through foundation models, humanoids, and applied robotics.

  • Physical Intelligence (Ο€) β€” Foundation models for general-purpose robots; developer of Ο€0.

  • Figure β€” Humanoid robotics company building general-purpose bipedal platforms with VLA-driven control.

  • 1X Technologies β€” Humanoid robots designed for safe human interaction and home environments.

  • Boston Dynamics β€” Pioneers of dynamic legged and humanoid platforms (Spot, Atlas) with active research arm.

  • Agility Robotics β€” Maker of Digit, a bipedal logistics robot deployed in commercial warehouses.

  • Apptronik β€” Humanoid robotics company building Apollo for industrial applications.

  • Skild AI β€” Building scalable, generalist robot intelligence trained across diverse embodiments.

  • Covariant β€” Foundation-model-driven AI for robotic picking and warehouse manipulation.

  • Wayve β€” Embodied AI for end-to-end autonomous driving using world models and VLAs.

  • Pollen Robotics (Hugging Face) β€” Open-source humanoid robotics; maker of Reachy 2 and Reachy Mini.

  • Sanctuary AI β€” Developing general-purpose humanoid robots for structured workplace environments.

  • Unitree Robotics β€” Commercial legged-robot and humanoid platforms with broad developer adoption.

  • Intrinsic β€” Alphabet-backed software platform focused on scalable industrial robotics development.

  • Dexterity β€” AI robotics company deploying high-throughput manipulation systems for logistics.

  • Tesla Optimus β€” Humanoid robotics program targeting general-purpose physical task automation.

  • Sunday Robotics β€” Building generalist robots and foundation models for physical intelligence.


Appendices

The sections below complement the canonical taxonomy with related learning resources, hardware, community pointers, and practitioner recommendations.

Books

Foundational and advanced textbooks.

Tutorials & Guides

Hands-on learning resources.

Key Papers

Influential research papers in Physical AI.

Survey Papers

Comprehensive overviews of key areas.

Hardware Platforms

Physical robots for research and development.

Arms & Manipulators

  • Franka Emika - Research-grade 7-DOF arm with torque sensing and compliant control.

  • Universal Robots - Collaborative robot arms widely used in research and industry.

  • Kinova - Lightweight assistive and research robot arms.

  • xArm - Affordable 6/7-DOF robot arms for research and development.

  • Kuka iiwa - Industrial arm designed for human collaboration.

  • UMI Gripper - Open-source underactuated manipulator for dexterous grasping and manipulation.

  • Dex-UMI - Dexterous manipulation platform and dataset for multi-fingered robotic hands.

  • DexUMI Code & Data - Code and datasets for the Dex-UMI dexterous manipulation benchmark.

Humanoids

Mobile Robots

Low-Cost & DIY

Conferences

Major venues for robotics and AI research.

  • CoRL - Conference on Robot Learning.
  • RSS - Robotics: Science and Systems.
  • ICRA - IEEE International Conference on Robotics and Automation.
  • IROS - IEEE/RSJ International Conference on Intelligent Robots and Systems.
  • NeurIPS - Conference on Neural Information Processing Systems.
  • ICML - International Conference on Machine Learning.
  • ICLR - International Conference on Learning Representations.
  • HRI - ACM/IEEE International Conference on Human-Robot Interaction.
  • Humanoids - IEEE-RAS International Conference on Humanoid Robots.

Community

Forums, discussions, and meetups.

Newsletters & Blogs

Stay updated with the latest developments.

Deep Dives & Analysis

  • Chipstrat - Austin Lyons. Semiconductors, AI, and robotics strategy. Excellent "Robots That See" series on computer vision for robotics.
  • Robots & Startups - Andra Keay (Silicon Valley Robotics). Robot startups and industry trends from the epicenter of the robot revolution.
  • Import AI - Jack Clark. Weekly analysis of AI breakthroughs, policy, and implications for robotics.
  • Interconnects - Nathan Lambert. Technical deep dives on AI from an actual model trainer.
  • Ahead of AI - Sebastian Raschka. Research-focused ML/AI coverage.
  • The Batch - Andrew Ng's weekly AI newsletter with educational focus.

Industry News

Research & Company Blogs

People to Follow

Researchers, engineers, and practitioners shaping Physical AI.

Research Leaders

  • Yann LeCun - Chief AI Scientist at Meta. V-JEPA, world models, and self-supervised learning.
  • Fei-Fei Li - Stanford HAI, World Labs. Computer vision and spatial intelligence pioneer.
  • Pieter Abbeel - Berkeley BAIR, Covariant. Robot learning and RL.
  • Sergey Levine - Berkeley. Reinforcement learning and robot learning.
  • Chelsea Finn - Stanford. Meta-learning and robot learning.
  • Russ Tedrake - MIT, Toyota Research Institute. Manipulation and control.
  • Dieter Fox - NVIDIA, UW. Perception and robot learning.

Robotics & Hardware

  • Kate Darling - MIT Media Lab, BD AI Institute. Robotics ethics and human-robot interaction.
  • Rodney Brooks - iRobot co-founder, Robust.AI. Robotics pioneer and blogger.
  • Angelica Lim - SFU. Social robotics and emotional AI.
  • Andra Keay - Silicon Valley Robotics. Robot ecosystem and startups.

Industry Leaders

Related Awesome Lists

Other curated lists covering adjacent topics.


Contributing

We love Contributors

Thrilled to have you here.
Whether it's a quick typo fix, a fresh resource,
a doc polish, or a sweeping overhaul β€” every contribution helps this list grow.
Jump in and join the community β€” PRs of every size are welcome.

πŸ“ Read the contributing guide Β· πŸ› good first issues

About

A curated list of Robotics + AI resources to learn, build, deploy, and stay current in Physical AI / Embodied AI. 🌟 Star if you like it!

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors