Skip to content

romizone/optimus

Repository files navigation

Optimus Perceptron

OPTIMUS PERCEPTRON

Humanoid AI Robot Simulation Suite

A next-generation autonomous humanoid robot simulation platform featuring a 7-layer cognitive architecture, multi-environment navigation, competitive padel athletics, and full self-repair diagnostics — all running in pure HTML5 Canvas with zero dependencies.


Live Demo Academic Paper MIT License

JavaScript HTML5 Canvas CSS3 Vercel PRs Welcome Stars


Launch Demo · Read Paper · Author Website · Report Bug · Request Feature



9
Integrated Tabs
7
Cognitive Layers
0
Dependencies
60
FPS Target
6,393
Lines of Code
<384 KB
Total Size

Table of Contents


Overview

Optimus Perceptron is a comprehensive, browser-based humanoid robot simulation platform that models every critical aspect of autonomous robot operation — from raw visual perception through high-level task planning to motor execution. The platform integrates six operational modules into a unified control loop running at 60 fps, demonstrating that complex multi-agent robotic cognition can be prototyped, visualized, and studied without specialized hardware.

Zero install. Zero config. Zero dependencies. Just open the HTML file in any modern browser.

At a Glance

Metric Value
Total Lines of Code 6,393
Main Simulation 2,567 lines (212 KB)
Integrated Tabs 9
AI Cognitive Layers 7
External Dependencies 0
Entity Types 5 (Human, Robot, Car, Child, Optimus)
Padel Shot Types 10
Robot Components Monitored 13
Charging Stations 8
Supported Browsers Chrome, Firefox, Safari, Edge

Why Optimus Perceptron?

The Problem

Existing simulation platforms like NVIDIA Isaac Sim, MuJoCo, and Gazebo deliver high-fidelity physics but require:

  • Specialized GPUs and hardware
  • Complex installation and dependencies
  • Significant computational resources
  • Steep learning curves

This creates barriers for rapid prototyping, education, and cross-disciplinary collaboration.

Our Solution

Optimus Perceptron addresses this gap by delivering:

  • Browser-native — runs on any device with a web browser
  • Zero-dependency — no npm, no frameworks, no CDNs
  • Instant start — open the file and the simulation begins
  • Full cognitive stack — 7-layer architecture from perception to motor control
  • Educational transparency — every decision is logged in real-time

Key Features

Perception & Vision System

Feature Description
Field of View Detection Real-time FOV-based entity detection with configurable range (320 units) and angle (117 degrees)
Progressive Confidence Classification confidence increases with proximity and observation duration using mathematical model
Entity Classification 4-class detection: Human, Robot, Vehicle, Child — each with distinct behavioral response policies
Robot Vision Camera First-person 2.5D perspective projection with bounding boxes and classification labels
LiDAR Point Cloud 128-channel sweep simulation with 280K-300K points/scan, distance-encoded coloring
Multi-Modal Vision RGB View, Depth Map, Semantic Segmentation, and LiDAR Projection

Navigation & Collision Avoidance

Feature Description
Urban Navigation Autonomous traversal through dense city grid (3,200 x 2,400 units) with building avoidance
Park Navigation Gate detection, fence boundary awareness, pond avoidance, organic obstacle distribution
Traffic Compliance Full traffic signal recognition — green (12s), yellow (3s), red (10s) cycle compliance
Collision Avoidance Hierarchical 5-priority system: fences, buildings/trees, water bodies, vehicles, pedestrians
Child Safety Mode Enhanced caution zones — 50% speed reduction, 2.5m buffer around detected children
Heading Smoothing Exponential moving average controller for stable, responsive heading transitions

Energy Management

Feature Description
Battery Model 5.2 kWh Li-ion simulation with discharge rates scaling by locomotion and computation load
8 Charging Stations Distributed across the city with varying speeds (45-150 kW) and availability
Intelligent Selection Weighted scoring algorithm balancing distance (40%), charge speed (35%), availability (25%)
Low Battery Warning Automatic prioritization of nearest high-speed station below 20% charge
Ring Gauge Visualization Animated battery visualization with real-time metrics and charging event log

Task Planning & Scheduling

Feature Description
7-Day Schedule Complete weekly planner with 5-8 tasks per day across multiple categories
5 Task Categories Work, Leisure, Maintenance, Social, Learning — with energy budgeting per category
Auto-Execution Engine Stochastic task completion with energy-aware scheduling and deferral logic
Multi-Day Planning Weekday/weekend activity distribution mirroring real-world service robot schedules

Self-Diagnosis & Repair

Feature Description
13 Components Monitored Head cameras, LiDAR, CPU/NPU, battery, shoulder actuators, hand grippers, hip joints, knee actuators, foot sensors
Degradation Model Stochastic wear proportional to usage intensity per component type
Nano-Repair System Autonomous self-healing at 0.01-0.03% per tick, modeling advanced repair materials
Spare Parts Inventory Stock tracking with unit costs (IDR), supplier info, and replacement event logging
3D Body Map Color-coded robot body visualization with per-component health indicators
Repair History Complete lifecycle log from degradation through repair to replacement

Competitive Padel AI (2v2)

Feature Description
Full Doubles Match 2v2 on official 20m x 10m enclosed court with glass and wire fence walls
10 Shot Types Forehand Drive, Backhand Slice, Overhead Smash, Bandeja, Vibora, Chiquita, Net Volley, Wall Rebound, Defensive Lob, Bajada
Realistic Physics Gravity (9.8 m/s^2), floor/wall/glass bounces with coefficients of restitution, air resistance
Dynamic Roles Real-time net player / back player role switching based on ball position
Official Scoring Points (0/15/30/40/deuce), games, sets following official padel rules
4 Unique Players OPTIMUS (4.2 m/s), NEXUS-4 (4.0 m/s), ATLAS-X9 (3.8 m/s), VOLT-12 (3.6 m/s)

9 Integrated Tabs

All nine modules operate within a single unified interface:

# Tab Icon Description
1 City Navigation 🏙️ Dense urban simulation with buildings, multi-lane traffic, pedestrians, traffic signals, and autonomous sidewalk navigation
2 Park Walking 🌳 Cinematic natural environment with rich flora (roses, jasmine, orchids, lily pads), pond, fireflies, fence gates, and child safety protocols
3 Tech Config ⚙️ Real-time sensor tuning panel — FOV range, angle, speed, detection thresholds, and 7-layer architecture visualization with robot body + vision panels
4 Battery Health 🔋 Animated ring gauge, city-wide charging station map with distance overlays, per-station details, and charging event history
5 Todo List 📋 7-day weekly task planner with 5 categories, energy budgeting, stochastic auto-completion engine
6 Damage & Repair 🔧 13-component health dashboard with 3D body map, nano-repair system, spare parts inventory, and repair history log
7 Padel AI (2v2) 🎾 Full doubles padel match — 4 robot players, 10 shot types, physically accurate ball dynamics, official scoring system
8 Academic Paper 📄 Complete 12-section research paper with system architecture diagrams, 9 data tables, mathematical models, and 15 academic references
9 AI Architecture 🧠 Interactive 7-layer cognitive stack visualization with full robot body SVG, multi-modal vision panels, and 3D LiDAR point cloud rendering

7-Layer Cognitive Architecture

The simulation implements a complete cognitive processing pipeline inspired by modern autonomous robot systems, with each layer operating at its characteristic frequency:

  ┌─────────────────────────────────────────────────────────────────┐
  │  Layer 7  │  Episodic Memory       │  0.1 Hz  │  Experience    │
  ├───────────┼────────────────────────┼──────────┼────────────────┤
  │  Layer 6  │  RL Policy Engine      │  50 Hz   │  PPO + SAC     │
  ├───────────┼────────────────────────┼──────────┼────────────────┤
  │  Layer 5  │  Task Planner          │  2 Hz    │  LLaMA-3 8B    │
  ├───────────┼────────────────────────┼──────────┼────────────────┤
  │  Layer 4  │  World Model           │  10 Hz   │  Dreamer-v3    │
  ├───────────┼────────────────────────┼──────────┼────────────────┤
  │  Layer 3  │  Sensor Fusion         │  100 Hz  │  EKF 12-state  │
  ├───────────┼────────────────────────┼──────────┼────────────────┤
  │  Layer 2  │  Object Detection      │  30 Hz   │  DETR          │
  ├───────────┼────────────────────────┼──────────┼────────────────┤
  │  Layer 1  │  Motor Control         │  1 kHz   │  PD 28-DOF     │
  └─────────────────────────────────────────────────────────────────┘
Layer Details (click to expand)
Layer Model Function Frequency Key Metrics
L1 — Visual Perception ViT-L/14 (304M params) RGB frame encoding to 1024-dim embeddings 30 Hz 97.3% ImageNet accuracy
L2 — Object Detection DETR (ResNet-50) Bounding box detection, 80-class COCO 30 Hz 44.9 AP, 28 fps
L3 — Sensor Fusion Extended Kalman Filter Multi-modal fusion (camera + LiDAR + IMU) 100 Hz 0.02m position error
L4 — World Model Dreamer-v3 + Voxel Grid Latent dynamics prediction, 15-step horizon 10 Hz 87.4% prediction accuracy
L5 — Task Planner LLaMA-3 8B + PDDL Natural language + symbolic task planning 2 Hz 340 tokens/sec
L6 — RL Policy PPO + SAC Hybrid Continuous action control, 24-dim action space 50 Hz 94.2% success rate
L7 — Motor Control PD Controller (28 DOF) Joint torque computation, 46-DOF kinematics 1 kHz 0.3 deg accuracy

Simulation Environments

City Environment

Dense urban simulation with procedurally generated infrastructure

Parameter Value
World Size 3,200 x 2,400 units
Buildings 15-25 procedurally generated structures
Initial Entities 58 total — 25 humans, 10 children, 8 robots, 15 vehicles
Road System Multi-lane with bidirectional traffic
Traffic Signals Green (12s) / Yellow (3s) / Red (10s) cycles
Navigation Sidewalk-only with pedestrian crossing compliance
Vehicle Spawning Stochastic, 4-10 second intervals

Park Environment

Cinematic natural landscape with rich flora and ambient effects

Parameter Value
World Size 2,800 x 2,000 units
Trees 38 (round and pine types) with shadow rendering
Flowers 80 patches (roses, jasmine, orchids)
Water Elliptical pond with lily pads and reflections
Ambient Effects Fireflies with glow particles
Fence System Perimeter fence with 4 navigable gates (N/S/E/W)
Initial Entities 17 with enhanced child safety protocols

Environment Comparison

Feature City Park
Obstacle Types Buildings, roads, traffic signals Trees, fences, gates, pond
Vehicle Traffic Road lanes Perimeter roads
Traffic Signals Yes No
Fence/Gate System No Yes (4 gates)
Child Safety Mode Standard Enhanced
Visual Style Urban neon Cinematic nature

Padel AI Doubles System

The padel module delivers a complete competitive doubles match simulation on a regulation court:

Team Rosters

Team Color Player 1 Player 2 Strategy
Team Alpha Blue OPTIMUS (4.2 m/s) NEXUS-4 (4.0 m/s) Aggressive net play + baseline coverage
Team Beta Red ATLAS-X9 (3.8 m/s) VOLT-12 (3.6 m/s) Counter-attack + wall play specialization

AI Vision Stack

Module Model Function Performance
Ball Tracker YOLOv9-Padel + Kalman + LSTM-256 Real-time detection + 800ms trajectory prediction 97.8% accuracy, 4.2ms latency
Pose Estimator MediaPipe Pose + Transformer Opponent swing prediction, shot classification 33 keypoints, 94.2% prediction
Strategy Engine PadelGPT (LLaMA-3 8B fine-tuned) Real-time match strategy, opponent adaptation 78.4% win rate, 3-rally adaptation
Swing Controller Imitation Learning + RL (28-DOF) Racket angle, spin, power, timing control 96.3% accuracy, 3200 RPM spin

Ball Physics

Parameter Value
Gravity 9.8 m/s^2
Floor bounce (CoR) 0.65
Side wall bounce (CoR) 0.80 (wire fence)
Back wall bounce (CoR) 0.75 (glass wall)
Air resistance 0.998 damping/tick

Complete Shot Repertoire (10 Types)

Shot Speed Spin Power Accuracy Tactical Purpose
Forehand Drive 95 80 90 88 Aggressive baseline push
Backhand Slice 75 90 65 92 Tempo variation, low bounce
Overhead Smash 100 40 100 78 Maximum power, 50ms timing
Bandeja 60 85 50 95 Controlled overhead cut
Vibora 80 95 70 82 Side-spin wall bounce
Chiquita 40 70 30 96 Soft lob forcing retreat
Net Volley 85 50 75 90 Reflex intercept at net
Wall Rebound 70 60 55 93 Glass wall return
Defensive Lob 50 45 40 97 Recovery under pressure
Bajada (Off-Glass) 88 75 85 74 Advanced back-wall attack

Technical Specifications

Specification Detail
Language Vanilla JavaScript (ES6+)
Rendering Engine HTML5 Canvas 2D Context
Architecture Single-file, zero external dependencies
Frame Rate 60 fps (requestAnimationFrame)
Delta Time Clamped at 50ms for physics stability
UI Updates Throttled at 1.4 Hz (DOM performance optimization)
Typography Inter (UI) + JetBrains Mono (technical/HUD)
Color System CSS custom properties with 12 semantic tokens
Layout CSS Grid + Flexbox with full mobile responsive design
Hosting Vercel (static deployment)
Build Tools None required
Package Manager None required
Frameworks None — pure vanilla implementation

Codebase Breakdown

File Purpose Lines Size
optimus-city-simulation.html Main simulation — 9 tabs 2,567 212 KB
optimus-perceptron-simulation.html AI architecture visualization 2,237 96 KB
optimus-park-simulation.html Standalone park environment 1,537 72 KB
index.html Landing page 52 3 KB
Total 6,393 384 KB

Getting Started

Option 1 — Live Demo (Recommended)

No installation required. Click and explore.

https://optimus-azure.vercel.app/optimus-city-simulation.html

Option 2 — Run Locally

# Clone the repository
git clone https://github.com/romizone/optimus.git

# Navigate to the project
cd optimus

# Open directly in browser — no build step needed
open optimus-city-simulation.html        # macOS
xdg-open optimus-city-simulation.html    # Linux
start optimus-city-simulation.html       # Windows

That's it. No npm install. No build process. No configuration files. Just open the HTML.

Option 3 — Local Server (for iframe features)

# Python
python3 -m http.server 8080

# Node.js
npx serve .

# Then visit
# http://localhost:8080/optimus-city-simulation.html

Option 4 — Deploy Your Own Instance

Deploy with Vercel


Project Structure

optimus/
├── index.html                          # Landing page with navigation cards
├── optimus-city-simulation.html        # Main simulation suite (9 integrated tabs)
│   ├── Tab 1: City Navigation          #   Urban autonomous driving
│   ├── Tab 2: Park Walking             #   Natural environment (iframe)
│   ├── Tab 3: Tech Config              #   Sensor tuning + architecture viz
│   ├── Tab 4: Battery Health           #   Energy management dashboard
│   ├── Tab 5: Todo List                #   7-day task planner
│   ├── Tab 6: Damage & Repair          #   13-component health system
│   ├── Tab 7: Padel AI (2v2)           #   Doubles padel match
│   ├── Tab 8: Academic Paper           #   12-section research paper
│   └── Tab 9: AI Architecture          #   Cognitive stack viz (iframe)
├── optimus-park-simulation.html        # Standalone park environment
├── optimus-perceptron-simulation.html  # AI architecture 4-column layout
├── README.md                           # This file
└── .gitignore                          # Build artifacts exclusion

Performance Benchmarks

Metric Value Condition
Target Frame Rate 60 fps All 9 modules active
Canvas Render (City) < 8 ms 25 humans, 8 robots, 10 children, 15 cars
Collision Check < 0.5 ms Per entity, hierarchical 5-priority system
FOV Computation < 0.3 ms Angular + distance filtering
DOM UI Update < 2 ms Throttled to 1.4 Hz
Memory Usage < 50 MB Chrome, steady state after 5 minutes
First Paint < 100 ms No external resources to load
Total Transfer < 220 KB Single HTML file, gzip compressed

Academic Paper

A comprehensive 12-section research paper is embedded as Tab 8 in the simulation and available as a standalone page.

Read the Full Paper

Paper Contents

# Section Highlights
1 Introduction Problem statement, contributions, accessibility-first philosophy
2 System Architecture 7-layer cognitive pipeline, blackboard data structure, layer interaction model
3 Perception System ViT-L/14, DETR, LiDAR 128ch, multi-modal vision (RGB/Depth/Semantic/LiDAR)
4 Navigation & Collision Avoidance Urban + park navigation, hierarchical collision checking, gate detection
5 Energy Lifecycle Management 5.2 kWh battery model, 8-station network, weighted scoring function
6 Task Planning & Scheduling 7-day planner, 5 categories, stochastic completion engine
7 Self-Diagnosis & Repair 13 components, degradation model, nano-repair, spare parts inventory
8 Padel Athletics System Court physics, doubles AI, 10 shot types, YOLOv9 tracking, PadelGPT strategy
9 Simulation Engine 60 fps game loop, camera system, first-person rendering, delta-time clamping
10 Multi-Environment Design City vs park comparison, entity distributions, obstacle types
11 Discussion Accessibility vs fidelity, educational value, modular extensibility, limitations
12 Conclusion Summary of contributions, future directions

Includes 9 data tables, mathematical models, and 15 academic references.


Roadmap

  • City simulation with autonomous navigation
  • Park simulation with cinematic flora rendering
  • 7-layer cognitive architecture visualization
  • Battery management with 8 charging stations
  • 7-day task planner with auto-execution
  • 13-component damage & repair system
  • Padel AI doubles (2v2) with full physics
  • Academic paper (12 sections)
  • Full mobile responsive design
  • Multiplayer mode — control Optimus across devices
  • 3D WebGL rendering upgrade
  • Voice command interface
  • Real-time collaborative simulation
  • Additional sport modules (basketball, soccer)
  • Sim-to-real ROS2 bridge

Contributing

Contributions are welcome! Whether it's a bug fix, new feature, or documentation improvement — every contribution helps.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Since the project has zero dependencies, contributing is as simple as editing HTML files.


Author

Romi Nur Ismanto

Independent Researcher — Robotics & Artificial Intelligence


Website Twitter GitHub Hugging Face Email


License

This project is licensed under the MIT License — see the LICENSE file for details.

MIT License

Copyright (c) 2026 Romi Nur Ismanto

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

Acknowledgments

This project builds upon the research and ideas from the broader robotics and AI community:

  • Vision Transformers — Dosovitskiy et al. (ViT, 2021)
  • DETR — Carion et al. (End-to-End Object Detection, 2020)
  • Dreamer-v3 — Hafner et al. (World Models, 2023)
  • PPO — Schulman et al. (Proximal Policy Optimization, 2017)
  • SAC — Haarnoja et al. (Soft Actor-Critic, 2018)
  • YOLOv9 — Wang et al. (Programmable Gradient Information, 2024)
  • LLaMA — Touvron et al. (Open Foundation Models, 2023)
  • MediaPipe — Lugaresi et al. (Perception Pipelines, 2019)
  • Tesla Optimus — Gen-2 Humanoid Robot specifications
  • International Padel Federation — Official Rules of Padel


Built with pure HTML5 Canvas + JavaScript

Zero frameworks. Zero dependencies. Zero excuses.


If you find this project useful or interesting, please consider giving it a star!

Star this repo


Made with dedication by Romi Nur Ismanto — Powered by Optimus Prime AI

About

Optimus Perceptron — Humanoid AI Robot Simulation Suite | 7-Layer Cognitive Architecture | 9 Integrated Tabs | Padel AI Doubles 2v2 | Zero Dependencies | Pure HTML5 Canvas + JavaScript | 60fps Browser Simulation

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages