Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions dimos/robot/all_blueprints.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@
"demo-skill": "dimos.agents.skills.demo_skill:demo_skill",
"drone-agentic": "dimos.robot.drone.blueprints.agentic.drone_agentic:drone_agentic",
"drone-basic": "dimos.robot.drone.blueprints.basic.drone_basic:drone_basic",
"drone-tello-tt-agentic": "dimos.robot.drone.blueprints.agentic.drone_tello_tt_agentic:drone_tello_tt_agentic",
"drone-tello-tt-basic": "dimos.robot.drone.blueprints.basic.drone_tello_tt_basic:drone_tello_tt_basic",
"dual-xarm6-planner": "dimos.manipulation.blueprints:dual_xarm6_planner",
"keyboard-teleop-piper": "dimos.robot.manipulators.piper.blueprints:keyboard_teleop_piper",
"keyboard-teleop-xarm6": "dimos.robot.manipulators.xarm.blueprints:keyboard_teleop_xarm6",
Expand Down Expand Up @@ -161,6 +163,7 @@
"simple-phone-teleop": "dimos.teleop.phone.phone_extensions",
"spatial-memory": "dimos.perception.spatial_perception",
"speak-skill": "dimos.agents.skills.speak_skill",
"tello-connection-module": "dimos.robot.drone.tello_connection_module",
"temporal-memory": "dimos.perception.experimental.temporal_memory.temporal_memory",
"twist-teleop-module": "dimos.teleop.quest.quest_extensions",
"unitree-g1-skill-container": "dimos.robot.unitree.g1.skill_container",
Expand Down
65 changes: 65 additions & 0 deletions dimos/robot/drone/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,34 @@ dimos run drone-basic --set outdoor=true

# Agentic with LLM control
dimos run drone-agentic

# RoboMaster TT / Tello (Wi-Fi SDK)
dimos run drone-tello-tt-basic --robot-ip 192.168.10.1
dimos run drone-tello-tt-agentic --robot-ip 192.168.10.1 --disable web-input
```

To interact with the agent, run `dimos humancli` in a separate terminal.

## RoboMaster TT / Tello Hardware + Network Setup

Tello control traffic and cloud API traffic should be split across two interfaces.

1. Power on the drone and connect to `TELLO-XXXXXX` from one network interface.
2. Keep internet on a separate interface:
- wired Ethernet is recommended, or
- a second Wi-Fi adapter connected to normal internet.
3. Confirm network state:
- Tello link: `ip a` should show `192.168.10.x` on the Tello interface.
- Tello subnet route: `192.168.10.0/24` on that same interface.
- Default route should prefer internet interface (not `192.168.10.1`).
4. Verify both paths:
- `ping -c 1 192.168.10.1`
- `curl -I https://api.openai.com`
5. Run the Tello blueprint:
- `dimos --robot-ip 192.168.10.1 run drone-tello-tt-agentic --disable web-input`

If you use only one Wi-Fi radio and connect it to Tello, cloud model calls usually fail due to loss of internet routing.

## Blueprints

### `drone-basic`
Expand All @@ -48,6 +72,25 @@ Composes on top of `drone-basic`, adding autonomous capabilities:
| `McpServer` + `McpClient` | LLM agent (default: GPT-4o) via MCP |
| `WebInput` | Web/CLI interface for human commands |

### `drone-tello-tt-basic`
RoboMaster TT/Tello SDK control over Wi-Fi UDP.

| Module | Purpose |
|--------|---------|
| `TelloConnectionModule` | Tello command/state/video bridge, skills, and follow commands |
| `DroneCameraModule` | Camera intrinsics + camera streams |
| `WebsocketVisModule` | Web-based visualization |
| `RerunBridgeModule` / `FoxgloveBridge` | 3D viewer (selected by `--viewer`) |

### `drone-tello-tt-agentic`
Composes on top of `drone-tello-tt-basic`, adding:

| Module | Purpose |
|--------|---------|
| `DroneTrackingModule` | Person detection, follow, yaw-centering, tracking overlay |
| `Agent` | LLM agent (default: GPT-4o) |
| `WebInput` | Web/CLI interface for human commands |

## Installation

### Python (included with DimOS)
Expand All @@ -73,6 +116,17 @@ export OPENAI_API_KEY=sk-...

# Optional
export GOOGLE_MAPS_API_KEY=... # For GoogleMapsSkillContainer
export ALIBABA_API_KEY=... # Optional Qwen detection for tracking
```

### Optional Tello Tracking Tuning
These variables are optional and only affect local YOLO person detection in `DroneTrackingModule`:

```bash
export DIMOS_DRONE_YOLO_DEVICE=cuda:0 # or cpu
export DIMOS_DRONE_YOLO_MODEL=yolo11s-pose.pt
export DIMOS_DRONE_YOLO_IMGSZ=416
export DIMOS_DRONE_YOLO_MAX_DET=5
```

## RosettaDrone Setup (Critical)
Expand Down Expand Up @@ -139,12 +193,16 @@ DJI Drone ← Wireless → DJI Controller ← USB → Android Device ← WiFi
dimos/robot/drone/
├── blueprints/
│ ├── basic/drone_basic.py # Base blueprint (connection + camera + vis)
│ ├── basic/drone_tello_tt_basic.py # Tello basic blueprint
│ └── agentic/drone_agentic.py # Agentic blueprint (composes on basic)
│ └── agentic/drone_tello_tt_agentic.py # Tello agentic blueprint
├── connection_module.py # MAVLink communication & skills
├── camera_module.py # Camera processing & intrinsics
├── drone_tracking_module.py # Visual servoing & object tracking
├── drone_visual_servoing_controller.py # PID-based visual servoing
├── mavlink_connection.py # Low-level MAVLink protocol
├── tello_connection_module.py # Tello module and skills
├── tello_sdk.py # Low-level Tello UDP SDK adapter
└── dji_video_stream.py # GStreamer video capture + replay
```

Expand Down Expand Up @@ -198,6 +256,13 @@ Parameters: `(Kp, Ki, Kd, (min_output, max_output), integral_limit, deadband_pix
4. Velocity commands sent via LCM stream
5. Connection module converts to MAVLink commands

### Tello Tracking Flow
1. Tello camera stream is decoded in `TelloSdkClient`.
2. `DroneTrackingModule` receives `/video`, publishes `/tracking_overlay`.
3. For TT agentic blueprint, person follow uses detector-driven tracking with yaw-centering and forward approach.
4. `cmd_vel` is remapped to `/movecmd_twist` and converted to Tello `rc` commands by `TelloConnectionModule`.
5. Agent skills call `follow_object`, `center_person_by_yaw`, and `orbit_object`.

## Available Skills

All skills are exposed to the LLM agent via the `@skill` decorator on `DroneConnectionModule`:
Expand Down
57 changes: 57 additions & 0 deletions dimos/robot/drone/blueprints/agentic/drone_tello_tt_agentic.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
#!/usr/bin/env python3

# Copyright 2025-2026 Dimensional Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Agentic TT/Tello blueprint with tracking and LLM control."""

from dimos.agents.agent import agent
from dimos.agents.web_human_input import web_input
from dimos.core.blueprints import autoconnect
from dimos.robot.drone.blueprints.basic.drone_tello_tt_basic import drone_tello_tt_basic
from dimos.robot.drone.drone_tracking_module import DroneTrackingModule

TELLO_TT_SYSTEM_PROMPT = """\
You are controlling a RoboMaster TT (Tello Talent) drone over the Tello SDK.
Use short, safe movement commands and keep altitude conservative indoors.
Prefer follow_object, orbit_object, move, move_relative, yaw, takeoff, and land.
For "scan/find person then follow", call follow_object(object_description="person", distance_m=1.0).
For any "hover + rotate/center person in frame" request, ALWAYS call
center_person_by_yaw(duration=..., scan_step_deg=0.0, max_scan_steps=1).
Do NOT call follow_object in default mode for rotate-only requests.
For "circle around person", call orbit_object(radius_m=1.0, revolutions=1.0).
Always end with land() when user asks to land.
Use observe to inspect the environment before aggressive motion.
Report what command you executed and the observed result.
"""

drone_tello_tt_agentic = autoconnect(
drone_tello_tt_basic,
DroneTrackingModule.blueprint(
outdoor=False,
enable_passive_overlay=True,
use_local_person_detector=True,
force_detection_servoing_for_person=True,
person_follow_policy="yaw_forward_constant",
),
agent(system_prompt=TELLO_TT_SYSTEM_PROMPT, model="gpt-4o"),
web_input(),
).remappings(
[
(DroneTrackingModule, "video_input", "video"),
(DroneTrackingModule, "cmd_vel", "movecmd_twist"),
]
)

__all__ = ["TELLO_TT_SYSTEM_PROMPT", "drone_tello_tt_agentic"]
96 changes: 96 additions & 0 deletions dimos/robot/drone/blueprints/basic/drone_tello_tt_basic.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
#!/usr/bin/env python3

# Copyright 2025-2026 Dimensional Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Basic RoboMaster TT/Tello blueprint with connection, camera, and visualization."""

from typing import Any

from dimos.core.blueprints import autoconnect
from dimos.core.global_config import global_config
from dimos.protocol.pubsub.impl.lcmpubsub import LCM
from dimos.robot.drone.camera_module import DroneCameraModule
from dimos.robot.drone.tello_connection_module import TelloConnectionModule
from dimos.web.websocket_vis.websocket_vis_module import websocket_vis


def _static_drone_body(rr: Any) -> list[Any]:
"""Static visualization of TT body."""
return [
rr.Boxes3D(
half_sizes=[0.09, 0.09, 0.03],
colors=[(40, 180, 90)],
),
rr.Transform3D(parent_frame="tf#/base_link"),
]


def _drone_rerun_blueprint() -> Any:
"""Split layout: camera feed + 3D world view side by side."""
import rerun as rr
import rerun.blueprint as rrb

return rrb.Blueprint(
rrb.Horizontal(
rrb.Spatial2DView(origin="world/video", name="Camera"),
rrb.Spatial2DView(origin="world/tracking_overlay", name="Tracking Overlay"),
rrb.Spatial3DView(
origin="world",
name="3D",
background=rrb.Background(kind="SolidColor", color=[0, 0, 0]),
line_grid=rrb.LineGrid3D(
plane=rr.components.Plane3D.XY.with_distance(0.2),
),
),
column_shares=[1, 1, 2],
),
)


_rerun_config = {
"blueprint": _drone_rerun_blueprint,
"pubsubs": [LCM()],
"static": {
"world/tf/base_link": _static_drone_body,
},
}

if global_config.viewer == "foxglove":
from dimos.robot.foxglove_bridge import foxglove_bridge

_vis = foxglove_bridge()
elif global_config.viewer.startswith("rerun"):
from dimos.visualization.rerun.bridge import _resolve_viewer_mode, rerun_bridge

_vis = rerun_bridge(viewer_mode=_resolve_viewer_mode(), **_rerun_config)
else:
_vis = autoconnect()

tello_ip = global_config.robot_ip or "192.168.10.1"

drone_tello_tt_basic = autoconnect(
_vis,
TelloConnectionModule.blueprint(
tello_ip=tello_ip,
local_ip="",
local_command_port=9000,
state_port=8890,
video_port=11111,
),
DroneCameraModule.blueprint(camera_intrinsics=[920.0, 920.0, 480.0, 360.0]),
websocket_vis(),
)

__all__ = ["drone_tello_tt_basic"]
Loading
Loading