Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
cf498d6
feat(discovery): add HostInfoProvider for default Component from host…
bburda Mar 29, 2026
5d2bfeb
feat(discovery): map namespaces to Functions instead of Areas/Components
bburda Mar 29, 2026
9e10f1c
feat(discovery): wire HostInfoProvider as default Component in runtim…
bburda Mar 29, 2026
9a84514
test: update integration tests for SOVD-aligned entity model
bburda Mar 29, 2026
77fdcc2
feat(aggregation): add resource collections on Function/Area entities…
bburda Mar 30, 2026
da6063e
feat(aggregation): add PeerClient and extend EntityInfo with remote f…
bburda Mar 30, 2026
ff15f5e
feat(aggregation): add EntityMerger with type-aware merge logic
bburda Mar 30, 2026
883f317
feat(aggregation): add AggregationManager with routing, fan-out, heal…
bburda Mar 30, 2026
5c78ae7
feat(aggregation): add StreamProxy abstraction with SSE implementation
bburda Mar 30, 2026
8193518
feat(aggregation): add MdnsDiscovery with configurable announce/discover
bburda Mar 30, 2026
48b4953
fix: remove unused includes and variables in stream_proxy and mdns_di…
bburda Mar 30, 2026
d276bec
feat(aggregation): wire HandlerContext forwarding and GatewayNode int…
bburda Mar 30, 2026
48fdb24
feat(aggregation): add global fan-out for faults and peer status in h…
bburda Mar 30, 2026
cbd520f
docs: add aggregation design doc, config reference, and multi-instanc…
bburda Mar 30, 2026
76224fa
test: add integration tests for entity model and peer aggregation
bburda Mar 30, 2026
1c9cebd
fix: area logs fall through to prefix matching when no components linked
bburda Mar 30, 2026
6de8b95
fix: thread safety, response limits, and header filtering in aggregation
bburda Mar 30, 2026
41b959a
fix: mDNS self-discovery filter, URL validation, socket logging, and …
bburda Mar 30, 2026
038b963
docs: fix stale config examples, entity model descriptions, and cross…
bburda Mar 30, 2026
cf63cc0
test: add mock server tests for PeerClient, SSE streaming, and mDNS c…
bburda Mar 30, 2026
f267a18
fix: body size limits, loopback blocking, lock patterns, and stale docs
bburda Mar 30, 2026
9a8d16e
fix: strip peer prefix from forwarded paths and validate static peer …
bburda Mar 30, 2026
8ac9f8d
fix: move mdns.h to vendored dir (unformatted) and fix copyright headers
bburda Mar 30, 2026
6a45aa2
fix: PlantUML syntax errors in aggregation design doc
bburda Mar 30, 2026
825c75d
fix: remove parentheses and block skinparams from PlantUML diagrams
bburda Mar 30, 2026
cae5367
fix: PlantUML compat with older version - remove actor from class dia…
bburda Mar 30, 2026
9065a3e
fix: correct mDNS query response record order per RFC 6763 (PTR answe…
bburda Mar 30, 2026
97d8534
fix: case-insensitive mDNS service name matching per RFC 1035
bburda Mar 31, 2026
f16e470
fix: add diagnostic logging to mDNS announce and browse loops
bburda Mar 31, 2026
05bb5eb
fix: use hostname for mDNS instance name instead of ROS node name
bburda Mar 31, 2026
e093744
fix: document mdns_name parameter, fix tutorial mDNS examples, add in…
bburda Mar 31, 2026
c546cb7
fix: unmanifested_nodes ignore policy now hides orphan apps and their…
bburda Mar 31, 2026
f7a959e
refactor: remove synthetic/heuristic Area and Component creation per …
bburda Mar 31, 2026
99692c3
fix: component merge by ID, function hosts from cache, and remote com…
bburda Mar 31, 2026
5271f5b
fix: handle_get_area cache-first lookup, update merge docs for Compon…
bburda Mar 31, 2026
49176a4
fix: apply cache-first lookup to all remaining discovery detail handlers
bburda Mar 31, 2026
b38be37
fix: parse parent_component_id, depends_on, and hosts from peer entit…
bburda Mar 31, 2026
23564b0
test: verify fetch_entities parses relationship fields from peer resp…
bburda Mar 31, 2026
20b6c41
fix: fetch entity details from peers for complete relationship data
bburda Mar 31, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ repos:
- id: clang-format
name: clang-format
types_or: [c, c++]
exclude: /vendored/
exclude: (/vendored/|/third_party/)
# Uses the project's .clang-format automatically

# ── CMake ──────────────────────────────────────────────────────────────
Expand Down Expand Up @@ -60,7 +60,7 @@ repos:
entry: ament_copyright
language: system
types_or: [c, c++, python, cmake]
exclude: /vendored/
exclude: (/vendored/|/third_party/)

# ── Incremental clang-tidy (pre-push only) ────────────────────────
# Requires: pre-commit install --hook-type pre-push
Expand All @@ -71,5 +71,5 @@ repos:
entry: ./scripts/clang-tidy-diff.sh
language: system
types: [c++]
exclude: /vendored/
exclude: (/vendored/|/third_party/)
stages: [pre-push]
301 changes: 301 additions & 0 deletions docs/config/aggregation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,301 @@
Aggregation Configuration
=========================

This reference describes all aggregation-related configuration options for
multi-instance peer aggregation in ros2_medkit_gateway.

.. contents:: Table of Contents
:local:
:depth: 2

Overview
--------

Aggregation allows multiple gateway instances to federate their entity trees
into a single unified API. A primary gateway merges entities from peer gateways
and transparently forwards requests for remote entities.

All aggregation parameters are under the ``aggregation`` key in
``gateway_params.yaml`` or can be set via ROS 2 parameters.

Quick Start
-----------

.. code-block:: bash

# Enable aggregation with a static peer
ros2 run ros2_medkit_gateway gateway_node --ros-args \
-p aggregation.enabled:=true \
-p aggregation.peer_urls:="['http://192.168.1.10:8080']" \
-p aggregation.peer_names:="['peer_b']"

Or in ``gateway_params.yaml``:

.. code-block:: yaml

ros2_medkit_gateway:
ros__parameters:
aggregation:
enabled: true
peer_urls: ["http://192.168.1.10:8080"]
peer_names: ["peer_b"]

Core Parameters
---------------

.. list-table::
:header-rows: 1
:widths: 30 10 10 50

* - Parameter
- Type
- Default
- Description
* - ``aggregation.enabled``
- bool
- ``false``
- Master switch for peer aggregation. When disabled, the gateway operates
in standalone mode with no peer communication.
* - ``aggregation.timeout_ms``
- int
- ``2000``
- HTTP timeout in milliseconds for all peer communication: health checks,
entity fetching, and request forwarding. Increase for high-latency
networks.

mDNS Discovery Parameters
--------------------------

.. list-table::
:header-rows: 1
:widths: 30 10 10 50

* - Parameter
- Type
- Default
- Description
* - ``aggregation.announce``
- bool
- ``false``
- Broadcast this gateway's presence via mDNS. Other gateways on the local
network can discover this instance automatically. Opt-in to avoid
surprising network behavior.
* - ``aggregation.discover``
- bool
- ``false``
- Browse for peer gateways via mDNS. When a new peer is found, it is
automatically added to the peer list. Opt-in to avoid surprising
network behavior.
* - ``aggregation.mdns_service``
- string
- ``"_medkit._tcp.local"``
- mDNS service type used for announcement and browsing. All gateways in
the same aggregation cluster must use the same service type.
* - ``aggregation.mdns_name``
- string
- ``""``
- mDNS instance name for announcement and self-discovery filtering.
Defaults to the system hostname (via ``gethostname()``). Must be unique
per gateway instance. Set explicitly when running multiple gateways on
the same host - otherwise they share the same hostname and filter each
other out as "self".

Static Peers
------------

Static peers are configured as parallel arrays: ``peer_urls[i]`` pairs with
``peer_names[i]``. Both arrays must have the same length. Empty-string entries
are ignored.

.. list-table::
:header-rows: 1
:widths: 30 10 10 50

* - Parameter
- Type
- Default
- Description
* - ``aggregation.peer_urls``
- string[]
- ``[""]``
- List of peer gateway base URLs (e.g.,
``["http://192.168.1.10:8080", "http://192.168.1.11:8080"]``).
Each URL must include the scheme and port.
* - ``aggregation.peer_names``
- string[]
- ``[""]``
- List of human-readable names for peers (e.g.,
``["arm_controller", "base_platform"]``).
Used as prefix for collision resolution (e.g., ``peername__entity_id``)
and in the routing table. Must be unique across all peers.

Scenario Examples
-----------------

Star Topology
~~~~~~~~~~~~~

One primary gateway aggregates from three subsystem gateways:

.. code-block:: yaml

# Primary gateway (host-A, port 8080)
ros2_medkit_gateway:
ros__parameters:
aggregation:
enabled: true
timeout_ms: 3000
announce: false
discover: false # Use static peers only
peer_urls: ["http://192.168.1.10:8080", "http://192.168.1.11:8080", "http://192.168.1.12:8080"]
peer_names: ["arm_controller", "base_platform", "sensor_array"]

The leaf gateways do not need aggregation enabled - they serve their own
entities independently. Only the primary gateway needs aggregation.

Chain Topology
~~~~~~~~~~~~~~

Gateway A aggregates from B, which aggregates from C:

.. code-block:: yaml

# Gateway A (top-level aggregator)
ros2_medkit_gateway:
ros__parameters:
server:
port: 8080
aggregation:
enabled: true
peer_urls: ["http://gateway-b:8080"]
peer_names: ["subsystem_b"]

.. code-block:: yaml

# Gateway B (mid-level aggregator)
ros2_medkit_gateway:
ros__parameters:
server:
port: 8080
aggregation:
enabled: true
peer_urls: ["http://gateway-c:8080"]
peer_names: ["subsystem_c"]

.. code-block:: yaml

# Gateway C (leaf - no aggregation needed)
ros2_medkit_gateway:
ros__parameters:
server:
port: 8080
aggregation:
enabled: false

Gateway A sees entities from A + B + C. Gateway B sees entities from B + C.

mDNS-Only Discovery
~~~~~~~~~~~~~~~~~~~~

Fully automatic peer discovery with no static configuration:

.. code-block:: yaml

# All gateways use the same config
ros2_medkit_gateway:
ros__parameters:
aggregation:
enabled: true
announce: true
discover: true
mdns_service: "_medkit._tcp.local"
# No static peers - all discovery via mDNS

All gateways on the same network segment automatically find each other. When a
gateway starts, it announces itself and discovers existing peers. When a gateway
stops, it sends an mDNS goodbye and peers remove it automatically.

.. note::

mDNS requires multicast network support. Docker containers using bridge
networking may not support mDNS - use static peers or host networking
instead.

Mixed Static + mDNS
~~~~~~~~~~~~~~~~~~~~

Combine static peers for known infrastructure with mDNS for dynamic discovery:

.. code-block:: yaml

ros2_medkit_gateway:
ros__parameters:
aggregation:
enabled: true
announce: true
discover: true
# Always connect to the base platform
peer_urls: ["http://base-platform:8080"]
peer_names: ["base_platform"]
# Additional peers discovered via mDNS at runtime

.. note::

When authentication is enabled, the gateway forwards the client's
``Authorization`` header to peer gateways. Ensure all peers use
the same JWT configuration. This means peer gateways receive
client credentials - only configure peers you trust.

Entity Merge Behavior
---------------------

When aggregation is enabled, entities from peers are merged with local entities:

- **Areas, Functions, and Components**: Merged by ID. If both local and remote
have the same ID, they become one entity. Areas and Functions represent logical
groupings that span hosts. Components represent physical hosts or ECUs defined
in manifests - the same Component across peers is the same physical entity.

- **Apps**: Prefixed on collision. If a remote App has the same ID as a local
one, the remote App's ID is prefixed with ``peername__`` (double underscore).
For example, App ``camera_driver`` from peer ``arm`` becomes
``arm__camera_driver``. Apps represent individual ROS 2 nodes with unique
behavior.

Requests for remote entities are transparently forwarded to the owning peer.
The routing table maps entity IDs to peer names.

See :doc:`../design/ros2_medkit_gateway/aggregation` for detailed merge logic
and architecture diagrams.

Health and Partial Results
--------------------------

If a peer is unreachable during a fan-out request (e.g., ``GET /api/v1/components``),
the response body includes:

- ``x-medkit.partial: true`` in the JSON response body
- ``x-medkit.failed_peers`` listing which peers failed

This allows clients to detect degraded responses and take appropriate action.
Individual entity requests for remote entities return ``502 Bad Gateway`` if the
owning peer is unreachable.

Migration Notes
---------------

The entity model has been simplified. Synthetic/heuristic Area and Component
creation from ROS 2 namespaces has been removed:

- **Areas** come from manifest only. Runtime discovery never creates Areas.
- **Components** come from ``HostInfoProvider`` (single host-level Component)
or manifest. Runtime discovery never creates Components.
- **Functions** are created from namespace grouping (``create_functions_from_namespaces``
defaults to ``true``).
- **Apps** are created from ROS 2 nodes with ``source: "heuristic"``.

The removed parameters are: ``create_synthetic_areas``,
``create_synthetic_components``, ``grouping_strategy``,
``synthetic_component_name_pattern``, ``topic_only_policy``,
``min_topics_for_component``, ``allow_heuristic_areas``, and
``allow_heuristic_components``.
Loading
Loading