From f03df53f58e5a1fd27119057b02f29fa5c7b661a Mon Sep 17 00:00:00 2001
From: Peter Streef <p.streef@gmail.com>
Date: Tue, 31 Mar 2026 09:50:22 +0200
Subject: [PATCH 1/4] Add agent scaling, routing, and deployment guidance

Expand agent configuration docs with sizing rules of thumb, per-agent
resource recommendations, deployment environment guidance (VMs vs K8s),
and proxy cross-references. Expand the routing reference with detailed
routing mechanics, operation routing table, capability splitting guide,
and common pitfalls.
---
 .../agent-configuration/agent-config.md       | 52 +++++++++++++++++--
 .../references/routing-requests-to-agents.md  | 50 +++++++++++++++++-
 2 files changed, 98 insertions(+), 4 deletions(-)

diff --git a/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md b/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
index 8581606edf..4488cc2143 100644
--- a/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
+++ b/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
@@ -799,14 +799,60 @@ To use the dashboard:
 
 For high availability and increased throughput, you can run multiple Moderne agent instances concurrently.
 
-**Key requirements for multi-instance deployment:**
+### Sizing guidance
+
+The number of agents you need depends on the number of repositories, the performance of your artifact repositories, and how heavily users run recipes.
+
+As a starting point, consider **one agent per 20,000 repositories**. For example, a deployment with 40,000 repositories and daily LST refreshes would typically use 2–3 agents. A deployment with 100,000 repositories might use 5–6.
+
+**Per-agent resource recommendations:**
+
+* CPU: 2–4 cores
+* Memory: 4–8 GB heap
+* Disk: minimal — agents stream artifacts rather than storing them
+* Network: low-latency connectivity to your artifact repositories and SCM
+
+**Signs you need more agents:**
+
+* LST sync jobs take significantly longer than expected
+* Recipe runs queue behind artifact downloads
+* Agent health checks show degraded performance in the Grafana dashboard
+
+These are rough guidelines — monitor agent resource usage and adjust based on your workload.
+
+### Traffic routing
+
+When multiple agents are running, the platform distributes work based on each agent's configuration:
+
+* If two agents are configured with different artifact repositories, each agent handles requests for its own repository.
+* If two agents share the same configuration, requests are distributed across them in a round-robin fashion.
+* The more services an agent is configured with (SCMs, artifact repositories), the more traffic it handles.
+
+This means you can split agents by responsibility — for example, dedicating some agents to artifact repository traffic and others to SCM operations. See [routing requests to agents](../../references/routing-requests-to-agents.md) for a detailed explanation.
+
+:::note
+Building and publishing LSTs is handled by separate containers ([mass ingest](../mass-ingest.md)), not by agents. Recipe execution also does not involve agents — recipes run on Moderne workers in the SaaS environment. Agents pull published LSTs into the platform and handle operations like resolving dependencies and performing SCM operations such as creating branches and commits.
+:::
+
+### Deployment environment
+
+**Virtual machines (recommended):** Static VMs provide the most predictable performance for agents. Agents maintain persistent RSocket connections to the Moderne API Gateway, and VM deployments avoid connection disruption from pod rescheduling. For most deployments, 4–6 static VMs are sufficient.
+
+**Kubernetes:** Agents can run on Kubernetes, but consider the following:
+
+* Use `Recreate` deployment strategy rather than `RollingUpdate` to avoid duplicate agent registrations during deployments
+* Set resource requests equal to limits (guaranteed QoS class) to prevent CPU throttling during artifact transfers
+* Configure liveness and readiness probes using the agent's actuator endpoints (`/actuator/health/liveness` and `/actuator/health/readiness`)
+* Avoid horizontal pod autoscaling — agents maintain long-lived RSocket connections, and scaling events disrupt them
+
+### Requirements for multi-instance deployment
 
 * Each agent instance must have a unique `MODERNE_AGENT_NICKNAME`
 * Each instance requires its own port mapping (e.g., 8080, 8081, 8082)
 * All instances should use the same `MODERNE_AGENT_CRYPTO_SYMMETRICKEY`
 * All instances should connect to the same `MODERNE_AGENT_APIGATEWAYRSOCKETURI`
 
-**Example multi-instance deployment:**
+### Example multi-instance deployment
 
 <Tabs groupId="agent-type">
 <TabItem value="oci-container" label="OCI Container">
@@ -864,7 +910,7 @@ Multiple agent instances will automatically distribute the workload and provide
 
 * **Invalid API endpoint:** Verify the `MODERNE_AGENT_APIGATEWAYRSOCKETURI` matches the URI provided by Moderne
 * **Invalid authentication token:** Confirm the `MODERNE_AGENT_TOKEN` is correct and has not expired
-* **Network connectivity:** Ensure the agent can reach the Moderne API endpoint (check firewalls, proxies, and outbound HTTPS access)
+* **Network connectivity:** Ensure the agent can reach the Moderne API endpoint (check firewalls, proxies, and outbound HTTPS access). If the agent connects through an HTTP proxy or reverse proxy, see [HTTP proxy configuration](./configure-an-agent-to-connect-to-moderne-via-an-http-proxy.md).
 * **SSL/TLS issues:** If using custom certificates, verify they are properly configured in the Java truststore
 
 ### DNS resolution failures in Podman containers
diff --git a/docs/administrator-documentation/moderne-platform/references/routing-requests-to-agents.md b/docs/administrator-documentation/moderne-platform/references/routing-requests-to-agents.md
index 804375bf63..3954c76573 100644
--- a/docs/administrator-documentation/moderne-platform/references/routing-requests-to-agents.md
+++ b/docs/administrator-documentation/moderne-platform/references/routing-requests-to-agents.md
@@ -28,8 +28,56 @@ Depending on the action, requests to these agents are routed differently. Modern
 | yes       | no                                   | many      | No current use case                                                                                                        |
 | no        | no                                   | one       | Git commit                                                                                                                 |
 
+## How routing works
+
+When the Moderne platform needs to communicate with your infrastructure (for example, to download an LST or make a git commit), it selects an agent using the following logic:
+
+1. **Filter by capability** — only agents that have a matching tool configured are considered. For example, a git commit to `github.mycompany.com` only routes to agents that have a GitHub configuration pointing to that host.
+2. **Shuffle and try** — among matching agents, the platform shuffles the list and tries them in order. The first agent to respond successfully handles the request.
+3. **Caching** — the list of available agents is refreshed every 10 seconds. If an agent goes down, it is removed from rotation within that window.
+
+This means:
+
+* There is no explicit load balancing — work distributes naturally across agents through shuffling, but is not evenly weighted.
+* All agents with the same tools configured are interchangeable — you do not need to designate primary and secondary agents.
+* An agent going offline is handled gracefully — requests fail over to the next agent in the shuffled list.
+
+## What routes through agents
+
+Not all platform operations go through agents. Recipe execution happens on Moderne workers in the SaaS environment — agents are not involved.
+
+| Operation                     | Routes through agent? | Routing strategy                             |
+| ----------------------------- | --------------------- | -------------------------------------------- |
+| LST artifact download         | Yes                   | Partitioned by artifact repository           |
+| LST artifact listing          | Yes                   | Partitioned by artifact repository           |
+| Git commit and PR creation    | Yes                   | Routed to agent with matching SCM config     |
+| Repository listing            | Yes                   | Routed to agent with matching SCM config     |
+| Organization mapping          | Yes                   | Agent serves `repos.csv`                     |
+| Maven dependency resolution   | Yes                   | Routed to agent with matching Maven config   |
+| Recipe execution              | No                    | Runs on Moderne workers (SaaS-side)          |
+
+## Splitting agents by capability
+
+You can run specialized agents by configuring different tools on different instances. This is useful when:
+
+* Your artifact repository and SCM are on different network segments
+* You want to scale LST artifact downloads independently from SCM operations
+* Different tools require different authentication credentials
+
+### Example: split by tool type
+
+| Agent               | Configured tools    | Handles                                  |
+| ------------------- | ------------------- | ---------------------------------------- |
+| `artifact-agent-1`  | Artifactory         | LST downloads, recipe artifact sync      |
+| `artifact-agent-2`  | Artifactory         | LST downloads (additional throughput)    |
+| `scm-agent-1`       | GitHub, GitLab      | Git commits, PR creation, repo listing   |
+
+### Common pitfall: missing SCM operations after splitting
+
+If you split agents and find that git operations (commits, PRs) are no longer working, check that at least one running agent has the SCM tool configured. The platform routes git operations only to agents that have the matching SCM configuration — if no agent has it, those operations will fail silently.
+
 ## Multi-tenant customers
 
 For multi-tenant customers, Moderne runs an agent that connects to your artifact repositories. For instance, if you work for a company whose email addresses end with `@mycompany.com`, Moderne configures an agent for you with a `tenantDomain` of `mycompany.com`.
 
-If a user is logged into Moderne with an `@mycompany.com` email address, they will find that their requests (e.g., Maven resolution requests) are made to the `mycompony.com` artifact repositories.
+If a user is logged into Moderne with an `@mycompany.com` email address, they will find that their requests (e.g., Maven resolution requests) are made to the `mycompany.com` artifact repositories.

From 03714899dd6c382e2a18c61b90c1b63dd12f74f2 Mon Sep 17 00:00:00 2001
From: Peter Streef <p.streef@gmail.com>
Date: Tue, 31 Mar 2026 09:57:27 +0200
Subject: [PATCH 2/4] Fix review feedback on agent scaling docs

- Reword misleading "recipe runs queue" bullet to clarify agents don't
  run recipes
- Remove unsupported "4-6 VMs" claim that conflicts with sizing guidance
- Add downtime tradeoff note to Recreate strategy recommendation
---
 .../how-to-guides/agent-configuration/agent-config.md       | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md b/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
index 4488cc2143..55f5e1187e 100644
--- a/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
+++ b/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
@@ -815,7 +815,7 @@ As a starting point, consider **one agent per 20,000 repositories**. For example
 **Signs you need more agents:**
 
 * LST sync jobs take significantly longer than expected
-* Recipe runs queue behind artifact downloads
+* LST artifacts are unavailable or stale because agents cannot keep up with syncing
 * Agent health checks show degraded performance in the Grafana dashboard
 
 These are rough guidelines — monitor agent resource usage and adjust based on your workload.
@@ -836,11 +836,11 @@ Building and publishing LSTs is handled by separate containers ([mass ingest](..
 
 ### Deployment environment
 
-**Virtual machines (recommended):** Static VMs provide the most predictable performance for agents. Agents maintain persistent RSocket connections to the Moderne API Gateway, and VM deployments avoid connection disruption from pod rescheduling. For most deployments, 4–6 static VMs are sufficient.
+**Virtual machines (recommended):** Static VMs provide the most predictable performance for agents. Agents maintain persistent RSocket connections to the Moderne API Gateway, and VM deployments avoid connection disruption from pod rescheduling.
 
 **Kubernetes:** Agents can run on Kubernetes, but consider the following:
 
-* Use `Recreate` deployment strategy rather than `RollingUpdate` to avoid duplicate agent registrations during deployments
+* Use `Recreate` deployment strategy rather than `RollingUpdate` to avoid duplicate agent registrations during deployments. This causes brief downtime during deploys, but the platform handles agent unavailability gracefully.
 * Set resource requests equal to limits (guaranteed QoS class) to prevent CPU throttling during artifact transfers
 * Configure liveness and readiness probes using the agent's actuator endpoints (`/actuator/health/liveness` and `/actuator/health/readiness`)
 * Avoid horizontal pod autoscaling — agents maintain long-lived RSocket connections, and scaling events disrupt them

From 0d9899f5e9aa7dd09d1861231231c4dbea2a8efe Mon Sep 17 00:00:00 2001
From: Peter Streef <p.streef@gmail.com>
Date: Thu, 2 Apr 2026 09:55:48 +0200
Subject: [PATCH 3/4] Add shared tool config requirement for multi-agent
 deployments

When multiple agents configure the same tool (e.g. same GitHub URL),
their credentials must be identical because requests shuffle across
matching agents. A multi-step OAuth flow can span agents and will fail
if credentials differ.
---
 .../how-to-guides/agent-configuration/agent-config.md            | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md b/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
index 55f5e1187e..ede284cd20 100644
--- a/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
+++ b/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
@@ -851,6 +851,7 @@ Building and publishing LSTs is handled by separate containers ([mass ingest](..
 * Each instance requires its own port mapping (e.g., 8080, 8081, 8082)
 * All instances should use the same `MODERNE_AGENT_CRYPTO_SYMMETRICKEY`
 * All instances should connect to the same `MODERNE_AGENT_APIGATEWAYRSOCKETURI`
+* If multiple agents configure the same tool (e.g., the same GitHub URL), those configurations must be identical — same OAuth client ID/secret, same credentials. Because requests are shuffled across matching agents, a multi-step flow like OAuth authentication can span multiple agents. If the credentials differ between agents, the flow will fail.
 
 ### Example multi-instance deployment
 

From 6a603fd1adec69f02a4e4d661af7f2e703aa580e Mon Sep 17 00:00:00 2001
From: Peter Streef <p.streef@gmail.com>
Date: Tue, 7 Apr 2026 10:40:26 +0200
Subject: [PATCH 4/4] Use round-robin terminology, fix bullet punctuation

- Replace "shuffle and try" with "round-robin" in routing docs
  (simpler mental model for customers)
- Consistent trailing periods on K8s bullet points
---
 .../how-to-guides/agent-configuration/agent-config.md       | 6 +++---
 .../references/routing-requests-to-agents.md                | 5 ++---
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md b/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
index ede284cd20..596d5fdf8b 100644
--- a/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
+++ b/docs/administrator-documentation/moderne-platform/how-to-guides/agent-configuration/agent-config.md
@@ -841,9 +841,9 @@ Building and publishing LSTs is handled by separate containers ([mass ingest](..
 **Kubernetes:** Agents can run on Kubernetes, but consider the following:
 
 * Use `Recreate` deployment strategy rather than `RollingUpdate` to avoid duplicate agent registrations during deployments. This causes brief downtime during deploys, but the platform handles agent unavailability gracefully.
-* Set resource requests equal to limits (guaranteed QoS class) to prevent CPU throttling during artifact transfers
-* Configure liveness and readiness probes using the agent's actuator endpoints (`/actuator/health/liveness` and `/actuator/health/readiness`)
-* Avoid horizontal pod autoscaling — agents maintain long-lived RSocket connections, and scaling events disrupt them
+* Set resource requests equal to limits (guaranteed QoS class) to prevent CPU throttling during artifact transfers.
+* Configure liveness and readiness probes using the agent's actuator endpoints (`/actuator/health/liveness` and `/actuator/health/readiness`).
+* Avoid horizontal pod autoscaling — agents maintain long-lived RSocket connections, and scaling events disrupt them.
 
 ### Requirements for multi-instance deployment
 
diff --git a/docs/administrator-documentation/moderne-platform/references/routing-requests-to-agents.md b/docs/administrator-documentation/moderne-platform/references/routing-requests-to-agents.md
index 3954c76573..67920cad90 100644
--- a/docs/administrator-documentation/moderne-platform/references/routing-requests-to-agents.md
+++ b/docs/administrator-documentation/moderne-platform/references/routing-requests-to-agents.md
@@ -33,14 +33,13 @@ Depending on the action, requests to these agents are routed differently. Modern
 When the Moderne platform needs to communicate with your infrastructure (for example, to download an LST or make a git commit), it selects an agent using the following logic:
 
 1. **Filter by capability** — only agents that have a matching tool configured are considered. For example, a git commit to `github.mycompany.com` only routes to agents that have a GitHub configuration pointing to that host.
-2. **Shuffle and try** — among matching agents, the platform shuffles the list and tries them in order. The first agent to respond successfully handles the request.
+2. **Round-robin** — among matching agents, requests are distributed in round-robin fashion. If an agent fails to respond, the platform tries the next one.
 3. **Caching** — the list of available agents is refreshed every 10 seconds. If an agent goes down, it is removed from rotation within that window.
 
 This means:
 
-* There is no explicit load balancing — work distributes naturally across agents through shuffling, but is not evenly weighted.
 * All agents with the same tools configured are interchangeable — you do not need to designate primary and secondary agents.
-* An agent going offline is handled gracefully — requests fail over to the next agent in the shuffled list.
+* An agent going offline is handled gracefully — requests fail over to the next agent in the rotation.
 
 ## What routes through agents