Skip to content

Feature: Network-AI as MCP coordination layer for multi-agent NeMo workflows #1807

@Jovancoding

Description

@Jovancoding

Problem

When NeMo Agent Toolkit workflows involve multiple agents running concurrently (e.g., a LangChain researcher + CrewAI reviewer + custom validator), there is no built-in mechanism to prevent state conflicts between them. If two agents write to the same shared resource simultaneously, you get last-write-wins -- silent data corruption with no error thrown.

Proposed Solution

Network-AI is an open-source (MIT) MCP server that provides atomic state coordination for multi-agent systems. Since NeMo Agent Toolkit already supports MCP clients, Network-AI can be used directly as an MCP tool server -- no code changes needed in NAT.

What Network-AI adds via MCP (22 tools):

Category Tools Description
Atomic shared blackboard blackboard_read, blackboard_write, blackboard_list, blackboard_delete, blackboard_exists propose, validate, commit with file-system mutex
Permission gating token_create, token_validate, token_revoke HMAC-SHA256 scoped tokens
Token budgets budget_status, budget_spend, budget_reset Per-agent cost ceilings
Agent lifecycle agent_list, agent_spawn, agent_stop Manage agent instances
Audit trail audit_query HMAC-signed append-only log
FSM transitions fsm_transition State machine governance
Configuration config_get, config_set Live orchestrator config

How it works with NeMo Agent Toolkit

Network-AI runs as a stdio or SSE MCP server. NAT agents connect as MCP clients and use the tools to coordinate:

# Start Network-AI MCP server (stdio -- compatible with NAT MCP client)
npx network-ai

Example workflow -- a multi-agent pipeline where agents share state through the blackboard instead of passing raw context:

  1. Agent A proposes a state change via blackboard_write with propose, validate, commit
  2. Agent B reads the validated state via blackboard_read
  3. Both agents check budget before expensive operations via budget_status / budget_spend
  4. All writes are permission-gated via token_create / token_validate
  5. Full audit trail via audit_query

Why this matters for NAT

  • Race condition prevention: NAT profiler can identify bottlenecks, but Network-AI prevents the state corruption that causes them
  • Framework-agnostic coordination: Works across all 8 NAT-supported frameworks (LangChain, CrewAI, Agno, AutoGen, etc.) via MCP
  • Zero integration code: NAT already has MCP client support -- agents just call the tools
  • 1,449 tests across 18 suites, Glama A/A/A score, Socket 100/100

Links

Happy to contribute an example workflow showing NAT agents coordinating through Network-AI MCP tools if there is interest.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Needs TriageNeed team to review and classifyfeature requestNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions