close
Skip to content

Latest commit

 

History

History
61 lines (45 loc) · 3.2 KB

File metadata and controls

61 lines (45 loc) · 3.2 KB

The Network Discovery Pattern (DNS for Agents)

When moving from a local, single-developer environment (where agents use local files and symlinks) to a distributed, cross-organizational enterprise architecture, we face the "Agent Discovery Problem."

How does an agent from Team A (e.g., Application Security) discover what tools and context it can use on Team B's infrastructure (e.g., Cloud Networking) without having direct access to their code repositories or filesystem?

The Problem with Symlinks and Remote Files

In local environments, our library.yaml acts as the definitive index. The agent reads it directly from the disk.

If we naively try to scale this to multiple teams, an agent would have to:

  1. Know exactly where Team B stores their library.yaml (GitHub, GitLab?).
  2. Have a valid Personal Access Token (PAT) to read their private repository.
  3. Be tightly coupled to Team B's internal repository structure.

This creates a massive security overhead and a brittle, static architecture.

The Solution: .well-known Endpoints

Instead of reading a repository remotely, we evolve to a network-routable Service Registry, leveraging standard web protocols.

We adopt the RFC 8615 (.well-known/) directory pattern to expose agent capabilities.

How it works (Distributed Domain Setup):

  1. Distributed Sources of Truth (GitOps): Each team or domain (e.g., Team B) maintains their own library.yaml in their specific git repository. It is human-readable and manageable via standard PRs.
  2. The Build Step (CI/CD): When changes are merged in Team B's repository, their CI pipeline parses their local library.yaml, extracts the capabilities that are meant to be public, and compiles a domain-specific agent-capabilities.json file.
  3. The Deployment: This JSON file is published to a highly available endpoint specific to that domain (e.g., a static GCP bucket with CDN, or a Kubernetes Ingress).
    https://api.teamb.internal/.well-known/agent-capabilities.json
    
  4. The Discovery (Runtime): When Team A's agent needs to interact with Team B's domain, it queries Team B's standard endpoint to discover what MCP servers and skills are available.

Example Capabilities Payload

The endpoint acts as a "reception desk," dynamically informing external agents of the current state of tools and how to authenticate:

{
  "org_id": "networking-core",
  "mcp_servers": {
    "terraform": {
      "endpoint": "https://mcp.networking.internal/v1/terraform",
      "auth_required": ["oauth2"],
      "scopes": ["tf:plan"]
    },
    "logs": {
      "endpoint": "grpc://mcp-logs.networking.internal:50051",
      "auth_required": ["mtls"]
    }
  },
  "supported_protocols": ["mcp-v1.2"],
  "contact": "platform-engineering@company.com"
}

Benefits of this Architecture

  • Decoupled: The agent does not need to know the internal Git structure of the target team.
  • Dynamic: If a tool (MCP server) goes down or rotates IPs, the .well-known endpoint reflects the live state, not the static Git intent.
  • Secure: We separate the discovery of capabilities (reading a public endpoint) from the execution (calling an MCP server with an OAuth token), eliminating the need to share Git tokens.