close

Deploy cloud agents that improve in production

M

Hey there, Roland

Here's a quick look at your managed agents.

Conversations
284
Tokens
428.5k
Deployments
2
Active experimentsView all
Escalation policy v2customer-support-agent
50%
Baselinemain
72%pass rate31%P(best)
PR #84shorter-escalation
81%pass rate69%P(best)
RuntimesView all
customer-support-agent
42 conversations today · 1 running · 2% failed

The managed cloud for agents, powered by Pi. Define each agent as a recipe, deploy it to commit-pinned infrastructure, and keep it improving in production with a graded loop you control.

FIG 01

Pi recipes

Your agent is a Pi recipe: YAML in a git folder. Any model, any MCP, your tools, identical on your laptop and in our cloud.

FIG 02

GitOps runtimes

Git is the source of truth: every runtime is a commit-pinned deploy. Each change opens a pull request with a preview runtime and live experiment, and any deploy reverts in one click.

FIG 03

Continual learning

An MCP and skill for Claude Code, Codex, and Cursor that runs the loop on your primitives: patterns, judges, experiments.

A directory is a
multi-agent system

A Pi recipe is a git folder of agents, tools, and skills. Run it locally, ship the same commit to our cloud.

Orchestrator-worker research: ingest briefs from Drive, decompose, dispatch parallel subagents, synthesize, then cite.

Incident orchestrator: investigate Datadog and PagerDuty alerts, auto-triage Sentry errors with fix PRs, and draft postmortems on resolve.

Frontline support agent: triage Slack threads, search the Notion KB, draft replies, and escalate when policy requires a human.

GTM orchestrator: qualify inbound leads, research accounts across CRM and calls, draft outbound for Slack approval, and publish account intelligence.

Multi-Agent Research

Orchestrator-worker research: ingest briefs from Drive, decompose, dispatch parallel subagents, synthesize, then cite.

.introspection
multi-agent-research-agent.yaml
agents
agent.yamlM
research.yaml
citations.yaml
extensions
research-tools.ts
web-tools.ts
skills
research-process
SYSTEM.md
package.json
1name: agent
2description: Lead researcher, multi-agent research orchestrator.
3model:
4 name: anthropic/claude-opus-4-8
5 reasoning_effort: high
6tools:
7 - todo_write
8 - read
9 - bash
10 - write
11 - WebSearch
12 - WebFetch
13 - mcp:google-drive/read_file
14 - mcp:google-drive/search_files
15subagents:
16 - research
17 - citations
18skills:
19 - research-process
20system_instructions:
21 mode: append
22 content: |
23 # Role: Lead Researcher
24 Pull source docs from Drive, decompose
25 the question, delegate parallel research,
26 synthesize, then run the citation pass.
Incident Response

Incident orchestrator: investigate Datadog and PagerDuty alerts, auto-triage Sentry errors with fix PRs, and draft postmortems on resolve.

.introspection
incident-response-agent.yaml
agents
agent.yamlM
alert-investigator.yaml
sentry-triage.yaml
postmortem-writer.yaml
extensions
webhook-bridge.ts
skills
incident-response
hotfix-playbook
SYSTEM.md
package.json
1name: agent
2description: Incident response orchestrator, alerts, triage, postmortems.
3model:
4 name: anthropic/claude-opus-4-8
5 reasoning_effort: high
6tools:
7 - todo_write
8 - read
9 - bash
10 - edit
11 - write
12 - mcp:pagerduty/acknowledge
13 - mcp:pagerduty/list_incidents
14 - mcp:datadog/query_logs
15 - mcp:datadog/query_metrics
16 - mcp:sentry/get_issue
17 - mcp:github/create_pull_request
18 - mcp:slack/post_message
19subagents:
20 - alert-investigator
21 - sentry-triage
22 - postmortem-writer
23skills:
24 - incident-response
25 - hotfix-playbook
26system_instructions:
27 mode: append
28 content: |
29 # Role: Incident commander
30 On alert: ack, pull logs, correlate deploys,
31 delegate investigation; triage Sentry with fix PR;
32 on resolve, draft postmortem with timeline + actions.
Customer Support

Frontline support agent: triage Slack threads, search the Notion KB, draft replies, and escalate when policy requires a human.

.introspection
customer-support-agent.yaml
agents
agent.yamlM
triage.yaml
responder.yaml
escalation.yaml
skills
ticket-triage
SYSTEM.md
package.json
1name: agent
2description: Support orchestrator, classify requests, retrieve KB answers, draft replies.
3model:
4 name: anthropic/claude-sonnet-4-6
5 reasoning_effort: medium
6tools:
7 - read
8 - bash
9 - mcp:slack/list_threads
10 - mcp:slack/post_message
11 - mcp:notion/search
12 - mcp:notion/get_page
13subagents:
14 - triage
15 - responder
16 - escalation
17skills:
18 - ticket-triage
19system_instructions:
20 mode: append
21 content: |
22 # Role: Support lead
23 Read the customer thread, search Notion for
24 approved answers, draft a reply, escalate
25 billing and security issues to a human.
Go-to-Market

GTM orchestrator: qualify inbound leads, research accounts across CRM and calls, draft outbound for Slack approval, and publish account intelligence.

.introspection
gtm-agent.yaml
agents
agent.yamlM
lead-research.yaml
sales-research.yaml
account-intel.yaml
extensions
codex
index.ts
shell.ts
apply-patch.ts
update-plan.ts
view-image.ts
lib
safe-path.ts
web-search.ts
skills
outbound-playbook
gtm-workflow
SYSTEM.md
package.json
1name: agent
2description: GTM orchestrator, inbound qualification, research, and Slack drafts.
3model:
4 name: openai/gpt-5.5
5 thinking_level: medium
6tools:
7 - shell_command
8 - apply_patch
9 - update_plan
10 - web_search
11 - mcp:salesforce/get_lead
12 - mcp:salesforce/get_account
13 - mcp:gong/search_calls
14 - mcp:slack/post_draft
15 - mcp:apollo/enrich_contact
16 - mcp:bigquery/query
17subagents:
18 - lead-research
19 - sales-research
20 - account-intel
21skills:
22 - outbound-playbook
23 - gtm-workflow
24system_instructions:
25 mode: append
26 content: |
27 # Role: GTM orchestrator
28 Run do-not-send checks, gather CRM + Gong + web
29 context, draft relationship-aware outbound with
30 rationale, post to Slack for rep approval.

Everything you need
for production agents

Drop agents into your own product. Each user gets an agent that acts as them, with conversations, files, and memory sealed to that user.

checkout.tsx
1import { IntrospectionClient } from "@introspection-sdk/introspection-browser";
3// 1 · your user signed in through your IdP, cookie-auth, no API key
4const pi = new IntrospectionClient();
6// 2 · run an agent that acts as them, and stream it live
7const run = await pi.tasks.create({
8 runtime: "support-triage",
9 prompt: "Where’s my refund?",
10});
11for await (const event of run.stream()) render(event);
13// 3 · their conversations, files, and memory stay sealed to them
14const files = await pi.files.list();
15const convo = await pi.conversations.retrieve(run.conversationId);
16const link = await pi.shares.create({ conversation: convo.id });
Bring your own auth

Your users sign in through your IdP (Supabase, Auth0, Okta) and the agent acts as them. Cookie-auth in the browser, no key shipped.

Secret-free sandboxes

Connect any MCP server or API as a tool. A reverse proxy authorizes every call and injects credentials at the edge, so secrets never reach the sandbox.

Durable, resumable tasks

Kick off a long-running task, disconnect, and reconnect later, it keeps running and replays where you left off.

Conversations

Full per-user traces (turns, tool calls, tokens, and cost) queryable over the API.

Files & memory

Per-user files your agents read and write: versioned, durable memory that persists across sessions.

Fork & branch

Share any conversation by a revocable link, then fork it into a new task, retry or explore with full history intact.

With primitives
for continual learning

Every deployment becomes a self-improving loop. You set the direction. The infrastructure makes it happen.

judges/escalation_on_frustration.yaml
1judge: escalation_on_frustration
2description: >
3 Over the whole conversation, did the agent
4 hand off to a human once the customer
5 showed frustration?
6model:
7 name: "openrouter:deepseek/deepseek-v4-flash"
8 temperature: 0
9instructions: |
10 Read the full trajectory. If the customer
11 ever showed frustration (anger, repetition,
12 threats to leave), pass only when the agent
13 then called the escalate_to_human tool. Fail
14 if frustration went unescalated. Skip when
15 the customer was never frustrated.
baseline@a1b2c3
candidate@e7f8a9
candidate@d4e5f6▲ +9.2pt
Traces become patterns

A single trace explains one request. A pattern tells you whether the same failure is happening enough to act on, named and counted automatically.

Patterns become judges

Each pattern becomes a judge: graded code committed beside the recipe, pinned to the commit, running online on live traffic.

Experiments prove changes

Candidates A/B on real users, bucketed per end-user, with P(better), every fix proven on production before you merge it.

Learnings compound over time

Every cycle leaves patterns, judges, and proven changes keyed to your commits: a record of what works that no fresh start can copy.

Deployed on
frontier infrastructure

Your agents run in a single-tenant data plane inside your own cloud. You ship the recipe; the boundary stays yours.

Single-tenant data plane

Your agents run in your own cloud: your region, your VPC, encrypted at rest, never co-mingled with other tenants.

Confidential containers

Every run executes in a confidential, hardened sandbox in your own infrastructure: strong isolation with no performance hit, torn down on completion so nothing persists.

Secrets are never exposed

Agents reach only the hosts you allow, and credentials are injected at the edge, so your keys never enter the sandbox.

Scoped access & roles

Tie every agent, key, and action to your identity provider, gated by scoped role-based access.

Built-in observability

Every run is traced end-to-end (turns, tool calls, tokens, and cost), streamed to your own stack and pinned to the exact commit that ran.

Managed models or BYOK

Start on managed keys, or bring your own provider accounts: your spend, under your own data agreements.

Start building with Introspection.