Introduction
AI agents are hard to operate: they behave differently on every input and fail in ways that don’t surface as errors, so dashboards built for microservices can’t tell you whether last week’s change actually helped. Introspection is built for this. You define each agent as a recipe (a versioned, git-backed package of its behavior), deploy it to commit-pinned infrastructure, and run it as a managed runtime, where every run is captured as structured behavior you can query, grade, and A/B test.
From local recipe to self-improving agent
- Build locally. Author and run your agent as a recipe with the
pi-recipesCLI (beta):recipes createto scaffold,pi --recipe <name>to run, iterate before anything ships. - Deploy with Introspection. Push the recipe to git, pin it to a commit, and register it as a runtime on an environment lane. Every run is now captured and replayable.
- Improve in production. Turn that captured behavior into evidence-based change with the operator loop below.
The operator loop
Each stage maps to a concept you’ll meet throughout these docs:
Connect a recipe as a runtime → observe every task as a conversation with synthesized observations and patterns → investigate top-down with structured filtering → define “good” as a judge → experiment on live traffic → ship & guard the winner so quality can’t silently drift.
See the guides for the loop in full, or Core Concepts for the entities it acts on.
Get started
- Quickstart: deploy a recipe and run your first task
- Core Concepts: the entities and how they fit together
- API Reference: the REST API for runtimes, tasks, and experiments
