v0.2deepagents runtime · playbooks · learnings · evals

The open framework for production-ready agents and agentic workflows.

Build, ship, and observe AI work in your own infra. Git-backed context. Typed plugins. MCP-native. Full observability — without the hosted-SaaS lock-in.

deepagents runtime + subagentsPlaybooks, learnings, evals, budgetsMCP-native + 12 connectorsSelf-host · Apache 2.0

AI breaks when it leaves the demo.

Most teams get their first agentic workflow working by stitching prompts into app code, bots, cron jobs, and internal tools. Then things drift.

Vocion gives you one runtime for AI work that has to hold up in production.

Five resources to author AI work.

One runtime to operate it.

Vocion stays small on purpose. These five resources are the authoring surface. Everything else is runtime.

Agents are optional. The runtime works just as well for deterministic reviewed workflows.

Two compositional primitives.

Authored once. Mounted into every relevant agent.

v0.2 added two primitives that compose on top of the five resources — for the procedural knowledge and continuous improvement that agentic systems need to stay accurate.

v0.2
Playbook

Markdown + YAML the agent reads on demand. Procedural guides for "how we draft a proposal", "how we triage a meeting." Resources (REFERENCE.html, COMPONENTS.md) ride along. Per-agent playbookTags decide what mounts where. Lazy-loaded — no bloat to the per-turn prompt.

v0.2
Learning

Whitelisted rule buckets ("global", "meeting_triage", "proposal_drafting"…). Rules are added at runtime by the self-improver subagent after the user explicitly approves a candidate. Trigram dedup at 0.72 keeps the store clean. The agent reads its applicable rules as /learnings/<step>.md on every turn.

One runtime, every interface.

Author once. Trigger and review from wherever your team already works. Speak MCP, and every Claude-side client can call your agents as tools.

Run from
webMCP serverSlackTeamsCLIyour own appscheduled jobsAPI triggers
What stays the same underneath
  • — context version
  • — workflow logic
  • — approvals
  • — audit trail
  • — trace spans
  • — output history

No more separate prompt stacks for each surface.

Connect what you already run.

Built for real business systems, not toy demos. Twelve first-class connectors today; typed source plugins when you need more control.

GmailHubSpotZoomSlackPostgresStripeZendeskGoogle DriveNotionSalesforceCustom RESTWebhooks

Starter connectors and source patterns first. Typed source plugins when you need more control.

The operating loop that makes agentic systems usable.

Most AI stacks stop at generation. Vocion ships the five primitives every production agentic system needs — human review, observability, evals, self-improvement, and compute budgets.

Human-in-the-loop

The request_human_review tool pauses a run for approval. Comments on Drive decks and Slack reactions flow into the same queue.

Full observability

Every LLM call, tool span, and subagent dispatch lands in Langfuse — joined to the context SHA that produced it.

Eval-driven development

npm run eval:run scores datasets via LLM judge. Stamp every run with its context SHA. Pass-rate < 0.8 fails CI.

Self-improving

The self-improver subagent watches feedback, proposes rules, and (after your explicit approval) commits them as learning rows the agent reads on every relevant turn.

Compute budgets

Token and dollar caps per agent, per period. Hard cap refuses new runs. Soft cap warns. Cache reads billed at 10% per the model card.

Review changes in PRs, not screenshots.

Every resource lives in git as YAML and markdown.

  1. — edit operation.yaml, SKILL.md, or prompt.md
  2. — commit the change
  3. — review it in a PR
  4. — apply it to the runtime
  5. — run and review with a stamped context version
context/<org>/
  agents/
    sales-assistant/
      agent.yaml          # slug, prompt, subagents, suggestions
      system-prompt.md
  operations/             # v0.2: typed LLM calls (was skills/)
    draft_followup/
      operation.yaml
      prompt.md
      evals.yaml
  playbooks/              # v0.2: markdown the agent reads on demand
    ece-proposal/
      SKILL.md            # YAML frontmatter + procedural guide
      REFERENCE.html      # sibling resources ride along
  learnings/              # v0.2: whitelisted rule-step buckets
    global.yaml
    meeting_triage.yaml
  evals/                  # v0.2: agent eval datasets
    sales-assistant-baseline.yaml
  workflows/
    discovery_followup/
      workflow.yaml
  objects/
    deal/
      type.yaml

Same folder pattern across every resource: structured definition · LLM-facing content · evals · notes. Easy to author, easy to diff, easy to test.

Author → Apply → Run → Review → Audit.

One loop, every interface.

  1. 01
    Author

    Edit an operation.yaml, workflow.yaml, SKILL.md playbook, or prompt.md in your editor.

  2. 02
    Apply

    Reconcile authored context into the runtime and stamp a new context version.

  3. 03
    Run

    Trigger from web, Slack, Teams, CLI, your app, or a scheduled workflow.

  4. 04
    Review

    Drafts and paused workflows land in one queue. Approve, reject, revise, resume.

  5. 05
    Log

    Trace any output back to the exact context version, inputs, retrieval hits, and runtime path that produced it.

Start with a real business workflow, not a blank canvas.

Vocion ships best when you begin with something your team already does every week.

Start with prompts. Graduate to code when the logic gets real.

Start fast with YAML and markdown. Move to typed plugins when the workflow needs stronger contracts, richer logic, or external actions.

This is not a throwaway prototype path. It is the intended upgrade path.

Built for engineers who want leverage and control.

Vocion is for teams that care about:

reproducibilityreviewabilitytyped boundariesruntime consistencyMCP-nativeoperational visibilityself-hosted deployment

Not just "agents."

Open source by default.

Vocion is Apache 2.0 and designed to run on your infrastructure.

Managed services can sit on top later if you want them. The framework does not depend on them.

Need help shipping it in a real business?

MetaCTO uses Vocion to design and deploy production AI workflows for revenue teams, support orgs, operating teams, and internal platforms. If you want help implementing, hosting, or customizing it, work with the team behind the framework.

Framework first. Services if you want them.

Build agents that survive production.

Subagents, playbooks, learnings, evals, budgets, HITL — out of the box. Your code, your infra, your data.