Build, ship, and observe AI work in your own infra. Git-backed context. Typed plugins. MCP-native. Full observability — without the hosted-SaaS lock-in.
Most teams get their first agentic workflow working by stitching prompts into app code, bots, cron jobs, and internal tools. Then things drift.
Vocion gives you one runtime for AI work that has to hold up in production.
Vocion stays small on purpose. These five resources are the authoring surface. Everything else is runtime.
Connected systems that feed raw data in. Zoom, Gmail, HubSpot, Postgres, your own APIs. Typed and authored per tenant.
The business entities you care about. Account, Deal, Ticket, Incident. Canonical grounded records every run reads from.
Typed LLM call. Zod schemas in, Zod schemas out. Approval-gated when it matters. Authored as prompt today, swapped to plugin tomorrow under the same slug.
A sequence of Operations with human approval gates where it matters. Durable on Postgres. Resumable from any interface.
A named identity with a system prompt, tool surface, subagents, and budget. Runs on the deepagents runtime — virtual FS, write_todos, subagent dispatch, full observability.
Agents are optional. The runtime works just as well for deterministic reviewed workflows.
v0.2 added two primitives that compose on top of the five resources — for the procedural knowledge and continuous improvement that agentic systems need to stay accurate.
Markdown + YAML the agent reads on demand. Procedural guides for "how we draft a proposal", "how we triage a meeting." Resources (REFERENCE.html, COMPONENTS.md) ride along. Per-agent playbookTags decide what mounts where. Lazy-loaded — no bloat to the per-turn prompt.
Whitelisted rule buckets ("global", "meeting_triage", "proposal_drafting"…). Rules are added at runtime by the self-improver subagent after the user explicitly approves a candidate. Trigram dedup at 0.72 keeps the store clean. The agent reads its applicable rules as /learnings/<step>.md on every turn.
Author once. Trigger and review from wherever your team already works. Speak MCP, and every Claude-side client can call your agents as tools.
No more separate prompt stacks for each surface.
Built for real business systems, not toy demos. Twelve first-class connectors today; typed source plugins when you need more control.
Starter connectors and source patterns first. Typed source plugins when you need more control.
Most AI stacks stop at generation. Vocion ships the five primitives every production agentic system needs — human review, observability, evals, self-improvement, and compute budgets.
The request_human_review tool pauses a run for approval. Comments on Drive decks and Slack reactions flow into the same queue.
Every LLM call, tool span, and subagent dispatch lands in Langfuse — joined to the context SHA that produced it.
npm run eval:run scores datasets via LLM judge. Stamp every run with its context SHA. Pass-rate < 0.8 fails CI.
The self-improver subagent watches feedback, proposes rules, and (after your explicit approval) commits them as learning rows the agent reads on every relevant turn.
Token and dollar caps per agent, per period. Hard cap refuses new runs. Soft cap warns. Cache reads billed at 10% per the model card.
Every resource lives in git as YAML and markdown.
operation.yaml, SKILL.md, or prompt.mdcontext/<org>/
agents/
sales-assistant/
agent.yaml # slug, prompt, subagents, suggestions
system-prompt.md
operations/ # v0.2: typed LLM calls (was skills/)
draft_followup/
operation.yaml
prompt.md
evals.yaml
playbooks/ # v0.2: markdown the agent reads on demand
ece-proposal/
SKILL.md # YAML frontmatter + procedural guide
REFERENCE.html # sibling resources ride along
learnings/ # v0.2: whitelisted rule-step buckets
global.yaml
meeting_triage.yaml
evals/ # v0.2: agent eval datasets
sales-assistant-baseline.yaml
workflows/
discovery_followup/
workflow.yaml
objects/
deal/
type.yamlSame folder pattern across every resource: structured definition · LLM-facing content · evals · notes. Easy to author, easy to diff, easy to test.
One loop, every interface.
Edit an operation.yaml, workflow.yaml, SKILL.md playbook, or prompt.md in your editor.
Reconcile authored context into the runtime and stamp a new context version.
Trigger from web, Slack, Teams, CLI, your app, or a scheduled workflow.
Drafts and paused workflows land in one queue. Approve, reject, revise, resume.
Trace any output back to the exact context version, inputs, retrieval hits, and runtime path that produced it.
Vocion ships best when you begin with something your team already does every week.
Draft outbound follow-ups from CRM notes and call context, route them to review, keep a full trace of every message.
Turn inbound tickets into draft responses with human approval on edge cases and a complete audit trail.
Generate structured updates from raw metrics, review before distribution, keep one history of every run.
Start fast with YAML and markdown. Move to typed plugins when the workflow needs stronger contracts, richer logic, or external actions.
This is not a throwaway prototype path. It is the intended upgrade path.
@vocion/sdktask("name", "...")Vocion is for teams that care about:
Not just "agents."
Vocion is Apache 2.0 and designed to run on your infrastructure.
Managed services can sit on top later if you want them. The framework does not depend on them.
MetaCTO uses Vocion to design and deploy production AI workflows for revenue teams, support orgs, operating teams, and internal platforms. If you want help implementing, hosting, or customizing it, work with the team behind the framework.
Framework first. Services if you want them.
Subagents, playbooks, learnings, evals, budgets, HITL — out of the box. Your code, your infra, your data.