# MCP, A2A, and AG-UI Source: https://docs.ag-ui.com/agentic-protocols Understanding how AG-UI complements and works with MCP and A2A ## Agentic Protocols The agentic ecosystem is rapidly organizing around a family of open, complementary protocols — each addressing a distinct layer of interaction. AG-UI has emerged as the 3rd leg of the AI protocol landscape:
AI Protocol Stack
You can connect your application to agents directly via **AG-UI**, **MCP**, and **A2A**. * **MCP** (Model Context Protocol) Connects agents to tools and to context — but those tools are themselves becoming agentic. * **A2A** (Agent to Agent) Connects agents to other agents. * **AG-UI (Agent–User Interaction)** Connects agents to users (through user-facing applications). You can think of AG-UI as the **"kitchen sink" protocol** — informed by bottom-up, real-world needs for building best-in-class agentic applications. These three agentic protocols are complementary and have distinct technical goals; a single agent can and often does use all 3 simultaneously. ## AG-UI Handshakes with MCP and A2A AG-UI contributors have recently added handshakes, allowing AG-UI to "front for" agents through MCP and A2A protocols, which allows AG-UI client apps and libraries to seamlessly use MCP and A2A supporting agents. AG-UI's mandate is to support the full set of building blocks required by modern agentic applications. ## Generative UI Specs Recently several [generative ui specs](./concepts/generative-ui-specs) (including MCP-UI, Open JSON UI, and A2UI) have been released which allow agents to deliver UI widgets through the interaction protocols. AG-UI works with all of these. Visit our [generative ui specs page](./concepts/generative-ui-specs) to lern more. # Agents Source: https://docs.ag-ui.com/concepts/agents Learn about agents in the Agent User Interaction Protocol # Agents Agents are the core components in the AG-UI protocol that process requests and generate responses. They establish a standardized way for front-end applications to communicate with AI services through a consistent interface, regardless of the underlying implementation. ## What is an Agent? In AG-UI, an agent is a class that: 1. Manages conversation state and message history 2. Processes incoming messages and context 3. Generates responses through an event-driven streaming interface 4. Follows a standardized protocol for communication Agents can be implemented to connect with any AI service, including: * Large language models (LLMs) like GPT-4 or Claude * Custom AI systems * Retrieval augmented generation (RAG) systems * Multi-agent systems ## Agent Architecture All agents in AG-UI extend the `AbstractAgent` class, which provides the foundation for: * State management * Message history tracking * Event stream processing * Tool usage ```typescript theme={null} import { AbstractAgent } from "@ag-ui/client" class MyAgent extends AbstractAgent { run(input: RunAgentInput): RunAgent { // Implementation details } } ``` ### Core Components AG-UI agents have several key components: 1. **Configuration**: Agent ID, thread ID, and initial state 2. **Messages**: Conversation history with user and assistant messages 3. **State**: Structured data that persists across interactions 4. **Events**: Standardized messages for communication with clients 5. **Tools**: Functions that agents can use to interact with external systems ## Agent Types AG-UI provides different agent implementations to suit various needs: ### AbstractAgent The base class that all agents extend. It handles core event processing, state management, and message history. ### HttpAgent A concrete implementation that connects to remote AI services via HTTP: ```typescript theme={null} import { HttpAgent } from "@ag-ui/client" const agent = new HttpAgent({ url: "https://your-agent-endpoint.com/agent", headers: { Authorization: "Bearer your-api-key", }, }) ``` ### Custom Agents You can create custom agents to integrate with any AI service by extending `AbstractAgent`: ```typescript theme={null} class CustomAgent extends AbstractAgent { // Custom properties and methods run(input: RunAgentInput): RunAgent { // Implement the agent's logic } } ``` ## Implementing Agents ### Basic Implementation To create a custom agent, extend the `AbstractAgent` class and implement the required `run` method: ```typescript theme={null} import { AbstractAgent, RunAgent, RunAgentInput, EventType, BaseEvent, } from "@ag-ui/client" import { Observable } from "rxjs" class SimpleAgent extends AbstractAgent { run(input: RunAgentInput): RunAgent { const { threadId, runId } = input return () => new Observable((observer) => { // Emit RUN_STARTED event observer.next({ type: EventType.RUN_STARTED, threadId, runId, }) // Send a message const messageId = Date.now().toString() // Message start observer.next({ type: EventType.TEXT_MESSAGE_START, messageId, role: "assistant", }) // Message content observer.next({ type: EventType.TEXT_MESSAGE_CONTENT, messageId, delta: "Hello, world!", }) // Message end observer.next({ type: EventType.TEXT_MESSAGE_END, messageId, }) // Emit RUN_FINISHED event observer.next({ type: EventType.RUN_FINISHED, threadId, runId, }) // Complete the observable observer.complete() }) } } ``` ## Agent Capabilities Agents in the AG-UI protocol provide a rich set of capabilities that enable sophisticated AI interactions: ### Interactive Communication Agents establish bi-directional communication channels with front-end applications through event streams. This enables: * Real-time streaming responses character-by-character * Immediate feedback loops between user and AI * Progress indicators for long-running operations * Structured data exchange in both directions ### Tool Usage Agents can use tools to perform actions and access external resources. Importantly, tools are defined and passed in from the front-end application to the agent, allowing for a flexible and extensible system: ```typescript theme={null} // Tool definition const confirmAction = { name: "confirmAction", description: "Ask the user to confirm a specific action before proceeding", parameters: { type: "object", properties: { action: { type: "string", description: "The action that needs user confirmation", }, importance: { type: "string", enum: ["low", "medium", "high", "critical"], description: "The importance level of the action", }, details: { type: "string", description: "Additional details about the action", }, }, required: ["action"], }, } // Running an agent with tools from the frontend agent.runAgent({ tools: [confirmAction], // Frontend-defined tools passed to the agent // other parameters }) ``` Tools are invoked through a sequence of events: 1. `TOOL_CALL_START`: Indicates the beginning of a tool call 2. `TOOL_CALL_ARGS`: Streams the arguments for the tool call 3. `TOOL_CALL_END`: Marks the completion of the tool call Front-end applications can then execute the tool and provide results back to the agent. This bidirectional flow enables sophisticated human-in-the-loop workflows where: * The agent can request specific actions be performed * Humans can execute those actions with appropriate judgment * Results are fed back to the agent for continued reasoning * The agent maintains awareness of all decisions made in the process This mechanism is particularly powerful for implementing interfaces where AI and humans collaborate. For example, [CopilotKit](https://docs.copilotkit.ai/) leverages this exact pattern with their [`useCopilotAction`](https://docs.copilotkit.ai/guides/frontend-actions) hook, which provides a simplified way to define and handle tools in React applications. By keeping the AI informed about human decisions through the tool mechanism, applications can maintain context and create more natural collaborative experiences between users and AI assistants. ### State Management Agents maintain a structured state that persists across interactions. This state can be: * Updated incrementally through `STATE_DELTA` events * Completely refreshed with `STATE_SNAPSHOT` events * Accessed by both the agent and front-end * Used to store user preferences, conversation context, or application state ```typescript theme={null} // Accessing agent state console.log(agent.state.preferences) // State is automatically updated during agent runs agent.runAgent().subscribe((event) => { if (event.type === EventType.STATE_DELTA) { // State has been updated console.log("New state:", agent.state) } }) ``` ### Multi-Agent Collaboration AG-UI supports agent-to-agent handoff and collaboration: * Agents can delegate tasks to other specialized agents * Multiple agents can work together in a coordinated workflow * State and context can be transferred between agents * The front-end maintains a consistent experience across agent transitions For example, a general assistant agent might hand off to a specialized coding agent when programming help is needed, passing along the conversation context and specific requirements. ### Human-in-the-Loop Workflows Agents support human intervention and assistance: * Agents can request human input on specific decisions * Front-ends can pause agent execution and resume it after human feedback * Human experts can review and modify agent outputs before they're finalized * Hybrid workflows combine AI efficiency with human judgment This enables applications where the agent acts as a collaborative partner rather than an autonomous system. ### Conversational Memory Agents maintain a complete history of conversation messages: * Past interactions inform future responses * Message history is synchronized between client and server * Messages can include rich content (text, structured data, references) * The context window can be managed to focus on relevant information ```typescript theme={null} // Accessing message history console.log(agent.messages) // Adding a new user message agent.messages.push({ id: "msg_123", role: "user", content: "Can you explain that in more detail?", }) ``` ### Metadata and Instrumentation Agents can emit metadata about their internal processes: * Reasoning steps through custom events * Performance metrics and timing information * Source citations and reference tracking * Confidence scores for different response options This allows front-ends to provide transparency into the agent's decision-making process and help users understand how conclusions were reached. ## Using Agents Once you've implemented or instantiated an agent, you can use it like this: ```typescript theme={null} // Create an agent instance const agent = new HttpAgent({ url: "https://your-agent-endpoint.com/agent", }) // Add initial messages if needed agent.messages = [ { id: "1", role: "user", content: "Hello, how can you help me today?", }, ] // Run the agent agent .runAgent({ runId: "run_123", tools: [], // Optional tools context: [], // Optional context }) .subscribe({ next: (event) => { // Handle different event types switch (event.type) { case EventType.TEXT_MESSAGE_CONTENT: console.log("Content:", event.delta) break // Handle other events } }, error: (error) => console.error("Error:", error), complete: () => console.log("Run complete"), }) ``` ## Agent Configuration Agents accept configuration through the constructor: ```typescript theme={null} interface AgentConfig { agentId?: string // Unique identifier for the agent description?: string // Human-readable description threadId?: string // Conversation thread identifier initialMessages?: Message[] // Initial messages initialState?: State // Initial state object } // Using the configuration const agent = new HttpAgent({ agentId: "my-agent-123", description: "A helpful assistant", threadId: "thread-456", initialMessages: [ { id: "1", role: "system", content: "You are a helpful assistant." }, ], initialState: { preferredLanguage: "English" }, }) ``` ## Agent State Management AG-UI agents maintain state across interactions: ```typescript theme={null} // Access current state console.log(agent.state) // Access messages console.log(agent.messages) // Clone an agent with its state const clonedAgent = agent.clone() ``` ## Conclusion Agents are the foundation of the AG-UI protocol, providing a standardized way to connect front-end applications with AI services. By implementing the `AbstractAgent` class, you can create custom integrations with any AI service while maintaining a consistent interface for your applications. The event-driven architecture enables real-time, streaming interactions that are essential for modern AI applications, and the standardized protocol ensures compatibility across different implementations. # Core architecture Source: https://docs.ag-ui.com/concepts/architecture Understand how AG-UI connects front-end applications to AI agents Agent User Interaction Protocol (AG-UI) is built on a flexible, event-driven architecture that enables seamless, efficient communication between front-end applications and AI agents. This document covers the core architectural components and concepts. ## Design Principles AG-UI is designed to be lightweight and minimally opinionated, making it easy to integrate with a wide range of agent implementations. The protocol's flexibility comes from its simple requirements: 1. **Event-Driven Communication**: Agents need to emit any of the 16 standardized event types during execution, creating a stream of updates that clients can process. 2. **Bidirectional Interaction**: Agents accept input from users, enabling collaborative workflows where humans and AI work together seamlessly. The protocol includes a built-in middleware layer that maximizes compatibility in two key ways: * **Flexible Event Structure**: Events don't need to match AG-UI's format exactly—they just need to be AG-UI-compatible. This allows existing agent frameworks to adapt their native event formats with minimal effort. * **Transport Agnostic**: AG-UI doesn't mandate how events are delivered, supporting various transport mechanisms including Server-Sent Events (SSE), webhooks, WebSockets, and more. This flexibility lets developers choose the transport that best fits their architecture. This pragmatic approach makes AG-UI easy to adopt without requiring major changes to existing agent implementations or frontend applications. ## Architectural Overview AG-UI follows a client-server architecture that standardizes communication between agents and applications: ```mermaid theme={null} flowchart LR subgraph "Frontend" App["Application"] Client["AG-UI Client"] end subgraph "Backend" A1["AI Agent A"] P["Secure Proxy"] A2["AI Agent B"] A3["AI Agent C"] end App <--> Client Client <-->|"AG-UI Protocol"| A1 Client <-->|"AG-UI Protocol"| P P <-->|"AG-UI Protocol"| A2 P <-->|"AG-UI Protocol"| A3 class P mintStyle; classDef mintStyle fill:#E0F7E9,stroke:#66BB6A,stroke-width:2px,color:#000000; style App rx:5, ry:5; style Client rx:5, ry:5; style A1 rx:5, ry:5; style P rx:5, ry:5; style A2 rx:5, ry:5; style A3 rx:5, ry:5; ``` * **Application**: User-facing apps (i.e. chat or any AI-enabled application). * **AG-UI Client**: Generic communication clients like `HttpAgent` or specialized clients for connecting to existing protocols. * **Agents**: Backend AI agents that process requests and generate streaming responses. * **Secure Proxy**: Backend services that provide additional capabilities and act as a secure proxy. ## Core components ### Protocol layer AG-UI's protocol layer provides a flexible foundation for agent communication. * **Universal compatibility**: Connect to any protocol by implementing `run(input: RunAgentInput) -> Observable` The protocol's primary abstraction enables applications to run agents and receive a stream of events: ```typescript theme={null} // Core agent execution interface type RunAgent = () => Observable class MyAgent extends AbstractAgent { run(input: RunAgentInput): RunAgent { const { threadId, runId } = input return () => from([ { type: EventType.RUN_STARTED, threadId, runId }, { type: EventType.MESSAGES_SNAPSHOT, messages: [ { id: "msg_1", role: "assistant", content: "Hello, world!" } ], }, { type: EventType.RUN_FINISHED, threadId, runId }, ]) } } ``` ### Standard HTTP client AG-UI offers a standard HTTP client `HttpAgent` that can be used to connect to any endpoint that accepts POST requests with a body of type `RunAgentInput` and sends a stream of `BaseEvent` objects. `HttpAgent` supports the following transports: * **HTTP SSE (Server-Sent Events)** * Text-based streaming for wide compatibility * Easy to read and debug * **HTTP binary protocol** * Highly performant and space-efficient custom transport * Robust binary serialization for production environments ### Message types AG-UI defines several event categories for different aspects of agent communication: * **Lifecycle events** * `RUN_STARTED`, `RUN_FINISHED`, `RUN_ERROR` * `STEP_STARTED`, `STEP_FINISHED` * **Text message events** * `TEXT_MESSAGE_START`, `TEXT_MESSAGE_CONTENT`, `TEXT_MESSAGE_END` * **Tool call events** * `TOOL_CALL_START`, `TOOL_CALL_ARGS`, `TOOL_CALL_END` * **State management events** * `STATE_SNAPSHOT`, `STATE_DELTA`, `MESSAGES_SNAPSHOT` * **Special events** * `RAW`, `CUSTOM` ## Running Agents To run an agent, you create a client instance and execute it: ```typescript theme={null} // Create an HTTP agent client const agent = new HttpAgent({ url: "https://your-agent-endpoint.com/agent", agentId: "unique-agent-id", threadId: "conversation-thread" }); // Start the agent and handle events agent.runAgent({ tools: [...], context: [...] }).subscribe({ next: (event) => { // Handle different event types switch(event.type) { case EventType.TEXT_MESSAGE_CONTENT: // Update UI with new content break; // Handle other event types } }, error: (error) => console.error("Agent error:", error), complete: () => console.log("Agent run complete") }); ``` ## State Management AG-UI provides efficient state management through specialized events: * `STATE_SNAPSHOT`: Complete state representation at a point in time * `STATE_DELTA`: Incremental state changes using JSON Patch format (RFC 6902) * `MESSAGES_SNAPSHOT`: Complete conversation history These events enable efficient client-side state management with minimal data transfer. ## Tools and Handoff AG-UI supports agent-to-agent handoff and tool usage through standardized events: * Tool definitions are passed in the `runAgent` parameters * Tool calls are streamed as sequences of `TOOL_CALL_START` → `TOOL_CALL_ARGS` → `TOOL_CALL_END` events * Agents can hand off to other agents, maintaining context continuity ## Events All communication in AG-UI is based on typed events. Every event inherits from `BaseEvent`: ```typescript theme={null} interface BaseEvent { type: EventType timestamp?: number rawEvent?: any } ``` Events are strictly typed and validated, ensuring reliable communication between components. # Capabilities Source: https://docs.ag-ui.com/concepts/capabilities Dynamic capability discovery for agents in the AG-UI protocol # Capabilities Agents in the AG-UI protocol can declare what they support at runtime through **capability discovery**. This allows clients to query an agent and adapt their behavior based on what features are available — without guessing or hardcoding assumptions. ## How It Works `AbstractAgent` exposes an optional `getCapabilities()` method that returns a typed snapshot of everything the agent currently supports: ```typescript theme={null} const agent = new HttpAgent({ url: "https://my-agent.example.com/api" }) const capabilities = await agent.getCapabilities?.() if (capabilities?.tools?.supported) { console.log(`Agent provides ${capabilities.tools.items?.length} tools`) } if (capabilities?.reasoning?.supported) { // Show reasoning UI toggle } ``` ### Key Principles * **Discovery only** — the agent declares what it can do, there is no negotiation * **Dynamic** — returns the current state at the time of the call (e.g., if tools are added, the next call reflects them) * **Optional** — agents that don't implement it return `undefined` * **Absent = unknown** — only declare what you support, omitted fields mean the capability is not declared ## The AgentCapabilities Interface Capabilities are organized into typed categories, each representing a different aspect of agent functionality: ```typescript theme={null} interface AgentCapabilities { /** Agent identity and metadata. */ identity?: IdentityCapabilities /** Supported transport mechanisms (SSE, WebSocket, binary, etc.). */ transport?: TransportCapabilities /** Tools the agent provides and tool calling configuration. */ tools?: ToolsCapabilities /** Output format support (structured output, MIME types). */ output?: OutputCapabilities /** State and memory management (snapshots, deltas, persistence). */ state?: StateCapabilities /** Multi-agent coordination (delegation, handoffs, sub-agents). */ multiAgent?: MultiAgentCapabilities /** Reasoning and thinking support (chain-of-thought, encrypted thinking). */ reasoning?: ReasoningCapabilities /** Multimodal input/output support organized by direction (input vs output). */ multimodal?: MultimodalCapabilities /** Execution control and limits (code execution, timeouts, iteration caps). */ execution?: ExecutionCapabilities /** Human-in-the-loop support (approvals, interventions, feedback). */ humanInTheLoop?: HumanInTheLoopCapabilities /** Integration-specific capabilities not covered by the standard categories. */ custom?: Record } ``` The `custom` field is an escape hatch for integration-specific capabilities that don't fit into the standard categories. ## Capability Categories ### Identity Basic metadata about the agent. Useful for discovery UIs, agent marketplaces, and debugging. Set these when you want clients to display agent information or when multiple agents are available and users need to pick one. ```typescript theme={null} interface IdentityCapabilities { /** Human-readable name shown in UIs and agent selectors. */ name?: string /** The framework or platform powering this agent (e.g., "langgraph", "mastra", "crewai"). */ type?: string /** What this agent does — helps users and routing logic decide when to use it. */ description?: string /** Semantic version of the agent (e.g., "1.2.0"). Useful for compatibility checks. */ version?: string /** Organization or team that maintains this agent. */ provider?: string /** URL to the agent's documentation or homepage. */ documentationUrl?: string /** Arbitrary key-value pairs for integration-specific identity info. */ metadata?: Record } ``` ### Transport Declares which transport mechanisms the agent supports. Clients use this to pick the best connection strategy. Only set flags to `true` for transports your agent actually handles — omit or set `false` for unsupported ones. ```typescript theme={null} interface TransportCapabilities { /** Set `true` if the agent streams responses via SSE. Most agents enable this. */ streaming?: boolean /** Set `true` if the agent accepts persistent WebSocket connections. */ websocket?: boolean /** Set `true` if the agent supports the AG-UI binary protocol (protobuf over HTTP). */ httpBinary?: boolean /** Set `true` if the agent can send async updates via webhooks after a run finishes. */ pushNotifications?: boolean /** Set `true` if the agent supports resuming interrupted streams via sequence numbers. */ resumable?: boolean } ``` ### Tools Tool calling capabilities. Distinguishes between tools the agent itself provides (listed in `items`) and tools the client passes at runtime via `RunAgentInput.tools`. Enable this when your agent can call functions, search the web, execute code, etc. ```typescript theme={null} interface ToolsCapabilities { /** Set `true` if the agent can make tool calls at all. Set `false` to explicitly * signal tool calling is disabled even if items are present. */ supported?: boolean /** The tools this agent provides on its own (full JSON Schema definitions). * These are distinct from client-provided tools passed in `RunAgentInput.tools`. */ items?: Tool[] /** Set `true` if the agent can invoke multiple tools concurrently within a single step. */ parallelCalls?: boolean /** Set `true` if the agent accepts and uses tools provided by the client at runtime. */ clientProvided?: boolean } ``` ### Output Output format support. Enable `structuredOutput` when your agent can return responses conforming to a JSON schema, which is useful for programmatic consumption. ```typescript theme={null} interface OutputCapabilities { /** Set `true` if the agent can produce structured JSON output matching a provided schema. */ structuredOutput?: boolean /** MIME types the agent can produce (e.g., `["text/plain", "application/json"]`). * Omit if the agent only produces plain text. */ supportedMimeTypes?: string[] } ``` ### State State and memory management capabilities. These tell the client how the agent handles shared state and whether conversation context persists across runs. ```typescript theme={null} interface StateCapabilities { /** Set `true` if the agent emits `STATE_SNAPSHOT` events (full state replacement). */ snapshots?: boolean /** Set `true` if the agent emits `STATE_DELTA` events (JSON Patch incremental updates). */ deltas?: boolean /** Set `true` if the agent has long-term memory beyond the current thread * (e.g., vector store, knowledge base, or cross-session recall). */ memory?: boolean /** Set `true` if state is preserved across multiple runs within the same thread. * When `false`, state resets on each run. */ persistentState?: boolean } ``` ### Multi-Agent Multi-agent coordination capabilities. Enable these when your agent can orchestrate or hand off work to other agents. ```typescript theme={null} interface MultiAgentCapabilities { /** Set `true` if the agent participates in any form of multi-agent coordination. */ supported?: boolean /** Set `true` if the agent can delegate subtasks to other agents while retaining control. */ delegation?: boolean /** Set `true` if the agent can transfer the conversation entirely to another agent. */ handoffs?: boolean /** List of sub-agents this agent can invoke. Helps clients build agent selection UIs. */ subAgents?: Array<{ name: string; description?: string }> } ``` ### Reasoning Reasoning and thinking capabilities. Enable these when your agent exposes its internal thought process (e.g., chain-of-thought, extended thinking). ```typescript theme={null} interface ReasoningCapabilities { /** Set `true` if the agent produces reasoning/thinking tokens visible to the client. */ supported?: boolean /** Set `true` if reasoning tokens are streamed incrementally (vs. returned all at once). */ streaming?: boolean /** Set `true` if reasoning content is encrypted (zero-data-retention mode). * Clients should expect opaque `encryptedValue` fields instead of readable content. */ encrypted?: boolean } ``` ### Multimodal Multimodal input and output support, organized into `input` and `output` sub-objects so clients can independently query what the agent accepts versus what it produces. Clients use this to show/hide file upload buttons, audio recorders, image pickers, etc. ```typescript theme={null} interface MultimodalInputCapabilities { /** Set `true` if the agent can process image inputs (e.g., screenshots, photos). */ image?: boolean /** Set `true` if the agent can process audio inputs (speech, recordings). */ audio?: boolean /** Set `true` if the agent can process video inputs. */ video?: boolean /** Set `true` if the agent can process PDF documents. */ pdf?: boolean /** Set `true` if the agent can process arbitrary file uploads. */ file?: boolean } interface MultimodalOutputCapabilities { /** Set `true` if the agent can generate images as part of its response. */ image?: boolean /** Set `true` if the agent can produce audio output (text-to-speech, audio files). */ audio?: boolean } interface MultimodalCapabilities { /** Modalities the agent can accept as input (images, audio, video, PDFs, files). */ input?: MultimodalInputCapabilities /** Modalities the agent can produce as output (images, audio). */ output?: MultimodalOutputCapabilities } ``` ### Execution Execution control and limits. Declare these so clients can set expectations about how long or how many steps an agent run might take. ```typescript theme={null} interface ExecutionCapabilities { /** Set `true` if the agent can execute code (e.g., Python, JavaScript) during a run. */ codeExecution?: boolean /** Set `true` if code execution happens in a sandboxed/isolated environment. * Only meaningful when `codeExecution` is `true`. */ sandboxed?: boolean /** Maximum number of tool-call/reasoning iterations the agent will perform per run. * Helps clients display progress or set timeout expectations. */ maxIterations?: number /** Maximum wall-clock time (in milliseconds) the agent will run before timing out. */ maxExecutionTime?: number } ``` ### Human-in-the-Loop Human-in-the-loop interaction support. Enable these when your agent can pause execution to request human input, approval, or feedback before continuing. ```typescript theme={null} interface HumanInTheLoopCapabilities { /** Set `true` if the agent supports any form of human-in-the-loop interaction. */ supported?: boolean /** Set `true` if the agent can pause and request explicit approval before * performing sensitive actions (e.g., sending emails, deleting data). */ approvals?: boolean /** Set `true` if the agent allows humans to intervene and modify its plan mid-execution. */ interventions?: boolean /** Set `true` if the agent can incorporate user feedback (thumbs up/down, corrections) * to improve its behavior within the current session. */ feedback?: boolean } ``` ## Implementing getCapabilities() ### Custom Agents Implement `getCapabilities()` on your agent subclass, returning only the capabilities you actually support: ```typescript theme={null} import { AbstractAgent, AgentCapabilities } from "@ag-ui/client" class MyAgent extends AbstractAgent { async getCapabilities(): Promise { return { identity: { name: "my-agent", description: "A custom agent with tool support", version: "1.0.0", }, transport: { streaming: true, }, tools: { supported: true, items: this.getRegisteredTools(), clientProvided: true, }, state: { snapshots: true, deltas: true, }, } } // ... run() implementation } ``` ### Dynamic Capabilities Since `getCapabilities()` returns a live snapshot, it reflects the agent's current state: ```typescript theme={null} const agent = new MyAgent(config) let caps = await agent.getCapabilities() console.log(caps.tools?.items?.length) // 5 // Register more tools at runtime agent.registerTool(newTool) caps = await agent.getCapabilities() console.log(caps.tools?.items?.length) // 6 ``` ## Client Usage Patterns ### Adaptive UI Render UI components based on what the agent supports: ```typescript theme={null} const capabilities = await agent.getCapabilities?.() // Only show reasoning panel if supported if (capabilities?.reasoning?.supported) { showReasoningPanel() } // Only show sub-agent selector if available if (capabilities?.multiAgent?.subAgents?.length) { showSubAgentSelector(capabilities.multiAgent.subAgents) } // Only show approval UI if HITL is supported if (capabilities?.humanInTheLoop?.approvals) { enableApprovalWorkflow() } ``` ### Feature Gating Disable features the agent doesn't support instead of failing at runtime: ```typescript theme={null} const capabilities = await agent.getCapabilities?.() const canUseStructuredOutput = capabilities?.output?.structuredOutput ?? false const canStream = capabilities?.transport?.streaming ?? false ``` ### Custom Capabilities Access integration-specific capabilities via the `custom` field: ```typescript theme={null} const capabilities = await agent.getCapabilities?.() const rateLimit = capabilities?.custom?.rateLimit as | { maxRequestsPerMinute: number } | undefined if (rateLimit) { configureThrottling(rateLimit.maxRequestsPerMinute) } ``` # Events Source: https://docs.ag-ui.com/concepts/events Understanding events in the Agent User Interaction Protocol # Events The Agent User Interaction Protocol uses a streaming event-based architecture. Events are the fundamental units of communication between agents and frontends, enabling real-time, structured interaction. ## Event Types Overview Events in the protocol are categorized by their purpose: | Category | Description | | ----------------------- | --------------------------------------- | | Lifecycle Events | Monitor the progression of agent runs | | Text Message Events | Handle streaming textual content | | Tool Call Events | Manage tool executions by agents | | State Management Events | Synchronize state between agents and UI | | Activity Events | Represent ongoing activity progress | | Special Events | Support custom functionality | | Draft Events | Proposed events under development | ## Base Event Properties All events share a common set of base properties: | Property | Description | | ----------- | ---------------------------------------------------------------- | | `type` | The specific event type identifier | | `timestamp` | Optional timestamp indicating when the event was created | | `rawEvent` | Optional field containing the original event data if transformed | ## Lifecycle Events These events represent the lifecycle of an agent run. A typical agent run follows a predictable pattern: it begins with a `RunStarted` event, may contain multiple optional `StepStarted`/`StepFinished` pairs, and concludes with either a `RunFinished` event (success) or a `RunError` event (failure). Lifecycle events provide crucial structure to agent runs, enabling frontends to track progress, manage UI states appropriately, and handle errors gracefully. They create a consistent framework for understanding when operations begin and end, making it possible to implement features like loading indicators, progress tracking, and error recovery mechanisms. ```mermaid theme={null} sequenceDiagram participant Agent participant Client Note over Agent,Client: Run begins Agent->>Client: RunStarted opt Sending steps is optional Note over Agent,Client: Step execution Agent->>Client: StepStarted Agent->>Client: StepFinished end Note over Agent,Client: Run completes alt Agent->>Client: RunFinished else Agent->>Client: RunError end ``` The `RunStarted` and either `RunFinished` or `RunError` events are mandatory, forming the boundaries of an agent run. Step events are optional and may occur multiple times within a run, allowing for structured, observable progress tracking. ### RunStarted Signals the start of an agent run. The `RunStarted` event is the first event emitted when an agent begins processing a request. It establishes a new execution context identified by a unique `runId`. This event serves as a marker for frontends to initialize UI elements such as progress indicators or loading states. It also provides crucial identifiers that can be used to associate subsequent events with this specific run. | Property | Description | | ------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `threadId` | ID of the conversation thread | | `runId` | ID of the agent run | | `parentRunId` | (Optional) Lineage pointer for branching/time travel. If present, refers to a prior run within the same thread, creating a git-like append-only log | | `input` | (Optional) The exact agent input payload that was sent to the agent for this run. May omit messages already present in history; compactEvents() will normalize | ### RunFinished Signals the successful completion of an agent run. The `RunFinished` event indicates that an agent has successfully completed all its work for the current run. Upon receiving this event, frontends should finalize any UI states that were waiting on the agent's completion. This event marks a clean termination point and indicates that no further processing will occur in this run unless explicitly requested. The optional `result` field can contain any output data produced by the agent run. | Property | Description | | ---------- | ----------------------------- | | `threadId` | ID of the conversation thread | | `runId` | ID of the agent run | | `result` | Optional result data from run | ### RunError Signals an error during an agent run. The `RunError` event indicates that the agent encountered an error it could not recover from, causing the run to terminate prematurely. This event provides information about what went wrong, allowing frontends to display appropriate error messages and potentially offer recovery options. After a `RunError` event, no further processing will occur in this run. | Property | Description | | --------- | ------------------- | | `message` | Error message | | `code` | Optional error code | ### StepStarted Signals the start of a step within an agent run. The `StepStarted` event indicates that the agent is beginning a specific subtask or phase of its processing. Steps provide granular visibility into the agent's progress, enabling more precise tracking and feedback in the UI. Steps are optional but highly recommended for complex operations that benefit from being broken down into observable stages. The `stepName` could be the name of a node or function that is currently executing. | Property | Description | | ---------- | ---------------- | | `stepName` | Name of the step | ### StepFinished Signals the completion of a step within an agent run. The `StepFinished` event indicates that the agent has completed a specific subtask or phase. When paired with a corresponding `StepStarted` event, it creates a bounded context for a discrete unit of work. Frontends can use these events to update progress indicators, show completion animations, or reveal results specific to that step. The `stepName` must match the corresponding `StepStarted` event to properly pair the beginning and end of the step. | Property | Description | | ---------- | ---------------- | | `stepName` | Name of the step | ## Text Message Events These events represent the lifecycle of text messages in a conversation. Text message events follow a streaming pattern, where content is delivered incrementally. A message begins with a `TextMessageStart` event, followed by one or more `TextMessageContent` events that deliver chunks of text as they become available, and concludes with a `TextMessageEnd` event. This streaming approach enables real-time display of message content as it's generated, creating a more responsive user experience compared to waiting for the entire message to be complete before showing anything. ```mermaid theme={null} sequenceDiagram participant Agent participant Client Note over Agent,Client: Message begins Agent->>Client: TextMessageStart loop Content streaming Agent->>Client: TextMessageContent end Note over Agent,Client: Message completes Agent->>Client: TextMessageEnd ``` The `TextMessageContent` events each contain a `delta` field with a chunk of text. Frontends should concatenate these deltas in the order received to construct the complete message. The `messageId` property links all related events, allowing the frontend to associate content chunks with the correct message. ### TextMessageStart Signals the start of a text message. The `TextMessageStart` event initializes a new text message in the conversation. It establishes a unique `messageId` that will be referenced by subsequent content chunks and the end event. This event allows frontends to prepare the UI for an incoming message, such as creating a new message bubble with a loading indicator. The `role` property identifies whether the message is coming from the assistant or potentially another participant in the conversation. | Property | Description | | ----------- | ------------------------------------------------------------------------------- | | `messageId` | Unique identifier for the message | | `role` | Role of the message sender ("developer", "system", "assistant", "user", "tool") | ### TextMessageContent Represents a chunk of content in a streaming text message. The `TextMessageContent` event delivers incremental parts of the message text as they become available. Each event contains a small chunk of text in the `delta` property that should be appended to previously received chunks. The streaming nature of these events enables real-time display of content, creating a more responsive and engaging user experience. Implementations should handle these events efficiently to ensure smooth text rendering without visible delays or flickering. | Property | Description | | ----------- | -------------------------------------- | | `messageId` | Matches the ID from `TextMessageStart` | | `delta` | Text content chunk (non-empty) | ### TextMessageEnd Signals the end of a text message. The `TextMessageEnd` event marks the completion of a streaming text message. After receiving this event, the frontend knows that the message is complete and no further content will be added. This allows the UI to finalize rendering, remove any loading indicators, and potentially trigger actions that should occur after message completion, such as enabling reply controls or performing automatic scrolling to ensure the full message is visible. | Property | Description | | ----------- | -------------------------------------- | | `messageId` | Matches the ID from `TextMessageStart` | ### TextMessageChunk Convenience event that expands to Start → Content → End automatically. The `TextMessageChunk` event lets you omit explicit `TextMessageStart` and `TextMessageEnd` events. The client stream transformer expands chunks into the standard triad: * First chunk for a message must include `messageId` and will emit `TextMessageStart` (role defaults to `assistant` when not provided). * Each chunk with a `delta` emits a `TextMessageContent` for the current `messageId`. * `TextMessageEnd` is emitted automatically when the stream switches to a new message ID or when the stream completes. | Property | Description | | ----------- | ------------------------------------------------------------------------------------ | | `messageId` | Optional unique identifier for the message; required on the first chunk of a message | | `role` | Optional role of the sender ("developer", "system", "assistant", "user") | | `delta` | Optional text content of the message | ## Tool Call Events These events represent the lifecycle of tool calls made by agents. Tool calls follow a streaming pattern similar to text messages. When an agent needs to use a tool, it emits a `ToolCallStart` event, followed by one or more `ToolCallArgs` events that stream the arguments being passed to the tool, and concludes with a `ToolCallEnd` event. This streaming approach allows frontends to show tool executions in real-time, making the agent's actions transparent and providing immediate feedback about what tools are being invoked and with what parameters. ```mermaid theme={null} sequenceDiagram participant Agent participant Client Note over Agent,Client: Tool call begins Agent->>Client: ToolCallStart loop Arguments streaming Agent->>Client: ToolCallArgs end Note over Agent,Client: Tool call completes Agent->>Client: ToolCallEnd Note over Agent,Client: Tool execution result Agent->>Client: ToolCallResult ``` The `ToolCallArgs` events each contain a `delta` field with a chunk of the arguments. Frontends should concatenate these deltas in the order received to construct the complete arguments object. The `toolCallId` property links all related events, allowing the frontend to associate argument chunks with the correct tool call. ### ToolCallStart Signals the start of a tool call. The `ToolCallStart` event indicates that the agent is invoking a tool to perform a specific function. This event provides the name of the tool being called and establishes a unique `toolCallId` that will be referenced by subsequent events in this tool call. Frontends can use this event to display tool usage to users, such as showing a notification that a specific operation is in progress. The optional `parentMessageId` allows linking the tool call to a specific message in the conversation, providing context for why the tool is being used. | Property | Description | | ----------------- | ----------------------------------- | | `toolCallId` | Unique identifier for the tool call | | `toolCallName` | Name of the tool being called | | `parentMessageId` | Optional ID of the parent message | ### ToolCallArgs Represents a chunk of argument data for a tool call. The `ToolCallArgs` event delivers incremental parts of the tool's arguments as they become available. Each event contains a segment of the argument data in the `delta` property. These deltas are often JSON fragments that, when combined, form the complete arguments object for the tool. Streaming the arguments is particularly valuable for complex tool calls where constructing the full arguments may take time. Frontends can progressively reveal these arguments to users, providing insight into exactly what parameters are being passed to tools. | Property | Description | | ------------ | ----------------------------------- | | `toolCallId` | Matches the ID from `ToolCallStart` | | `delta` | Argument data chunk | ### ToolCallEnd Signals the end of a tool call. The `ToolCallEnd` event marks the completion of a tool call. After receiving this event, the frontend knows that all arguments have been transmitted and the tool execution is underway or completed. This allows the UI to finalize the tool call display and prepare for potential results. In systems where tool execution results are returned separately, this event indicates that the agent has finished specifying the tool and its arguments, and is now waiting for or has received the results. | Property | Description | | ------------ | ----------------------------------- | | `toolCallId` | Matches the ID from `ToolCallStart` | ### ToolCallResult Provides the result of a tool call execution. The `ToolCallResult` event delivers the output or result from a tool that was previously invoked by the agent. This event is sent after the tool has been executed by the system and contains the actual output generated by the tool. Unlike the streaming pattern of tool call specification (start, args, end), the result is delivered as a complete unit since tool execution typically produces a complete output. Frontends can use this event to display tool results to users, append them to the conversation history, or trigger follow-up actions based on the tool's output. | Property | Description | | ------------ | ----------------------------------------------------------- | | `messageId` | ID of the conversation message this result belongs to | | `toolCallId` | Matches the ID from the corresponding `ToolCallStart` event | | `content` | The actual result/output content from the tool execution | | `role` | Optional role identifier, typically "tool" for tool results | ### ToolCallChunk Convenience event that expands to Start → Args → End automatically. The `ToolCallChunk` event lets you omit explicit `ToolCallStart` and `ToolCallEnd` events. The client stream transformer expands chunks into the standard tool-call triad: * First chunk for a tool call must include `toolCallId` and `toolCallName` and will emit `ToolCallStart` (propagating any `parentMessageId`). * Each chunk with a `delta` emits a `ToolCallArgs` for the current `toolCallId`. * `ToolCallEnd` is emitted automatically when the stream switches to a new `toolCallId` or when the stream completes. | Property | Description | | ----------------- | -------------------------------------------------------------------- | | `toolCallId` | Optional on later chunks; required on the first chunk of a tool call | | `toolCallName` | Optional on later chunks; required on the first chunk of a tool call | | `parentMessageId` | Optional ID of the parent message | | `delta` | Optional argument data chunk (often a JSON fragment) | ## State Management Events These events are used to manage and synchronize the agent's state with the frontend. State management in the protocol follows an efficient snapshot-delta pattern where complete state snapshots are sent initially or infrequently, while incremental updates (deltas) are used for ongoing changes. This approach optimizes for both completeness and efficiency: snapshots ensure the frontend has the full state context, while deltas minimize data transfer for frequent updates. Together, they enable frontends to maintain an accurate representation of agent state without unnecessary data transmission. ```mermaid theme={null} sequenceDiagram participant Agent participant Client Note over Agent,Client: Initial state transfer Agent->>Client: StateSnapshot Note over Agent,Client: Incremental updates loop State changes over time Agent->>Client: StateDelta Agent->>Client: StateDelta end Note over Agent,Client: Occasional full refresh Agent->>Client: StateSnapshot loop More incremental updates Agent->>Client: StateDelta end Note over Agent,Client: Message history update Agent->>Client: MessagesSnapshot ``` The combination of snapshots and deltas allows frontends to efficiently track changes to agent state while ensuring consistency. Snapshots serve as synchronization points that reset the state to a known baseline, while deltas provide lightweight updates between snapshots. ### StateSnapshot Provides a complete snapshot of an agent's state. The `StateSnapshot` event delivers a comprehensive representation of the agent's current state. This event is typically sent at the beginning of an interaction or when synchronization is needed. It contains all state variables relevant to the frontend, allowing it to completely rebuild its internal representation. Frontends should replace their existing state model with the contents of this snapshot rather than trying to merge it with previous state. | Property | Description | | ---------- | ----------------------- | | `snapshot` | Complete state snapshot | ### StateDelta Provides a partial update to an agent's state using JSON Patch. The `StateDelta` event contains incremental updates to the agent's state in the form of JSON Patch operations (as defined in RFC 6902). Each delta represents specific changes to apply to the current state model. This approach is bandwidth-efficient, sending only what has changed rather than the entire state. Frontends should apply these patches in sequence to maintain an accurate state representation. If a frontend detects inconsistencies after applying patches, it may request a fresh `StateSnapshot`. | Property | Description | | -------- | ----------------------------------------- | | `delta` | Array of JSON Patch operations (RFC 6902) | ### MessagesSnapshot Provides a snapshot of all messages in a conversation. The `MessagesSnapshot` event delivers a complete history of messages in the current conversation. Unlike the general state snapshot, this focuses specifically on the conversation transcript. This event is useful for initializing the chat history, synchronizing after connection interruptions, or providing a comprehensive view when a user joins an ongoing conversation. Frontends should use this to establish or refresh the conversational context displayed to users. | Property | Description | | ---------- | ------------------------ | | `messages` | Array of message objects | ## Activity Events Activity Events expose structured, in-progress activity updates that occur between chat messages. They follow the same snapshot/delta pattern as the state system so that UIs can render a complete activity view immediately and then incrementally update it as new information arrives. ### ActivitySnapshot Delivers a complete snapshot of an activity message. | Property | Description | | -------------- | --------------------------------------------------------------------------------------------- | | `messageId` | Identifier for the `ActivityMessage` this event updates | | `activityType` | Activity discriminator (for example `"PLAN"`, `"SEARCH"`) | | `content` | Structured JSON payload representing the full activity state | | `replace` | Optional. Defaults to `true`. When `false`, ignore the snapshot if the message already exists | Frontends should either create a new `ActivityMessage` or replace the existing one with the payload supplied by the snapshot. ### ActivityDelta Applies incremental updates to an existing activity using JSON Patch operations. | Property | Description | | -------------- | ------------------------------------------------------------------------ | | `messageId` | Identifier for the target activity message | | `activityType` | Activity discriminator (mirrors the value from the most recent snapshot) | | `patch` | Array of RFC 6902 JSON Patch operations to apply to the activity data | Activity deltas should be applied in order to the previously synchronized activity content. If an application detects divergence, it can request or emit a fresh `ActivitySnapshot` to resynchronize. ## Special Events Special events provide flexibility in the protocol by allowing for system-specific functionality and integration with external systems. These events don't follow the standard lifecycle or streaming patterns of other event types but instead serve specialized purposes. ### Raw Used to pass through events from external systems. The `Raw` event acts as a container for events originating from external systems or sources that don't natively follow the Agent UI Protocol. This event type enables interoperability with other event-based systems by wrapping their events in a standardized format. The enclosed event data is preserved in its original form inside the `event` property, while the optional `source` property identifies the system it came from. Frontends can use this information to handle external events appropriately, either by processing them directly or by delegating them to system-specific handlers. | Property | Description | | -------- | -------------------------- | | `event` | Original event data | | `source` | Optional source identifier | ### Custom Used for application-specific custom events. The `Custom` event provides an extension mechanism for implementing features not covered by the standard event types. Unlike `Raw` events which act as passthrough containers, `Custom` events are explicitly part of the protocol but with application-defined semantics. The `name` property identifies the specific custom event type, while the `value` property contains the associated data. This mechanism allows for protocol extensions without requiring formal specification changes. Teams should document their custom events to ensure consistent implementation across frontends and agents. | Property | Description | | -------- | ------------------------------- | | `name` | Name of the custom event | | `value` | Value associated with the event | ## Reasoning Events Reasoning events support LLM reasoning visibility and continuity, enabling chain-of-thought reasoning while maintaining privacy. These events allow agents to surface reasoning signals (e.g., summaries) and support encrypted reasoning items for state carry-over across turns—especially under `store:false` or zero data retention policies—without exposing raw chain-of-thought. See [OpenAI ZTR documentation](https://developers.openai.com/cookbook/examples/responses_api/reasoning_items/#encrypted-reasoning-items), [OpenAI store parameter documentation](https://platform.openai.com/docs/api-reference/responses/create#responses_create-store), and [Gemini Thought Signatures](https://ai.google.dev/gemini-api/docs/thought-signatures) for the underlying concept of encrypted reasoning items, which inspired this design. See [Reasoning](/concepts/reasoning) for comprehensive documentation including privacy considerations, compliance guidance, and implementation examples. ```mermaid theme={null} sequenceDiagram participant Agent participant Client Note over Agent,Client: Reasoning begins Agent->>Client: ReasoningStart Note over Agent,Client: Stream reasoning content Agent->>Client: ReasoningMessageStart Agent->>Client: ReasoningMessageContent Agent->>Client: ReasoningMessageEnd Note over Agent,Client: Reasoning completes Agent->>Client: ReasoningEnd ``` ### ReasoningStart Marks the start of reasoning. The `ReasoningStart` event signals that the agent is beginning a reasoning process. It establishes a reasoning context identified by a unique `messageId`. | Property | Description | | ----------- | ----------------------------------- | | `messageId` | Unique identifier of this reasoning | ### ReasoningMessageStart Signals the start of a reasoning message. The `ReasoningMessageStart` event begins a streaming reasoning message. This message will contain the visible portion of the agent's reasoning that should be displayed to users (e.g., a summary or partial chain-of-thought). | Property | Description | | ----------- | --------------------------------------------- | | `messageId` | Unique identifier of the message | | `role` | Role of the reasoning message (`"reasoning"`) | ### ReasoningMessageContent Represents a chunk of content in a streaming reasoning message. The `ReasoningMessageContent` event delivers incremental reasoning content to the client. Multiple content events with the same `messageId` should be concatenated to form the complete visible reasoning. | Property | Description | | ----------- | ------------------------------------------ | | `messageId` | Matches ID from ReasoningMessageStart | | `delta` | Reasoning content chunk (non-empty string) | ### ReasoningMessageEnd Signals the end of a reasoning message. The `ReasoningMessageEnd` event indicates that all content for the specified reasoning message has been sent. Clients should finalize any UI representing this reasoning message. | Property | Description | | ----------- | ------------------------------------- | | `messageId` | Matches ID from ReasoningMessageStart | ### ReasoningMessageChunk A convenience event to auto start/close reasoning messages. The `ReasoningMessageChunk` event simplifies implementation by automatically managing message lifecycle. The first chunk with a `messageId` implicitly starts the message. An empty `delta` or the next non-reasoning event implicitly closes the message. | Property | Description | | ----------- | --------------------------------------------------------- | | `messageId` | Message ID (first event must be non-empty) | | `delta` | Reasoning content chunk (empty string closes the message) | ### ReasoningEnd Marks the end of reasoning. The `ReasoningEnd` event signals that the agent has completed its reasoning process for the given context. No further reasoning events with the same `messageId` should be expected after this event. | Property | Description | | ----------- | ----------------------------------- | | `messageId` | Unique identifier of this reasoning | ### ReasoningEncryptedValue Attaches encrypted chain-of-thought reasoning to a message or tool call. The `ReasoningEncryptedValue` event carries encrypted reasoning content that represents the LLM's internal chain-of-thought related to a specific entity. This allows the agent to preserve reasoning state across conversation turns without exposing the raw content to the client. The client stores and forwards these encrypted values opaquely—only the agent (or authorized backend) can decrypt them. | Property | Description | | ---------------- | -------------------------------------------------------- | | `subtype` | Entity type: `"message"` or `"tool-call"` | | `entityId` | ID of the message or tool call this reasoning belongs to | | `encryptedValue` | Encrypted chain-of-thought content blob | Use cases: * **Message reasoning**: Attach encrypted reasoning to an `AssistantMessage` or `ReasoningMessage` to preserve context for follow-up turns * **Tool call reasoning**: Attach encrypted reasoning to a tool call to capture why the agent chose specific arguments or how it interpreted results ## Deprecated Events The following events are deprecated and will be removed in version 1.0.0. Use the corresponding Reasoning events instead. ### Thinking Events (Deprecated) The `THINKING_*` events have been replaced by `REASONING_*` events: | Deprecated Event | Replacement | | ------------------------------- | --------------------------- | | `THINKING_START` | `REASONING_START` | | `THINKING_END` | `REASONING_END` | | `THINKING_TEXT_MESSAGE_START` | `REASONING_MESSAGE_START` | | `THINKING_TEXT_MESSAGE_CONTENT` | `REASONING_MESSAGE_CONTENT` | | `THINKING_TEXT_MESSAGE_END` | `REASONING_MESSAGE_END` | See [Reasoning Migration](/concepts/reasoning#migration-from-thinking-events) for detailed migration guidance. ## Draft Events These events are currently in draft status and may change before finalization. They represent proposed extensions to the protocol that are under active development and discussion. ### Meta Events DRAFT [View Proposal](/drafts/meta-events) Meta events provide annotations and signals independent of agent runs, such as user feedback or external system events. #### MetaEvent A side-band annotation event that can occur anywhere in the stream. | Property | Description | | ---------- | ---------------------------------------------------- | | `metaType` | Application-defined type (e.g., "thumbs\_up", "tag") | | `payload` | Application-defined payload | ### Modified Lifecycle Events DRAFT [View Proposal](/drafts/interrupts) Extensions to existing lifecycle events to support interrupts and branching. #### RunFinished (Extended) The `RunFinished` event gains new fields to support interrupt-aware workflows. | Property | Description | | ----------- | ------------------------------------------------ | | `outcome` | Optional: "success" or "interrupt" | | `interrupt` | Optional: Contains interrupt details when paused | See [Serialization](/concepts/serialization) for lineage and input capture. #### RunStarted (Extended) The `RunStarted` event gains new fields to support branching and input tracking. | Property | Description | | ------------- | ------------------------------------------------- | | `parentRunId` | Optional: Parent run ID for branching/time travel | | `input` | Optional: The exact agent input for this run | ## Event Flow Patterns Events in the protocol typically follow specific patterns: 1. **Start-Content-End Pattern**: Used for streaming content (text messages, tool calls) * `Start` event initiates the stream * `Content` events deliver data chunks * `End` event signals completion 2. **Snapshot-Delta Pattern**: Used for state synchronization * `Snapshot` provides complete state * `Delta` events provide incremental updates 3. **Lifecycle Pattern**: Used for monitoring agent runs * `Started` events signal beginnings * `Finished`/`Error` events signal endings ## Implementation Considerations When implementing event handlers: * Events should be processed in the order they are received * Events with the same ID (e.g., `messageId`, `toolCallId`) belong to the same logical stream * Implementations should be resilient to out-of-order delivery * Custom events should follow the established patterns for consistency # Generative UI Source: https://docs.ag-ui.com/concepts/generative-ui-specs Understanding AG-UI's relationship with generative UI specifications ## **AG-UI and Generative UI Specs** Several recently released specs have enabled agents to return generative UI, increasing the power and flexibility of the Agent\<->User conversation. A2UI, MCP-UI, and Open-JSON-UI are all **generative UI specifications.** Generative UIs allow agents to respond to users not only with text but also with dynamic UI components. | **Specification** | **Origin / Maintainer** | **Purpose** | | ----------------- | ----------------------- | -------------------------------------------------------------------------------------------------------------------- | | **A2UI** | Google | A declarative, LLM-friendly Generative UI spec. JSONL-based and streaming, designed for platform-agnostic rendering. | | **Open-JSON-UI** | OpenAI | An open standardization of OpenAI's internal declarative Generative UI schema. | | **MCP-UI** | Microsoft + Shopify | A fully open, iframe-based Generative UI standard extending MCP for user-facing experiences. | Despite the naming similarities, **AG-UI is not a generative UI specification** — it's a **User Interaction protocol** that provides the **bi-directional runtime connection** between the agent and the application. AG-UI natively supports all of the above generative UI specs and allows developers to define **their own custom generative UI standards** as well. # Messages Source: https://docs.ag-ui.com/concepts/messages Understanding message structure and communication in AG-UI # Messages Messages form the backbone of communication in the AG-UI protocol. They represent the conversation history between users and AI agents, and provide a standardized way to exchange information regardless of the underlying AI service being used. ## Message Structure AG-UI messages follow a vendor-neutral format, ensuring compatibility across different AI providers while maintaining a consistent structure. This allows applications to switch between AI services (like OpenAI, Anthropic, or custom models) without changing the client-side implementation. The basic message structure includes: ```typescript theme={null} interface BaseMessage { id: string // Unique identifier for the message role: string // The role of the sender (user, assistant, system, tool, reasoning) content?: string // Optional text content of the message name?: string // Optional name of the sender encryptedContent?: string // Optional encrypted content for privacy-preserving state continuity } ``` The `role` discriminator can be `"user"`, `"assistant"`, `"system"`, `"tool"`, `"developer"`, `"activity"`, or `"reasoning"`. Concrete message types extend this shape with the fields they need. > The `encryptedContent` field enables privacy-preserving workflows where > sensitive content (such as reasoning chains) can be passed across turns > without exposing the raw content. This is particularly useful for zero data > retention (ZDR) compliance and `store:false` scenarios. ## Message Types AG-UI supports several message types to accommodate different participants in a conversation: ### User Messages Messages from the end user to the agent: ```typescript theme={null} interface UserMessage { id: string role: "user" content: string | InputContent[] // Text or multimodal input from the user name?: string // Optional user identifier } type InputContent = | TextInputContent | ImageInputContent | AudioInputContent | VideoInputContent | DocumentInputContent interface InputContentDataSource { type: "data" value: string mimeType: string } interface InputContentUrlSource { type: "url" value: string mimeType?: string } type InputContentSource = InputContentDataSource | InputContentUrlSource interface TextInputContent { type: "text" text: string } interface ImageInputContent { type: "image" source: InputContentSource metadata?: Record } interface AudioInputContent { type: "audio" source: InputContentSource metadata?: Record } interface VideoInputContent { type: "video" source: InputContentSource metadata?: Record } interface DocumentInputContent { type: "document" source: InputContentSource metadata?: Record } ``` > In Python, the previous `BinaryInputContent` model is deprecated and remains > temporarily available as a compatibility path. This structure keeps traditional plain-text inputs working while enabling richer payloads such as images, audio clips, or uploaded files in the same message. ### Assistant Messages Messages from the AI assistant to the user: ```typescript theme={null} interface AssistantMessage { id: string role: "assistant" content?: string // Text response from the assistant (optional if using tool calls) name?: string // Optional assistant identifier toolCalls?: ToolCall[] // Optional tool calls made by the assistant encryptedContent?: string // Optional encrypted content for state continuity } ``` ### System Messages Instructions or context provided to the agent: ```typescript theme={null} interface SystemMessage { id: string role: "system" content: string // Instructions or context for the agent name?: string // Optional identifier } ``` ### Tool Messages Results from tool executions: ```typescript theme={null} interface ToolMessage { id: string role: "tool" content: string // Result from the tool execution toolCallId: string // ID of the tool call this message responds to error?: string // Optional error message if the tool execution failed encryptedValue?: string // Optional encrypted reasoning for state continuity } ``` Key points: * The `toolCallId` links the result back to the original tool call * Use `error` to indicate tool execution failures * Use `encryptedValue` to attach encrypted chain-of-thought related to how the agent interpreted or processed the tool result ### Activity Messages Structured UI messages that exist only on the frontend.  Used for progress, status, or any custom visual element that shouldn’t be sent to the model: ```typescript theme={null} interface ActivityMessage { id: string role: "activity" activityType: string // e.g. "PLAN", "SEARCH", "SCRAPE" content: Record // Structured payload rendered by the frontend } ``` Key points * Emitted via `ACTIVITY_SNAPSHOT` and `ACTIVITY_DELTA` to support live, updateable UI (checklists, steps, search-in-progress, etc.). * **Frontend-only:** never forwarded to the agent, so no filtering and no LLM confusion. * **Customizable:** define your own `activityType` and `content` and render a matching UI component. * **Streamable:** can be updated over time for long-running operations. * Helps persist/restore custom events by turning them into durable message objects. ### Developer Messages Internal messages used for development or debugging: ```typescript theme={null} interface DeveloperMessage { id: string role: "developer" content: string name?: string } ``` ### Reasoning Messages Messages representing the agent's internal reasoning or chain-of-thought process: ```typescript theme={null} interface ReasoningMessage { id: string role: "reasoning" content: string // Reasoning content (visible to client) encryptedValue?: string // Optional encrypted reasoning for state continuity } ``` Unlike Activity messages, Reasoning messages are intended to represent the agent's internal thought process and may be encrypted for privacy and are meant to be sent back to the agent for further processing on subsequent turns. Key points: * Emitted via `REASONING_MESSAGE_START`, `REASONING_MESSAGE_CONTENT`, and `REASONING_MESSAGE_END` events. * **Visibility control:** Content may be visible to users (as a summary) or fully encrypted. * **Encrypted values:** Use `REASONING_ENCRYPTED_VALUE` events to attach encrypted chain-of-thought to messages or tool calls without exposing content. * **State continuity:** Encrypted reasoning items can be passed across conversation turns without exposing raw chain-of-thought. * **Privacy-first:** Supports `store:false` and zero data retention (ZDR) policies while preserving reasoning capabilities. * **Separate from assistant messages:** Reasoning is kept distinct from final responses to avoid polluting the conversation history. See [Reasoning Events](/concepts/events#reasoning-events) for the streaming event lifecycle. ## Vendor Neutrality AG-UI messages are designed to be vendor-neutral, meaning they can be easily mapped to and from proprietary formats used by various AI providers: ```typescript theme={null} // Example: Converting AG-UI messages to OpenAI format const openaiMessages = agUiMessages .filter((msg) => ["user", "system", "assistant"].includes(msg.role)) .map((msg) => ({ role: msg.role as "user" | "system" | "assistant", content: msg.content || "", // Map tool calls if present ...(msg.role === "assistant" && msg.toolCalls ? { tool_calls: msg.toolCalls.map((tc) => ({ id: tc.id, type: tc.type, function: { name: tc.function.name, arguments: tc.function.arguments, }, })), } : {}), })) ``` This abstraction allows AG-UI to serve as a common interface regardless of the underlying AI service. ## Message Synchronization Messages can be synchronized between client and server through two primary mechanisms: ### Complete Snapshots The `MESSAGES_SNAPSHOT` event provides a complete view of all messages in a conversation: ```typescript theme={null} interface MessagesSnapshotEvent { type: EventType.MESSAGES_SNAPSHOT messages: Message[] // Complete array of all messages } ``` This is typically used: * When initializing a conversation * After connection interruptions * When major state changes occur * To ensure client-server synchronization ### Streaming Messages For real-time interactions, new messages can be streamed as they're generated: 1. **Start a message**: Indicate a new message is being created ```typescript theme={null} interface TextMessageStartEvent { type: EventType.TEXT_MESSAGE_START messageId: string role: string } ``` 2. **Stream content**: Send content chunks as they become available ```typescript theme={null} interface TextMessageContentEvent { type: EventType.TEXT_MESSAGE_CONTENT messageId: string delta: string // Text chunk to append } ``` 3. **End a message**: Signal the message is complete ```typescript theme={null} interface TextMessageEndEvent { type: EventType.TEXT_MESSAGE_END messageId: string } ``` This streaming approach provides a responsive user experience with immediate feedback. ## Tool Integration in Messages AG-UI messages elegantly integrate tool usage, allowing agents to perform actions and process their results: ### Tool Calls Tool calls are embedded within assistant messages: ```typescript theme={null} interface ToolCall { id: string // Unique ID for this tool call type: "function" // Type of tool call function: { name: string // Name of the function to call arguments: string // JSON-encoded string of arguments } } ``` Example assistant message with tool calls: ```typescript theme={null} { id: "msg_123", role: "assistant", content: "I'll help you with that calculation.", toolCalls: [ { id: "call_456", type: "function", function: { name: "calculate", arguments: '{"expression": "24 * 7"}' } } ] } ``` ### Tool Results Results from tool executions are represented as tool messages: ```typescript theme={null} { id: "result_789", role: "tool", content: "168", toolCallId: "call_456" // References the original tool call } ``` This creates a clear chain of tool usage: 1. Assistant requests a tool call 2. Tool executes and returns a result 3. Assistant can reference and respond to the result ## Streaming Tool Calls Similar to text messages, tool calls can be streamed to provide real-time visibility into the agent's actions: 1. **Start a tool call**: ```typescript theme={null} interface ToolCallStartEvent { type: EventType.TOOL_CALL_START toolCallId: string toolCallName: string parentMessageId?: string // Optional link to parent message } ``` 2. **Stream arguments**: ```typescript theme={null} interface ToolCallArgsEvent { type: EventType.TOOL_CALL_ARGS toolCallId: string delta: string // JSON fragment to append to arguments } ``` 3. **End a tool call**: ```typescript theme={null} interface ToolCallEndEvent { type: EventType.TOOL_CALL_END toolCallId: string } ``` This allows frontends to show tools being invoked progressively as the agent constructs its reasoning. ## Practical Example Here's a complete example of a conversation with tool usage: ```typescript theme={null} // Conversation history ;[ // User query { id: "msg_1", role: "user", content: "What's the weather in New York?", }, // Assistant response with tool call { id: "msg_2", role: "assistant", content: "Let me check the weather for you.", toolCalls: [ { id: "call_1", type: "function", function: { name: "get_weather", arguments: '{"location": "New York", "unit": "celsius"}', }, }, ], }, // Tool result { id: "result_1", role: "tool", content: '{"temperature": 22, "condition": "Partly Cloudy", "humidity": 65}', toolCallId: "call_1", }, // Assistant's final response using tool results { id: "msg_3", role: "assistant", content: "The weather in New York is partly cloudy with a temperature of 22°C and 65% humidity.", }, ] ``` ## Conclusion The message structure in AG-UI enables sophisticated conversational AI experiences while maintaining vendor neutrality. By standardizing how messages are represented, synchronized, and streamed, AG-UI provides a consistent way to implement interactive human-agent communication regardless of the underlying AI service. This system supports everything from simple text exchanges to complex tool-based workflows, all while optimizing for both real-time responsiveness and efficient data transfer. # Middleware Source: https://docs.ag-ui.com/concepts/middleware Transform and intercept events in AG-UI agents # Middleware Middleware in AG-UI provides a powerful way to transform, filter, and augment the event streams that flow through agents. It enables you to add cross-cutting concerns like logging, authentication, rate limiting, and event filtering without modifying the core agent logic. Examples below assume the relevant RxJS operators/utilities (`map`, `tap`, `catchError`, `switchMap`, `timer`, etc.) are imported. ## What is Middleware? Middleware sits between the agent execution and the event consumer, allowing you to: 1. **Transform events** – Modify or enhance events as they flow through the pipeline 2. **Filter events** – Selectively allow or block certain events 3. **Add metadata** – Inject additional context or tracking information 4. **Handle errors** – Implement custom error recovery strategies 5. **Monitor execution** – Add logging, metrics, or debugging capabilities ## How Middleware Works Middleware forms a chain where each middleware wraps the next, creating layers of functionality. When an agent runs, the event stream flows through each middleware in sequence. ```typescript theme={null} import { AbstractAgent } from "@ag-ui/client" const agent = new MyAgent() // Middleware chain: logging -> auth -> filter -> agent agent.use(loggingMiddleware, authMiddleware, filterMiddleware) // When agent runs, events flow through all middleware await agent.runAgent() ``` Middleware added with `agent.use(...)` is applied in `runAgent()`. `connectAgent()` currently calls `connect()` directly and does not run middleware. ## Function-Based Middleware For simple transformations, you can use function-based middleware. This is the most concise way to add middleware: ```typescript theme={null} import { MiddlewareFunction } from "@ag-ui/client" import { EventType } from "@ag-ui/core" const prefixMiddleware: MiddlewareFunction = (input, next) => { return next.run(input).pipe( map(event => { if ( event.type === EventType.TEXT_MESSAGE_CHUNK || event.type === EventType.TEXT_MESSAGE_CONTENT ) { return { ...event, delta: `[AI]: ${event.delta}` } } return event }) ) } agent.use(prefixMiddleware) ``` ## Class-Based Middleware For more complex scenarios requiring state or configuration, use class-based middleware: ```typescript theme={null} import { Middleware } from "@ag-ui/client" import { Observable } from "rxjs" import { tap } from "rxjs/operators" class MetricsMiddleware extends Middleware { private eventCount = 0 constructor(private metricsService: MetricsService) { super() } run(input: RunAgentInput, next: AbstractAgent): Observable { const startTime = Date.now() return this.runNext(input, next).pipe( tap(event => { this.eventCount++ this.metricsService.recordEvent(event.type) }), finalize(() => { const duration = Date.now() - startTime this.metricsService.recordDuration(duration) this.metricsService.recordEventCount(this.eventCount) }) ) } } agent.use(new MetricsMiddleware(metricsService)) ``` If you are writing class middleware, prefer the helper methods: * `runNext(input, next)` normalizes chunk events into full `TEXT_MESSAGE_*`/`TOOL_CALL_*` sequences. * `runNextWithState(input, next)` also provides accumulated `messages` and `state` after each event. ## Built-in Middleware AG-UI provides several built-in middleware components for common use cases: ### FilterToolCallsMiddleware Filter tool calls based on allowed or disallowed lists: ```typescript theme={null} import { FilterToolCallsMiddleware } from "@ag-ui/client" // Only allow specific tools const allowedFilter = new FilterToolCallsMiddleware({ allowedToolCalls: ["search", "calculate"] }) // Or block specific tools const blockedFilter = new FilterToolCallsMiddleware({ disallowedToolCalls: ["delete", "modify"] }) agent.use(allowedFilter) ``` `FilterToolCallsMiddleware` filters emitted `TOOL_CALL_*` events. It does not block tool execution in the upstream model/runtime. ## Middleware Patterns Common patterns include logging, auth via `forwardedProps`, and rate limiting. See the [JS middleware reference](/sdk/js/client/middleware) for concrete implementations. ## Combining Middleware You can combine multiple middleware to create sophisticated processing pipelines: ```typescript theme={null} const logMiddleware: MiddlewareFunction = (input, next) => next.run(input) const metricsMiddleware = new MetricsMiddleware(metricsService) const filterMiddleware = new FilterToolCallsMiddleware({ allowedToolCalls: ["search"] }) agent.use(logMiddleware, metricsMiddleware, filterMiddleware) ``` ## Execution Order Middleware executes in the order it's added, with each middleware wrapping the next: 1. First middleware receives the original input 2. It can modify the input before passing to the next middleware 3. Each middleware processes events from the next in the chain 4. The final middleware calls the actual agent ```typescript theme={null} agent.use(middleware1, middleware2, middleware3) // Execution flow: // → middleware1 // → middleware2 // → middleware3 // → agent.run() // ← events flow back through middleware3 // ← events flow back through middleware2 // ← events flow back through middleware1 ``` ## Best Practices 1. **Keep middleware focused** – Each middleware should have a single responsibility 2. **Handle errors gracefully** – Use RxJS error handling operators 3. **Avoid blocking operations** – Use async patterns for I/O operations 4. **Document side effects** – Clearly indicate if middleware modifies state 5. **Test middleware independently** – Write unit tests for each middleware 6. **Consider performance** – Be mindful of processing overhead in the event stream ## Advanced Use Cases ### Conditional Middleware Apply middleware based on runtime conditions: ```typescript theme={null} const conditionalMiddleware: MiddlewareFunction = (input, next) => { if (input.forwardedProps?.debug === true) { // Apply debug logging return next.run(input).pipe( tap(event => console.debug(event)) ) } return next.run(input) } ``` For event transformation and stream-control variants, see the [JS middleware reference](/sdk/js/client/middleware). ## Conclusion Middleware provides a flexible and powerful way to extend AG-UI agents without modifying their core logic. Whether you need simple event transformation or complex stateful processing, the middleware system offers the tools to build robust, maintainable agent applications. # Reasoning Source: https://docs.ag-ui.com/concepts/reasoning Support for LLM reasoning visibility and continuity in AG-UI # Reasoning AG-UI provides first-class support for LLM reasoning, enabling chain-of-thought visibility while maintaining privacy and state continuity across conversation turns. ## Overview Modern LLMs increasingly use chain-of-thought reasoning to improve response quality. AG-UI's reasoning support addresses three key challenges: * **Reasoning visibility**: Surface reasoning signals (e.g., summaries) to users without exposing raw chain-of-thought * **State continuity**: Maintain reasoning context across turns using encrypted reasoning items, even under `store:false` or zero data retention (ZDR) policies * **Privacy compliance**: Support enterprise privacy requirements while preserving reasoning capabilities Unlike Activity messages, Reasoning messages are intended to represent the agent's internal thought process and may be encrypted for privacy and are meant to be sent back to the agent for further processing on subsequent turns. ## ReasoningMessage The `ReasoningMessage` type represents reasoning content in the message history: ```typescript theme={null} interface ReasoningMessage { id: string role: "reasoning" content: string // Reasoning content (visible to client) encryptedValue?: string // Optional encrypted reasoning for state continuity } ``` | Property | Type | Description | | ---------------- | ------------- | ---------------------------------------------------- | | `id` | `string` | Unique identifier for the reasoning message | | `role` | `"reasoning"` | Message role discriminator | | `content` | `string` | Reasoning content visible to the client | | `encryptedValue` | `string?` | Encrypted chain-of-thought blob for state continuity | Key characteristics: * **Separate from assistant messages**: Reasoning is kept distinct from final responses to avoid polluting conversation history * **Streamable**: Content arrives via streaming events * **Optional encryption**: When `encryptedValue` is present, it represents encrypted chain-of-thought that the client stores and forwards opaquely ## Reasoning Events Reasoning events manage the lifecycle of reasoning messages. See [Events](/concepts/events#reasoning-events) for the complete event reference. ### Event Flow A typical reasoning flow follows this pattern: ```mermaid theme={null} sequenceDiagram participant Agent participant Client Note over Agent,Client: Reasoning begins Agent->>Client: ReasoningStart Note over Agent,Client: Stream visible reasoning Agent->>Client: ReasoningMessageStart Agent->>Client: ReasoningMessageContent (delta) Agent->>Client: ReasoningMessageContent (delta) Agent->>Client: ReasoningMessageEnd Note over Agent,Client: Attach encrypted chain-of-thought Agent->>Client: ReasoningEncryptedValue Note over Agent,Client: Reasoning completes Agent->>Client: ReasoningEnd ``` ### Event Types | Event | Purpose | | ------------------------- | ------------------------------------------------------------- | | `ReasoningStart` | Marks beginning of reasoning phase | | `ReasoningMessageStart` | Begins a streaming reasoning message | | `ReasoningMessageContent` | Delivers reasoning content chunks | | `ReasoningMessageEnd` | Completes a reasoning message | | `ReasoningMessageChunk` | Convenience event that auto-manages message lifecycle | | `ReasoningEnd` | Marks completion of reasoning | | `ReasoningEncryptedValue` | Attaches encrypted chain-of-thought to a message or tool call | ## Privacy and Compliance AG-UI reasoning is designed with privacy-first principles: ### Zero Data Retention (ZDR) For deployments requiring zero data retention: 1. **Encrypted reasoning values** can carry state across turns without storing decryptable content on the client 2. The client receives and forwards `encryptedValue` blobs opaquely via `ReasoningEncryptedValue` events 3. Only the agent (or authorized backend) can decrypt the reasoning content ### Visibility Control Agents control what reasoning is visible to users: * **Full visibility**: Stream the complete chain-of-thought via `ReasoningMessageContent` events * **Summary only**: Emit a condensed summary while attaching detailed reasoning as encrypted values * **Hidden**: Use only `ReasoningEncryptedValue` events with no visible streaming ### Compliance Considerations | Requirement | Solution | | ----------------------- | ----------------------------------------------------------------------------------- | | GDPR right to erasure | Encrypted content can be discarded without losing reasoning capability | | SOC 2 data handling | Reasoning content never stored in plaintext on client | | HIPAA minimum necessary | Only summaries exposed; detailed reasoning stays encrypted | | Audit logging | `ReasoningStart`/`ReasoningEnd` events provide audit trail without content exposure | ## Example Implementations ### Basic Reasoning Flow A simple implementation showing visible reasoning: ```typescript theme={null} // Agent emits reasoning start yield { type: "REASONING_START", messageId: "reasoning-001", } // Stream visible reasoning content yield { type: "REASONING_MESSAGE_START", messageId: "msg-123", role: "reasoning", } yield { type: "REASONING_MESSAGE_CONTENT", messageId: "msg-123", delta: "Let me ", } yield { type: "REASONING_MESSAGE_CONTENT", messageId: "msg-123", delta: "think through ", } yield { type: "REASONING_MESSAGE_CONTENT", messageId: "msg-123", delta: "this step ", } yield { type: "REASONING_MESSAGE_CONTENT", messageId: "msg-123", delta: "by step...", } yield { type: "REASONING_MESSAGE_END", messageId: "msg-123", } // End reasoning yield { type: "REASONING_END", messageId: "reasoning-001", } ``` ### Encrypted Content for State Continuity When maintaining reasoning state across turns without exposing content, use the `ReasoningEncryptedValue` event to attach encrypted chain-of-thought to messages or tool calls: ```typescript theme={null} // Agent emits reasoning start yield { type: "REASONING_START", messageId: "reasoning-002", } // Stream a visible summary for the user yield { type: "REASONING_MESSAGE_START", messageId: "msg-456", role: "reasoning", } yield { type: "REASONING_MESSAGE_CONTENT", messageId: "msg-456", delta: "Analyzing your request...", } yield { type: "REASONING_MESSAGE_END", messageId: "msg-456", } // Attach encrypted chain-of-thought to the reasoning message yield { type: "REASONING_ENCRYPTED_VALUE", subtype: "message", entityId: "msg-456", encryptedValue: "eyJhbGciOiJBMjU2R0NNIiwiZW5jIjoiQTI1NkdDTSJ9...", } yield { type: "REASONING_END", messageId: "reasoning-002", } // On subsequent turns, client sends back the message with encryptedValue // which the agent can decrypt to restore reasoning context ``` ### Attaching Encrypted Reasoning to Tool Calls You can also attach encrypted reasoning to tool calls to capture why the agent chose specific arguments or how it interpreted results: ```typescript theme={null} // Tool call with encrypted reasoning yield { type: "TOOL_CALL_START", toolCallId: "tool-123", toolCallName: "search_database", parentMessageId: "msg-789", } yield { type: "TOOL_CALL_ARGS", toolCallId: "tool-123", delta: '{"query": "user preferences"}', } yield { type: "TOOL_CALL_END", toolCallId: "tool-123", } // Attach encrypted reasoning explaining why this tool was called yield { type: "REASONING_ENCRYPTED_VALUE", subtype: "tool-call", entityId: "tool-123", encryptedValue: "encrypted-reasoning-about-tool-selection...", } ``` ### ZDR-Compliant Implementation For zero data retention scenarios: ```typescript theme={null} // Server-side: encrypt reasoning before sending const encryptedReasoning = await encrypt(detailedChainOfThought, secretKey) yield { type: "REASONING_START", messageId: "reasoning-003", } // Only emit a high-level summary to the client yield { type: "REASONING_MESSAGE_CHUNK", messageId: "summary-001", delta: "Processing your request securely...", } yield { type: "REASONING_MESSAGE_CHUNK", messageId: "summary-001", delta: "", // Empty delta closes the message } // Attach the encrypted chain-of-thought yield { type: "REASONING_ENCRYPTED_VALUE", subtype: "message", entityId: "summary-001", encryptedValue: encryptedReasoning, } yield { type: "REASONING_END", messageId: "reasoning-003", } // Client stores only: // - The encrypted blob (cannot decrypt) // - The summary text (no sensitive details) // Full reasoning is never persisted in plaintext ``` ### Using the Convenience Chunk Event The `ReasoningMessageChunk` event simplifies implementation by auto-managing message lifecycle: ```typescript theme={null} // First chunk with messageId starts the message automatically yield { type: "REASONING_MESSAGE_CHUNK", messageId: "msg-789", delta: "Analyzing the problem space...", } // Subsequent chunks continue the stream yield { type: "REASONING_MESSAGE_CHUNK", messageId: "msg-789", delta: " Considering multiple approaches...", } // Empty delta (or next non-reasoning event) closes automatically yield { type: "REASONING_MESSAGE_CHUNK", messageId: "msg-789", delta: "", } ``` ## Client Integration ### Handling Reasoning Events ```typescript theme={null} import { EventType, type BaseEvent } from "@ag-ui/core" function handleEvent(event: BaseEvent) { switch (event.type) { case EventType.REASONING_START: // Initialize reasoning UI (e.g., "thinking" indicator) console.log("Agent is reasoning...") break case EventType.REASONING_MESSAGE_CONTENT: // Append visible reasoning to UI appendReasoningText(event.messageId, event.delta) break case EventType.REASONING_ENCRYPTED_VALUE: // Store encrypted value for the referenced entity if (event.subtype === "message") { storeMessageEncryptedValue(event.entityId, event.encryptedValue) } else if (event.subtype === "tool-call") { storeToolCallEncryptedValue(event.entityId, event.encryptedValue) } break case EventType.REASONING_END: // Finalize reasoning UI console.log("Reasoning complete") break } } ``` ### Passing Encrypted Reasoning Back When making subsequent requests, include stored encrypted values: ```typescript theme={null} const response = await agent.run({ threadId: "thread-123", messages: [ ...previousMessages, { id: "reasoning-002", role: "reasoning", content: "Analyzing your request...", // Visible summary encryptedValue: storedEncryptedBlob, // Opaque to client }, { id: "user-msg-001", role: "user", content: "Follow up question...", }, ], }) ``` ## Migration from Thinking Events The `THINKING_*` events are deprecated and will be removed in version 1.0.0. New implementations should use `REASONING_*` events. ### Deprecated Events The following events are deprecated: | Deprecated Event | Replacement | | ------------------------------- | --------------------------- | | `THINKING_START` | `REASONING_START` | | `THINKING_END` | `REASONING_END` | | `THINKING_TEXT_MESSAGE_START` | `REASONING_MESSAGE_START` | | `THINKING_TEXT_MESSAGE_CONTENT` | `REASONING_MESSAGE_CONTENT` | | `THINKING_TEXT_MESSAGE_END` | `REASONING_MESSAGE_END` | ### Migration Steps 1. **Update event types**: Replace all `THINKING_*` event types with their `REASONING_*` equivalents 2. **Update message types**: Use `ReasoningMessage` with `role: "reasoning"` instead of any thinking-specific message types 3. **Add encrypted value support**: Consider using `ReasoningEncryptedValue` events for improved privacy compliance 4. **Test thoroughly**: Ensure existing functionality works with the new event types ### Example Migration Before (deprecated): ```typescript theme={null} // ❌ Deprecated - do not use yield { type: "THINKING_START", messageId: "think-001" } yield { type: "THINKING_TEXT_MESSAGE_START", messageId: "msg-001" } yield { type: "THINKING_TEXT_MESSAGE_CONTENT", messageId: "msg-001", delta: "..." } yield { type: "THINKING_TEXT_MESSAGE_END", messageId: "msg-001" } yield { type: "THINKING_END", messageId: "think-001" } ``` After (current): ```typescript theme={null} // ✅ Current implementation yield { type: "REASONING_START", messageId: "reasoning-001" } yield { type: "REASONING_MESSAGE_START", messageId: "msg-001", role: "assistant" } yield { type: "REASONING_MESSAGE_CONTENT", messageId: "msg-001", delta: "..." } yield { type: "REASONING_MESSAGE_END", messageId: "msg-001" } yield { type: "REASONING_END", messageId: "reasoning-001" } ``` ## Best Practices 1. **Always pair start/end events**: Every `ReasoningStart` should have a corresponding `ReasoningEnd` 2. **Use encrypted values for sensitive reasoning**: When chain-of-thought contains sensitive information, use `ReasoningEncryptedValue` to attach encrypted content to messages or tool calls 3. **Provide user feedback**: Even with encrypted reasoning, emit visible summaries so users know the agent is working 4. **Handle missing events gracefully**: Clients should be resilient to incomplete event streams 5. **Consider bandwidth**: For very long reasoning chains, consider emitting only summaries to reduce data transfer ## Related Documentation * [Events](/concepts/events#reasoning-events) - Complete event type reference * [Messages](/concepts/messages#reasoning-messages) - Message type documentation * [Serialization](/concepts/serialization) - State continuity and lineage # Serialization Source: https://docs.ag-ui.com/concepts/serialization Serialize event streams for history restore, branching, and compaction in AG-UI # Serialization Serialization in AG-UI provides a standard way to persist and restore the event stream that drives an agent–UI session. With a serialized stream you can: * Restore chat history and UI state after reloads or reconnects * Attach to running agents and continue receiving events * Create branches (time travel) from any prior run * Compact stored history to reduce size without losing meaning This page explains the model, the updated event fields, and practical usage patterns with examples. ## Core Concepts * Stream serialization – Convert the full event history to and from a portable representation (e.g., JSON) for storage in databases, files, or logs. * Event compaction – Reduce verbose streams to snapshots while preserving semantics (e.g., merge content chunks, collapse deltas into snapshots). * Run lineage – Track branches of conversation using a `parentRunId`, forming a git‑like append‑only log that enables time travel and alternative paths. ## Updated Event Fields The `RunStarted` event includes additional optional fields: ```ts theme={null} type RunStartedEvent = BaseEvent & { type: EventType.RUN_STARTED threadId: string runId: string /** Parent for branching/time travel within the same thread */ parentRunId?: string /** Exact agent input for this run (may omit messages already in history) */ input?: AgentInput } ``` These fields enable lineage tracking and let implementations record precisely what was passed to the agent, independent of previously recorded messages. ## Event Compaction Compaction reduces noise in an event stream while keeping the same observable outcome. A typical implementation provides a utility: ```ts theme={null} declare function compactEvents(events: BaseEvent[]): BaseEvent[] ``` Common compaction rules include: * Message streams – Combine `TEXT_MESSAGE_*` sequences into a single message snapshot; concatenate adjacent `TEXT_MESSAGE_CONTENT` for the same message. * Tool calls – Collapse tool call start/content/end into a compact record. * State – Merge consecutive `STATE_DELTA` events into a single final `STATE_SNAPSHOT` and discard superseded updates. * Run input normalization – Remove from `RunStarted.input.messages` any messages already present earlier in the stream. ## Branching and Time Travel Setting `parentRunId` on a `RunStarted` event creates a git‑like lineage. The stream becomes an immutable append‑only log where each run can branch from any previous run. ```mermaid theme={null} gitGraph commit id: "run1" commit id: "run2" branch alternative checkout alternative commit id: "run3 (parent run2)" commit id: "run4" checkout main commit id: "run5 (parent run2)" commit id: "run6" ``` Benefits: * Multiple branches in the same serialized log * Immutable history (append‑only) * Deterministic time travel to any point ## Examples ### Basic Serialization ```ts theme={null} // Serialize event stream const events: BaseEvent[] = [...]; const serialized = JSON.stringify(events); await storage.save(threadId, serialized); // Restore and compact later const restored = JSON.parse(await storage.load(threadId)); const compacted = compactEvents(restored); ``` ### Event Compaction Before: ```ts theme={null} [ { type: "TEXT_MESSAGE_START", messageId: "msg1", role: "user" }, { type: "TEXT_MESSAGE_CONTENT", messageId: "msg1", delta: "Hello " }, { type: "TEXT_MESSAGE_CONTENT", messageId: "msg1", delta: "world" }, { type: "TEXT_MESSAGE_END", messageId: "msg1" }, { type: "STATE_DELTA", patch: { op: "add", path: "/foo", value: 1 } }, { type: "STATE_DELTA", patch: { op: "replace", path: "/foo", value: 2 } }, ] ``` After: ```ts theme={null} [ { type: "MESSAGES_SNAPSHOT", messages: [{ id: "msg1", role: "user", content: "Hello world" }], }, { type: "STATE_SNAPSHOT", state: { foo: 2 }, }, ] ``` ### Branching With `parentRunId` ```ts theme={null} // Original run { type: "RUN_STARTED", threadId: "thread1", runId: "run1", input: { messages: ["Tell me about Paris"] }, } // Branch from run1 { type: "RUN_STARTED", threadId: "thread1", runId: "run2", parentRunId: "run1", input: { messages: ["Actually, tell me about London instead"] }, } ``` ### Normalized Input ```ts theme={null} // First run includes full message { type: "RUN_STARTED", runId: "run1", input: { messages: [{ id: "msg1", role: "user", content: "Hello" }] }, } // Second run omits already‑present message { type: "RUN_STARTED", runId: "run2", input: { messages: [{ id: "msg2", role: "user", content: "How are you?" }] }, // msg1 omitted; it already exists in history } ``` ## Implementation Notes * Provide SDK helpers for compaction and (de)serialization. * Store streams append‑only; prefer incremental writes when possible. * Consider compression when persisting long histories. * Add indexes by `threadId`, `runId`, and timestamps for fast retrieval. ## See Also * Concepts: [Events](/concepts/events), [State Management](/concepts/state) * SDKs: TypeScript encoder and core event types # State Management Source: https://docs.ag-ui.com/concepts/state Understanding state synchronization between agents and frontends in AG-UI # State Management State management is a core feature of the AG-UI protocol that enables real-time synchronization between agents and frontend applications. By providing efficient mechanisms for sharing and updating state, AG-UI creates a foundation for collaborative experiences where both AI agents and human users can work together seamlessly. ## Shared State Architecture In AG-UI, state is a structured data object that: 1. Persists across interactions with an agent 2. Can be accessed by both the agent and the frontend 3. Updates in real-time as the interaction progresses 4. Provides context for decision-making on both sides This shared state architecture creates a bidirectional communication channel where: * Agents can access the application's current state to make informed decisions * Frontends can observe and react to changes in the agent's internal state * Both sides can modify the state, creating a collaborative workflow ## State Synchronization Methods AG-UI provides two complementary methods for state synchronization: ### State Snapshots The `STATE_SNAPSHOT` event delivers a complete representation of an agent's current state: ```typescript theme={null} interface StateSnapshotEvent { type: EventType.STATE_SNAPSHOT snapshot: any // Complete state object } ``` Snapshots are typically used: * At the beginning of an interaction to establish the initial state * After connection interruptions to ensure synchronization * When major state changes occur that require a complete refresh * To establish a new baseline for future delta updates When a frontend receives a `STATE_SNAPSHOT` event, it should replace its existing state model entirely with the contents of the snapshot. ### State Deltas The `STATE_DELTA` event delivers incremental updates to the state using JSON Patch format (RFC 6902): ```typescript theme={null} interface StateDeltaEvent { type: EventType.STATE_DELTA delta: JsonPatchOperation[] // Array of JSON Patch operations } ``` Deltas are bandwidth-efficient, sending only what has changed rather than the entire state. This approach is particularly valuable for: * Frequent small updates during streaming interactions * Large state objects where most properties remain unchanged * High-frequency updates that would be inefficient to send as full snapshots ## JSON Patch Format AG-UI uses the JSON Patch format (RFC 6902) for state deltas, which defines a standardized way to express changes to a JSON document: ```typescript theme={null} interface JsonPatchOperation { op: "add" | "remove" | "replace" | "move" | "copy" | "test" path: string // JSON Pointer (RFC 6901) to the target location value?: any // The value to apply (for add, replace) from?: string // Source path (for move, copy) } ``` Common operations include: 1. **add**: Adds a value to an object or array ```json theme={null} { "op": "add", "path": "/user/preferences", "value": { "theme": "dark" } } ``` 2. **replace**: Replaces a value ```json theme={null} { "op": "replace", "path": "/conversation_state", "value": "paused" } ``` 3. **remove**: Removes a value ```json theme={null} { "op": "remove", "path": "/temporary_data" } ``` 4. **move**: Moves a value from one location to another ```json theme={null} { "op": "move", "path": "/completed_items", "from": "/pending_items/0" } ``` Frontends should apply these patches in sequence to maintain an accurate state representation. If inconsistencies are detected after applying patches, the frontend can request a fresh `STATE_SNAPSHOT`. ## State Processing in AG-UI In the AG-UI implementation, state deltas are applied using the `fast-json-patch` library: ```typescript theme={null} case EventType.STATE_DELTA: { const { delta } = event as StateDeltaEvent; try { // Apply the JSON Patch operations to the current state without mutating the original const result = applyPatch(state, delta, true, false); state = result.newDocument; return emitUpdate({ state }); } catch (error: unknown) { console.warn( `Failed to apply state patch:\n` + `Current state: ${JSON.stringify(state, null, 2)}\n` + `Patch operations: ${JSON.stringify(delta, null, 2)}\n` + `Error: ${errorMessage}` ); return emitNoUpdate(); } } ``` This implementation ensures that: * Patches are applied atomically (all or none) * The original state is not mutated during the application process * Errors are caught and handled gracefully ## Human-in-the-Loop Collaboration The shared state system is fundamental to human-in-the-loop workflows in AG-UI. It enables: 1. **Real-time visibility**: Users can observe the agent's thought process and current status 2. **Contextual awareness**: The agent can access user actions, preferences, and application state 3. **Collaborative decision-making**: Both human and AI can contribute to the evolving state 4. **Feedback loops**: Humans can correct or guide the agent by modifying state properties For example, an agent might update its state with a proposed action: ```json theme={null} { "proposal": { "action": "send_email", "recipient": "client@example.com", "content": "Draft email content..." } } ``` The frontend can display this proposal to the user, who can then approve, reject, or modify it before execution. ## CopilotKit Implementation [CopilotKit](https://docs.copilotkit.ai), a popular framework for building AI assistants, leverages AG-UI's state management system through its "shared state" feature. This implementation enables bidirectional state synchronization between agents (particularly LangGraph agents) and frontend applications. CopilotKit's shared state system is implemented through: ```jsx theme={null} // In the frontend React application const { state: agentState, setState: setAgentState } = useCoAgent({ name: "agent", initialState: { someProperty: "initialValue" }, }) ``` This hook creates a real-time connection to the agent's state, allowing: 1. Reading the agent's current state in the frontend 2. Updating the agent's state from the frontend 3. Rendering UI components based on the agent's state On the backend, LangGraph agents can emit state updates using: ```python theme={null} # In the LangGraph agent async def tool_node(self, state: ResearchState, config: RunnableConfig): # Update state with new information tool_state = { "title": new_state.get("title", ""), "outline": new_state.get("outline", {}), "sections": new_state.get("sections", []), # Other state properties... } # Emit updated state to frontend await copilotkit_emit_state(config, tool_state) return tool_state ``` These state updates are transmitted using AG-UI's state snapshot and delta mechanisms, creating a seamless shared context between agent and frontend. ## Best Practices When implementing state management in AG-UI: 1. **Use snapshots judiciously**: Full snapshots should be sent only when necessary to establish a baseline. 2. **Prefer deltas for incremental changes**: Small state updates should use deltas to minimize data transfer. 3. **Structure state thoughtfully**: Design state objects to support partial updates and minimize patch complexity. 4. **Handle state conflicts**: Implement strategies for resolving conflicting updates from agent and frontend. 5. **Include error recovery**: Provide mechanisms to resynchronize state if inconsistencies are detected. 6. **Consider security implications**: Avoid storing sensitive information in shared state. ## Conclusion AG-UI's state management system provides a powerful foundation for building collaborative applications where humans and AI agents work together. By efficiently synchronizing state between frontend and backend through snapshots and JSON Patch deltas, AG-UI enables sophisticated human-in-the-loop workflows that combine the strengths of both human intuition and AI capabilities. The implementation in frameworks like CopilotKit demonstrates how this shared state approach can create collaborative experiences that are more effective than either fully autonomous systems or traditional user interfaces. # Tools Source: https://docs.ag-ui.com/concepts/tools Understanding tools and how they enable human-in-the-loop AI workflows # Tools Tools are a fundamental concept in the AG-UI protocol that enable AI agents to interact with external systems and incorporate human judgment into their workflows. By defining tools in the frontend and passing them to agents, developers can create sophisticated human-in-the-loop experiences that combine AI capabilities with human expertise. ## What Are Tools? In AG-UI, tools are functions that agents can call to: 1. Request specific information 2. Perform actions in external systems 3. Ask for human input or confirmation 4. Access specialized capabilities Tools bridge the gap between AI reasoning and real-world actions, allowing agents to accomplish tasks that would be impossible through conversation alone. ## Tool Structure Tools follow a consistent structure that defines their name, purpose, and expected parameters: ```typescript theme={null} interface Tool { name: string // Unique identifier for the tool description: string // Human-readable explanation of what the tool does parameters: { // JSON Schema defining the tool's parameters type: "object" properties: { // Tool-specific parameters } required: string[] // Array of required parameter names } } ``` The `parameters` field uses [JSON Schema](https://json-schema.org/) to define the structure of arguments that the tool accepts. This schema is used by both the agent (to generate valid tool calls) and the frontend (to validate and parse tool arguments). ## Frontend-Defined Tools A key aspect of AG-UI's tool system is that tools are defined in the frontend and passed to the agent during execution: ```typescript theme={null} // Define tools in the frontend const userConfirmationTool = { name: "confirmAction", description: "Ask the user to confirm a specific action before proceeding", parameters: { type: "object", properties: { action: { type: "string", description: "The action that needs user confirmation", }, importance: { type: "string", enum: ["low", "medium", "high", "critical"], description: "The importance level of the action", }, }, required: ["action"], }, } // Pass tools to the agent during execution agent.runAgent({ tools: [userConfirmationTool], // Other parameters... }) ``` This approach has several advantages: 1. **Frontend control**: The frontend determines what capabilities are available to the agent 2. **Dynamic capabilities**: Tools can be added or removed based on user permissions, context, or application state 3. **Separation of concerns**: Agents focus on reasoning while frontends handle tool implementation 4. **Security**: Sensitive operations are controlled by the application, not the agent ## Tool Call Lifecycle When an agent needs to use a tool, it follows a standardized sequence of events: 1. **ToolCallStart**: Indicates the beginning of a tool call with a unique ID and tool name ```typescript theme={null} { type: EventType.TOOL_CALL_START, toolCallId: "tool-123", toolCallName: "confirmAction", parentMessageId: "msg-456" // Optional reference to a message } ``` 2. **ToolCallArgs**: Streams the tool arguments as they're generated ```typescript theme={null} { type: EventType.TOOL_CALL_ARGS, toolCallId: "tool-123", delta: '{"act' // Partial JSON being streamed } ``` ```typescript theme={null} { type: EventType.TOOL_CALL_ARGS, toolCallId: "tool-123", delta: 'ion":"Depl' // More JSON being streamed } ``` ```typescript theme={null} { type: EventType.TOOL_CALL_ARGS, toolCallId: "tool-123", delta: 'oy the application to production"}' // Final JSON fragment } ``` 3. **ToolCallEnd**: Marks the completion of the tool call ```typescript theme={null} { type: EventType.TOOL_CALL_END, toolCallId: "tool-123" } ``` The frontend accumulates these deltas to construct the complete tool call arguments. Once the tool call is complete, the frontend can execute the tool and provide results back to the agent. ## Tool Results After a tool has been executed, the result is sent back to the agent as a "tool message": ```typescript theme={null} { id: "result-789", role: "tool", content: "true", // Tool result as a string toolCallId: "tool-123" // References the original tool call } ``` This message becomes part of the conversation history, allowing the agent to reference and incorporate the tool's result in subsequent responses. ## Human-in-the-Loop Workflows The AG-UI tool system is especially powerful for implementing human-in-the-loop workflows. By defining tools that request human input or confirmation, developers can create AI experiences that seamlessly blend autonomous operation with human judgment. For example: 1. Agent needs to make an important decision 2. Agent calls the `confirmAction` tool with details about the decision 3. Frontend displays a confirmation dialog to the user 4. User provides their input 5. Frontend sends the user's decision back to the agent 6. Agent continues processing with awareness of the user's choice This pattern enables use cases like: * **Approval workflows**: AI suggests actions that require human approval * **Data verification**: Humans verify or correct AI-generated data * **Collaborative decision-making**: AI and humans jointly solve complex problems * **Supervised learning**: Human feedback improves future AI decisions ## CopilotKit Integration [CopilotKit](https://docs.copilotkit.ai/) provides a simplified way to work with AG-UI tools in React applications through its [`useCopilotAction`](https://docs.copilotkit.ai/guides/frontend-actions) hook: ```tsx theme={null} import { useCopilotAction } from "@copilotkit/react-core" // Define a tool for user confirmation useCopilotAction({ name: "confirmAction", description: "Ask the user to confirm an action", parameters: { type: "object", properties: { action: { type: "string", description: "The action to confirm", }, }, required: ["action"], }, handler: async ({ action }) => { // Show a confirmation dialog const confirmed = await showConfirmDialog(action) return confirmed ? "approved" : "rejected" }, }) ``` This approach makes it easy to define tools that integrate with your React components and handle the tool execution logic in a clean, declarative way. ## Tool Examples Here are some common types of tools used in AG-UI applications: ### User Confirmation ```typescript theme={null} { name: "confirmAction", description: "Ask the user to confirm an action", parameters: { type: "object", properties: { action: { type: "string", description: "The action to confirm" }, importance: { type: "string", enum: ["low", "medium", "high", "critical"], description: "The importance level" } }, required: ["action"] } } ``` ### Data Retrieval ```typescript theme={null} { name: "fetchUserData", description: "Retrieve data about a specific user", parameters: { type: "object", properties: { userId: { type: "string", description: "ID of the user" }, fields: { type: "array", items: { type: "string" }, description: "Fields to retrieve" } }, required: ["userId"] } } ``` ### User Interface Control ```typescript theme={null} { name: "navigateTo", description: "Navigate to a different page or view", parameters: { type: "object", properties: { destination: { type: "string", description: "Destination page or view" }, params: { type: "object", description: "Optional parameters for the navigation" } }, required: ["destination"] } } ``` ### Content Generation ```typescript theme={null} { name: "generateImage", description: "Generate an image based on a description", parameters: { type: "object", properties: { prompt: { type: "string", description: "Description of the image to generate" }, style: { type: "string", description: "Visual style for the image" }, dimensions: { type: "object", properties: { width: { type: "number" }, height: { type: "number" } }, description: "Dimensions of the image" } }, required: ["prompt"] } } ``` ## Best Practices When designing tools for AG-UI: 1. **Clear naming**: Use descriptive, action-oriented names 2. **Detailed descriptions**: Include thorough descriptions to help the agent understand when and how to use the tool 3. **Structured parameters**: Define precise parameter schemas with descriptive field names and constraints 4. **Required fields**: Only mark parameters as required if they're truly necessary 5. **Error handling**: Implement robust error handling in tool execution code 6. **User experience**: Design tool UIs that provide appropriate context for human decision-making ## Conclusion Tools in AG-UI bridge the gap between AI reasoning and real-world actions, enabling sophisticated workflows that combine the strengths of AI and human intelligence. By defining tools in the frontend and passing them to agents, developers can create interactive experiences where AI and humans collaborate efficiently. The tool system is particularly powerful for implementing human-in-the-loop workflows, where AI can suggest actions but defer critical decisions to humans. This balances automation with human judgment, creating AI experiences that are both powerful and trustworthy. # Contributing Source: https://docs.ag-ui.com/development/contributing How to participate in Agent User Interaction Protocol development # Naming conventions Add your package under `integrations/` with docs and tests. If your integration is work in progress, you can still add it to main branch. You can prefix it with `wip-`, i.e. (`integrations/wip-your-integration`) or if you're a third party contributor use the `community` prefix, i.e. (`integrations/community/your-integration`). For questions and discussions, please use [GitHub Discussions](https://github.com/ag-ui-protocol/ag-ui/discussions). # Roadmap Source: https://docs.ag-ui.com/development/roadmap Our plans for evolving Agent User Interaction Protocol You can follow the progress of the AG-UI Protocol on our [public roadmap](https://github.com/orgs/ag-ui-protocol/projects/1). ## Get Involved If you’d like to contribute ideas, feature requests, or bug reports to the roadmap, please see the [Contributing Guide](https://github.com/ag-ui-protocol/ag-ui/blob/main/CONTRIBUTING.md) for details on how to get involved. # What's New Source: https://docs.ag-ui.com/development/updates The latest updates and improvements to AG-UI * Initial release of the Agent User Interaction Protocol # Generative User Interfaces Source: https://docs.ag-ui.com/drafts/generative-ui AI-generated interfaces without custom tool renderers # Generative User Interfaces ## Summary ### Problem Statement Currently, creating custom user interfaces for agent interactions requires programmers to define specific tool renderers. This limits the flexibility and adaptability of agent-driven applications. ### Motivation This draft describes an AG-UI extension that addresses **generative user interfaces**—interfaces produced directly by artificial intelligence without requiring a programmer to define custom tool renderers. The key idea is to leverage our ability to send client-side tools to the agent, thereby enabling this capability across all agent frameworks supported by AG-UI. ## Status * **Status**: Draft * **Author(s)**: Markus Ecker ([mail@mme.xyz](mailto:mail@mme.xyz)) ## Challenges and Limitations ### Tool Description Length OpenAI enforces a limit of 1024 characters for tool descriptions. Gemini and Anthropic impose no such limit. ### Arguments JSON Schema Constraints Classes, nesting, `$ref`, and `oneOf` are not reliably supported across LLM providers. ### Context Window Considerations Injecting a large UI description language into an agent may reduce its performance. Agents dedicated solely to UI generation perform better than agents combining UI generation with other tasks. ## Detailed Specification ### Two-Step Generation Process ```mermaid theme={null} flowchart TD A[Agent needs UI] --> B["Step 1: What?
Agent calls generateUserInterface
(description, data, output)"] B --> C["Step 2: How?
Secondary generator builds actual UI
(JSON Schema, React, etc.)"] C --> D[Rendered UI shown to user] D --> E[Validated user input returned to Agent] ``` ### Step 1: What to Generate? Inject a lightweight tool into the agent: **Tool Definition:** * **Name:** `generateUserInterface` * **Arguments:** * **description**: A high-level description of the UI (e.g., *"A form for entering the user's address"*) * **data**: Arbitrary pre-populated data for the generated UI * **output**: A description or schema of the data the agent expects the user to submit back (fields, required/optional, types, constraints) **Example Tool Call:** ```json theme={null} { "tool": "generateUserInterface", "arguments": { "description": "A form that collects a user's shipping address.", "data": { "firstName": "Ada", "lastName": "Lovelace", "city": "London" }, "output": { "type": "object", "required": [ "firstName", "lastName", "street", "city", "postalCode", "country" ], "properties": { "firstName": { "type": "string", "title": "First Name" }, "lastName": { "type": "string", "title": "Last Name" }, "street": { "type": "string", "title": "Street Address" }, "city": { "type": "string", "title": "City" }, "postalCode": { "type": "string", "title": "Postal Code" }, "country": { "type": "string", "title": "Country", "enum": ["GB", "US", "DE", "AT"] } } } } } ``` ### Step 2: How to Generate? Delegate UI generation to a secondary LLM or agent: * The CopilotKit user stays in control: Can make their own generators, add custom libraries, include additional prompts etc. * On tool invocation, the secondary model consumes `description`, `data`, and `output` to generate the user interface * This model is focused solely on UI generation, ensuring maximum fidelity and consistency * The generation method can be swapped as needed (e.g., JSON, HTML, or other renderable formats) * The UI format description is not subject to structural or length constraints, allowing arbitrarily complex specifications ## Implementation Examples ### Example Output: UISchemaGenerator ```json theme={null} { "jsonSchema": { "title": "Shipping Address", "type": "object", "required": [ "firstName", "lastName", "street", "city", "postalCode", "country" ], "properties": { "firstName": { "type": "string", "title": "First name" }, "lastName": { "type": "string", "title": "Last name" }, "street": { "type": "string", "title": "Street address" }, "city": { "type": "string", "title": "City" }, "postalCode": { "type": "string", "title": "Postal code" }, "country": { "type": "string", "title": "Country", "enum": ["GB", "US", "DE", "AT"] } } }, "uiSchema": { "type": "VerticalLayout", "elements": [ { "type": "Group", "label": "Personal Information", "elements": [ { "type": "Control", "scope": "#/properties/firstName" }, { "type": "Control", "scope": "#/properties/lastName" } ] }, { "type": "Group", "label": "Address", "elements": [ { "type": "Control", "scope": "#/properties/street" }, { "type": "Control", "scope": "#/properties/city" }, { "type": "Control", "scope": "#/properties/postalCode" }, { "type": "Control", "scope": "#/properties/country" } ] } ] }, "initialData": { "firstName": "Ada", "lastName": "Lovelace", "city": "London", "country": "GB" } } ``` ### Example Output: ReactFormHookGenerator ```tsx theme={null} import React from "react" import { useForm } from "react-hook-form" import { z } from "zod" import { zodResolver } from "@hookform/resolvers/zod" // ----- Schema (contract) ----- const AddressSchema = z.object({ firstName: z.string().min(1, "Required"), lastName: z.string().min(1, "Required"), street: z.string().min(1, "Required"), city: z.string().min(1, "Required"), postalCode: z.string().regex(/^[A-Za-z0-9\\-\\s]{3,10}$/, "3–10 chars"), country: z.enum(["GB", "US", "DE", "AT", "FR", "IT", "ES"]), }) export type Address = z.infer type Props = { initialData?: Partial
meta?: { title?: string; submitLabel?: string } respond: (data: Address) => void // <-- called on successful submit } const COUNTRIES: Address["country"][] = [ "GB", "US", "DE", "AT", "FR", "IT", "ES", ] export default function AddressForm({ initialData, meta, respond }: Props) { const { register, handleSubmit, formState: { errors }, } = useForm
({ resolver: zodResolver(AddressSchema), defaultValues: { firstName: "", lastName: "", street: "", city: "", postalCode: "", country: "GB", ...initialData, }, }) const onSubmit = (data: Address) => { // Guaranteed to match AddressSchema respond(data) } return (
{meta?.title &&

{meta.title}

} {/* Section: Personal Information */}
Personal Information
{errors.firstName && {errors.firstName.message}}
{errors.lastName && {errors.lastName.message}}
{/* Section: Address */}
Address
{errors.street && {errors.street.message}}
{errors.city && {errors.city.message}}
{errors.postalCode && {errors.postalCode.message}}
{errors.country && {errors.country.message}}
) } ``` ## Implementation Considerations ### Client SDK Changes TypeScript SDK additions: * New `generateUserInterface` tool type * UI generator registry for pluggable generators * Validation layer for generated UI schemas * Response handler for user-submitted data Python SDK additions: * Support for UI generation tool invocation * Schema validation utilities * Serialization for UI definitions ### Integration Impact * All AG-UI integrations can leverage this capability without modification * Frameworks emit standard tool calls; client handles UI generation * Backward compatible with existing tool-based UI approaches ## Use Cases ### Dynamic Forms Agents can generate forms on-the-fly based on conversation context without pre-defined schemas. ### Data Visualization Generate charts, graphs, or tables appropriate to the data being discussed. ### Interactive Workflows Create multi-step wizards or guided processes tailored to user needs. ### Adaptive Interfaces Generate different UI layouts based on user preferences or device capabilities. ## Testing Strategy * Unit tests for tool injection and invocation * Integration tests with multiple UI generators * E2E tests demonstrating various UI types * Performance benchmarks comparing single vs. two-step generation * Cross-provider compatibility testing ## References * [AG-UI Tools Documentation](/concepts/tools) * [JSON Schema](https://json-schema.org/) * [React Hook Form](https://react-hook-form.com/) * [JSON Forms](https://jsonforms.io/) # Interrupt-Aware Run Lifecycle Source: https://docs.ag-ui.com/drafts/interrupts Native support for human-in-the-loop pauses and interrupts # Interrupt-Aware Run Lifecycle Proposal ## Summary ### Problem Statement Agents often need to pause execution to request human approval, gather additional input, or confirm potentially risky actions. Currently, there's no standardized way to handle these interruptions across different agent frameworks. ### Motivation Support **human-in-the-loop pauses** (and related mechanisms) natively in AG-UI and CopilotKit. This enables compatibility with various framework interrupts, workflow suspend/resume, and other framework-specific pause mechanisms. ## Status * **Status**: Draft * **Author(s)**: Markus Ecker ([mail@mme.xyz](mailto:mail@mme.xyz)) ## Overview This proposal introduces a standardized interrupt/resume pattern: ```mermaid theme={null} sequenceDiagram participant Agent participant Client as Client App Agent-->>Client: RUN_FINISHED { outcome: "interrupt", interrupt:{ id, reason, payload }} Client-->>Agent: RunAgentInput.resume { threadId, interruptId, payload } Agent-->>Client: RUN_FINISHED { outcome: "success", result } ``` ## Detailed Specification ### Updates to RUN\_FINISHED Event ```typescript theme={null} type RunFinishedOutcome = "success" | "interrupt" type RunFinished = { type: "RUN_FINISHED" // ... existing fields outcome?: RunFinishedOutcome // optional for back-compat (see rules below) // Present when outcome === "success" (or when outcome omitted and interrupt is absent) result?: any // Present when outcome === "interrupt" (or when outcome omitted and interrupt is present) interrupt?: { id?: string // id can be set when needed reason?: string // e.g. "human_approval" | "upload_required" | "policy_hold" payload?: any // arbitrary JSON for UI (forms, proposals, diffs, etc.) } } ``` When a run finishes with `outcome == "interrupt"`, the agent indicates that on the next run, a value needs to be provided to continue. ### Updates to RunAgentInput ```typescript theme={null} type RunAgentInput = { // ... existing fields // NEW: resume channel for continuing a suspension resume?: { interruptId?: string // echo back if one was provided payload?: any // arbitrary JSON: approvals, edits, files-as-refs, etc. } } ``` ### Contract Rules * Resume requests **must** use the same `threadId` * When given in the `interrupt`, the `interruptId` must be provided via `RunAgentInput` * Agents should handle missing or invalid resume payloads gracefully ## Implementation Examples ### Minimal Interrupt/Resume **Agent sends interrupt:** ```json theme={null} { "type": "RUN_FINISHED", "threadId": "t1", "runId": "r1", "outcome": "interrupt", "interrupt": { "id": "int-abc123", "reason": "human_approval", "payload": { "proposal": { "tool": "sendEmail", "args": { "to": "a@b.com", "subject": "Hi", "body": "…" } } } } } ``` **User responds:** ```json theme={null} { "threadId": "t1", "runId": "r2", "resume": { "interruptId": "int-abc123", "payload": { "approved": true } } } ``` ### Complex Approval Flow **Agent requests approval with context:** ```json theme={null} { "type": "RUN_FINISHED", "threadId": "thread-456", "runId": "run-789", "outcome": "interrupt", "interrupt": { "id": "approval-001", "reason": "database_modification", "payload": { "action": "DELETE", "table": "users", "affectedRows": 42, "query": "DELETE FROM users WHERE last_login < '2023-01-01'", "rollbackPlan": "Restore from backup snapshot-2025-01-23", "riskLevel": "high" } } } ``` **User approves with modifications:** ```json theme={null} { "threadId": "thread-456", "runId": "run-790", "resume": { "interruptId": "approval-001", "payload": { "approved": true, "modifications": { "batchSize": 10, "dryRun": true } } } } ``` ## Use Cases ### Human Approval Agents pause before executing sensitive operations (sending emails, making purchases, deleting data). ### Information Gathering Agent requests additional context or files from the user mid-execution. ### Policy Enforcement Automatic pauses triggered by organizational policies or compliance requirements. ### Multi-Step Wizards Complex workflows where each step requires user confirmation or input. ### Error Recovery Agent pauses when encountering an error, allowing user to provide guidance. ## Implementation Considerations ### Client SDK Changes TypeScript SDK: * Extended `RunFinishedEvent` type with outcome and interrupt fields * Updated `RunAgentInput` with resume field * Helper methods for interrupt handling Python SDK: * Extended `RunFinishedEvent` class * Updated `RunAgentInput` with resume support * Interrupt state management utilities ### Framework Integration **Planning Frameworks:** * Map framework interrupts to AG-UI interrupt events * Handle resume payloads in execution continuation **Workflow Systems:** * Convert workflow suspensions to AG-UI interrupts * Resume workflow execution with provided payload **Custom Frameworks:** * Provide interrupt/resume adapter interface * Documentation for integration patterns ### UI Considerations * Standard components for common interrupt reasons * Customizable interrupt UI based on payload * Clear indication of pending interrupts * History of interrupt/resume actions ## Testing Strategy * Unit tests for interrupt/resume serialization * Integration tests with multiple frameworks * E2E tests demonstrating various interrupt scenarios * State consistency tests across interrupt boundaries * Performance tests for rapid interrupt/resume cycles ## References * [AG-UI Events Documentation](/concepts/events) * [AG-UI State Management](/concepts/state) # Meta Events Source: https://docs.ag-ui.com/drafts/meta-events Annotations and signals independent of agent runs # Meta Events Proposal ## Summary ### Problem Statement Currently, AG-UI events are tightly coupled to agent runs. There's no standardized way to attach user feedback, annotations, or external signals to the event stream that are independent of the agent's execution lifecycle. ### Motivation AG-UI is extended with **MetaEvents**, a new class of events that can occur at any point in the event stream, independent of agent runs. MetaEvents provide a way to attach annotations, signals, or feedback to a serialized stream. They may originate from users, clients, or external systems rather than from agents. Examples include reactions such as thumbs up/down on a message. ## Status * **Status**: Draft * **Author(s)**: Markus Ecker ([mail@mme.xyz](mailto:mail@mme.xyz)) ## Detailed Specification ### Overview This proposal introduces: * A new **MetaEvent** type for side-band annotations * Events that can appear anywhere in the stream * Support for user feedback, tags, and external annotations * Extensible payload structure for application-specific data ## New Type: MetaEvent ```typescript theme={null} type MetaEvent = BaseEvent & { type: EventType.META /** * Application-defined type of the meta event. * Examples: "thumbs_up", "thumbs_down", "tag", "note" */ metaType: string /** * Application-defined payload. * May reference other entities (e.g., messageId) or contain freeform data. */ payload: Record } ``` ### Key Characteristics * **Run-independent**: MetaEvents are not tied to any specific run lifecycle * **Position-flexible**: Can appear before, between, or after runs * **Origin-diverse**: May come from users, clients, or external systems * **Extensible**: Applications define their own metaType values and payload schemas ## Implementation Examples ### User Feedback **Thumbs Up:** ```json theme={null} { "id": "evt_123", "ts": 1714063982000, "type": "META", "metaType": "thumbs_up", "payload": { "messageId": "msg_456", "userId": "user_789" } } ``` **Thumbs Down with Reason:** ```json theme={null} { "id": "evt_124", "ts": 1714063985000, "type": "META", "metaType": "thumbs_down", "payload": { "messageId": "msg_456", "userId": "user_789", "reason": "inaccurate", "comment": "The calculation seems incorrect" } } ``` ### Annotations **User Note:** ```json theme={null} { "id": "evt_789", "ts": 1714064001000, "type": "META", "metaType": "note", "payload": { "text": "Important question to revisit", "relatedRunId": "run_001", "author": "user_123" } } ``` **Tag Assignment:** ```json theme={null} { "id": "evt_890", "ts": 1714064100000, "type": "META", "metaType": "tag", "payload": { "tags": ["important", "follow-up"], "threadId": "thread_001" } } ``` ### External System Events **Analytics Event:** ```json theme={null} { "id": "evt_901", "ts": 1714064200000, "type": "META", "metaType": "analytics", "payload": { "event": "conversation_shared", "properties": { "shareMethod": "link", "recipientCount": 3 } } } ``` **Moderation Flag:** ```json theme={null} { "id": "evt_902", "ts": 1714064300000, "type": "META", "metaType": "moderation", "payload": { "action": "flag", "messageId": "msg_999", "category": "inappropriate_content", "confidence": 0.95 } } ``` ## Common Meta Event Types While applications can define their own types, these are commonly used: | MetaType | Description | Typical Payload | | ------------- | ----------------- | ---------------------------------- | | `thumbs_up` | Positive feedback | `{ messageId, userId }` | | `thumbs_down` | Negative feedback | `{ messageId, userId, reason? }` | | `note` | User annotation | `{ text, relatedId?, author }` | | `tag` | Categorization | `{ tags[], targetId }` | | `bookmark` | Save for later | `{ messageId, userId }` | | `copy` | Content copied | `{ messageId, content }` | | `share` | Content shared | `{ messageId, method }` | | `rating` | Numeric rating | `{ messageId, rating, maxRating }` | ## Use Cases ### User Feedback Collection Capture user reactions to agent responses for quality improvement. ### Conversation Annotation Allow users to add notes, tags, or bookmarks to important parts of conversations. ### Analytics and Tracking Record user interactions and behaviors without affecting agent execution. ### Content Moderation Flag or mark content for review by external moderation systems. ### Collaborative Features Enable multiple users to annotate or comment on shared conversations. ### Audit Trail Create a complete record of all interactions, not just agent responses. ## Implementation Considerations ### Client SDK Changes TypeScript SDK: * New `MetaEvent` type in `@ag-ui/core` * Helper functions for common meta event types * MetaEvent filtering and querying utilities Python SDK: * `MetaEvent` class implementation * Meta event builders for common types * Event stream filtering capabilities ## Testing Strategy * Unit tests for MetaEvent creation and validation * Integration tests with mixed event streams * Performance tests with high-volume meta events * Security tests for payload validation ## References * [AG-UI Events Documentation](/concepts/events) * [Event Sourcing](https://martinfowler.com/eaaDev/EventSourcing.html) * [CQRS Pattern](https://martinfowler.com/bliki/CQRS.html) # Overview Source: https://docs.ag-ui.com/drafts/overview Draft changes being considered for the AG-UI protocol # Overview This section contains draft changes being considered for the AG-UI protocol. These proposals are under internal review and may be modified or withdrawn before implementation. ## Current Drafts Support for LLM reasoning visibility and continuity with encrypted content Native support for agent pauses requiring human approval or input AI-generated interfaces without requiring custom tool renderers Annotations and signals independent of agent runs ## Status Definitions * **Draft** - Initial proposal under consideration * **Under Review** - Active development and testing * **Accepted** - Approved for implementation * **Implemented** - Merged into the main protocol specification * **Withdrawn** - Proposal has been withdrawn or superseded # AG-UI Overview Source: https://docs.ag-ui.com/introduction # The Agent–User Interaction (AG-UI) Protocol AG-UI is an open, lightweight, event-based protocol that standardizes how AI agents connect to user-facing applications. AG-UI is designed to be the general-purpose, bi-directional connection between a user-facing application and any agentic backend. Built for simplicity and flexibility, it standardizes how agent state, UI intents, and user interactions flow between your model/agent runtime and user-facing frontend applications—to allow application developers to ship reliable, debuggable, user‑friendly agentic features fast while focusing on application needs and avoiding complex ad-hoc wiring.
AG-UI Overview AG-UI Overview
*** ## Agentic Protocols Confused about "A2UI" and "AG-UI"? That's understandable! Despite the naming similarities, they are quite different and work well together. A2UI is a [generative UI specification](./concepts/generative-ui-specs) - allowing agents to deliver UI widgets, where AG-UI is the Agent↔User Interaction protocol - which connects an agentic frontend to any agentic backend. [Learn more](https://copilotkit.ai/ag-ui-and-a2ui) AG-UI is one of three prominent open [agentic protocols](./agentic-protocols). | **Layer** | **Protocol / Example** | **Purpose** | | ---------------------------- | ------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------ | | **Agent ↔ User Interaction** | **AG-UI
(Agent–User Interaction Protocol)** | The open, event-based standard that connects agents to user-facing applications — enabling real-time, multimodal, interactive experiences. | | **Agent ↔ Tools & Data** | **MCP
(Model Context Protocol)** | Open standard (originated by Anthropic) that lets agents securely connect to external systems — tools, workflows, and data sources. | | **Agent ↔ Agent** | **A2A
(Agent to Agent)** | Open standard (originated by Google) which defines how agents coordinate and share work across distributed agentic systems. | *** ## Building blocks (today & upcoming)
Streaming chat
Live token and event streaming for responsive multi turn sessions, with cancel and resume.
Multimodality
Typed attachments and real time media (files, images, audio, transcripts); supports voice, previews, annotations, provenance.
Generative UI, static
Render model output as stable, typed components under app control.
Generative UI, declarative
Small declarative language for constrained yet open-ended agent UIs; agents propose trees and constraints, the app validates and mounts.
Shared state
(Read-only & read-write). Typed store shared between agent and app, with streamed event-sourced diffs and conflict resolution for snappy collaboration.
Thinking steps
Visualize intermediate reasoning from traces and tool events; no raw chain of thought.
Frontend tool calls
Typed handoffs from agent to frontend-executed actions, and back.
Backend tool rendering
Visualize backend tool outputs in app and chat, emit side effects as first-class events.
Interrupts (human in the loop)
Pause, approve, edit, retry, or escalate mid flow without losing state.
Sub-agents and composition
Nested delegation with scoped state, tracing, and cancellation.
Agent steering
Dynamically redirect agent execution with real-time user input to guide behavior and outcomes.
Tool output streaming
Stream tool results and logs so UIs can render long-running effects in real time.
Custom events
Open-ended data exchange for needs not covered by the protocol.
*** ## Why Agentic Apps need AG-UI Agentic applications break the simple request/response model that dominated frontend-backend development in the pre-agentic era: a client makes a request, the server returns data, the client renders it, and the interaction ends. #### The requirements of user‑facing agents While agents are just software, they exhibit characteristics that make them challenging to serve behind traditional REST/GraphQL APIs: * Agents are **long‑running** and **stream** intermediate work—often across multi‑turn sessions. * Agents are **nondeterministic** and can **control application UI nondeterministically**. * Agents simultanously mix **structured + unstructured IO** (e.g. text & voice, alongside tool calls and state updates). * Agents need user-interactive **composition**: e.g. they may call sub‑agents, often recursively. * And more... AG-UI is an event-based protocol that enables dynamic communication between agentic frontends and backends. It builds on top of the foundational protocols of the web (HTTP, WebSockets) as an abstraction layer designed for the agentic age—bridging the gap between traditional client-server architectures and the dynamic, stateful nature of AI agents. *** ## AG-UI in Action
You can see demo apps of the AG-UI features with the framework of your choice, with preview, code, and walkthrough docs in the [AG-UI Dojo](https://dojo.ag-ui.com/) *** ## Supported Integrations AG-UI was born from CopilotKit's initial **partnership** with LangGraph and CrewAI - and brings the incredibly popular agent-user-interactivity infrastructure to the wider agentic ecosystem. **1st party** = the platforms that have AG‑UI built in and provide documentation for guidance. ### Direct to LLM | Framework | Status | AG-UI Resources | | :------------ | --------- | ------------------------------------------------ | | Direct to LLM | Supported | [Docs](https://docs.copilotkit.ai/direct-to-llm) | ### Agent Framework - Partnerships | Framework | Status | AG-UI Resources | | :----------------------------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------- | | [LangGraph](https://www.langchain.com/langgraph) | Supported | [Docs](https://docs.copilotkit.ai/langgraph/), [Demos](https://dojo.ag-ui.com/langgraph-fastapi/feature/shared_state) | | [CrewAI](https://crewai.com/) | Supported | [Docs](https://docs.copilotkit.ai/crewai-flows), [Demos](https://dojo.ag-ui.com/crewai/feature/shared_state) | ### Agent Framework - 1st Party | Framework | Status | AG-UI Resources | | :------------------------------------------------------------------------------------------------------------------------- | ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | | [Microsoft Agent Framework](https://azure.microsoft.com/en-us/blog/introducing-microsoft-agent-framework/) | Supported | [Docs](https://docs.copilotkit.ai/microsoft-agent-framework), [Demos](https://dojo.ag-ui.com/microsoft-agent-framework-dotnet/feature/shared_state) | | [Google ADK](https://google.github.io/adk-docs/get-started/) | Supported | [Docs](https://docs.copilotkit.ai/adk), [Demos](https://dojo.ag-ui.com/adk-middleware/feature/shared_state?openCopilot=true) | | [AWS Strands Agents](https://github.com/strands-agents/sdk-python) | Supported | [Docs](https://docs.copilotkit.ai/aws-strands), [Demos](https://dojo.ag-ui.com/aws-strands/feature/shared_state) | | [AWS Bedrock AgentCore](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-agui-protocol-contract.html) | Supported | [Docs](https://github.com/awslabs/fullstack-solution-template-for-agentcore) | | [Mastra](https://mastra.ai/) | Supported | [Docs](https://docs.copilotkit.ai/mastra/), [Demos](https://dojo.ag-ui.com/mastra/feature/tool_based_generative_ui) | | [Pydantic AI](https://github.com/pydantic/pydantic-ai) | Supported | [Docs](https://docs.copilotkit.ai/pydantic-ai/), [Demos](https://dojo.ag-ui.com/pydantic-ai/feature/shared_state) | | [Agno](https://github.com/agno-agi/agno) | Supported | [Docs](https://docs.copilotkit.ai/agno/), [Demos](https://dojo.ag-ui.com/agno/feature/tool_based_generative_ui) | | [LlamaIndex](https://github.com/run-llama/llama_index) | Supported | [Docs](https://docs.copilotkit.ai/llamaindex/), [Demos](https://dojo.ag-ui.com/llamaindex/feature/shared_state) | | [AG2](https://ag2.ai/) | Supported | [Docs](https://docs.copilotkit.ai/ag2/) [Demos](https://dojo.ag-ui.com/ag2/feature/shared_state) | | [AWS Bedrock Agents](https://aws.amazon.com/bedrock/agents/) | In Progress | – | ### Agent Framework - Community | Framework | Status | AG-UI Resources | | :----------------------------------------------------------------- | ----------- | --------------- | | [OpenAI Agent SDK](https://openai.github.io/openai-agents-python/) | In Progress | – | | [Cloudflare Agents](https://developers.cloudflare.com/agents/) | In Progress | – | ### Agent Interaction Protocols | Protocol | Status | AG-UI Resources | Integrations | | :------------------------------------------ | --------- | ----------------------------------------------- | ------------ | | [A2A Middleware](https://a2a-protocol.org/) | Supported | [Docs](https://docs.copilotkit.ai/a2a-protocol) | Partnership | ### Infrastructure / Deployment | Platform | Status | AG-UI Resources | Integrations | | :-------------------------------------------------------------------- | --------- | --------------------------------------------------------------------------------------- | ------------ | | [Amazon Bedrock AgentCore](https://aws.amazon.com/bedrock/agentcore/) | Supported | [Docs](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-agui.html) | 1st Party | ### Specification (standard) | Framework | Status | AG-UI Resources | | :------------------------------------------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | | [Oracle Agent Spec](http://oracle.github.io/agent-spec/) | Supported | [Docs](https://go.copilotkit.ai/copilotkit-oracle-docs), [Demos](https://dojo.ag-ui.com/agent-spec-langgraph/feature/tool_based_generative_ui) | ### SDKs | SDK | Status | AG-UI Resources | Integrations | | :----------- | ----------- | ------------------------------------------------------------------------------------------------------------ | ------------ | | [Kotlin]() | Supported | [Getting Started](https://github.com/ag-ui-protocol/ag-ui/blob/main/docs/sdk/kotlin/overview.mdx) | Community | | [Golang]() | Supported | [Getting Started](https://github.com/ag-ui-protocol/ag-ui/blob/main/docs/sdk/go/overview.mdx) | Community | | [Dart]() | Supported | [Getting Started](https://github.com/ag-ui-protocol/ag-ui/tree/main/sdks/community/dart) | Community | | [Java]() | Supported | [Getting Started](https://github.com/ag-ui-protocol/ag-ui/blob/main/docs/sdk/java/overview.mdx) | Community | | [Rust]() | Supported | [Getting Started](https://github.com/ag-ui-protocol/ag-ui/tree/main/sdks/community/rust/crates/ag-ui-client) | Community | | [.NET]() | In Progress | [PR](https://github.com/ag-ui-protocol/ag-ui/pull/38) | Community | | [Nim]() | In Progress | [PR](https://github.com/ag-ui-protocol/ag-ui/pull/29) | Community | | [Flowise]() | In Progress | [GitHub Source](https://github.com/ag-ui-protocol/ag-ui/issues/367) | Community | | [Langflow]() | In Progress | [GitHub Source](https://github.com/ag-ui-protocol/ag-ui/issues/366) | Community | ### Clients | Client | Status | AG-UI Resources | Integrations | | :----------------------------------------------------- | ----------- | ----------------------------------------------------------------------------- | ------------ | | [CopilotKit](https://github.com/CopilotKit/CopilotKit) | Supported | [Getting Started](https://docs.copilotkit.ai/direct-to-llm/guides/quickstart) | 1st Party | | [Terminal + Agent]() | Supported | [Getting Started](https://docs.ag-ui.com/quickstart/clients) | Community | | [React Native](https://reactnative.dev/) | Help Wanted | [GitHub Source](https://github.com/ag-ui-protocol/ag-ui/issues/510) | Community | *** ## Quick Start Choose the path that fits your needs: Build agentic applications powered by AG-UI compatible agents. Build integrations for new agent frameworks, custom in-house solutions, or use AG-UI without any agent framework. Build new clients for AG-UI-compatible agents (web, mobile, slack, messaging, etc.) ## Explore AG-UI Dive deeper into AG-UI's core concepts and capabilities: Understand how AG-UI connects agents, protocols, and front-ends Learn about AG-UI's event-driven protocol ## Resources Explore guides, tools, and integrations to help you build, optimize, and extend your AG-UI implementation. These resources cover everything from practical development workflows to debugging techniques. Use Cursor to build AG-UI implementations faster Fix common issues when working with AG-UI servers and clients ## Contributing Want to contribute? Check out our [Contributing Guide](/development/contributing) to learn how you can help improve AG-UI. ## Support and Feedback Here's how to get help or provide feedback: * For bug reports and feature requests related to the AG-UI specification, SDKs, or documentation (open source), please [create a GitHub issue](https://github.com/ag-ui-protocol/ag-ui/issues) * For discussions or Q\&A about AG-UI, please join the [Discord community](https://discord.gg/Jd3FzfdJa8) # Build applications Source: https://docs.ag-ui.com/quickstart/applications Build agentic applications utilizing compatible event AG-UI event streams # Introduction AG-UI provides a concise, event-driven protocol that lets any agent stream rich, structured output to any client. It can be used to connect any agentic system to any client. A client is defined as any system that can receieve, display, and respond to AG-UI events. For more information on existing clients and integrations, see the [integrations](/integrations) page. # Automatic Setup AG-UI provides a CLI tool to automatically create or scaffold a new application with any client and server. ```sh theme={null} npx create-ag-ui-app@latest ``` Once the setup is done, start the server with ```sh theme={null} npm run dev ``` For the copilotkit example you can head to [http://localhost:3000/copilotkit](http://localhost:3000/copilotkit) to see the app in action. # Build clients Source: https://docs.ag-ui.com/quickstart/clients Showcase: build a conversational CLI agent from scratch using AG-UI and Mastra # Introduction A client implementation allows you to **build conversational applications that leverage AG-UI's event-driven protocol**. This approach creates a direct interface between your users and AI agents, demonstrating direct access to the AG-UI protocol. ## When to use a client implementation Building your own client is useful if you want to explore/hack on the AG-UI protocol. For production use, use a full-featured client like [CopilotKit](https://copilotkit.ai). ## What you'll build In this guide, we'll create a CLI client that: 1. Uses the `MastraAgent` from `@ag-ui/mastra` 2. Connects to OpenAI's GPT-4o model 3. Implements a weather tool for real-world functionality 4. Provides an interactive chat interface in the terminal Let's get started! ## Prerequisites Before we begin, make sure you have: * [Node.js](https://nodejs.org/) **22.13.0 or later** * An **OpenAI API key** * [pnpm](https://pnpm.io/) package manager ### 1. Provide your OpenAI API key First, let's set up your API key: ```bash theme={null} # Set your OpenAI API key export OPENAI_API_KEY=your-api-key-here ``` ### 2. Install pnpm If you don't have pnpm installed: ```bash theme={null} # Install pnpm npm install -g pnpm ``` ## Step 1 – Initialize your project Create a new directory for your AG-UI client: ```bash theme={null} mkdir my-ag-ui-client cd my-ag-ui-client ``` Initialize a new Node.js project: ```bash theme={null} pnpm init ``` ### Set up TypeScript and basic configuration Install TypeScript and essential development dependencies: ```bash theme={null} pnpm add -D typescript @types/node tsx ``` Create a `tsconfig.json` file: ```json theme={null} { "compilerOptions": { "target": "ES2022", "module": "commonjs", "lib": ["ES2022"], "outDir": "./dist", "rootDir": "./src", "strict": true, "esModuleInterop": true, "skipLibCheck": true, "forceConsistentCasingInFileNames": true, "resolveJsonModule": true }, "include": ["src/**/*"], "exclude": ["node_modules", "dist"] } ``` Update your `package.json` scripts: ```json theme={null} { "scripts": { "start": "tsx src/index.ts", "dev": "tsx --watch src/index.ts", "build": "tsc", "clean": "rm -rf dist" } } ``` ## Step 2 – Install AG-UI and dependencies Install the core AG-UI packages and dependencies: ```bash theme={null} # Core AG-UI packages pnpm add @ag-ui/client @ag-ui/core @ag-ui/mastra # Mastra ecosystem packages pnpm add @mastra/core @mastra/client-js @mastra/memory @mastra/libsql # Mastra peer dependencies pnpm add zod ``` ## Step 3 – Create your agent Let's create a basic conversational agent. Create `src/agent.ts`: ```typescript theme={null} import { Agent } from "@mastra/core/agent" import { MastraAgent } from "@ag-ui/mastra" import { Memory } from "@mastra/memory" import { LibSQLStore } from "@mastra/libsql" export const agent = new MastraAgent({ resourceId: "cliExample", agent: new Agent({ id: "ag-ui-assistant", name: "AG-UI Assistant", instructions: ` You are a helpful AI assistant. Be friendly, conversational, and helpful. Answer questions to the best of your ability and engage in natural conversation. `, model: "openai/gpt-4o", memory: new Memory({ storage: new LibSQLStore({ id: "storage-memory", url: "file:./assistant.db", }), }), }), threadId: "main-conversation", }) ``` ### What's happening in the agent? 1. **MastraAgent** – We wrap a Mastra Agent with the AG-UI protocol adapter 2. **Model Configuration** – We use OpenAI's GPT-4o for high-quality responses 3. **Memory Setup** – We configure persistent memory using LibSQL for conversation context 4. **Instructions** – We give the agent basic guidelines for helpful conversation ## Step 4 – Create the CLI interface Now let's create the interactive chat interface. Create `src/index.ts`: ```typescript theme={null} import * as readline from "readline" import { agent } from "./agent" import { randomUUID } from "@ag-ui/client" const rl = readline.createInterface({ input: process.stdin, output: process.stdout, }) async function chatLoop() { console.log("🤖 AG-UI Assistant started!") console.log("Type your messages and press Enter. Press Ctrl+D to quit.\n") return new Promise((resolve) => { const promptUser = () => { rl.question("> ", async (input) => { if (input.trim() === "") { promptUser() return } console.log("") // Pause input while processing rl.pause() // Add user message to conversation agent.messages.push({ id: randomUUID(), role: "user", content: input.trim(), }) try { // Run the agent with event handlers await agent.runAgent( {}, // No additional configuration needed { onTextMessageStartEvent() { process.stdout.write("🤖 Assistant: ") }, onTextMessageContentEvent({ event }) { process.stdout.write(event.delta) }, onTextMessageEndEvent() { console.log("\n") }, } ) } catch (error) { console.error("❌ Error:", error) } // Resume input rl.resume() promptUser() }) } // Handle Ctrl+D to quit rl.on("close", () => { console.log("\n👋 Thanks for using AG-UI Assistant!") resolve() }) promptUser() }) } async function main() { await chatLoop() } main().catch(console.error) ``` ### What's happening in the CLI interface? 1. **Readline Interface** – We create an interactive prompt for user input 2. **Message Management** – We add each user input to the agent's conversation history 3. **Event Handling** – We listen to AG-UI events to provide real-time feedback 4. **Streaming Display** – We show the agent's response as it's being generated ## Step 5 – Test your assistant Let's run your new AG-UI client: ```bash theme={null} pnpm dev ``` You should see: ``` 🤖 AG-UI Assistant started! Type your messages and press Enter. Press Ctrl+D to quit. > ``` Try asking questions like: * "Hello! How are you?" * "What can you help me with?" * "Tell me a joke" * "Explain quantum computing in simple terms" You'll see the agent respond with streaming text in real-time! ## Step 6 – Understanding the AG-UI event flow Let's break down what happens when you send a message: 1. **User Input** – You type a question and press Enter 2. **Message Added** – Your input is added to the conversation history 3. **Agent Processing** – The agent analyzes your request and formulates a response 4. **Response Generation** – The agent streams its response back 5. **Streaming Output** – You see the response appear word by word ### Event types you're handling: * `onTextMessageStartEvent` – Agent starts responding * `onTextMessageContentEvent` – Each chunk of the response * `onTextMessageEndEvent` – Response is complete ## Step 7 – Add tool functionality Now that you have a working chat interface, let's add some real-world capabilities by creating tools. We'll start with a weather tool. ### Create your first tool Let's create a weather tool that your agent can use. Create the directory structure: ```bash theme={null} mkdir -p src/tools ``` Create `src/tools/weather.tool.ts`: ```typescript theme={null} import { createTool } from "@mastra/core/tools" import { z } from "zod" interface GeocodingResponse { results: { latitude: number longitude: number name: string }[] } interface WeatherResponse { current: { time: string temperature_2m: number apparent_temperature: number relative_humidity_2m: number wind_speed_10m: number wind_gusts_10m: number weather_code: number } } export const weatherTool = createTool({ id: "get-weather", description: "Get current weather for a location", inputSchema: z.object({ location: z.string().describe("City name"), }), outputSchema: z.object({ temperature: z.number(), feelsLike: z.number(), humidity: z.number(), windSpeed: z.number(), windGust: z.number(), conditions: z.string(), location: z.string(), }), execute: async (inputData) => { return await getWeather(inputData.location) }, }) const getWeather = async (location: string) => { const geocodingUrl = `https://geocoding-api.open-meteo.com/v1/search?name=${encodeURIComponent( location )}&count=1` const geocodingResponse = await fetch(geocodingUrl) const geocodingData = (await geocodingResponse.json()) as GeocodingResponse if (!geocodingData.results?.[0]) { throw new Error(`Location '${location}' not found`) } const { latitude, longitude, name } = geocodingData.results[0] const weatherUrl = `https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}¤t=temperature_2m,apparent_temperature,relative_humidity_2m,wind_speed_10m,wind_gusts_10m,weather_code` const response = await fetch(weatherUrl) const data = (await response.json()) as WeatherResponse return { temperature: data.current.temperature_2m, feelsLike: data.current.apparent_temperature, humidity: data.current.relative_humidity_2m, windSpeed: data.current.wind_speed_10m, windGust: data.current.wind_gusts_10m, conditions: getWeatherCondition(data.current.weather_code), location: name, } } function getWeatherCondition(code: number): string { const conditions: Record = { 0: "Clear sky", 1: "Mainly clear", 2: "Partly cloudy", 3: "Overcast", 45: "Foggy", 48: "Depositing rime fog", 51: "Light drizzle", 53: "Moderate drizzle", 55: "Dense drizzle", 56: "Light freezing drizzle", 57: "Dense freezing drizzle", 61: "Slight rain", 63: "Moderate rain", 65: "Heavy rain", 66: "Light freezing rain", 67: "Heavy freezing rain", 71: "Slight snow fall", 73: "Moderate snow fall", 75: "Heavy snow fall", 77: "Snow grains", 80: "Slight rain showers", 81: "Moderate rain showers", 82: "Violent rain showers", 85: "Slight snow showers", 86: "Heavy snow showers", 95: "Thunderstorm", 96: "Thunderstorm with slight hail", 99: "Thunderstorm with heavy hail", } return conditions[code] || "Unknown" } ``` ### What's happening in the weather tool? 1. **Tool Definition** – We use `createTool` from Mastra to define the tool's interface 2. **Input Schema** – We specify that the tool accepts a location string 3. **Output Schema** – We define the structure of the weather data returned 4. **API Integration** – We fetch data from Open-Meteo's free weather API 5. **Data Processing** – We convert weather codes to human-readable conditions ### Update your agent Now let's update our agent to use the weather tool. Update `src/agent.ts`: ```typescript theme={null} import { weatherTool } from "./tools/weather.tool" // <--- Import the tool export const agent = new MastraAgent({ agent: new Agent({ // ... tools: { weatherTool }, // <--- Add the tool to the agent // ... }), threadId: "main-conversation", }) ``` ### Update your CLI to handle tools Update your CLI interface in `src/index.ts` to handle tool events: ```typescript theme={null} // Add these new event handlers to your agent.runAgent call: await agent.runAgent( {}, // No additional configuration needed { // ... existing event handlers ... onToolCallStartEvent({ event }) { console.log("🔧 Tool call:", event.toolCallName) }, onToolCallArgsEvent({ event }) { process.stdout.write(event.delta) }, onToolCallEndEvent() { console.log("") }, onToolCallResultEvent({ event }) { if (event.content) { console.log("🔍 Tool call result:", event.content) } }, } ) ``` ### Test your weather tool Now restart your application and try asking about weather: ```bash theme={null} pnpm dev ``` Try questions like: * "What's the weather like in London?" * "How's the weather in Tokyo today?" * "Is it raining in Seattle?" You'll see the agent use the weather tool to fetch real data and provide detailed responses! ## Step 8 – Add more functionality ### Create a browser tool Let's add a web browsing capability. First install the `open` package: ```bash theme={null} pnpm add open ``` Create `src/tools/browser.tool.ts`: ```typescript theme={null} import { createTool } from "@mastra/core/tools" import { z } from "zod" import { open } from "open" export const browserTool = createTool({ id: "open-browser", description: "Open a URL in the default web browser", inputSchema: z.object({ url: z.url().describe("The URL to open"), }), outputSchema: z.object({ success: z.boolean(), message: z.string(), }), execute: async (inputData) => { try { await open(inputData.url) return { success: true, message: `Opened ${inputData.url} in your default browser`, } } catch (error) { return { success: false, message: `Failed to open browser: ${error}`, } } }, }) ``` ### Update your agent with both tools Update `src/agent.ts` to include both tools: ```typescript theme={null} import { Agent } from "@mastra/core/agent" import { MastraAgent } from "@ag-ui/mastra" import { Memory } from "@mastra/memory" import { LibSQLStore } from "@mastra/libsql" import { weatherTool } from "./tools/weather.tool" import { browserTool } from "./tools/browser.tool" export const agent = new MastraAgent({ resourceId: "cliExample", agent: new Agent({ id: "ag-ui-assistant", name: "AG-UI Assistant", instructions: ` You are a helpful assistant with weather and web browsing capabilities. For weather queries: - Always ask for a location if none is provided - Use the weatherTool to fetch current weather data For web browsing: - Always use full URLs (e.g., "https://www.google.com") - Use the browserTool to open web pages Be friendly and helpful in all interactions! `, model: "openai/gpt-4o", tools: { weatherTool, browserTool }, // Add both tools memory: new Memory({ storage: new LibSQLStore({ id: "storage-memory", url: "file:./assistant.db", }), }), }), threadId: "main-conversation", }) ``` Now you can ask your assistant to open websites: "Open Google for me" or "Show me the weather website". ## Step 9 – Deploy your client ### Building your client Create a production build: ```bash theme={null} pnpm build ``` ### Create a startup script Add to your `package.json`: ```json theme={null} { "bin": { "weather-assistant": "./dist/index.js" } } ``` Add a shebang to your built `dist/index.js`: ```javascript theme={null} #!/usr/bin/env node // ... rest of your compiled code ``` Make it executable: ```bash theme={null} chmod +x dist/index.js ``` ### Link globally Install your CLI globally: ```bash theme={null} pnpm link --global ``` Now you can run `weather-assistant` from anywhere! ## Extending your client Your AG-UI client is now a solid foundation. Here are some ideas for enhancement: ### Add more tools * **Calculator tool** – For mathematical operations * **File system tool** – For reading/writing files * **API tools** – For connecting to other services * **Database tools** – For querying data ### Improve the interface * **Rich formatting** – Use libraries like `chalk` for colored output * **Progress indicators** – Show loading states for long operations * **Configuration files** – Allow users to customize settings * **Command-line arguments** – Support different modes and options ### Add persistence * **Conversation history** – Save and restore chat sessions * **User preferences** – Remember user settings * **Tool results caching** – Cache expensive API calls ## Share your client Built something useful? Consider sharing it with the community: 1. **Open source it** – Publish your code on GitHub 2. **Publish to npm** – Make it installable via `npm install` 3. **Create documentation** – Help others understand and extend your work 4. **Join discussions** – Share your experience in the [AG-UI GitHub Discussions](https://github.com/orgs/ag-ui-protocol/discussions) ## Conclusion You've built a complete AG-UI client from scratch! Your weather assistant demonstrates the core concepts: * **Event-driven architecture** with real-time streaming * **Tool integration** for real-world functionality * **Conversation memory** for context retention * **Interactive CLI interface** for user engagement From here, you can extend your client to support any use case – from simple CLI tools to complex conversational applications. The AG-UI protocol provides the foundation, and your creativity provides the possibilities. Happy building! 🚀 # Introduction Source: https://docs.ag-ui.com/quickstart/introduction Learn how to get started building an AG-UI integration