> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ag-ui.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Capabilities

> Dynamic capability discovery for agents in the AG-UI protocol

# Capabilities

Agents in the AG-UI protocol can declare what they support at runtime through
**capability discovery**. This allows clients to query an agent and adapt their
behavior based on what features are available — without guessing or hardcoding
assumptions.

## How It Works

`AbstractAgent` exposes an optional `getCapabilities()` method that returns a
typed snapshot of everything the agent currently supports:

```typescript theme={null}
const agent = new HttpAgent({ url: "https://my-agent.example.com/api" })

const capabilities = await agent.getCapabilities?.()

if (capabilities?.tools?.supported) {
  console.log(`Agent provides ${capabilities.tools.items?.length} tools`)
}

if (capabilities?.reasoning?.supported) {
  // Show reasoning UI toggle
}
```

### Key Principles

* **Discovery only** — the agent declares what it can do, there is no
  negotiation
* **Dynamic** — returns the current state at the time of the call (e.g., if
  tools are added, the next call reflects them)
* **Optional** — agents that don't implement it return `undefined`
* **Absent = unknown** — only declare what you support, omitted fields mean the
  capability is not declared

## The AgentCapabilities Interface

Capabilities are organized into typed categories, each representing a different
aspect of agent functionality:

```typescript theme={null}
interface AgentCapabilities {
  /** Agent identity and metadata. */
  identity?: IdentityCapabilities
  /** Supported transport mechanisms (SSE, WebSocket, binary, etc.). */
  transport?: TransportCapabilities
  /** Tools the agent provides and tool calling configuration. */
  tools?: ToolsCapabilities
  /** Output format support (structured output, MIME types). */
  output?: OutputCapabilities
  /** State and memory management (snapshots, deltas, persistence). */
  state?: StateCapabilities
  /** Multi-agent coordination (delegation, handoffs, sub-agents). */
  multiAgent?: MultiAgentCapabilities
  /** Reasoning and thinking support (chain-of-thought, encrypted thinking). */
  reasoning?: ReasoningCapabilities
  /** Multimodal input/output support organized by direction (input vs output). */
  multimodal?: MultimodalCapabilities
  /** Execution control and limits (code execution, timeouts, iteration caps). */
  execution?: ExecutionCapabilities
  /** Human-in-the-loop support (approvals, interventions, feedback). */
  humanInTheLoop?: HumanInTheLoopCapabilities
  /** Integration-specific capabilities not covered by the standard categories. */
  custom?: Record<string, unknown>
}
```

The `custom` field is an escape hatch for integration-specific capabilities that
don't fit into the standard categories.

## Capability Categories

### Identity

Basic metadata about the agent. Useful for discovery UIs, agent marketplaces,
and debugging. Set these when you want clients to display agent information or
when multiple agents are available and users need to pick one.

```typescript theme={null}
interface IdentityCapabilities {
  /** Human-readable name shown in UIs and agent selectors. */
  name?: string
  /** The framework or platform powering this agent (e.g., "langgraph", "mastra", "crewai"). */
  type?: string
  /** What this agent does — helps users and routing logic decide when to use it. */
  description?: string
  /** Semantic version of the agent (e.g., "1.2.0"). Useful for compatibility checks. */
  version?: string
  /** Organization or team that maintains this agent. */
  provider?: string
  /** URL to the agent's documentation or homepage. */
  documentationUrl?: string
  /** Arbitrary key-value pairs for integration-specific identity info. */
  metadata?: Record<string, unknown>
}
```

### Transport

Declares which transport mechanisms the agent supports. Clients use this to pick
the best connection strategy. Only set flags to `true` for transports your agent
actually handles — omit or set `false` for unsupported ones.

```typescript theme={null}
interface TransportCapabilities {
  /** Set `true` if the agent streams responses via SSE. Most agents enable this. */
  streaming?: boolean
  /** Set `true` if the agent accepts persistent WebSocket connections. */
  websocket?: boolean
  /** Set `true` if the agent supports the AG-UI binary protocol (protobuf over HTTP). */
  httpBinary?: boolean
  /** Set `true` if the agent can send async updates via webhooks after a run finishes. */
  pushNotifications?: boolean
  /** Set `true` if the agent supports resuming interrupted streams via sequence numbers. */
  resumable?: boolean
}
```

### Tools

Tool calling capabilities. Distinguishes between tools the agent itself provides
(listed in `items`) and tools the client passes at runtime via
`RunAgentInput.tools`. Enable this when your agent can call functions, search the
web, execute code, etc.

```typescript theme={null}
interface ToolsCapabilities {
  /** Set `true` if the agent can make tool calls at all. Set `false` to explicitly
   *  signal tool calling is disabled even if items are present. */
  supported?: boolean
  /** The tools this agent provides on its own (full JSON Schema definitions).
   *  These are distinct from client-provided tools passed in `RunAgentInput.tools`. */
  items?: Tool[]
  /** Set `true` if the agent can invoke multiple tools concurrently within a single step. */
  parallelCalls?: boolean
  /** Set `true` if the agent accepts and uses tools provided by the client at runtime. */
  clientProvided?: boolean
}
```

### Output

Output format support. Enable `structuredOutput` when your agent can return
responses conforming to a JSON schema, which is useful for programmatic
consumption.

```typescript theme={null}
interface OutputCapabilities {
  /** Set `true` if the agent can produce structured JSON output matching a provided schema. */
  structuredOutput?: boolean
  /** MIME types the agent can produce (e.g., `["text/plain", "application/json"]`).
   *  Omit if the agent only produces plain text. */
  supportedMimeTypes?: string[]
}
```

### State

State and memory management capabilities. These tell the client how the agent
handles shared state and whether conversation context persists across runs.

```typescript theme={null}
interface StateCapabilities {
  /** Set `true` if the agent emits `STATE_SNAPSHOT` events (full state replacement). */
  snapshots?: boolean
  /** Set `true` if the agent emits `STATE_DELTA` events (JSON Patch incremental updates). */
  deltas?: boolean
  /** Set `true` if the agent has long-term memory beyond the current thread
   *  (e.g., vector store, knowledge base, or cross-session recall). */
  memory?: boolean
  /** Set `true` if state is preserved across multiple runs within the same thread.
   *  When `false`, state resets on each run. */
  persistentState?: boolean
}
```

### Multi-Agent

Multi-agent coordination capabilities. Enable these when your agent can
orchestrate or hand off work to other agents.

```typescript theme={null}
interface MultiAgentCapabilities {
  /** Set `true` if the agent participates in any form of multi-agent coordination. */
  supported?: boolean
  /** Set `true` if the agent can delegate subtasks to other agents while retaining control. */
  delegation?: boolean
  /** Set `true` if the agent can transfer the conversation entirely to another agent. */
  handoffs?: boolean
  /** List of sub-agents this agent can invoke. Helps clients build agent selection UIs. */
  subAgents?: Array<{ name: string; description?: string }>
}
```

### Reasoning

Reasoning and thinking capabilities. Enable these when your agent exposes its
internal thought process (e.g., chain-of-thought, extended thinking).

```typescript theme={null}
interface ReasoningCapabilities {
  /** Set `true` if the agent produces reasoning/thinking tokens visible to the client. */
  supported?: boolean
  /** Set `true` if reasoning tokens are streamed incrementally (vs. returned all at once). */
  streaming?: boolean
  /** Set `true` if reasoning content is encrypted (zero-data-retention mode).
   *  Clients should expect opaque `encryptedValue` fields instead of readable content. */
  encrypted?: boolean
}
```

### Multimodal

Multimodal input and output support, organized into `input` and `output`
sub-objects so clients can independently query what the agent accepts versus what
it produces. Clients use this to show/hide file upload buttons, audio recorders,
image pickers, etc.

```typescript theme={null}
interface MultimodalInputCapabilities {
  /** Set `true` if the agent can process image inputs (e.g., screenshots, photos). */
  image?: boolean
  /** Set `true` if the agent can process audio inputs (speech, recordings). */
  audio?: boolean
  /** Set `true` if the agent can process video inputs. */
  video?: boolean
  /** Set `true` if the agent can process PDF documents. */
  pdf?: boolean
  /** Set `true` if the agent can process arbitrary file uploads. */
  file?: boolean
}

interface MultimodalOutputCapabilities {
  /** Set `true` if the agent can generate images as part of its response. */
  image?: boolean
  /** Set `true` if the agent can produce audio output (text-to-speech, audio files). */
  audio?: boolean
}

interface MultimodalCapabilities {
  /** Modalities the agent can accept as input (images, audio, video, PDFs, files). */
  input?: MultimodalInputCapabilities
  /** Modalities the agent can produce as output (images, audio). */
  output?: MultimodalOutputCapabilities
}
```

### Execution

Execution control and limits. Declare these so clients can set expectations about
how long or how many steps an agent run might take.

```typescript theme={null}
interface ExecutionCapabilities {
  /** Set `true` if the agent can execute code (e.g., Python, JavaScript) during a run. */
  codeExecution?: boolean
  /** Set `true` if code execution happens in a sandboxed/isolated environment.
   *  Only meaningful when `codeExecution` is `true`. */
  sandboxed?: boolean
  /** Maximum number of tool-call/reasoning iterations the agent will perform per run.
   *  Helps clients display progress or set timeout expectations. */
  maxIterations?: number
  /** Maximum wall-clock time (in milliseconds) the agent will run before timing out. */
  maxExecutionTime?: number
}
```

### Human-in-the-Loop

Human-in-the-loop interaction support. Enable these when your agent can pause
execution to request human input, approval, or feedback before continuing.

```typescript theme={null}
interface HumanInTheLoopCapabilities {
  /** Set `true` if the agent supports any form of human-in-the-loop interaction. */
  supported?: boolean
  /** Set `true` if the agent can pause and request explicit approval before
   *  performing sensitive actions (e.g., sending emails, deleting data). */
  approvals?: boolean
  /** Set `true` if the agent allows humans to intervene and modify its plan mid-execution. */
  interventions?: boolean
  /** Set `true` if the agent can incorporate user feedback (thumbs up/down, corrections)
   *  to improve its behavior within the current session. */
  feedback?: boolean
  /** Set `true` if the agent participates in the AG-UI interrupt protocol (emits
   *  `RunFinished` with `outcome: { type: "interrupt", interrupts: [...] }`,
   *  accepts `RunAgentInput.resume`). */
  interrupts?: boolean
  /** Set `true` if tool-call interrupts accept `editedArgs` in the resume payload.
   *  Only meaningful when `interrupts` is `true`. */
  approveWithEdits?: boolean
}
```

<Note>
  See [Interrupts](/concepts/interrupts) for the full protocol specification.
</Note>

## Implementing getCapabilities()

### Custom Agents

Implement `getCapabilities()` on your agent subclass, returning only the
capabilities you actually support:

```typescript theme={null}
import { AbstractAgent, AgentCapabilities } from "@ag-ui/client"

class MyAgent extends AbstractAgent {
  async getCapabilities(): Promise<AgentCapabilities> {
    return {
      identity: {
        name: "my-agent",
        description: "A custom agent with tool support",
        version: "1.0.0",
      },
      transport: {
        streaming: true,
      },
      tools: {
        supported: true,
        items: this.getRegisteredTools(),
        clientProvided: true,
      },
      state: {
        snapshots: true,
        deltas: true,
      },
    }
  }

  // ... run() implementation
}
```

### Dynamic Capabilities

Since `getCapabilities()` returns a live snapshot, it reflects the agent's
current state:

```typescript theme={null}
const agent = new MyAgent(config)

let caps = await agent.getCapabilities()
console.log(caps.tools?.items?.length) // 5

// Register more tools at runtime
agent.registerTool(newTool)

caps = await agent.getCapabilities()
console.log(caps.tools?.items?.length) // 6
```

## Client Usage Patterns

### Adaptive UI

Render UI components based on what the agent supports:

```typescript theme={null}
const capabilities = await agent.getCapabilities?.()

// Only show reasoning panel if supported
if (capabilities?.reasoning?.supported) {
  showReasoningPanel()
}

// Only show sub-agent selector if available
if (capabilities?.multiAgent?.subAgents?.length) {
  showSubAgentSelector(capabilities.multiAgent.subAgents)
}

// Only show approval UI if HITL is supported
if (capabilities?.humanInTheLoop?.approvals) {
  enableApprovalWorkflow()
}
```

### Feature Gating

Disable features the agent doesn't support instead of failing at runtime:

```typescript theme={null}
const capabilities = await agent.getCapabilities?.()

const canUseStructuredOutput = capabilities?.output?.structuredOutput ?? false
const canStream = capabilities?.transport?.streaming ?? false
```

### Custom Capabilities

Access integration-specific capabilities via the `custom` field:

```typescript theme={null}
const capabilities = await agent.getCapabilities?.()

const rateLimit = capabilities?.custom?.rateLimit as
  | { maxRequestsPerMinute: number }
  | undefined

if (rateLimit) {
  configureThrottling(rateLimit.maxRequestsPerMinute)
}
```