Agents
Learn about agents in the Agent User Interaction Protocol
Agents
Agents are the core components in the AG-UI protocol that process requests and generate responses. They establish a standardized way for front-end applications to communicate with AI services through a consistent interface, regardless of the underlying implementation.
What is an Agent?
In AG-UI, an agent is a class that:
- Manages conversation state and message history
- Processes incoming messages and context
- Generates responses through an event-driven streaming interface
- Follows a standardized protocol for communication
Agents can be implemented to connect with any AI service, including:
- Large language models (LLMs) like GPT-4 or Claude
- Custom AI systems
- Retrieval augmented generation (RAG) systems
- Multi-agent systems
Agent Architecture
All agents in AG-UI extend the AbstractAgent
class, which provides the
foundation for:
- State management
- Message history tracking
- Event stream processing
- Tool usage
Core Components
AG-UI agents have several key components:
- Configuration: Agent ID, thread ID, and initial state
- Messages: Conversation history with user and assistant messages
- State: Structured data that persists across interactions
- Events: Standardized messages for communication with clients
- Tools: Functions that agents can use to interact with external systems
Agent Types
AG-UI provides different agent implementations to suit various needs:
AbstractAgent
The base class that all agents extend. It handles core event processing, state management, and message history.
HttpAgent
A concrete implementation that connects to remote AI services via HTTP:
Custom Agents
You can create custom agents to integrate with any AI service by extending
AbstractAgent
:
Implementing Agents
Basic Implementation
To create a custom agent, extend the AbstractAgent
class and implement the
required run
method:
Agent Capabilities
Agents in the AG-UI protocol provide a rich set of capabilities that enable sophisticated AI interactions:
Interactive Communication
Agents establish bi-directional communication channels with front-end applications through event streams. This enables:
- Real-time streaming responses character-by-character
- Immediate feedback loops between user and AI
- Progress indicators for long-running operations
- Structured data exchange in both directions
Tool Usage
Agents can use tools to perform actions and access external resources. Importantly, tools are defined and passed in from the front-end application to the agent, allowing for a flexible and extensible system:
Tools are invoked through a sequence of events:
TOOL_CALL_START
: Indicates the beginning of a tool callTOOL_CALL_ARGS
: Streams the arguments for the tool callTOOL_CALL_END
: Marks the completion of the tool call
Front-end applications can then execute the tool and provide results back to the agent. This bidirectional flow enables sophisticated human-in-the-loop workflows where:
- The agent can request specific actions be performed
- Humans can execute those actions with appropriate judgment
- Results are fed back to the agent for continued reasoning
- The agent maintains awareness of all decisions made in the process
This mechanism is particularly powerful for implementing interfaces where AI and
humans collaborate. For example, CopilotKit
leverages this exact pattern with their
useCopilotAction
hook,
which provides a simplified way to define and handle tools in React
applications.
By keeping the AI informed about human decisions through the tool mechanism, applications can maintain context and create more natural collaborative experiences between users and AI assistants.
State Management
Agents maintain a structured state that persists across interactions. This state can be:
- Updated incrementally through
STATE_DELTA
events - Completely refreshed with
STATE_SNAPSHOT
events - Accessed by both the agent and front-end
- Used to store user preferences, conversation context, or application state
Multi-Agent Collaboration
AG-UI supports agent-to-agent handoff and collaboration:
- Agents can delegate tasks to other specialized agents
- Multiple agents can work together in a coordinated workflow
- State and context can be transferred between agents
- The front-end maintains a consistent experience across agent transitions
For example, a general assistant agent might hand off to a specialized coding agent when programming help is needed, passing along the conversation context and specific requirements.
Human-in-the-Loop Workflows
Agents support human intervention and assistance:
- Agents can request human input on specific decisions
- Front-ends can pause agent execution and resume it after human feedback
- Human experts can review and modify agent outputs before they’re finalized
- Hybrid workflows combine AI efficiency with human judgment
This enables applications where the agent acts as a collaborative partner rather than an autonomous system.
Conversational Memory
Agents maintain a complete history of conversation messages:
- Past interactions inform future responses
- Message history is synchronized between client and server
- Messages can include rich content (text, structured data, references)
- The context window can be managed to focus on relevant information
Metadata and Instrumentation
Agents can emit metadata about their internal processes:
- Reasoning steps through custom events
- Performance metrics and timing information
- Source citations and reference tracking
- Confidence scores for different response options
This allows front-ends to provide transparency into the agent’s decision-making process and help users understand how conclusions were reached.
Using Agents
Once you’ve implemented or instantiated an agent, you can use it like this:
Agent Configuration
Agents accept configuration through the constructor:
Agent State Management
AG-UI agents maintain state across interactions:
Conclusion
Agents are the foundation of the AG-UI protocol, providing a standardized way to
connect front-end applications with AI services. By implementing the
AbstractAgent
class, you can create custom integrations with any AI service
while maintaining a consistent interface for your applications.
The event-driven architecture enables real-time, streaming interactions that are essential for modern AI applications, and the standardized protocol ensures compatibility across different implementations.