Skip to main content

Multimodal Inputs

AGUIUserMessage.Content accepts either plain text or an ordered array of multimodal content parts through the AGUIUserContent union.
using System.Text.Json;
using AGUI.Abstractions;

var message = new AGUIUserMessage
{
    Id = "user-1",
    Content =
    [
        new AGUITextInputContent
        {
            Text = "Summarize this PDF and screenshot"
        },
        new AGUIImageInputContent
        {
            Source = new AGUIInputContentUrlSource
            {
                Value = "https://example.com/screen.png",
                MimeType = "image/png"
            }
        },
        new AGUIDocumentInputContent
        {
            Source = new AGUIInputContentUrlSource
            {
                Value = "https://example.com/report.pdf",
                MimeType = "application/pdf"
            }
        }
    ]
};

User Message Content

The AG-UI wire model for user messages is content: string | InputContent[]. In .NET, that is represented by AGUIUserContent.
var plainText = new AGUIUserMessage
{
    Id = "user-1",
    Content = "Hello"
};

var parts = new AGUIUserMessage
{
    Id = "user-2",
    Content =
    [
        new AGUITextInputContent { Text = "Describe this image." },
        new AGUIImageInputContent
        {
            Source = new AGUIInputContentUrlSource
            {
                Value = "https://example.com/image.png",
                MimeType = "image/png"
            }
        }
    ]
};
AGUIUserContent has implicit conversions from string, List<AGUIInputContent>, and AGUIInputContent[], supports collection expressions, and implements IReadOnlyList<AGUIInputContent> for normalized reads. When the stored value is a string, the read-only list facade exposes it as a single AGUITextInputContent.

Input Content Types

All content parts derive from AGUIInputContent and use the JSON type discriminator.
C# typeJSON typeProperties
AGUITextInputContenttexttext
AGUIImageInputContentimagesource, optional metadata
AGUIAudioInputContentaudiosource, optional metadata
AGUIVideoInputContentvideosource, optional metadata
AGUIDocumentInputContentdocumentsource, optional metadata
AGUIBinaryInputContentbinarymimeType, optional id, url, data, filename

Text

var text = new AGUITextInputContent
{
    Text = "What issue do you see in this UI?"
};
C# propertyJSON fieldTypeDescription
Typetype"text"Content discriminator
TexttextstringText content

Media Parts

Images, audio, video, and documents all derive from AGUIMediaInputContent.
var image = new AGUIImageInputContent
{
    Source = new AGUIInputContentUrlSource
    {
        Value = "https://example.com/ui.png",
        MimeType = "image/png"
    },
    Metadata = JsonDocument.Parse("""{"detail":"high"}""").RootElement.Clone()
};
C# propertyJSON fieldTypeDescription
Typetype"image", "audio", "video", or "document"Content discriminator
SourcesourceAGUIInputContentSourceData or URL source
MetadatametadataJsonElement?Optional modality-specific metadata

Binary

AGUIBinaryInputContent represents an arbitrary binary input part.
var binary = new AGUIBinaryInputContent
{
    MimeType = "application/octet-stream",
    Data = "AAECAwQ=",
    Filename = "payload.bin"
};
C# propertyJSON fieldTypeDescription
Typetype"binary"Content discriminator
MimeTypemimeTypestringMIME type
Ididstring?Optional binary identifier
Urlurlstring?Optional URL for the binary
Datadatastring?Optional inline base64 data
Filenamefilenamestring?Optional file name

Source Types

Media input parts use AGUIInputContentSource, a discriminator-based hierarchy with JSON field type.

Data Source

Use AGUIInputContentDataSource for inline base64 payloads. mimeType is required.
var source = new AGUIInputContentDataSource
{
    Value = "iVBORw0KGgo...",
    MimeType = "image/png"
};
C# propertyJSON fieldTypeDescription
Typetype"data"Source discriminator
ValuevaluestringInline base64 payload
MimeTypemimeTypestringPayload MIME type

URL Source

Use AGUIInputContentUrlSource for HTTP(S) URLs or data URLs. mimeType is optional.
var source = new AGUIInputContentUrlSource
{
    Value = "https://example.com/meeting.wav",
    MimeType = "audio/wav"
};
C# propertyJSON fieldTypeDescription
Typetype"url"Source discriminator
ValuevaluestringURL or data URL
MimeTypemimeTypestring?Optional MIME type

Common Use Cases

Visual QA

var message = new AGUIUserMessage
{
    Id = "q1",
    Content =
    [
        new AGUITextInputContent
        {
            Text = "What issue do you see in this UI?"
        },
        new AGUIImageInputContent
        {
            Source = new AGUIInputContentUrlSource
            {
                Value = "https://example.com/ui.png",
                MimeType = "image/png"
            },
            Metadata = JsonDocument.Parse("""{"detail":"high"}""").RootElement.Clone()
        }
    ]
};

Audio Transcription

var message = new AGUIUserMessage
{
    Id = "q2",
    Content =
    [
        new AGUITextInputContent
        {
            Text = "Transcribe this recording."
        },
        new AGUIAudioInputContent
        {
            Source = new AGUIInputContentUrlSource
            {
                Value = "https://example.com/meeting.wav",
                MimeType = "audio/wav"
            }
        }
    ]
};

Mixed Media Comparison

var message = new AGUIUserMessage
{
    Id = "q3",
    Content =
    [
        new AGUITextInputContent
        {
            Text = "Compare the screenshot with the spec."
        },
        new AGUIImageInputContent
        {
            Source = new AGUIInputContentDataSource
            {
                Value = "iVBORw0KGgo...",
                MimeType = "image/png"
            }
        },
        new AGUIDocumentInputContent
        {
            Source = new AGUIInputContentUrlSource
            {
                Value = "https://example.com/spec.pdf",
                MimeType = "application/pdf"
            }
        }
    ]
};
Use plain string Content for simple text-only turns. Use multimodal parts when order matters or when the user message includes media alongside text.