Skip to main content
This document outlines the protocol for interacting with AG-Kit agents via the send-message endpoint. It is intended for developers who wish to build custom SDKs in other languages or interact directly with the AG-Kit server using HTTP, without being restricted to the existing React SDK.

Prerequisites

This doc is designed for developers who want to interact with AG-Kit agent server using HTTP endpoint. So please check before you start reading:
  • You have a AG-Kit agent server running. If not, please refer to Run Agent.
  • You do not want to use the existing React SDK. If you do, please refer to Run Agent.

0. send-message Endpoint

The client and server communicate via the send-message endpoint. During an interaction between the client and the server:
  • The client sends a POST request to the endpoint, and carries the payload in the request body as a JSON string.
  • The server responds in SSE format, streaming back a series of events.
The following sections will focus on the request and response payloads, introducing couple of common interaction cases, including:
  • Basic chat
  • Client tool calling
  • Server tool calling
  • Interrupt handling
To learn the details of the endpoint, please refer to API Reference.

1. Basic Chat

The simplest form of interaction involves the client sending a plain text message and the agent responding with text.

Client Request Payload

The client sends a request with message.type set to "text" and the user’s input in message.content. For an ongoing conversation, the conversationId should be included.
conversationId is used to identify the conversation.AG-Kit’s agent server will keep track of the conversation history based on the conversationId.The messages field is supposed to carry the delta messages of the conversation. It will be automatically appended to the conversation history by the agent server.
Request Body Example:
{
    "messages": [
        {
            "role": "user",
            "content": "hi"
        }
    ],
    "conversationId": "2da2146d-9e7e-41f3-9515-b70557e4430c"
}

Server Response Events

For plain text response, the server will stream back text events. The client should concatenate the content field from these events to reconstruct the complete agent’s textual response. Response Event Examples:
data: {"type": "text", "content": "Hello"}

data: {"type": "text", "content": " there!"}

data: {"type": "text", "content": " How can I help you?"}

Diagrams

2. Client Tool Calling

In this scenario, the agent requests the client to perform a tool call. The client is responsible for executing the tool and submitting its result back to the server.

Client Request Payload (Initial)

In the initial request, the client sends a request with available client tools and messages that may trigger the agent to call a client tool. The tools are carried in the tools field, which is an array of tool definition with name, description, and parameters. The parameters should be a JSON schema string. Request Body Example:
{
    "messages": [
        {
            "role": "user",
            "content": "Change background color to blue."
        }
    ],
    "tools": [
        {
            "name": "change-background-color",
            "description": "Change the background color. The color should be one of the following: blue, red, green, transparent, yellow, purple, orange, pink, brown, gray, black, or white. No other colors or any variants.",
            "parameters": "{\"$schema\":\"https://json-schema.org/draft/2020-12/schema\",\"type\":\"object\",\"properties\":{\"color\":{\"type\":\"string\"}},\"required\":[\"color\"],\"additionalProperties\":false}"
        }
    ],
    "conversationId": "c7d334f7-d920-4dd3-91e0-53d695e79fc0"
}

Server Response Events (Tool Call Request)

After the server receives the available client tools and decides to call one of them, the sever will stream a series of tool call events. Tool call events are designed to support streaming tool call arguments, they have 3 variants:
  • tool-call-start: carries toolCallId adn toolCallName, indicates the start of a tool call.
  • tool-call-args: carries toolCallId and delta, represents an chunk of the arguments for a tool call.
  • tool-call-end: carries toolCallId, indicates the completion of a tool call.
The client should assemble the arguments by concatenating the delta field from the tool-call-args events. After client receives the tool-call-end event, the arguments should be a valid JSON string, which can be parsed by the client to pass to the client tool. Response Event Examples:
data: {"type":"tool-call-start","toolCallId":"a_b_c","toolCallName":"change-background-color"}

data: {"type":"tool-call-args","toolCallId":"a_b_c","delta":"{\"color\":"}

data: {"type":"tool-call-args","toolCallId":"a_b_c","delta":" \"blue\"}"}

data: {"type":"tool-call-end","toolCallId":"a_b_c"}

Client Action and Request Payload (Tool Execution and Result Submission)

Now the client receives the tool call events, it can start to execute the tool.
The implementation of tool execution is arbitrary, it can be a local function, a UI component, or an external API call. It can even insert Human in the Loop feature to introduce human approval before tool execution.
After tool execution, the client sends a new request with a tool message, carrying content as the result of the tool execution, and toolCallId as the id of the tool call. Request Body Example (Submitting Tool Result):
{
    "messages": [
        {
            "role": "tool",
            "content": "Background color successfully changed to: blue",
            "toolCallId": "a_b_c"
        }
    ],
    "conversationId": "c7d334f7-d920-4dd3-91e0-53d695e79fc0"
}

Server Response Events (After Tool Result Submission)

Just as the regular chat, the server append the tool message to the conversation history and streams further text events, often incorporating the tool’s output into the response. Response Event Examples:
data:	{"type":"text","content":"I"}	

data:	{"type":"text","content":"'ve"}	

data:	{"type":"text","content":" successfully"}	

data:	{"type":"text","content":" changed"}	

data:	{"type":"text","content":" the"}	

data:	{"type":"text","content":" background"}	

data:	{"type":"text","content":" color"}	

data:	{"type":"text","content":" to"}	

data:	{"type":"text","content":" blue"}	

data:	{"type":"text","content":" for"}	

data:	{"type":"text","content":" you"}	

data:	{"type":"text","content":"."}

Diagrams

3. Server Tool Calling

In this scenario, the agent server has its own server tools available. The client sends a request to the server that triggers the server to decide to call a server-side tool. The server will automatically execute the tool and stream the whole tool calling process back to the client alongside with further text generation. In this case, we assume that there is a server tool available called get_weather.

Client Request Payload

The client sends a request that is likely to trigger the agent to call the get_weather tool. Request Body Example:
{
    "messages": [
        {
            "role": "user",
            "content": "What's the weather in London?",
        }
    ],
    "conversationId": "c7d334f7-d920-4dd3-91e0-53d695e79fc0"
}

Server Response Events

The server streams events progressively as it generates a response and executes a tool. The tool call events are of the exact same shape as those for client tool calls: tool-call-start, tool-call-args, and tool-call-end. After the tool call completes, the server sends a tool-call-result event, followed by a series of text events. Response Event Examples (Successful Tool Execution):
data: {"type":"text","content":"\n"}

data: {"type":"text","content":"I"}

data: {"type":"text","content":"'ll"}

data: {"type":"text","content":" check"}

data: {"type":"text","content":" the"}

data: {"type":"text","content":" weather"}

data: {"type":"text","content":" in"}

data: {"type":"text","content":" London"}

data: {"type":"text","content":" for"}

data: {"type":"text","content":" you"}

data: {"type":"text","content":".\n"}

data: {"type":"tool-call-start","toolCallId":"call_5fab24926dc542cda0df0bb3","toolCallName":"get_weather"}

data: {"type":"tool-call-args","toolCallId":"call_5fab24926dc542cda0df0bb3","delta":"{\"city\":\"London\"}"}

data: {"type":"tool-call-end","toolCallId":"call_5fab24926dc542cda0df0bb3"}

data: {"type":"tool-result","result":"The weather in London is sunny and 20 degrees Celsius.","toolCallId":"call_5fab24926dc542cda0df0bb3"}

data: {"type":"text","content":"\n"}

data: {"type":"text","content":"The"}

data: {"type":"text","content":" weather"}

data: {"type":"text","content":" in"}

data: {"type":"text","content":" London"}

data: {"type":"text","content":" is"}

data: {"type":"text","content":" sunny"}

data: {"type":"text","content":" and"}

data: {"type":"text","content":" "}

data: {"type":"text","content":"20"}

data: {"type":"text","content":" degrees"}

data: {"type":"text","content":" Celsius"}

data: {"type":"text","content":"."}

data: {"type":"text","content":" It"}

data: {"type":"text","content":"'s"}

data: {"type":"text","content":" a"}

data: {"type":"text","content":" pleasant"}

data: {"type":"text","content":" day"}

data: {"type":"text","content":" for"}

data: {"type":"text","content":" outdoor"}

data: {"type":"text","content":" activities"}

data: {"type":"text","content":"!"}
Response Event Examples (Tool Execution Error):

Diagrams

4. Interrupt Handling

In this scenario, the agent pauses execution and requests user input before continuing (Human-in-the-Loop). The server emits an interrupt event with a unique id, an optional reason, and a arbitrary structured payload. The client should render an appropriate UI based on the payload, collect the user’s input, and then resume the conversation by posting a new request with a resume object.

Server Response Events (Interrupt)

When the agent pauses, the server streams an interrupt event:
data: {"type":"interrupt","id":"a522d9262d6dd44c78777969cb3e58ab","reason":"agent requested interrupt","payload":{"styles":["dark","sweet"]}}
The payload of the interrupt event is arbitrary. It is up to the agent server developer to define the payload.

Client Action and Request Payload (Resume)

After receiving an interrupt, the client renders UI based on payload, gathers the resume data, and posts a new request containing a resume object, which contains a payload field representing a JSON string. The request should also include the same conversationId. Request Body Example (Resume):
{
    "resume": {
        "interruptId": "a522d9262d6dd44c78777969cb3e58ab",
        "payload": "\"sweet\""
    },
    "conversationId": "6b1f2ed4-3dcd-41fc-ba82-b6de95355982"
}

Server Response Events (After Resume)

After a valid resume, the server continues streaming as usual, typically with text events and possibly tool call events:
data: {"type":"text","content":"Thanks, proceeding with the requested action."}

data: {"type":"text","content":" Action completed."}

Diagrams