POST /v1/responses is the core call. You make it against your instance, not against api.agent37.com: every instance serves its own chat API at https://{instanceId}.agent37.app, the url of the default port in the create response. This page uses https://ab12cd34ef.agent37.app. Authenticate with the same sk_live_ key you use on the hosting API.
The call is agentic by default: the agent can browse, run code, use a terminal, read and write files, call connected tools, and reason across many steps before answering.
Omit session_id to start a conversation; the reply carries the session_id the gateway minted. Send it back to continue. The session keeps the full history, so you never resend a transcript: you send only the new input.
Request body
Request bodies are capped at 2 MB; anything larger returns413 payload_too_large.
The message or task. Text only: a plain string, with no attachment, file, or image field. See Sessions and models for how history carries across turns.
Continue an existing conversation. Omit it to start a new one; the response returns the new session’s id. An unknown id returns
404 session_not_found.true returns a Server-Sent Events stream; false returns the finished response as one JSON body. See Streaming.The LLM to run this turn on. Omit it to use the session’s current model (the instance default on a new session). List what the instance can run with
GET /v1/models; see Sessions and models.The model’s provider, for example
anthropic. Both model and provider are set per turn, and sending them on a continuation updates the session’s stored pair for the turns that follow.How hard the model thinks:
none, minimal, low, medium, high, or xhigh.Up to 16 key/value pairs, at most 64 KB serialized. Echoed back on the response object, never interpreted.
Which agent runs the turn.
hermes is the default and the only agent today.chat runs one turn and replies. goal is reserved: sending it returns 400 validation_error today.instance_id in the body is accepted and ignored. The URL names the instance: one gateway per instance, so there is nothing to route.Response
The response object. Ids are 32-character hex strings; timestamps from the gateway are epoch milliseconds.The response id. Use it to fetch, reconnect, or cancel the turn.
The conversation this turn belongs to. Reuse it on the next call to continue the thread.
in_progress, then a terminal completed, failed, or cancelled.The agent that ran the turn,
hermes today.The model the turn ran on,
null when none was set.The model’s provider,
null when none was set.The agent’s final answer. Always a string, empty if the turn produced none.
Token counts and cost for the turn:
{ input_tokens, output_tokens, cost_usd }. cost_usd is absent or null when the provider did not report a cost.Your request metadata, echoed back verbatim.
When the turn started, epoch milliseconds.
A failed turn does not reject the HTTP call. The POST still returns 200 with
status: "failed" and error set. Branch on status, not on the HTTP code.Example
- Request
- Response
Continue a conversation
The first message omitssession_id and starts a session. The reply returns a session_id; pass it on the next message to continue the same thread. The session holds the full history, so you never resend a transcript: you send only the new input.
One active turn per session. A session runs one response at a time. Sending new input while one is in flight returns
409 session_busy. Use another session, or cancel the running turn first.Follow up on a response
Every response has an id you can use after the call returns.| Action | Endpoint |
|---|---|
| Fetch it again | GET /v1/responses/{id} |
| Reconnect a dropped stream | GET /v1/responses/{id}/stream |
| Stop a running turn | POST /v1/responses/{id}/cancel |
GET /v1/responses/{id} returns the response object at any time, before or after the turn finishes.
GET /v1/responses/{id}/stream replays every event so far in order, then stays attached live, so a dropped connection never loses the answer. See Streaming for the replay window.
POST /v1/responses/{id}/cancel takes no body and stops a running turn, best effort. It returns 200 with the current response object. Cancelling a finished response is a no-op that returns its terminal state, still 200.
Status values
A response moves fromin_progress to exactly one terminal status.
| Status | Meaning |
|---|---|
in_progress | The turn is running. |
completed | The turn finished and output_text holds the answer. |
failed | The turn ended on an error; error says why. |
cancelled | You stopped the turn with cancel. |
Next steps
Streaming
The full event list and a client parser for
stream: true.Sessions
List threads, read history, delete a session, and pick a model.
Build a chat app
Put send, continue, and list together into a working chat.
Instances
Create, size, and manage the computer a conversation runs on.