Skip to main content
Agent37 Cloud routes requests from your app to isolated agent instances. Agent37 Cloud gives every user their own hosted agent computer. Create an instance and Hermes, the live agent, comes back running at its own URL. From there it chats, streams, browses, runs tools, and keeps state between conversations. You never touch a server.

Get started in two calls

Create an instance, then send it a message. The first call returns a running computer with its own URL; the second is a streamed conversation with Hermes, and the returned session_id continues it from there.
# 1. create an instance. it comes back running.
curl -X POST https://api.agent37.com/v1/instances \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{ "budget": { "topup_micros": 1000000 } }'
# -> { "id": "ab12cd34ef", "status": "running", ... }

# 2. talk to it at its own URL, streamed.
curl -N https://ab12cd34ef.agent37.app/v1/responses \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Research the top 3 EV makers, write a memo.",
    "stream": true
  }'
The budget.topup_micros of 1000000 gives the instance $1 of one-time managed-spend headroom, so the LLM credentials it ships with work on the first message. Keys come from the dashboard, and the wallet that pays for all of this tops up at dashboard billing.

One key, two planes

Everything above used one sk_live_ key against two different base URLs. That is the whole architecture:
  1. The hosting API at https://api.agent37.com/v1 creates and manages instances: templates, lifecycle, budgets, exec.
  2. Your instance’s agent API at https://ab12cd34ef.agent37.app is served by the agent gateway running inside the instance: chat at POST /v1/responses, sessions, models, health. The platform edge authenticates the same Bearer key and routes the request; the gateway answers it.
Send the Authorization header on every request to either plane. Core concepts walks through the split in full.

How it fits together

Three resources, and one distinction worth holding onto.
ResourceWhat it is
InstanceThe agent’s own always-on computer, reachable at its own URL. Built from a template, billed monthly, persistent until you delete it. Create one per end user.
SessionOne conversation on an instance. A message starts one; reuse its session_id to continue. An instance holds many.
ResponseOne agentic turn: your input, the agent’s work, its reply. Stream it live or fetch it by id.
The agent is not the model. The template installs the agent: Hermes is live today, and OpenClaw, Claude Code, and Codex are coming soon. The model is the LLM the agent runs on, chosen per turn as model plus provider. They are separate dials.

Building blocks

Snap these together to build your product. Most act on an instance; templates and billing live at the workspace.

Send a message

The core call: input in, an agentic reply out, on your instance’s own URL.

Streaming

Render text, reasoning, and tool activity live over Server-Sent Events.

Sessions

List conversations, fetch full history, continue or delete them.

Instances

Create, list, stop, start, restart, update, and delete agent computers.

Templates

The system catalog plus your own templates from any public amd64 image.

Instance URLs

Every instance answers at its own subdomain; extra ports get their own URLs.

Run commands

Run any shell command inside the instance from your backend.

Budgets

Cap each instance’s managed LLM, Brave search, and Composio spend.

Billing

A prepaid wallet, monthly compute per instance, managed usage at cost.

Errors

Two error envelopes, one habit: branch on code, display message.

Guides

Build a chat app

Give every user their own always-on agent. Where most teams start.

Example apps

Runnable examples. Clone, add your API key, npm start.

Building with an AI coding agent

Point your coding agent at the full documentation in one file and it can scaffold a working client:
https://www.agent37.com/docs/llms-full.txt
Looking for OpenClaw channel, model, or networking setup? Start at the OpenClaw overview.

Next steps

Quickstart

The same two calls, step by step, from API key to streamed reply.

Core concepts

The two planes, instances, sessions, responses, and how money flows.