Rate Limits#
Every Agent API endpoint belongs to a rate-limit class. Limits are enforced per agent over a rolling window. The live, authoritative state for your agent is always available from GET /v1/agent/self/usage (see the Endpoints catalog); the values below are the current defaults.
Classes#
| Class | Limit | Burst | Window | Applies to |
|---|---|---|---|---|
read | 600 | 100 | 60s | GET reads (balances, operations, account, activity) |
write | 60 | 20 | 60s | Mutations (payments, conversions, earn, borrow, withdrawals) |
simulate | 120 | 30 | 60s | POST /v1/agent/simulate and other dry-run checks |
webhook_management | 10 | 5 | 60s | All /v1/agent/webhooks routes — list and get and create/update/delete/rotate/test (only GET .../deliveries is charged as read) |
bulk_write | 20 | 5 | 60s | Reserved for batched mutations |
The class for each endpoint is shown in the Endpoints reference and in the generated SDK, CLI, and MCP references.
Headers#
Every response carries the current window state:
X-RateLimit-Limit: <max requests in the window>
X-RateLimit-Remaining: <requests left>
X-RateLimit-Reset: <ISO 8601 reset time>
X-RateLimit-Window-Seconds: <window length>When you exceed a class limit the API additionally returns HTTP 429 with the rate_limited error code and a Retry-After header (seconds).
Handling 429#
Back off for at least the Retry-After duration before retrying.
- Treat
Retry-Afteras the minimum wait, and add jitter when running many agents. - The error details include
limit_count(the effective limit, a number) andattempted_count. - The SDK surfaces this as
HightopAgentSDKErrorwithcode === 'rate_limited'; the CLI exits non-zero and prints the error.
import { getAgentApiErrorByCode } from '@hightop/sdk'
const rateLimit = getAgentApiErrorByCode(error, 'rate_limited')
if (rateLimit) {
// back off for Retry-After, then retry the same logical request with the same key
console.log(rateLimit.details.limit_count)
}A 429 itself creates no idempotency record (see Conventions), so there is nothing to "replay". Retrying the same logical request with the same idempotency key and body is still the right move: it avoids creating a duplicate if an earlier attempt did land.
Reducing pressure#
- Prefer a single
readpoll loop with backoff over tight polling. For operation status, use the SDKoperations.wait()helper or the CLI--waitflag, which poll efficiently. - Batch related reads where an endpoint supports it, and use pagination cursors rather than re-fetching whole lists.
- Keep webhook management (a low-limit class) to setup and rotation, not steady-state traffic — subscribe once and let deliveries flow. See Webhooks.
Next#
- Going to Production — retries, idempotency, and error handling end to end
- Errors — the full stable error-code list
- Conventions — shared headers and idempotency rules
