Rate limiting

ACAAS rate limits requests per API key, in fixed windows. When you exhaust your quota for a window, further requests return 429 Too Many Requests until the window resets. This page covers how to read your current status, what the status ladder means, and how to back off gracefully.

How limits work

Per key. Each API key has its own quota. Sharing a key between processes shares its quota.
Per window. The window is a rolling fixed interval. The exact size depends on your plan; the demo key uses a one-minute window with 100 requests.
Per request, not per character. A 1-character request and a 10,000- character request both consume one slot.
/v1/health is free. It does not require auth and does not count against your quota.
/v1/rate-limits is free. Calling it does not consume a request from the window it reports on.

Read your current status

Call GET /v1/rate-limits to inspect your quota at any time without spending a request.

curl https://api.acaas.example.com/v1/rate-limits \
  -H "X-API-Key: demo"

Response

{
  "requests_remaining": 73,
  "limit": 100,
  "resets_in_seconds": 28,
  "status": "ok"
}

requests_remaining — slots left in the current window.
limit — total slots per window for this key.
resets_in_seconds — seconds until the window resets and requests_remaining returns to limit.
status — categorical health (ok, approaching_limit, at_limit).

The status ladder

The status field is a coarse signal designed for quick branching. Use it when you do not want to make a numeric decision in the client.

Status	Meaning	Suggested action
`ok`	Plenty of headroom (typically `> 25%` remaining).	Carry on as normal.
`approaching_limit`	Below ~25% remaining. The next burst may exhaust quota.	Slow request rate; defer non-urgent calls.
`at_limit`	Quota exhausted. Further requests will return `429`.	Stop sending; wait `resets_in_seconds`.

Treat approaching_limit as the moment to apply backpressure — not after you have already received a 429.

Handling 429 responses

When a request returns 429 Too Many Requests, the body is a simple detail string.

{ "detail": "Rate limit exceeded" }

Do not retry immediately. Call /v1/rate-limits to find out how long until the window resets, sleep that duration, then retry.

import time
import requests

BASE = "https://api.acaas.example.com"

def shout(text: str, *, api_key: str) -> str:
    headers = {"X-API-Key": api_key}
    resp = requests.post(f"{BASE}/v1/shout", headers=headers, json={"text": text})

    if resp.status_code == 429:
        status = requests.get(f"{BASE}/v1/rate-limits", headers=headers).json()
        time.sleep(status["resets_in_seconds"])
        resp = requests.post(f"{BASE}/v1/shout", headers=headers, json={"text": text})

    resp.raise_for_status()
    return resp.json()["result"]

Proactive pacing

For high-volume workloads, do not wait for 429. Check the status before each batch and pace accordingly.

import time
import requests

BASE = "https://api.acaas.example.com"

def paced_shout_many(texts: list[str], *, api_key: str) -> list[str]:
    headers = {"X-API-Key": api_key}
    results = []

    for text in texts:
        status = requests.get(f"{BASE}/v1/rate-limits", headers=headers).json()

        if status["status"] == "at_limit":
            time.sleep(status["resets_in_seconds"])
        elif status["status"] == "approaching_limit":
            # Stretch remaining requests across the rest of the window.
            remaining = max(status["requests_remaining"], 1)
            time.sleep(status["resets_in_seconds"] / remaining)

        resp = requests.post(f"{BASE}/v1/shout", headers=headers, json={"text": text})
        resp.raise_for_status()
        results.append(resp.json()["result"])

    return results

The last branch — dividing resets_in_seconds by requests_remaining — is a simple way to spread the tail of the window evenly without overshooting.

Best practices

Cache your status. When you have just made a request, you know your status changed by exactly one slot. Avoid hammering /v1/rate-limits before every call.
Prefer one shared client. Creating a fresh HTTP session per call adds latency without changing your quota math. Reuse connections.
Surface limits in your UI. If end users drive request volume, show them when they are approaching quota — approaching_limit is a good trigger for an “easy there, friend” toast.
Log resets_in_seconds on 429. It is the single most useful field for diagnosing rate-limit issues after the fact.

Introduction

Cookbook

Reference

Changelog

Rate limiting

How limits work

Read your current status

The status ladder

Handling 429 responses

Proactive pacing

Best practices

Next steps

Errors reference

Cookbook

Introduction

Cookbook

Reference

Changelog

Documentation Index

​How limits work

​Read your current status

​The status ladder

​Handling 429 responses

​Proactive pacing

​Best practices

​Next steps

Errors reference

Cookbook

How limits work

Read your current status

The status ladder

Handling 429 responses

Proactive pacing

Best practices

Next steps