Skip to main content

Documentation Index

Fetch the complete documentation index at: https://acaas.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

ACAAS rate limits requests per API key, in fixed windows. When you exhaust your quota for a window, further requests return 429 Too Many Requests until the window resets. This page covers how to read your current status, what the status ladder means, and how to back off gracefully.

How limits work

  • Per key. Each API key has its own quota. Sharing a key between processes shares its quota.
  • Per window. The window is a rolling fixed interval. The exact size depends on your plan; the demo key uses a one-minute window with 100 requests.
  • Per request, not per character. A 1-character request and a 10,000- character request both consume one slot.
  • /v1/health is free. It does not require auth and does not count against your quota.
  • /v1/rate-limits is free. Calling it does not consume a request from the window it reports on.

Read your current status

Call GET /v1/rate-limits to inspect your quota at any time without spending a request.
curl https://api.acaas.example.com/v1/rate-limits \
  -H "X-API-Key: demo"
Response
{
  "requests_remaining": 73,
  "limit": 100,
  "resets_in_seconds": 28,
  "status": "ok"
}
  • requests_remaining — slots left in the current window.
  • limit — total slots per window for this key.
  • resets_in_seconds — seconds until the window resets and requests_remaining returns to limit.
  • status — categorical health (ok, approaching_limit, at_limit).

The status ladder

The status field is a coarse signal designed for quick branching. Use it when you do not want to make a numeric decision in the client.
StatusMeaningSuggested action
okPlenty of headroom (typically > 25% remaining).Carry on as normal.
approaching_limitBelow ~25% remaining. The next burst may exhaust quota.Slow request rate; defer non-urgent calls.
at_limitQuota exhausted. Further requests will return 429.Stop sending; wait resets_in_seconds.
Treat approaching_limit as the moment to apply backpressure — not after you have already received a 429.

Handling 429 responses

When a request returns 429 Too Many Requests, the body is a simple detail string.
{ "detail": "Rate limit exceeded" }
Do not retry immediately. Call /v1/rate-limits to find out how long until the window resets, sleep that duration, then retry.
import time
import requests

BASE = "https://api.acaas.example.com"

def shout(text: str, *, api_key: str) -> str:
    headers = {"X-API-Key": api_key}
    resp = requests.post(f"{BASE}/v1/shout", headers=headers, json={"text": text})

    if resp.status_code == 429:
        status = requests.get(f"{BASE}/v1/rate-limits", headers=headers).json()
        time.sleep(status["resets_in_seconds"])
        resp = requests.post(f"{BASE}/v1/shout", headers=headers, json={"text": text})

    resp.raise_for_status()
    return resp.json()["result"]

Proactive pacing

For high-volume workloads, do not wait for 429. Check the status before each batch and pace accordingly.
import time
import requests

BASE = "https://api.acaas.example.com"

def paced_shout_many(texts: list[str], *, api_key: str) -> list[str]:
    headers = {"X-API-Key": api_key}
    results = []

    for text in texts:
        status = requests.get(f"{BASE}/v1/rate-limits", headers=headers).json()

        if status["status"] == "at_limit":
            time.sleep(status["resets_in_seconds"])
        elif status["status"] == "approaching_limit":
            # Stretch remaining requests across the rest of the window.
            remaining = max(status["requests_remaining"], 1)
            time.sleep(status["resets_in_seconds"] / remaining)

        resp = requests.post(f"{BASE}/v1/shout", headers=headers, json={"text": text})
        resp.raise_for_status()
        results.append(resp.json()["result"])

    return results
The last branch — dividing resets_in_seconds by requests_remaining — is a simple way to spread the tail of the window evenly without overshooting.

Best practices

  • Cache your status. When you have just made a request, you know your status changed by exactly one slot. Avoid hammering /v1/rate-limits before every call.
  • Prefer one shared client. Creating a fresh HTTP session per call adds latency without changing your quota math. Reuse connections.
  • Surface limits in your UI. If end users drive request volume, show them when they are approaching quota — approaching_limit is a good trigger for an “easy there, friend” toast.
  • Log resets_in_seconds on 429. It is the single most useful field for diagnosing rate-limit issues after the fact.

Next steps

Errors reference

Every status code, what it means, and how to recover.

Cookbook

A complete Python walkthrough using these patterns end-to-end.