Know exactly what a 429 means here, why authentication surfaces fail closed, and how to retry so your integration recovers cleanly.

The model

Rate limiting is atomic (no check-then-write races) and identity-aware. Two buckets apply simultaneously:

Per-identity bucket. Requests are counted against the acting identity — the SCIM token for provisioning traffic, or the account-plus-IP pair on authentication surfaces. One noisy tenant cannot starve another.
Per-IP ceiling. A separate, higher cap on total requests from a single IP address. This stops fan-out abuse from one host even when it spreads load across many identities — and conversely, the per-identity bucket means many users behind one corporate NAT do not exhaust each other.

Fail-closed on authentication surfaces

Behavior when the limiter itself cannot be evaluated differs by surface:

Authentication endpoints fail closed. If the limit cannot be checked, the request is rejected. Availability is sacrificed before brute-force protection.
Non-authentication surfaces fail open, biased toward availability. SCIM in particular tolerates this because identity providers re-list on their next sync sweep, so a transiently accepted request is reconciled.

Current limits

Surface	Limit
SCIM (per token)	1,200 requests per hour
Per-IP infrastructure ceiling	Applies in addition to per-identity buckets

Limits are subject to tuning; treat a 429 as authoritative rather than hard-coding thresholds.

What a 429 looks like

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60

{
  "error": "Rate limit exceeded. Please try again later.",
  "retryAfter": 60,
  "message": "Too many requests. Please wait 60 seconds before trying again."
}

The Retry-After header (and the matching retryAfter JSON field) is in seconds.

Retry guidance

Honor Retry-After first

If the response carries Retry-After, wait at least that long. Retrying earlier only re-fills the bucket you already exhausted.

Back off exponentially with jitter

Without a Retry-After value, use exponential backoff (for example 1s, 2s, 4s, 8s…) with random jitter so concurrent clients do not retry in lockstep.

Cap concurrency at the source

For SCIM, avoid running a bulk "push now" while a backfill or initial sync is in progress — that is the common way to trip the 1,200/hour token limit. The limit relaxes at the next hourly window.

429 responses are always safe to retry. For non-idempotent operations that returned an ambiguous network failure, prefer a read-back (for example a SCIM filter query) over a blind re-POST, which can produce 409 uniqueness conflicts.

Note

Identity providers already implement compliant retry behavior for SCIM. The guidance above matters most for custom scripts and middleware you build yourself.

Your client honors Retry-After, backs off with jitter when it is absent, and never runs unbounded retry loops against authentication endpoints.

Updated 12 June 2026 · 3 min read

Was this useful?

The model

Rate limiting is atomic (no check-then-write races) and identity-aware. Two buckets apply simultaneously:

Per-identity bucket. Requests are counted against the acting identity — the SCIM token for provisioning traffic, or the account-plus-IP pair on authentication surfaces. One noisy tenant cannot starve another.
Per-IP ceiling. A separate, higher cap on total requests from a single IP address. This stops fan-out abuse from one host even when it spreads load across many identities — and conversely, the per-identity bucket means many users behind one corporate NAT do not exhaust each other.

Fail-closed on authentication surfaces

Behavior when the limiter itself cannot be evaluated differs by surface:

Authentication endpoints fail closed. If the limit cannot be checked, the request is rejected. Availability is sacrificed before brute-force protection.
Non-authentication surfaces fail open, biased toward availability. SCIM in particular tolerates this because identity providers re-list on their next sync sweep, so a transiently accepted request is reconciled.

Current limits

Surface	Limit
SCIM (per token)	1,200 requests per hour
Per-IP infrastructure ceiling	Applies in addition to per-identity buckets

Limits are subject to tuning; treat a 429 as authoritative rather than hard-coding thresholds.

What a 429 looks like

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60

{
  "error": "Rate limit exceeded. Please try again later.",
  "retryAfter": 60,
  "message": "Too many requests. Please wait 60 seconds before trying again."
}

The Retry-After header (and the matching retryAfter JSON field) is in seconds.

Your client honors Retry-After, backs off with jitter when it is absent, and never runs unbounded retry loops against authentication endpoints.

Updated 12 June 2026 · 3 min read

Was this useful?

Rate limits

The model

Fail-closed on authentication surfaces

Current limits

What a 429 looks like

Retry guidance

Honor Retry-After first

Back off exponentially with jitter

Cap concurrency at the source

Retry only what is safe

On this page

Rate limits

The model

Fail-closed on authentication surfaces

Current limits

What a 429 looks like

Retry guidance

Honor Retry-After first

Back off exponentially with jitter

Cap concurrency at the source

Retry only what is safe

On this page