throttle

Limit the rate of module calls.

Quick Reference

Parameter	Type	Description
`throttle`	Rate	Maximum calls per time period

Syntax:

result = ExternalAPI(input) with throttle: 100/1min

Syntax

result = Module(args) with throttle: <count>/<duration>

Type: Rate (count per duration)

Description

The throttle option limits how frequently a module can be called. Calls that exceed the rate limit are queued and executed when capacity becomes available. This is useful for respecting external API rate limits or controlling resource usage.

The rate is specified as count/duration, where count is the maximum number of calls allowed in the given duration window.

Token bucket allows bursting

The throttle uses a token bucket algorithm, which means up to count calls can execute instantly (the "burst"), then subsequent calls are rate-limited. This is ideal for APIs that allow short bursts but enforce average rate limits.

Throttle vs concurrency

Use throttle to limit calls per time period (rate limiting). Use concurrency to limit simultaneous calls (parallelism). They can be combined: throttle: 100/1min, concurrency: 10 means max 10 parallel calls, with max 100 total per minute.

Examples

API Rate Limiting

response = ExternalApi(request) with throttle: 100/1min

Maximum 100 calls per minute to the external API.

Per-Second Limiting

result = FastService(data) with throttle: 10/1s

Maximum 10 calls per second.

Hourly Limiting

report = GenerateReport(params) with throttle: 1000/1h

Maximum 1000 reports per hour.

Combined with Retry

response = RateLimitedApi(request) with
    throttle: 50/1min,
    retry: 3,
    delay: 1s,
    backoff: exponential

Rate limit calls and retry on failure with backoff.

Behavior

The throttle uses a token bucket algorithm:

Each duration period, count tokens are available
Each call consumes one token
If tokens are available:
- Consume a token
- Execute immediately
If no tokens available:
- Wait until a token becomes available
- Then execute

This allows bursting up to count calls instantly, with sustained rate averaging to count/duration.

Rate Format

<count>/<duration>

Examples:
  100/1min   - 100 per minute
  10/1s      - 10 per second
  1000/1h    - 1000 per hour
  50/30s     - 50 per 30 seconds

concurrency - Limit parallel executions
retry - Retry after rate limit wait

Per-Module Limiting

Rate limits are tracked per module name. Different modules have independent limits:

# These have separate rate limits
a = ServiceA(x) with throttle: 10/1s
b = ServiceB(y) with throttle: 10/1s

Best Practices

Match throttle rate to external API limits (with margin)
Use throttle for third-party APIs with documented rate limits
Combine with retry for handling rate limit errors
Consider using per-user or per-tenant throttling for fairness
Monitor throttle wait times to detect capacity issues

Quick Reference​

Syntax​

Description​

Examples​

API Rate Limiting​

Per-Second Limiting​

Hourly Limiting​

Combined with Retry​

Behavior​

Rate Format​

Related Options​

Per-Module Limiting​

Best Practices​