Skip to main content

Rate Limiting

Limit the rate and concurrency of module calls to respect API rate limits and protect resources.

Use Case

You call external APIs with rate limits (e.g., 5 requests/second) or run resource-intensive tasks that should not all execute at once.

The Pipeline

# rate-limiting-demo.cst

@example("request-1")
in req1: String

@example("request-2")
in req2: String

@example("request-3")
in req3: String

@example("request-4")
in req4: String

@example("request-5")
in req5: String

# Throttle: limit to 5 requests per second
resp1 = RateLimitedApi(req1) with throttle: 5/1s
resp2 = RateLimitedApi(req2) with throttle: 5/1s
resp3 = RateLimitedApi(req3) with throttle: 5/1s
resp4 = RateLimitedApi(req4) with throttle: 5/1s
resp5 = RateLimitedApi(req5) with throttle: 5/1s

# Concurrency: at most 2 running at the same time
@example("task-A")
in taskA: String

@example("task-B")
in taskB: String

@example("task-C")
in taskC: String

resultA = ResourceIntensiveTask(taskA) with concurrency: 2
resultB = ResourceIntensiveTask(taskB) with concurrency: 2
resultC = ResourceIntensiveTask(taskC) with concurrency: 2

# Combine throttle and concurrency
@example("heavy-task")
in heavyTask: String

heavyResult = ResourceIntensiveTask(heavyTask) with
throttle: 10/1s,
concurrency: 3

out resp1
out resp2
out resp3
out resp4
out resp5
out resultA
out resultB
out resultC
out heavyResult

Explanation

OptionSyntaxPurpose
throttlethrottle: 5/1sMaximum N calls per time window
concurrencyconcurrency: 2Maximum N calls running simultaneously

Throttle vs Concurrency

AspectThrottleConcurrency
ControlsRate (calls per time period)Parallelism (simultaneous calls)
Use caseExternal API rate limitsCPU/memory-bounded tasks
ScopePer module namePer module name

Throttle limits are shared across all calls to the same module. If you have 5 calls to RateLimitedApi with throttle: 5/1s, at most 5 execute per second regardless of how many are queued.

Running the Example

Input

{
"req1": "request-1",
"req2": "request-2",
"req3": "request-3",
"req4": "request-4",
"req5": "request-5",
"taskA": "task-A",
"taskB": "task-B",
"taskC": "task-C",
"heavyTask": "heavy-task"
}

Output

{
"resp1": "rate limited response: request-1",
"resp2": "rate limited response: request-2",
"resp3": "rate limited response: request-3",
"resp4": "rate limited response: request-4",
"resp5": "rate limited response: request-5",
"resultA": "intensive result: task-A",
"resultB": "intensive result: task-B",
"resultC": "intensive result: task-C",
"heavyResult": "intensive result: heavy-task"
}

Variations

Rate-limited with retry

in request: String
in default: String

result = ExternalApi(request) with
throttle: 10/1s,
retry: 3,
delay: 200ms,
fallback: default
tip

The throttle option uses a token bucket algorithm. This means short bursts above the rate are allowed as long as the average rate stays within limits. For strict rate limiting, combine with concurrency to cap simultaneous calls.

warning

Setting burst size too high can overwhelm external services during traffic spikes. If an API has a strict rate limit (e.g., 10/second with no burst allowance), use a lower throttle value like 8/1s to leave headroom for retries.

Best Practices

  1. Match throttle to the API's rate limit — check the external service's documentation for its limits
  2. Use concurrency for CPU-bound work — prevent resource exhaustion from too many parallel computations
  3. Combine both for heavy API callsthrottle for rate, concurrency for parallel connections
  4. Same settings per module — all calls to the same module should use the same throttle/concurrency values for consistent behavior