Skip to main content
The parallelism limit controls the maximum number of calls that can be executed concurrently. Unlike rate limiting (which works per time window), parallelism enforces concurrency control with a token-based system.
Configure Retry Attempt Count
import { Client } from "@upstash/workflow";

const client = new Client({ token: "<QSTASH_TOKEN>" })

const { workflowRunId } = await client.trigger({
  url: "https://<YOUR_WORKFLOW_ENDPOINT>/<YOUR-WORKFLOW-ROUTE>",
  flowControl: {
    key: "user-signup",
    parallelism: 10,
  }
})
Example: If parallelism = 3, at most 3 requests can run concurrently. When tokens are available, requests acquire one and start execution:
When all tokens are in use, additional requests are not failed — they’re queued in a waitlist:
The step in the waitlist will wait for a step to complete and hand off it’s token to a pending request:
Token handoff does not guarantee strict ordering. A later request in the waitlist may acquire a token before an earlier one.