Flow Control enables you to limit the number of calls made to your workflow by delaying the delivery.

There are two main use cases for Flow-Control in Workflow:

  1. Limiting the Workflow Environment: Controlling the execution environment.
  2. Limiting External API Calls: Preventing excessive requests to external services.

There two parameters you can configure to achieve the desired behavior:

  • Rate Per Second Limit: Defines the maximum number of calls per second. Calls within the same FlowControl key contribute to the rate count. Instead of rejecting excess calls, QStash delays them for execution in later seconds, respecting the limit.
  • Parallelism Limit: This controls the number of concurrent executions. Unlike rate limiting, execution duration matters. At no time will there be more than the specified number of active calls. If the limit is reached, other publishes will wait until a slot is available and, once one finishes, they will continue.

Using Rate and Parallelism Together: Both parameters can be combined. For example, with a rate of 10 per second and parallelism of 20, if each request takes a minute to complete, QStash will trigger 10 calls in the first second and another 10 in the next. Since none of them will have finished, the system will wait until one completes before triggering another.

For the FlowControl, you need to choose a key first. This key is used to count the number of calls made to your endpoint.

The rate/parallelism limits are not applied per url, they are applied per Flow-Control-Key.

There are not limits to number of keys you can use.

Limiting the Workflow Environment

To limit the execution environment, you need to configure both the serve and trigger methods. When configured, all the steps of the workflow will respect the limits. Due to the nature of the Workflow SDK, QStash calls the serve method multiple times. This means that to stay within the limits of the deployed environments, the given rate will be applied to all calls going from QStash servers to the serve method.

Note that if there are multiple Workflows running in the same environment, their steps can interleave, but overall rate and parallelism limits will be respected if they share the same flowControl key.

  • In the serve method:
export const { POST } = serve<string>(
  async (context) => {
    await context.run("step-1", async () => {
      return someWork();
    });
  },
  {
    flowControl: { key: "app1", parallelism: 3, ratePerSecond: 10 }
  }
);

// TODO links broken For more details, see the flowControl documentation under serve parameters.

  • In the trigger method:
import { Client } from "@upstash/workflow";

const client = new Client({ token: "<QStash_TOKEN>" });
const { workflowRunId } = await client.trigger({
  url: "https://workflow-endpoint.com",
  body: "hello there!",
  flowControl: { key: "app1", parallelism: 3, ratePerSecond: 10 }
});

For more detailso trigger, see the documentation here.

Limiting External API Calls

To limit requests to an external API, use context.call:

import { serve } from "@upstash/workflow/nextjs";

export const { POST } = serve<{ topic: string }>(async (context) => {
  const request = context.requestPayload;

  const response = await context.call(
    "generate-long-essay", 
    {
      url: "https://api.openai.com/v1/chat/completions",
      method: "POST",
      body: {/*****/},
      flowControl: { key: "opani-call", parallelism: 3, ratePerSecond: 10 }
    }
  );
});

For more details, see the documentation here.