Testing Individual StepFunction States

Testing Individual StepFunction States


5 min read

AWS StepFunctions is a fantastic service for orchestrating complex workflows. But testing StepFunctions has always been... tricky.

You could execute the entire state machine, observing side effects and final outcomes, but this approach often feels like using a sledgehammer for a task that needs a scalpel. It's effective for smaller workflows but quickly becomes unwieldy with more complex ones.

Unit testing Lambda functions or other compute components can cover the core logic of a task, but this doesn't cover data transformations or control flows between states.

Local emulation of StepFunctions is another approach but it often leads to IAM access inconsistencies and other headaches common with emulating cloud services.

While these methods are useful, they each come with drawbacks that risk that the beautiful state machine you wrote to handle that one complicated payment workflow becomes just another mess of logic that is hard to maintain or evolve. This is particularly ironic considering that many state machines are built explicitly to untangle complex logic.

Did you know? While StepFunctions is the name of the AWS Service, a workflow in StepFunctions is called a state machine. Each step in a workflow is called a state, and a task state represents a state where another AWS service performs the work.

Right before re:Invent 2023, however, AWS released a new capability to test individual states in a state machine.

Printscreen from the test feature in StepFunction console

This feature is available in the StepFunction console, but, more importantly, it also allows us to write isolated integration tests in code. These can test output, data transformations, and control flow for each individual state in our state machines - all without the hassle of trying to force the entire state machine to go down a particular route to validate the same thing.

Let's take a look at how we can use it.

Integration Tests

First, assume we have a small state machine that upgrades a user account, but only if they have enough credits to cover the cost.

Using the Node AWS SDK, there's a new TestStateCommand. This command takes:

  • A task definition

  • A task input payload

  • An IAM role that the StepFunction service will assume to execute the state

It executes the state, and returns:

  • The status of the execution - whether or not it errored, succeeded, or caught an error

  • The output of the task

  • The state that would be next in line

  • Metadata about the execution

An example response could look like this:

  '$metadata': {
    httpStatusCode: 200,
    requestId: '2916bb74-418d-402b-a903-18c0a9c3e9c9',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  nextState: 'Choice',
  output: '{"ExecutedVersion":"$LATEST","Payload":{"cost":75,"credits":100},"SdkHttpMetadata":{"AllHttpHeaders":{"X-Amz-Executed-Version":["$LATEST"],"x-amzn-Remapped-Content-Length":["0"],"Connection":["keep-alive"],"x-amzn-RequestId":["256f8768-f84e-46b0-ba79-b48dbfd250fa"],"Content-Length":["25"],"Date":["Wed, 06 Dec 2023 19:48:24 GMT"],"X-Amzn-Trace-Id":["root=1-6570d008-34fc67a96baf085e418d0745;sampled=1;lineage=055c6c5a:0"],"Content-Type":["application/json"]},"HttpHeaders":{"Connection":"keep-alive","Content-Length":"25","Content-Type":"application/json","Date":"Wed, 06 Dec 2023 19:48:24 GMT","X-Amz-Executed-Version":"$LATEST","x-amzn-Remapped-Content-Length":"0","x-amzn-RequestId":"256f8768-f84e-46b0-ba79-b48dbfd250fa","X-Amzn-Trace-Id":"root=1-6570d008-34fc67a96baf085e418d0745;sampled=1;lineage=055c6c5a:0"},"HttpStatusCode":200},"SdkResponseMetadata":{"RequestId":"256f8768-f84e-46b0-ba79-b48dbfd250fa"},"StatusCode":200}',
  status: 'SUCCEEDED'

Let's put together a couple of helper functions that can help us fetch a deployed state machine's definition, and one that executes a given state in the state machine definition.

// util.ts

import {
} from '@aws-sdk/client-sfn';

const client = new SFNClient({});

interface TestStateInput {
  stateMachineDefinition: string;
  roleArn: string;
  taskName: string;
  input?: string;

export const fetchStateMachine = async (stateMachineArn: string) => {
  const stateMachine = await client.send(
    new DescribeStateMachineCommand({
      stateMachineArn: stateMachineArn,

  if (!stateMachine.definition) {
    throw new Error('State machine definition not found');

  if (!stateMachine.roleArn) {
    throw new Error('State machine roleArn not found');

  return {
    definition: stateMachine.definition,
    roleArn: stateMachine.roleArn,

const getTask = (taskName: string, definition: string) => {
  // parse the definition and return the given state
  const task = JSON.parse(definition).States[taskName];
  if (!task) {
    throw new Error(`Task ${taskName} not found`);

  return JSON.stringify(task);

export const testState = async ({
}: TestStateInput) => {
  const task = getTask(taskName, stateMachineDefinition);

  return await client.send(
    new TestStateCommand({
      definition: task,
      roleArn: roleArn,
      input: input,

fetchStateMachine takes a state machine ARN and returns the deployed state machine's definition and the IAM role it's configured to use.

testState takes that definition, role, and the name of the state that we want to test. It executes the state using the StepFunction service, and it returns the execution information.

We can now use these to start writing our tests:

// account-upgrade.test.ts
import { describe, it, expect } from 'vitest';
import { fetchStateMachine, testState } from './util';

describe('[accountUpgrade]', async () => {
  const stateMachine = await fetchStateMachine(

  describe('[CheckCreditBalance]', async () => {
    it('returns credit balance', async () => {
      const res = await testState({
        stateMachineDefinition: stateMachine.definition,
        roleArn: stateMachine.roleArn,
        taskName: 'CheckCreditBalance',
        input: JSON.stringify({
          cost: 75,


      const output = JSON.parse(res.output || '{}');
        balance: expect.any(Number),
        cost: 75,

  describe('[Choice]', async () => {
    it('flows to account upgrade if credit balance more than cost', async () => {
      const res = await testState({
        stateMachineDefinition: stateMachine.definition,
        roleArn: stateMachine.roleArn,
        taskName: 'Choice',
        input: JSON.stringify({
          cost: 75,
          balance: 100,


With this, we have everything we need to start to fully cover the full configuration and logic of our state machines.

For a more complete demo, which gives us type safety for the resource ARN and the state names, I've published a demo SST project here.

Additionally, Lars Jacobsson has added support to his samp-cli project for interactively running the individual state tests - it even lets you re-use recent live execution payloads. Incredible!

If you want to dig further into more strategies for testing StepFunctions, Yan Cui has some excellent writing on it here.

In conclusion, testing StepFunctions is hard, but it did just get a whole lot easier. As we've seen, the ability to test individual StepFunction states marks a leap forward in how we can build more robust and well-tested state machines. This feature isn't just a neat trick; it addresses some of the core frustrations we've faced in complex workflow orchestration. It doesn't replace end-to-end testing and unit testing of state machines, but it's a well-appreciated complement!

Hi there, I'm Sebastian Bille! If you enjoyed this post or just want a constant feed of memes, AWS & serverless talk, and the occasional new blog post, make sure to follow me on ๐• at @TastefulElk or on LinkedIn ๐Ÿ‘‹

Elva is a serverless-first consulting company that can help you transform or begin your AWS journey for the future