Skip to content
Learni
View all tutorials
Cloud AWS

How to Orchestrate Workflows with AWS Step Functions in 2026

Lire en français

Introduction

AWS Step Functions is a serverless orchestration service that coordinates distributed workflows via state machines defined in Amazon States Language (ASL), a standardized JSON format. Unlike simple Lambdas, it natively handles states, transitions, errors, and loops, making it ideal for complex microservices like ETL pipelines, transaction sagas, or ML workflows.

Why use it in 2026? With the rise of event-driven architectures, Step Functions cuts boilerplate code by 70% (per AWS benchmarks), integrates with 200+ AWS services, and provides graphical visibility through the Execution Console. This advanced tutorial covers essential patterns: Lambda tasks, retries, Parallel, Map, and CDK deployment. By the end, you'll deploy a scalable state machine worthy of any cloud architect's bookmark. (128 words)

Prerequisites

  • AWS account with IAM admin (or StepFunctionsFullAccess)
  • AWS CLI v2 installed and configured (aws configure)
  • Node.js 20+ and AWS CDK v2 (npm i -g aws-cdk)
  • Advanced knowledge of Lambda, IAM, and serverless
  • AWS region eu-west-1 for examples

Install and Configure AWS CLI

setup-aws.sh
#!/bin/bash

# Installer AWS CLI v2 si pas présent (macOS/Linux)
if ! command -v aws &> /dev/null; then
    curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
    unzip awscliv2.zip
    sudo ./aws/install
fi

# Configurer les credentials
aws configure set aws_access_key_id YOUR_ACCESS_KEY
aws configure set aws_secret_access_key YOUR_SECRET_KEY
aws configure set default.region eu-west-1
aws configure set default.output json

# Vérifier
aws sts get-caller-identity

This script installs AWS CLI v2, configures credentials, and verifies identity. Replace YOUR_ACCESS_KEY and YOUR_SECRET_KEY with your IAM values. Avoid root keys; use a dedicated IAM role for security.

First State Machine: Hello World

Start with a trivial state machine using a Pass state to validate ASL syntax. Copy the following JSON and deploy it via CLI.

Define a Simple State Machine

hello-world.json
{
  "Comment": "Hello World Step Function",
  "StartAt": "Hello",
  "States": {
    "Hello": {
      "Type": "Pass",
      "Result": "Hello, AWS Step Functions!",
      "End": true
    }
  }
}

This ASL defines a single-step workflow with a Pass state that returns a fixed result. The StartAt field points to the first state; End: true terminates the execution. Test it before adding complexity to validate your StepFunctionsFullAccess IAM role.

Deploy and Run the State Machine

deploy-hello.sh
#!/bin/bash

# Créer la state machine
ARN=$(aws stepfunctions create-state-machine \
  --name HelloWorldStateMachine \
  --definition file://hello-world.json \
  --role-arn arn:aws:iam::YOUR_ACCOUNT:role/StepFunctionsExecutionRole \
  --query 'stateMachineArn' --output text)

echo "State Machine ARN: $ARN"

# Exécuter
EXECUTION_ARN=$(aws stepfunctions start-execution \
  --state-machine-arn $ARN \
  --input '{}' \
  --query 'executionArn' --output text)

echo "Execution ARN: $EXECUTION_ARN"

# Attendre et obtenir output
aws stepfunctions wait-execution-started --execution-arn $EXECUTION_ARN
aws stepfunctions get-execution-history --execution-arn $EXECUTION_ARN --query 'events[-1].stateEnteredEventDetails.output'

This script creates the state machine, runs it with empty input, and retrieves the output. Replace YOUR_ACCOUNT and create the IAM role StepFunctionsExecutionRole beforehand (with Step Functions trust policy). Use --no-cli-pager for long logs.

Integrate a Lambda Task with Error Handling

Move to a Task state that invokes a Lambda. We'll handle errors with Catch and Retry, crucial for production resilience.

State Machine with Lambda, Retry, and Catch

lambda-workflow.json
{
  "Comment": "Workflow avec Lambda, Retry et Catch",
  "StartAt": "ProcessData",
  "States": {
    "ProcessData": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "arn:aws:lambda:eu-west-1:YOUR_ACCOUNT:function:ProcessFunction",
        "Payload.$": "$"
      },
      "Retry": [
        {
          "ErrorEquals": ["States.TaskFailed"],
          "IntervalSeconds": 2,
          "MaxAttempts": 3,
          "BackoffRate": 2.0
        }
      ],
      "Catch": [
        {
          "ErrorEquals": ["States.ALL"],
          "Next": "ErrorHandler",
          "ResultPath": "$.error"
        }
      ],
      "Next": "Success"
    },
    "Success": {
      "Type": "Succeed",
      "End": true
    },
    "ErrorHandler": {
      "Type": "Fail",
      "Error": "WorkflowError",
      "Cause": "$.error"
    }
  }
}

This workflow invokes a Lambda via the standard Resource ARN, retries 3 times with exponential backoff on TaskFailed, and catches all errors to a Fail state. Create the ProcessFunction Lambda first. ResultPath preserves the original input.

Parallel and Iterative Workflows

Parallel runs branches concurrently; Map iterates over arrays (up to 40 items by default, configurable). Perfect for parallel ETL jobs.

State Machine with Parallel and Map

parallel-map.json
{
  "Comment": "Parallel et Map Example",
  "StartAt": "ParallelProcess",
  "States": {
    "ParallelProcess": {
      "Type": "Parallel",
      "Branches": [
        {
          "StartAt": "Branch1",
          "States": {
            "Branch1": {
              "Type": "Pass",
              "Result": "Branch 1 done",
              "End": true
            }
          }
        },
        {
          "StartAt": "Branch2",
          "States": {
            "Branch2": {
              "Type": "Pass",
              "Result": "Branch 2 done",
              "End": true
            }
          }
        }
      ],
      "Next": "MapProcess"
    },
    "MapProcess": {
      "Type": "Map",
      "ItemsPath": "$.inputArray",
      "MaxConcurrency": 5,
      "Iterator": {
        "StartAt": "ProcessItem",
        "States": {
          "ProcessItem": {
            "Type": "Task",
            "Resource": "arn:aws:states:::lambda:invoke",
            "Parameters": {
              "FunctionName": "arn:aws:lambda:eu-west-1:YOUR_ACCOUNT:function:ProcessItem",
              "Payload.$": "$"
            },
            "End": true
          }
        }
      },
      "ResultPath": "$.results",
      "End": true
    }
  }
}

Parallel launches two simultaneous branches; Map iterates over $.inputArray with max 5 concurrencies, invoking a Lambda per item. Use ItemsPath to target the input array; ResultPath aggregates results. Scale with MaxConcurrency.

Deploy with AWS CDK (IaC)

lib/step-functions-stack.ts
import * as cdk from 'aws-cdk-lib';
import * as stepfunctions from 'aws-cdk-lib/aws-stepfunctions';
import * as tasks from 'aws-cdk-lib/aws-stepfunctions-tasks';
import * as lambda from 'aws-cdk-lib/aws-lambda';

export class StepFunctionsStack extends cdk.Stack {
  constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const processFn = new lambda.Function(this, 'ProcessFn', {
      runtime: lambda.Runtime.NODEJS_20_X,
      handler: 'index.handler',
      code: lambda.Code.fromInline('exports.handler = async () => ({ statusCode: 200, body: "Processed" });')
    });

    const definition = new stepfunctions.Chain.start(new tasks.LambdaInvoke(this, 'ProcessTask', {
      lambdaFunction: processFn
    }));

    const stateMachine = new stepfunctions.StateMachine(this, 'AdvancedStateMachine', {
      definition,
      timeout: cdk.Duration.minutes(5)
    });

    processFn.grantInvoke(stateMachine);
  }
}

This CDK stack creates an inline Lambda and a simple Chain state machine. Synthesize (cdk synth) and deploy (cdk deploy). Benefits: IaC with diffs, versioning, and auto IAM grants. Extend with fromJson for custom ASL.

Deploy the CDK Stack

deploy-cdk.sh
#!/bin/bash

cdk bootstrap aws://YOUR_ACCOUNT/eu-west-1
npm init -y
npm install aws-cdk-lib constructs
npx tsx lib/step-functions-stack.ts  # Pour test
cdk init app --language typescript
# Copier le code TS dans stacks/StepFunctionsStack.ts
cdk synth
cdk deploy StepFunctionsStack

# Obtenir ARN et exécuter
ARN=$(aws stepfunctions list-state-machines --query 'stateMachines[0].stateMachineArn' --output text)
aws stepfunctions start-execution --state-machine-arn $ARN --input '{"key":"value"}'

Initializes a TypeScript CDK app, bootstraps, synthesizes, and deploys. Replace YOUR_ACCOUNT. Runs the machine post-deployment. CDK handles updates without downtime via versioning.

Best Practices

  • Always use IaC: CDK/Terraform for versioning and audits.
  • Timeouts and limits: Set per-state timeouts (max 1 year for machine).
  • CloudWatch Logging: Enable Insights for SQL queries on executions.
  • Input/OutputPath: Filter payloads to avoid bloat (256 KB limit).
  • Granular roles: Least privilege IAM with conditions (e.g., via sourceArn).

Common Errors to Avoid

  • Forgetting ResultPath: Loses input context in Catch/Retry.
  • MaxConcurrency too high: Lambda throttling (limit to 1000).
  • No global Catch: Orphaned executions on States.Timeout.
  • Malformed ASL: Validate JSON in AWS Console before CLI (ParseError).

Next Steps