AWS Step Functions is a low-code, visual workflow service that enables developers to build and automate applications using AWS services. Step Functions allow developers to rapidly create and deploy fault-tolerant, reliable, and scalable workflows while writing less integration code. The foundational concepts of AWS Step Functions will be explained in this article.

AWS Step Functions: Workflow = State Machines

In AWS Step Functions, a workflow is called a state machine. State machines represent a series of event-driven steps, and each step in a state machine workflow is called a state. A state machine is programmed to read a set of inputs, and based on the input it receives, switch to a different state. Based on the inputs, individual states can make decisions, perform actions, and transmit outputs to other states. The following is an example of a state machine:

{
"StartAt": "HelloWorld",
"States": {
"HelloWorld": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:HelloFunction",
"Comment": "Run the HelloWorld Lambda function",
"Next": "GoodbyeWorld"
},
"GoodbyeWorld": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:GoodbyeFunction",
"Comment": "Run the GoodbyeWorld Lambda function",
"End": true
}
}
}

A state machine is defined using JSON or YAML formats. There are multiple ways to create and programmatically deploy a State Machine: AWS API, AWS CloudFormation, Terraform, AWS CDK, or through the Serverless framework with the “serverless-step-functions” plugin.

The AWS console allows users to build state machines intuitively using a beginner-friendly drag-and-drop interface, providing explanations for each available setting, as seen in the example below.

AWS Step Functions Beginner’s Guide - design workflow
State Machine Creation on the AWS Console

State Types

When creating a state within a state machine, it is not only important to give it a unique name but also to define its type. A state type determines the behavior of the state being created.

Task

The ‘Task’ state type is the core feature of any Step Function, as it executes an action on an AWS resource. Used mainly to invoke lambdas, it can interact with almost all AWS services. Tasks can be used to manipulate buckets or objects on S3, launch tasks on EKS, execute DynamoDB queries, send messages through SNS, etc.

1*PetViBi2TIC8qH2 FiAWFA

Choice

The ‘Choice’ state type adds a conditional option based on state variables to execute a specific branch of the workflow. Similar to the process of coding an “if” condition, it is possible to create complex conditions based on booleans, numbers, strings, and timestamps.

1*6ERh L1SHcp4bEfl v9C Q

Pass

The ‘Pass’ state type simply sends its input to output. This state type is used to make workflows more readable on the console, manipulate the state variable, or serve as a placeholder while waiting for a new feature.

Wait

The ‘Wait’ state type is similar to the ‘Pass’ state type. However, it delays the execution of the workflow for a specific amount of time or waits for a specific timestamp.

Parallel

The ‘Parallel’ state type launches specific workflow branches in parallel and executed at the same time. This state type is best used for tasks that are not interdependent.

1*y5pShKKeWiDIWjihnjLN5A

Map

The ‘Map’ state type repeats over an array to execute a workflow branch for each element in the array. Map also makes it possible to execute a defined number of workflows in parallel.

1*L5ra4VkZLPMWdGjV4b2nrg

Succeed

The ‘Succeed’ state type stops the execution of a workflow or a specific branch in the workflow and sets the status of the execution to “Success”. The use of this state type is not a technical necessity since the same behavior can be replicated by adding an “End: true” on the previous state. However, the usage of the ‘Succeed’ state type allows for better readability in the console.

Fail

Like Succeed, the ‘Fail’ state type stops the execution of a workflow. However, it sets the status to “Failed”. It is possible to customize errors returned by the state machine using this state type.

State Variable

By default, each step of a state machine takes the output of the previous step (a JSON object) as its input. It is possible to manipulate and store variables in a more complex manner using “$”. Examples of this are as follows:

  • “Parameters” can be added to precisely define the payload sent to a task
  • “InputPath” can be added to put a sub-object of the state as a payload
  • “ResultPath” can be added to put the result in a sub-object of the state
  • “ItemsPath” can be added to perform the same role as InputPath for Map instruction

AWS Step Functions supports state sizes up to 256KB for the whole execution.

1*UyrJ2N 4uGaE bvyiNtsZg
ResultPath Usage Example

Error Management

When building and automating workflows, it is critical to consider execution contingencies in the event of an error. These errors can be managed from the state machine using the “Retry” and “Catch” settings.

A task can be configured to restart automatically for a specific number of times with a defined interval (either fixed or exponential) between consecutive executions. If all the retry attempts result in errors, the execution of the state machine can be configured to stop with a “Failed” status. A “Catch” setting can then be added to execute another branch of the workflow by transmitting the error.

"X": {
"Type": "Task",
"Resource": "arn:aws:states:us-east-1:123456789012:task:X",
"Next": "Y",
"Retry": [{
"ErrorEquals": ["ErrorA", "ErrorB"],
"IntervalSeconds": 1,
"BackoffRate": 2.0,
"MaxAttempts": 2
}, {
"ErrorEquals": ["ErrorC"],
"IntervalSeconds": 5
}],
"Catch": [{
"ErrorEquals": ["States.ALL"],
"Next": "Z"
}]
}

How TrackIt Uses Step Functions — Order Management System Example

AWS Step Functions has been an integral part of workflows implemented by TrackIt. The development team recently leveraged Step Functions to rebuild an entire Order Management System (OMS) for a major retailer. AWS Step Functions allowed TrackIt engineers to write more efficient code by outputting part of the OMS logic in the workflow, and enabled a detailed view of each order. The serverless nature of AWS Step Functions and Amazon Lambda also helped ensure efficient scaling during peak usage.

Conclusion

Developers can save significant amounts of time with the strategic use of the various mechanisms offered by the AWS Step Functions service. Workflows built using AWS Step Functions help manage failures, retries, parallelization, service integrations, and observability, allowing developers to shift their focus towards higher-value business logic.

About TrackIt

TrackIt is an Amazon Web Services Advanced Tier Services Partner specializing in cloud management, consulting, and software development solutions based in Marina del Rey, CA.

TrackIt specializes in Modern Software Development, DevOps, Infrastructure-As-Code, Serverless, CI/CD, and Containerization with specialized expertise in Media & Entertainment workflows, High-Performance Computing environments, and data storage.

In addition to providing cloud management, consulting, and modern software development services, TrackIt also provides an open-source AWS cost management tool that allows users to optimize their costs and resources on AWS.