Chase Farrant

About | Blog | Projects

AWS CloudFormation Best Practices

Created on February 23, 2023

Introduction

CloudFormation is an infrastructure-as-code templating language developed by AWS. I have worked with these templates for several years, so here is a list of tips and tricks I've compiled to take your templates to the next level. Many of these can be applied to other IaC solutions, such as Terraform, CDK, or Azure ARM templates.

Use YAML

I used JSON for YEARS and didn't see a need to switch. I was comfortable with it, so why switch? However, I recently ran into a few situations that changed my mind. Here are some of the benefits of using YAML (based on facts, not opinions):

Supports comments - I find it baffling that comments were removed from the JSON spec.
Smaller filesize - Although rare to hit, CloudFormation templates have a file size limit. Using YAML significantly increases this limit.
Much easier to write inline scripts - No more escaping strings! This is huge for user-data scripts and CloudFormation hooks.

Tag resources during stack deployment

Most AWS resources support the concept of tags. However, declaring tags for each resource in a template is time-consuming and redundant. Is there a better way? Lo and behold! We can tag every resource within a template at the stack level during deployment. Use the CloudFormation Parameters file to also pass in tags at the stack level:

{
    "Parameters": {
        "Environment": "prod",
        "Prefix": "cf"
    },
    "Tags": {
        "BusinessUnit": "Engineering",
        "Environment": "prod",
        "Owner": "TeamBlue"
    }
}

Now, pass in the parameters file with the --parameter-overrides flag. Unfortunately, the parameters file doesn't yet support YAML :/

aws cloudformation deploy --template-file demotemplate.yaml --s3-bucket demobucket --stack-name demostack --parameter-overrides file://demoparams.json

Use the `aws cloudformation package` command

The aws cloudformation package command is a template preprocessor that allows local file references to be used within a template. The command converts local file references to external S3 URLs that the CloudFormation template can reference during deployment. See the documentation for more information.

Use mappings to standardize region names

Region names are variable in length and can be unnecessarily long. This can be frustrating if you need to include the region in a resource name but then subsequently hit a length limit.

CloudFormation mappings can standardize the length and format of region names. By removing the dashes and truncating the length to 5, we can create a short and simple name for each region that is easy to understand without losing any information. This also enables parsing by the name by length or via '-' dashes if needed.

---
AWSTemplateFormatVersion: "2010-09-09"
Mappings:
  RegionMap:
    ap-southeast-1:
      NameShortened: apse1
    ap-southeast-2:
      NameShortened: apse2
    ap-southeast-3:
      NameShortened: apse3
    ca-central-1:
      NameShortened: cace1
    us-east-1:
      NameShortened: usea1
    us-east-2:
      NameShortened: usea2
    us-gov-east-1:
      NameShortened: ugea1
    us-gov-west-1:
      NameShortened: ugwe1
    us-west-2:
      NameShortened: uswe2
Resources:
  - S3BucketDemo:
    Type: AWS::S3::Bucket
    Properties:
      Name: !Sub
        - demo-{RegionShortened}
        - RegionShortened: !FindInMap
            - RegionMap:
            - !Ref AWS::Region
            - NameShortened

How to name CloudFormation resources

First, understand that there are two names associated with each resource in a CloudFormation template:

---
AWSTemplateFormatVersion: "2010-09-09"
Resources:
  Route53HostedZoneCF:  # Logical Name
    Type: AWS::Route53::HostedZone
    Properties:
      Name: chasefarrant.com  # Physical Name

Physical Name - The name of the resource within AWS.

It's a common trope in software development - naming things sucks. Luckily for us, CloudFormation supports auto-generating the Physical Name for most resources! A significant benefit to this approach is that CloudFormation can automagically replace resources without manual intervention. It simply creates a new resource with a slightly different name alongside the old one before deleting it. TL;DR Don't set physical names if you can help it.

Logical Name - The name of the resource within the CloudFormation template.

The preferred format for the Logical Name is to use the Service Name and Resource Type at the beginning of the string and combine it with a unique Name at the end that describes the resource's purpose. The unique name distinguishes it from other resources within the template and makes it easier to update in the future. Sometimes, the Service Name can be omitted when redundant or misleading. For example, AWS::EC2::VPC could be Vpc.

Service Name: Dynamodb
Resource Type: Table
Unique Descriptive Name: Demo
Final Name: DynamodbTableDemo

---
Resources:
  DynamodbTableDemo: # {ServiceName}{ResourceType}{UniqueDescriptiveName}
    Type: AWS::Dynamodb::Table
    ...

Use lowercase characters and hyphens for stack names

Resources within the template will likely derive their name from the stack name. Therefore, the stack name should abide by the naming requirements of all services. Many AWS services have unique resource naming requirements, each restricted by a different subset of characters and length. Let's review a few of the more restrictive rules:

S3 buckets don't allow capital letters.
S3 buckets are limited to 63 characters in length.
S3 buckets and RDS instances must be at least three characters in length.
RDS instances only allow hyphens as a "special" character, but names cannot end in a hyphen.

Given these requirements, it is commonly accepted that using lowercase words separated by hyphens is the best approach. Any resources not explicitly assigned a Physical Name will derive their name from the stack name. This is where the benefit of this pattern is fully realized: A resource's physical name is derived from the concatenation of the stack name and the resource's logical name.

Example stack name:

${Prefix}-${Environment}-${RegionShortened}-UniqueStackName or cf-port-dev-usea1-shared

Example resource physical name:

${StackName}-${ResourceLogicalName} or cf-port-dev-usea1-shared-DynamodbTableDemo

Resource names can also be used to scope IAM access to resources in AWS. Name stacks from generic -> specific to leverage this behavior as the name traverses from left to right. I like to include a Prefix variable to ensure all resources are globally unique across all AWS accounts. The prefix can consist of a single field (E.g., cf) or multiple fields (E.g., cf-portfolio), but keep it short to prevent running into character length limits.

Now, a trailing wildcard can be used to write an IAM policy scoped by Prefix, Environment, or Region. The example below allows dynamodb:GetItem access to all DynamoDB tables beginning with cf-port-dev-*:

  Statement:
  - Effect: Allow
    Action: dynamodb:GetItem
    Resource: !Sub 'arn:${AWS::Partion}:s3:::cf-port-dev-*'

Sort templates alphabetically

Used with the resource naming tips above, this is a simple but effective strategy that makes it easy to navigate larger CloudFormation templates.

...see how easy it is to find an S3 Bucket?

Some folks prefer to "group" related resources next to each other within a template. I'd argue it's implied they are related since they're already in the same template!

Pro-tip: Use shortcuts within your IDE to quickly collapse code to a specific level. As you can see above, this makes it much easier to move around templates rapidly. Here are some shortcuts I use for VSCode:

Collapse code to a specific level: CTRL+K, CTRL+{LEVEL_NUMBER} (I commonly use CTRL+2 for YAML files and CTRL+3 for JSON files).
Fully expand the entire file: CTRL+K, CTRL+J.

Catch errors earlier with `cfn-lint`

cfn-lint is a CLI linting tool that is incredibly useful for discovering template issues before they are deployed. This dramatically reduces the time it takes to make template changes. VSCode also has a plugin that enables linting directly within your IDE.

Make sure cfn-lint is installed ⬈
Install the "CloudFormation Linter" extension within VSCode ⬈

Locally test template deployments before committing

Development speed is crucial, and testing locally is the best way to shorten the feedback loop. Below is a simple example helper script that simplifies the commands needed to deploy a CloudFormation template from your local machine:

#!/bin/bash
filepath_template=$1
filepath_parameters=$2
s3_bucket=$2
stack_name=$3

datetime=$(date +"%Y-%m-%dT%H:%M:%S")

aws cloudformation package \f
    --template-file $filepath_template \
    --s3-bucket $s3_bucket \
    --force-upload \
    --output-template-file $filepath_template.out \
    --s3-prefix $datetime

aws cloudformation deploy \
    --template-file $filepath_template.out \
    --s3-bucket $s3_bucket \
    --s3-prefix $datetime \
    --stack-name $stack_name \
    --parameter-overrides file://$filepath_parameters \
    --capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM CAPABILITY_AUTO_EXPAND

rm $filepath_template.out

Use a separate stack for shared resources

Working on a simple project with minimal requirements, you could squeak by with a single CloudFormation template. Though more than likely, you'll want to split the infrastructure into more manageable chunks. I suggest consolidating resources shared by multiple services into their own "shared" stack. Here are some examples of commonly shared resources:

Alerting (SNS)
API Gateway
Cognito
ECS Cluster
Load Balancers
Messaging and Eventing resources (SNS, EventBridge)
Networking (VPC, Subnets, NAT gateways, etc...)
Route53 Hosted Zones
Security Groups
Shared Databases
Shared Variables and Secrets

Ultimately, this helps prevent "hard" service -> service dependencies. Services shouldn't be required to be deployed in a specific order. That's a fundamental difference between infrastructure and a service:

Infrastructure is inherently dependency-driven; Services are not.

Use nested stacks within the Shared stack

Networking resources alone can blow up the size of a single CloudFormation template. After some trial and error, I have found it best to sort shared resources by purpose into CloudFormation nested stacks. This enables all shared resources to be deployed with a single command while keeping templates to a reasonable size. Check out this simplified example:

# Example Stack Name - cf-port-dev-usea1-shared
---
Resources:
  Alerts:
    Type: AWS::CloudFormation::Stack
  Pipelines:
    Type: AWS::CloudFormation::Stack
    DependsOn: SecurityGroups
  SecurityGroups:
    Type: AWS::CloudFormation::Stack
    DependsOn: Vpc
  SecurityGroupRules:
    Type: AWS::CloudFormation::Stack
    DependsOn: SecurityGroups
  Vpc:
    Type: AWS::CloudFormation::Stack
    DependsOn: Alerts

The DependsOn attribute ensures that resources get created in the correct order. As your infrastructure evolves, the order of dependencies can be adjusted by simply updating the DependsOn attributes. Make sure to test the ordering changes in a fresh environment to avoid circular dependencies from already created resources.

To share values between the shared nested stacks, I prefer NOT to use outputs as that makes the parent stack template messy and doesn't prevent breaking changes (which is essential when it's the underlying infrastructure of your entire application). Instead, I prefer exports, as explained below in the next section...

Use exports between non-nested stacks

CloudFormation Exports often get a bad rep for being difficult to update once used. Indeed, using them can significantly increase the time it takes to make simple changes. But, there is a significant upside to them.

By blocking updates to resources already being consumed, using exports helps prevent breaking infrastructure changes by forcing all changes to be backward compatible.

For example, a Lambda depends on an S3 bucket in a different stack. Without exports, the S3 bucket could be ripped out from underneath the Lambda without warning. The Lambda would be none the wiser until trying to run. Using an export would have prevented the S3 bucket change and encouraged an alternative approach, perhaps standing up a new S3 Bucket first and changing the Lambda over to it.