How To: Mass S3 Object Creation with Terraform

I've been working a bit with Terraform recently to support managing some of our infrastructure at the Mozilla Foundation. In doing so I came across a problem I couldn't find a documented solution for, so today I'm going to publish a bit of a how-to in the hopes that folks in similar problems might find it helpful.

Let's say you have one hundred files in the "src" directory, and you need to upload them to S3. You could add a resource entry for each file, but that would be tedious and repetitive:


locals {
  src_dir      = "${path.module}/src"
}


resource "aws_s3_bucket_object" "index" {
  bucket        = local.my_bucket_id
  key           = "index.html"
  source        = "${local.src_dir}/index.html"
  content_type  = "text/html"l
}

resource "aws_s3_bucket_object" "about" {
  bucket        = local.my_bucket_id
  key           = "about.html"
  source        = "${local.src_dir}/about.html"
  content_type  = "text/html"
}

resource "aws_s3_bucket_object" "main_css" {
  bucket        = local.my_bucket_id
  key           = "main.css"
  source        = "${local.src_dir}/main.css"
  content_type  = "text/css"
}

resource "aws_s3_bucket_object" "javascript" {
  bucket        = local.my_bucket_id
  key           = "main.js"
  source        = "${local.src_dir}/main.js"
  content_type  = "application/javascript"
}

resource "aws_s3_bucket_object" "favicon" {
  bucket        = local.my_bucket_id
  key           = "favicon.ico"
  source        = "${local.src_dir}/favicon.ico"
  content_type  = "image/x-icon"
}

resource "aws_s3_bucket_object" "header_image" {
  bucket        = local.my_bucket_id
  key           = "header.png"
  source        = "${local.src_dir}/header.png"
  content_type  = "image/png"
}

# and so on, 90+ more times

Thankfully, Terraform has the for_each meta-argument which lets us pass a map or set of strings to create an instance of a resource for each element in the meta-argument. Pairing this functionality with the fileset built-in function allows us to generate all these objects in far fewer lines of configuration:


locals {
  src_dir      = "${path.module}/src"
}

resource "aws_s3_bucket_object" "site_files" {
  # Enumerate all the files in ./src
  for_each = fileset(local.src_dir, "**")

  # Create an object from each
  bucket        = aws_s3_bucket.bucket.id
  key           = each.value
  source        = "${local.src_dir}/${each.value}"
  
  # Uh oh, what should we do here?
  # content_type  = ???
}

There is one small problem though, as indicated above: content_type. How can we set the content type correctly for each file? Well, there's a few built-ins to help us out here. Firstly, there's the lookup built-in, which returns the value from a map given a key, and can set a default if no key is found. So, if we define a content type map like so:

locals {
  content_type_map = {
    html        = "text/html",
    js          = "application/javascript",
    css         = "text/css",
    svg         = "image/svg+xml",
    jpg         = "image/jpeg",
    ico         = "image/x-icon",
    png         = "image/png",
    gif         = "image/gif",
    pdf         = "application/pdf"
  }
}

We can use lookup get the content type by extracting the file extension from the filename. To accomplish that, we use the regex built-in function:

regex("\\.(?P<extension>[A-Za-z0-9]+)$", filename).extension

So, if we put it all together, we get:

locals {
  src_dir      = "${path.module}/src",
  content_type_map = {
    html        = "text/html",
    js          = "application/javascript",
    css         = "text/css",
    svg         = "image/svg+xml",
    jpg         = "image/jpeg",
    ico         = "image/x-icon",
    png         = "image/png",
    gif         = "image/gif",
    pdf         = "application/pdf"
  }
}

resource "aws_s3_bucket_object" "site_files" {
  # Enumerate all the files in ./src
  for_each = fileset(local.src_dir, "**")

  # Create an object from each
  bucket        = aws_s3_bucket.bucket.id
  key           = each.value
  source        = "${local.src_dir}/${each.value}"
  
  content_type  = lookup(local.content_type_map, regex("\\.(?P<extension>[A-Za-z0-9]+)$", each.value).extension, "application/octet-stream")
}

And there you have it! all the files in your "src" directory should now have an associated S3 resource managed using terraform, and each has the appropriate content type, so you can serve the files up via S3's static site functionality or via a CloudFront CDN.