s3cmd Revisited

Still a work in progress, but here is what I have:

1
2
3
4
5
6
7
# 24 hours
s3cmd sync --no-preserve --cf-invalidate --add-header="Cache-Control: public, max-age=86400" [local directory path] s3://[bucket_name]/
# 1 hour
s3cmd modify --add-header="Cache-Control: public, max-age=3600" s3://[bucket_name]/categories/sports/index.html
s3cmd modify --add-header="Cache-Control: public, max-age=3600" s3://[bucket_name]/categories/technology/index.html
s3cmd modify --add-header="Cache-Control: public, max-age=3600" s3://[bucket_name]/index.html
s3cmd modify --add-header="Cache-Control: public, max-age=3600" s3://[bucket_name]/atom.xml

Let’s run down the options to the sync command:

  • --nopreserve: disable save of filesystem attributes in s3 metadata
  • --cf-invalidate: invalidate the uploaded file[s] in Cloudfront
  • --add-header=…: explicitly set the cache control headers on the uploaded files

Followed by 4 explicit modify statements to set a shorter max-age on my RSS feed and index pages.

I had previously planned to control my Cloudfront cache behavior via the underlying s3 Cache-Control header. After reflection, I realized this wasn’t my best choice. I want a short Cache-Control max-age to instruct browsers to check for new content. But I want Cloudfront to cache files as long as possible.

My current plan is to use the Minimum TTL in Cloudfront to control the Cloudfront cache behavior and use the Cache-Control header to control browser cache behavior.

17 Oct: s3cmd Revisited x2