Amazon Simple Storage Service (S3) provides an online repository for files, all running on the same infrastructure that Amazon uses to run it's own network of sites. Files can just be stored or made available on a url to download. This makes it great for anything from backing up all your photos and documents to serving up media files on your web site. In this post we'll look at the benefits and costs of using Amazon S3.
Creating an account with Amazon S3 will allow you to create what they call 'Buckets' (you can think of these as higher level categories) in which you can place 'Objects' (images, documents etc). Each bucket belongs to one or many regions (e.g. 'US West') allowing you to optimise how quickly data will get served. Objects can be uploaded using their online console, third party software such as CloudBerry or by using the Developer API, which allows you to connect via your web site or other software.
There are a range of options for access levels to the objects in your buckets - each bucket and individual object can be made to be 'private' or 'public'. A private bucket would be used when you simply want to store or backup some files. Public buckets and objects can be made available on a url which will look something like this:
S3 runs on a massive network of servers also used by Amazon's own network of sites, meaning it is extremely reliable, fast and cost effective. For this reason, serving files such as large images on your site from S3 is an effective way of cutting costs and providing a better experience for your visitors. If you have a static web site, it's even possible to store and serve your entire site from S3.
Here's a few highlights of why S3 makes a good choice for data storage and serving files:
Costs for S3 are broken down into three areas, which can make trying to estimate your usage a little tricky...
Standard storage for the first TB of data per month will cost you $0.095 per GB of data. After this it slowly decreases until you go over 5000TB per data at which point it costs $0.055 per GB of data.
Requests are charged at $0.004 per 10,000 GET requests (i.e. each time someone goes to one of your files via a Url).
This refers to the amount of data in GB's is transferred out of your buckets. Some good news, the first GB is free, but after that it's charged at $0.12 per GB up to 10TB per month.
So what does all this mean in practice?
If your web site equated to 1mb of CSS, scripts and images (lets say over 10 different files) all served from S3, your primary cost would be 'Data Transfer', which would work out at around $1.20 per 10,000 site visitors. To then serve 10 files per 10,000 visitors this would cost you $0.04 and your storage cost would be negligble.
To give you a specific example; one of my client sites received around 7,000 visitors over a week with approximately 30,000 page views. The site primarily acted as a portfolio and as a result featured hundreds of images averaging 100 - 500k, totalling around 200mb in storage. Over the week period site visitors chomped their way through over 20GB of data (averaging 2.85mb per user). The cost for all this?
|$0.120 per GB - up to 10 TB / month data transfer out||26.792 GB||3.22|
|$0.004 per 10,000 GET and all other requests||600,414 Requests||0.24|
|$0.095 per GB - first 1 TB / month of storage used||0.236 GB-Mo||0.02|
Total Cost: $4.23 (inc vat)
To get an estimate for much it will cost you each month try using their usage calculator.
The short answer is no. But if you're concerned about running a high bill there are a couple of things you can implement...
You can setup as many billing alerts as you like for different thresholds, so if your requests start getting high you can take a look at the logs and review if you need to take any futher action. Billing alerts can be setup in the amazon s3 console.
One risk of allowing large media files to be accessed on a public url is the potential for people to do what is known as 'hotlinking' to your content. Hotlinking is the concept of people linking directly to your files, rather than hosting the files themsleves. Why is this relevant? Imagine if someone featured a 'hotlinked' image from one of your buckets on an email campaign that went out to thousands of people, or a web page that went viral - you could end up picking up the bill for all the requests and data transfer.
To prevent hotlinking you can use something called 'Signed Urls', which require an additional query string to be appended to files object requests. You can restrict how long objects should be made available on a url with a given query string, meaning if anyone 'hotlinks' to you, it will only work for a short period of time before the string changes and is no longer valid.
If you want to try out Amazon S3 they provide a free starter tier with 5GB of data storage and 20,000 get requests.