Amazon EC2 grows 62% in 2 years

I estimated Amazon data center size about two years ago using a unique probing technique that I came up with. Since then, I have been tracking their growth (US East data center monthly, but less frequently for all data centers). Now is the time to give you all an update.

Physical server

I will not cover the technique again here, since you can refer to the original post. But I want to stress that this is measuring the number of physical server racks in their data centers, hence deducing the number of physical servers. There are other approaches, such as Netcraft that measures the web facing virtual servers. However, Netcraft only measures the number of virtual servers (and only a subset of it, those that are web facing), where a virtual server could be a tiny Micro instance, a very small slice of a physical server. If you want to know how big EC2 is physically, this is the definitive research.

The following figure shows the growth of the US East data center.

useastgrowthtrend

Number of server racks in EC2 US East data center

The growth in US East data center slowed down in late 2012 and 2013, but the growth has picked up quite a bit recently. It only added 1,362 racks between Mar. 12, 2012 and Dec. 29th, 2013, whereas, it has been adding on average 1,000 racks per year between 2007 and 2013. Then, all of a sudden, it adds 431 racks in the last month and half. However, other EC2 data centers have enjoyed tremendous growth in the two years period. The following table shows how many racks I can observe today, and at the end of last year vs. two years ago by each data center.

data center # of server racks on 3/12/2012 # of server racks on 12/29/2013 % growth 3/12/2012 to 12/29/2013 # of server racks on 2/18/2014 % growth 3/12/2012 to 2/18/2014
US East (Virginia) 5,030 6,382 26.9% 6,813 35.4%
US West (Oregon) 41 619 1410% 904 2205%
US West (N. California) 630 847 34.4% 950 50.8%
EU West (Ireland) 814 1,340 64.6% 1,556 191.2%
AP Northeast (Japan) 314 589 87.6% 719 229%
AP Southeast (Singapore) 246 371 50.8% 432 75.6%
SA East (Sao Paulo) 25 83 232% 122 488%
Total 7,100 10,231 44.1% 11,496 61.9%

There are a few observations:

1. The overall growth rate shows no sign of slowing down. From Jan. 2007 to Mar. 2012, EC2 grows from almost 0 server to 7,100 racks of servers, roughly 1,420 racks per year. From Mar. 2012 to Feb. 2014, EC2 grows from 7,100 racks to 11,496 racks, which is 2,198 racks per year.

2. Most of the growth is not from the US East data center. The Oregon data center grows the most at 2205%, followed by Sao Paulo at 488%.

3. There is a huge spike within the last 1.5 months. The number of racks increased from 10,231 to 11,496, adding 1,265 racks of servers.

The overall growth in the last two years is 62%, which is quite impressive. However, others have estimated that AWS revenue have been growing at a faster rate of more than 50% per year. The discrepancy could be due to the fact that AWS revenue includes many other AWS services including some new ones they have introduced in recent years, and EC2 is just a smaller component of it.

Virtual server growth

Another way to look at EC2’s growth is to look at how many virtual servers are running. Since a customer is paying for a virtual server, looking at the virtual server trend is also a good predictor of EC2 revenue.

As part of our probing technique, we enumerate all virtual servers, regardless whether it hosts a web server or not. If a virtual server is running, the EC2 DNS server will have an entry translating its external IP address to its internal IP address. By counting the number of DNS entries, we arrive at an upper bound of the number of virtual servers running (it is an upper bound because when a virtual server is terminated, the DNS entry is not deleted right away).

The following figure shows the number of running virtual servers (active DNS entries) in the US East Data center in orange. AWS also publishes the number of IP addresses that are available periodically, and we have been tracking that over time. The blue points shows how many IP addresses that are available to assign to virtual servers. AWS has been constantly adding more IP address allocation ahead of the expected growth.

AWS number of running virtual servers

EC2 number of running virtual servers

The green dots show the total available IP addresses across all data center. It is an upper bound on the maximum number of virtual servers EC2 can run. On Dec. 29th, 2013, our data shows there are up to 2.97 Million virtual machines that are active. You can put in an assumption of the average price AWS charges for an instance to roughly estimate EC2 revenue.

Density

From our data, we can also derive the density — the average number of virtual servers running on a physical server. On Mar. 12, 2012, there are 120 virtual servers running on each server rack. However, on Dec. 29th, 2013, this density has increased to 245 virtual servers per rack. Either the Micro instance is gaining popularity, or AWS has been doing a better job of consolidating their load to increase the profit margin.

Parting comment

I have not been blogging much in the last two years. You may be wondering what I have been doing. Well, I have been working on a startup, today we finally come out of stealth mode, and we are officially launching at the Launch Festival. It is an iPhone app, called Jamo, that brings dance games from Wii and Xbox to the iPhone. If this research has been helpful to you, please help me by downloading the App, and give us a 5* rating. You can read more about the App in a previous post.

Advertisements

Cloud is more secure than your own hard disk

I had several feedbacks from my last post on the Outlook Attachment Remover from my colleagues. The number one response is: “Do not put our client’s data there, even if encrypted, it is against the policy”. In this post, I will discuss why Cloud is secure and what a sensible company policy should be.

When CIO gives us the company laptop, we promise to take full responsibility for it. We are expected to set a strong password so that no one can logon to our machine, and we are expected to lock our screen whenever we are away. When clients send us their confidential data, they expect us to secure it in areas where only we have access to. We do not need client permission to store the data on our hard drive because we have promised to our CIO and our clients that we will guard our laptop and hard drive.

When we request a bucket from Amazon S3, the bucket, by default, is readable/writable by us only. Similar to a password, our access to the bucket is guarded by our Amazon credential, which includes both a 20 alpha-numerical characters of Access Key ID and a 40 alpha-numerical characters of Secret Access Key. We promise to keep the Keys to ourselves and Amazon promises the access right works as designed. So, just like our hard disk, the bucket is ours and ours alone. Why should not we be able to store our and client data there? Why do we need client permission?

As much as we promise, accidents do happen. Our laptop could be infected with virus and Trojan horses, we could lose our laptops, Amazon security could be breached. In the past year alone, I know at least two incidents where our company laptops were stolen. In contrast, I have not heard ANY S3 security breach since they launched their service three years ago. It is a more dramatic contrast than you think because S3 has millions of customers and it hosts 29 billion objects, whereas, our company has much fewer employees and far fewer number of laptops. So, is our hard disk more secure than S3?

Since no one can say their system is 100% secure, we have to put in measures to guard against the rare events. Our company laptop has encryption software installed. When the laptop is lost, we are safe because no one can read the data.

 Now, if I encrypt my email attachments, including client data, and put them in my own S3 bucket that is readable/writable by me only, and hold on to the password to myself, why would I need client permission? Why is it not secure? Why is it against the company’s policy? If anything, based on the past track record, CIO should ban us from storing data on our hard drive instead.

An “unlimited” email inbox in the Cloud

Do you work for one of those stingy companies who only give you a tiny email Inbox? My CIO gives me 120MB, which runs out two month after I joined the company. Even if you have a bigger one, it will run out fast enough because everyone likes to send large attachments around.

If you are like me, you will spend hours each week cleaning up your Inbox, archiving your emails, and backing up all your data. Well, I am happy to report that help is finally here. There is a new Outlook Attachment Remover from a startup that can detach your attachments and embedded images and put them on the Amazon Cloud (i.e., S3). For $0.15/GB/month (the price Amazon charges), you can get rid of all the hassles and have unlimited storage.

When I first started using their software, I did a few experiments to see how well it performs. I have an archive folder which has 12000 messages and it is about 890MB in size. The size is small because I deleted most attachments before I put the emails into my archive folder. After I converted them, my archive folder shrinks to about 400MB, which is very impressive since I did not have many attachments in the folder. I guess those embedded images take quite a bit of space.

Next I ran the same test on my working folder. My working folder has all important emails that I need to keep around for reference. They all have their original attachments because that is the reason I keep them in my working folder. It has 400 messages at about 200MB. After the conversion, it is at 25MB. Wow! That is a size that I can fit into the mailbox my CIO gives me.

After I converted my old mail, I just enabled the “auto-detach” option. So whenever a new mail arrives, the attachments are automatically stripped and stored in Amazon. If I want to convert it back for whatever reason, all I have to do is click the “re-attach” button.

I have been using the product for a few weeks and I am quite happy with it. I hope you find the tool useful too.