More than half a million Android Wear watches sold

Screenshot_2014-12-04-20-51-40

Today, Android Wear claims the top spot to be the most popular smart watch platform, with more than 500,000 smart watches sold.

According to Google Play, the Android Wear App, for the first time, shows more than 500,000 downloads. An Android Wear watch requires the Android Wear App in order to pair with a smart phone, so everyone with an Android Wear watch must download the app. However, if you do not have a watch, the app is useless, so no one should download the app if he/she does not have a watch. Thus, the number of app downloads should be a good approximation of how many Android Wear watches have been sold.

Back in March, it was reported that more than 400,000 pebble watches have been sold. Compared to Pebble, there are not only more Android Wear watches sold, but those sales also happened in a much shorter amount of time — a mere 5 months since its launch in July.

Although Android Wear leads today, it is widely expected that Apple Watch will shatter the record. Some estimates that Apple could sell 20 or 30 millions Apple Watches in the first year.

Some people still doubt that the smart watch market will take off, hopefully the Android Wear growth data provides a good counter argument. Another reason some are skeptical is because of their experience with simpler wearable devices, such as those activity trackers. Many people leave their tracker in drawers collecting dust after the novelty factor wears off. But I see smart watch is fundamentally different. With an open platform, creative developers will come up with interesting apps that increase the usefulness of the smart watches, and we all know that it is those novel apps that keep us engaged with the device. If you are looking for proof, check out our fitness app or golf app, which shows you some creative use of your watch. We see extremely high retention rate from users of our watch apps, compared to our phone app. I bet, once you discover a killer watch app, you will never think about putting your watch in the drawer again.

Advertisements

How Apple is planning to pump up the smart watches market

Apple-Watch-logo-main1Some people say smart watches, like Android Wear watches or Apple Watch, are a product with no market and no need. The reasoning behind it is that we can already do everything a smart watch can do with our smart phone. So how could smart watches find a wide-spread adoption? I think Apple has the successful formula in mind already.

Market it as a fashion item

Smart Watch is distinct from our smart phone in that it is worn on our wrist. Since we often wear fashion items on our wrist, people would buy a watch just for the fashion reason alone. This is how the Swiss luxury watch industry survived the onslaught from cheap Japanese quartz watch competitors. Apple understands this lesson well. Unlike the original iPhone which was released with one model and one color, Apple Watch comes with 2 sizes, 6 different custom alloys including 2 18k gold alloy, 6 different bands with each band having several different colors, all these grouped into 3 distinct editions. It is a nightmare for supply chain SKU management, but this is what it takes to build a fashion item.

Migrate existing use cases, making them more convenient

All watch makers understand this point. They have all made glancing notification a top priority. Instead of pulling out your phone to see each incoming email or text, you can simply glance down your watch.

While Apple is doing the same thing, it is going one step beyond. Apple purchased Beats not only for their music streaming service, but more importantly for their headphone business. They are reportedly making an Apple branded bluetooth headphone, and I think this is designed for the Apple Watch. Listening to music is going to be an important use case. It is not only limited to young folks, I also observe many people like to listen to their iPod during workout. When you strap you iPod to your arm, a wired earphone may be acceptable, but if you were to listen to music from your Apple Watch on your wrist, the wire will be a significant disadvantage. Instead, a bluetooth headphone will greatly improve the experience. With Beats’ existing user base in the young audience, they can easily move these young folks off iPod onto Apple Watch. My prediction is that Apple Watch will function as a stand-alone music player even when it is not paired with an iPhone.

Create new use cases

A key difference of a smart watch from a smart phone is that it is worn on your wrist, thus it is finally able to track your fitness intimately. Apple emphasized on its use in health and fitness, and has designed a built-in fitness tracking app aimed at replacing activity trackers such as Fitbit or Jawbone. But, its tracking capability is beyond a simple fitness tracker.

For example, our Jamo Dance (Jamo Dance on Android) product could use the Apple Watch to track your dance moves. VimoFit product demonstrated that you could track in-home/in-gym exercises and automatically count repetitions with a watch’s built-in motion sensors, you can even use our VimoGolf product to track your golf swing, both are something not possible to do with just your mobile phone.

With the focus on the right design concepts and on key use cases, I think Apple has cracked the secret sauce on how to jump start the smart watch market. I would not be surprised if they sell tens of millions of Apple Watches in the first year. I would be the first in line to purchase one.

 

Amazon EC2 grows 62% in 2 years

I estimated Amazon data center size about two years ago using a unique probing technique that I came up with. Since then, I have been tracking their growth (US East data center monthly, but less frequently for all data centers). Now is the time to give you all an update.

Physical server

I will not cover the technique again here, since you can refer to the original post. But I want to stress that this is measuring the number of physical server racks in their data centers, hence deducing the number of physical servers. There are other approaches, such as Netcraft that measures the web facing virtual servers. However, Netcraft only measures the number of virtual servers (and only a subset of it, those that are web facing), where a virtual server could be a tiny Micro instance, a very small slice of a physical server. If you want to know how big EC2 is physically, this is the definitive research.

The following figure shows the growth of the US East data center.

useastgrowthtrend

Number of server racks in EC2 US East data center

The growth in US East data center slowed down in late 2012 and 2013, but the growth has picked up quite a bit recently. It only added 1,362 racks between Mar. 12, 2012 and Dec. 29th, 2013, whereas, it has been adding on average 1,000 racks per year between 2007 and 2013. Then, all of a sudden, it adds 431 racks in the last month and half. However, other EC2 data centers have enjoyed tremendous growth in the two years period. The following table shows how many racks I can observe today, and at the end of last year vs. two years ago by each data center.

data center # of server racks on 3/12/2012 # of server racks on 12/29/2013 % growth 3/12/2012 to 12/29/2013 # of server racks on 2/18/2014 % growth 3/12/2012 to 2/18/2014
US East (Virginia) 5,030 6,382 26.9% 6,813 35.4%
US West (Oregon) 41 619 1410% 904 2205%
US West (N. California) 630 847 34.4% 950 50.8%
EU West (Ireland) 814 1,340 64.6% 1,556 191.2%
AP Northeast (Japan) 314 589 87.6% 719 229%
AP Southeast (Singapore) 246 371 50.8% 432 75.6%
SA East (Sao Paulo) 25 83 232% 122 488%
Total 7,100 10,231 44.1% 11,496 61.9%

There are a few observations:

1. The overall growth rate shows no sign of slowing down. From Jan. 2007 to Mar. 2012, EC2 grows from almost 0 server to 7,100 racks of servers, roughly 1,420 racks per year. From Mar. 2012 to Feb. 2014, EC2 grows from 7,100 racks to 11,496 racks, which is 2,198 racks per year.

2. Most of the growth is not from the US East data center. The Oregon data center grows the most at 2205%, followed by Sao Paulo at 488%.

3. There is a huge spike within the last 1.5 months. The number of racks increased from 10,231 to 11,496, adding 1,265 racks of servers.

The overall growth in the last two years is 62%, which is quite impressive. However, others have estimated that AWS revenue have been growing at a faster rate of more than 50% per year. The discrepancy could be due to the fact that AWS revenue includes many other AWS services including some new ones they have introduced in recent years, and EC2 is just a smaller component of it.

Virtual server growth

Another way to look at EC2’s growth is to look at how many virtual servers are running. Since a customer is paying for a virtual server, looking at the virtual server trend is also a good predictor of EC2 revenue.

As part of our probing technique, we enumerate all virtual servers, regardless whether it hosts a web server or not. If a virtual server is running, the EC2 DNS server will have an entry translating its external IP address to its internal IP address. By counting the number of DNS entries, we arrive at an upper bound of the number of virtual servers running (it is an upper bound because when a virtual server is terminated, the DNS entry is not deleted right away).

The following figure shows the number of running virtual servers (active DNS entries) in the US East Data center in orange. AWS also publishes the number of IP addresses that are available periodically, and we have been tracking that over time. The blue points shows how many IP addresses that are available to assign to virtual servers. AWS has been constantly adding more IP address allocation ahead of the expected growth.

AWS number of running virtual servers

EC2 number of running virtual servers

The green dots show the total available IP addresses across all data center. It is an upper bound on the maximum number of virtual servers EC2 can run. On Dec. 29th, 2013, our data shows there are up to 2.97 Million virtual machines that are active. You can put in an assumption of the average price AWS charges for an instance to roughly estimate EC2 revenue.

Density

From our data, we can also derive the density — the average number of virtual servers running on a physical server. On Mar. 12, 2012, there are 120 virtual servers running on each server rack. However, on Dec. 29th, 2013, this density has increased to 245 virtual servers per rack. Either the Micro instance is gaining popularity, or AWS has been doing a better job of consolidating their load to increase the profit margin.

Parting comment

I have not been blogging much in the last two years. You may be wondering what I have been doing. Well, I have been working on a startup, today we finally come out of stealth mode, and we are officially launching at the Launch Festival. It is an iPhone app, called Jamo, that brings dance games from Wii and Xbox to the iPhone. If this research has been helpful to you, please help me by downloading the App, and give us a 5* rating. You can read more about the App in a previous post.

Amazon DynamoDB use cases

In-memory computing is clearly hot. It is reported that SAP HANA has been “one of SAP’s more successful new products — and perhaps the fastest growing new product the company ever launched”. Similarly, I have heard Amazon DynamoDB is also a rapidly growing product for AWS. Part of the reason is that the price for in-memory technology has dropped significantly, both for SSD flash memory and traditional RAM, as shown in the following graph (excerp from Hasso Plattner and Alexander Zeier’s book, page 15).

In-memory technology offers both higher throughput and lower latency, thus it could potentially be used to satisfy a range of latency-hungry or bandwidth-hungry applications. To understand DynamoDB’s sweet spots, we looked into many areas where DynamoDB could be used, and we concluded that DynamoDB does not make sense for applications that desire a higher throughput, but it does make sense for a portion of the applications that desire a lower latency. This post is about our reasoning when investigating DynamoDB, hope it helps those of you who are considering adopting the technology.

Let us start examining a couple of broader classes of applications, and see which one might be a good fit for DynamoDB.

Batch applications

Batch applications are those with a large volume of data that needs to be analyzed. Typically, there is a less stringent latency requirement. Many batch applications can run overnight or for even longer before the report is needed. However, there is a strong requirement for high throughput due to the volume of data. Hadoop, a framework for batch applications, is a good example. It cannot guarantee low latency, but it can sustain a high throughput through horizontal scaling.

For data intensive applications, such as those targeted by the Hadoop platform, it is easy to scale the bandwidth. Because there is an embarassing amount of parallelism, you can simply add more servers to the cluster to scale out the throughput. Given that it is feasible to get high bandwidth both through in-memory technology and through disk-based technology using horizontal scaling, it comes down to price comparison.

The RAMCloud project has made an argument that in-memory technology is actually cheaper in certain cases. As noted by the RAMCloud paper, even though hard drive’s price has also fallen over the years, the IO bandwidth of a hard disk has not improved much. If you desire to access each data item more frequently, you simply cannot fill up the disk; otherwise, you will choke the disk IO interface. For example, the RAMCloud paper calculates that you can access any data only 6 times a year on average if you fill up a modern disk (assuming random access for 1k blocks). Since you can only use a small portion of a hard disk if you need high IO throughput, your effective cost per bit goes up. At some point, it is more expensive than an in-memory solution. The following figure from the RAMCloud paper shows in which area a particular technology becomes the cheapest solution. As the graph shows, when the data set is relatively smaller, and when the IO requirement is high, in-memory technology is the winner.

The key to RAMCloud’s argument is that you cannot fill up a disk, thus the effective cost is higher. However, this argument does not apply in the cloud. You pay AWS for the actual storage space you use, and you do not care a large portion of the disk is empty. In effect, you count on getting a higher access rate to your data at the expense of other people’s data getting a lower access rate (This is certainly true for some of my data in S3 which I have not accessed even once since I started using AWS in 2006). In our own tests, we get a very high throughput rate from both S3 and SimpleDB (by spreading the data over many domains). Although there is no guarantee on access rate, S3 comes at a cost of 1/8 and SimpleDB comes at a cost of 1/4 of that of DynamoDB, making both an attractive alternative for batch applications.

In summary, if you are deploying in house where you are paying for the infrastructure cost, it may make sense economically to use in-memory technology for your batch applications. However, in a hosted cloud environment where you only pay for the actual storage you use, in-memory technology, such as DynamoDB, is less likely a candidate for batch applications.

Web applications

We have argued that bandwidth-hungry applications are not a good fit for DynamoDB because there is a cheaper way using a disk based solution by leveraging shared bandwidth in the cloud. But let us look at another type of applicaton — web appplications — which may value the lower latency offered by DynamoDB.

Interactive web applications

First, let us consider an interactive web application, where users may create data on your website, then they may query the data in many different forms. Our work around Gamification typically involves this kind of application. For example, in Steptacular (our previous Gamification work on health care/wellness), users need to upload their walking history, then they may need to query their history in many different format and look at their friends’ actions.

For our current Gamification project, we seriously considered using DynamoDB, but in the end, we concluded that it is not a good fit for two reasons.

1. Immaturity of ORM tools

Many web applications are developed using an ORM (Object Relational Mapping) tool. This is because an ORM tool shields you away from the complexity of the underlying data store, allowing the developers to be more productive. Ruby’s ActiveRecords is the best I have seen, where you just define your data model in one place. Unlike earlier ORM tools, such as Hibernate for Java, you do not even have to explicitly define a mapping using an XML file, all the mapping is done automatically.

Even though Amazon SDK comes with an ORM layer, its feature set is far from other mature ORM tools. People are developing a more complete ORM tool, but the lack of features from DynamoDB (e.g., no auto-increment ID field support) and the wide grounds to cover for each progamming language means that it could be a while before this field matures.

2. Lack of secondary index

The lack of secondary index support makes it a no go for a majority of interactive web applications. These interactive web applications need to present data in many different dimensions, each dimension needs to have an index for an efficient query.

AWS recommends that you duplicate data in different tables, so that you can use the primary index to query efficiently. Unfortunately, this is not really practical. This requires multiple writes on data input, which is not only a performance killer, but it also creates a coherence management nightware. The coherence management problem is difficult to get around. Consider a failure scenario, where you successfully wrote the first copy, but then you failed when you are updating the data in the second table with a different index structure. What do you do in that case? You cannot simply roll back the last update because, like many other NoSQL data stores, DynamoDB does not support transaction. So you will end up with an inconsistent state.

Hybrid web/batch applications

Next, let us consider a different type of web application, which I refer to as the google-search-type web application. This type of application has little or no data input from the web front end, or if it takes data from the web front end, the data is not going to be queried over more than one dimension. In other words, this type of application is mostly read-only. The data it queries may come from a different source, such as from web crawling, and there is a batch process which load the data possibly into many tables with different indexes. The consistency problem is not an issue here because the batch process can simply retry without worrying about data getting out of sync since there are no other concurrent writes. The beauty of this type of application is that it can easily get around the feature limitations of DynamoDB and yet benefit from the much reduced latency to improve interactivity.

Many applications fall into this category, including BI (Business Intelligence) applications and many visualization applications. Part of the reason that SAP HANA is taking off is because the demands from BI applications for faster, interactive queries. I think the same demand is probably driving the demand for DynamoDB.

What type of applications are you deploying in DynamoDB? If you are deploying an interactive web application or a batch application, I would really like to hear from you to understand the rationale.

Amazon data center size

(Edit 3/16/2012: I am surprised that this post is picked up by a lot of media outlets. Given the strong interest, I want to emphasize what is measured and what is derived. The # of server racks in EC2 is what I am directly observing. By assuming 64 physical servers in a rack, I can derive the rough server count. But remember this is an *assumption*. Check the comments below that some think that AWS uses 1U server, others think that AWS is less dense. Obviously, using a different assumption, the estimated server number would be different. For example, if a credible source tells you that AWS uses 36 1U servers in each rack, the number of servers would be 255,600. An additional note: please visit my disclaimer page. This is a personal blog, only represents my personal opinion, not my employer’s.)

Similar to the EC2 CPU utilization rate, another piece of secret information Amazon will never share with you is the size of their data center. But it is really informative if we can get a glimpse, because Amazon is clearly a leader in this space, and their growth rate would be a great indicator of how well the cloud industry is doing.

Although Amazon would never tell you, I have figured out a way to probe for its size. There have been early guesstimates on how big Amazon cloud is, and there are even tricks to figure out how many virtual machines are started in EC2, but this is the first time anyone can estimate the real size of Amazon EC2.

The methodology is fully documented below for those inquisitive minds. If you are one of them, read it through and feel free to point out if there are any flaws in the methodology. But for those of you who just want to know the numbers: Amazon has a pretty impressive infrastructure. The following table shows the number of server racks and physical servers each of Amazon’s data centers has, as of Mar. 12, 2012. The column on server racks is what I directly probed (see the methodology below), and the column on number of servers is derived by assuming there are 64 blade servers in each rack.

data center\size # of server racks # of blade servers
US East (Virginia) 5,030 321,920
US West (Oregon) 41 2,624
US West (N. California) 630 40,320
EU West (Ireland) 814 52,096
AP Northeast (Japan) 314 20,096
AP Southeast (Singapore) 246 15,744
SA East (Sao Paulo) 25 1,600
Total 7,100 454,400

The first key observation is that Amazon now has close to half a million servers, which is quite impressive. The other observation is that the US east data center, being the first data center, is much bigger. What it means is that it is hard to compete with Amazon on scale in the US, but in other regions, the entry barrier is lower. For example, Sao Paulo has only 25 racks of servers.

I also show the growth rate of Amazon’s infrastructure for the past 6 months below. I only collected data for the US east data center because it is the largest, and the most popular data center. The Y axis shows the number of server racks in the US east data center.

EC2 US east data center growth in the number of server racks

Besides their size, the growth rate is also pretty impressive. The US east data center has been adding roughly 110 racks of servers each month. The growth rate looks roughly linear, although recently it is showing signs of slowing down.

Probing methodology

Figuring out EC2′ size is not trivial. Part of the reason is that EC2 provides you with virtual machines and it is difficult to know how many virtual machines are active on a physical host. Thus, even if we can determine how many virtual machines are there, we still cannot figure out the number of physical servers. Instead of focusing on how many servers are there, our methodology probes for the number of server racks out there.

It may sound harder to probe for the number of server racks. Luckily, EC2 uses a regular pattern of IP address assignment, which can be exploited to correlate with server racks. I noticed the pattern by looking at a lot of instances I launched over time and running traceroutes between my instances.  The pattern is as follows:

  • Each EC2 instance is assigned an internal IP address in the form of 10.x.x.x.
  • Each server rack is assigned a 10.x.x.x/22 IP address range, i.e., all virtual machines running on that server rack will have the same 22 bits IP prefix.
  • A 10.x.x.x/22 IP address range has 1024 IP addresses, but the first 256 are reserved for DOM0 virtual machines (system management virtual machine in XEN), and only the last 768 are used for customers’ instances.
  • Within the first 256 addresses, two at address 10.x.x.2 and 10.x.x.3 are reserved for routers on the rack. These two routers are arranged in a load balanced and fault-tolerant configuration to route traffic in and out of the rack. I verified that the uplink capacity from 10.x.x.2 and 10.x.x.3 are roughly 2 Gbps total, further suggesting that they are routers each with a 1Gbps uplink.

Understanding the pattern allows us to deduce how many racks are there. In particular, if we know a virtual machine at a certain internal IP address (e.g., 10.2.13.243), then we know there is a rack using the /22 address range (e.g., a rack at 10.2.12.x/22). If we take this to the extreme where we know the IP address of at least one virtual machine on each rack, then we can see all racks in EC2.

So how can we know the IP addresses of a large number of virtual machines? You can certainly launch a large number of virtual machines and record the internal IP addresses that you get, but that is going to be costly. If you are RightScale, where a large number of instances are launched through your service, you may not be able to take this approach. Another approach is to scan the whole IP address space and watch when an instance responds back to a ping. There are several problems with this approach. First, it may be considered port scanning, which is a violation of AWS’s policy. Second, not all live instances respond to ping, especially with AWS’ security group blocking all ports by default. Lastly, the whole IP address space in 10.x.x.x is huge, which would take a considerable amount of time to scan.

While you may be discouraged at this point, it turns out there is another way. In addition to the internal IP address we talked about, each AWS instance also has an external IP address. Although we cannot scan the external IP addresses either (so as not to violate the port scanning policy), we can leverage DNS translation to figure out the internal IP addresses. If you query DNS for an EC2 instance’s public DNS name from inside EC2, the DNS server will return its internal IP address (if you query it from outside of EC2, you will get the external IP instead). So, all we are left to do is to get a large number of EC2 instances’ public DNS names. Luckily, we can easily derive the list of public DNS names, because EC2 instances’ public DNS names are directly tied to their external IP addresses. An instance at external IP address x.y.z.w (e.g., 50.17.204.150) will have a public DNS name ec2-x-y-z-w…..amazonaws.com (e.g., ec2-50-17-204-150.compute-1.amazonaws.com if in US east data center). To enumerate all public DNS names, we just have to find out all public IP addresses. Again, this is easy to do because EC2 publishs all public IP addresses they use here.

Once we determined the number of server racks, we just multiply it by the number of physical servers on the rack. Unfortunately, we do not know how many physical servers are on each rack, so we have to make assumptions. I assume Amazon has dense racks, each rack has 4 10U chassis, and each chassis holds 16 blades for a total of 64 blades/rack.

Let us recap how we can find all server racks.

  • Enumerate all public IP addresses EC2 uses
  • Translate a public IP address to its public DNS name (e.g., ec2-50-17-204-150.compute-1.amazonaws.com)
  • Run a DNS query inside EC2 to get its internal IP address (e.g., 10.2.13.243).
  • Derive the rack’s IP range from the internal IP address (e.g., 10.2.12.x/22).
  • Count how many unique racks we have seen, then multiple it by the number of physical servers in a rack (I assume it is 64 servers/rack).

Caveat

Even though my methodology could provide insights that are never possible before, it has its shortcomings, which could lead to inaccurate results. The limitations are:

  • The methodology requires an active instance on a rack for the rack to be observed. If the rack has no instances running on it, we cannot count it.
  • We cannot know how many physical servers are in a rack. I assume Amazon has dense racks, each rack has 4 10U chassis, and each chassis holds 16 blades.
  • My methodology cannot tell whether the racks I observe are for EC2 only. It could be possible that other AWS services (such as S3, SQS, SimpleDB) run on virtual servers on the same set of racks. It it also possible that they run on dedicated racks, in which case, AWS is bigger than what I can observe. So, what I am observing is only a lower bound on the size of AWS.

Launch a new site in 3.5 weeks with Amazon

Getting started quick is one of the reasons that people adopted cloud, and that is why Amazon Web Services (AWS) is so popular. But people often overlook the fact that the retail part of Amazon is also amazing. If your project involves supply chain, you can also leverage Amazon retail to get up and running quickly.

We recently launched a wellness pilot project at Accenture where we leveraged both Amazon retail and Amazon web services. The Steptacular pilot is designed to encourage Accenture US employees to lead a healthy lifestyle. We all had our new year resolutions, but we always procrastinate, and we never exercise as much as we should. Why? Because there is a lack of motivation and engagement. The Steptacular pilot uses a pedometer to track a participant’s physical activity, then it leverages concepts in Gamification, uses social incentive (peer pressure) and monetary incentive to constantly engage participants. I will talk about the pilot and its results in details in a future post, but in this post, let me share how we are able to launch within 3.5 weeks, the key capabilities we leveraged from Amazon and some lessons we learned from this experience.

Supply chain side

The Steptacular pilot requires participants to carry a pedometer to track their physical activity. This is the first step of increasing engagement — using technology to alleviate the hassle of manual (and inaccurate) entry. We quickly locked into the Omron HJ-720 model because it is low cost and it has a USB connector so that we can automate the step upload process.

We got in touch with Omron. The guys at Omron are super nice. Once they learned what we are trying to do, they immediately approved us as a reseller. That means we can buy pedometer at the wholesale price. Unfortunately, we still have to figure out how we can get the devices into our participants’ hands. Accenture is a distributed organization with 42 offices in the US alone. To make the matter worse, many consultants work from client sites, so it is not feasible to distribute in person. We seriously considered three options:

  1. Ask our participants to order directly from Amazon. This is the solution we chose in the end, after connecting with the Amazon buyer in charge of the Omron pedometer and being assured that they will have no problem handling the volume. It turns out that this not only saves us a significant amount of shipping hassle, but it is also very cost effective for our participants.
  2. Be a vendor ourselves and uses Amazon for supply chain. Although I did not know about it before, I am pleasantly surprised to learn about the Fulfillment by Amazon capability. This is Amazon’s cloud for supply chain. Like a cloud, this is provided as a service — you store your merchandise in Amazon’s warehouse, and they handle the inventory and shipping. Also, like a cloud, it is pay per use with no long term commitment. Although equally good at reducing hassle for us, we did not find that we can save cost. Amazon retail is so efficient and has such a small margin that we realize we cannot compete even though we are happy with a 0% margin and even though we (supposedly) pay for the same wholesale price.
  3. Ship and manage by ourselves. The only way we could be cheaper is if we manage the supply chain and shipping logistics ourselves, and of course, this is assuming that we work for free. However, the amount of work is huge, and none of us wants to lick envelope for a few weeks, definitely not for free.

The pilot officially launched on Mar. 29th. Besides Amazon itself, another Amazon affiliate, J&R music, also sells the same pedometer on Amazon’s website. Within a few minutes, our participants were able to totally drain J&R’s stock. However, Amazon remained in stock for the whole duration. Within a week, they sold roughly 3,000 pedometers pedometers. I am sure J&R is still mystified by the sudden surge in demand. If you are from J&R, my apologies for not giving adequate warning ahead and kudos to you for not overcommitting your stock like many TouchPad vendors did recently (I am one of those burned by OnSale).

In addition to managing device distribution, we also have to worry about how to subsidize our participants. Our sponsors agreed to subsidize each pedometer by $10 to ease the adoption, but we could not just write each participant a $10 check — that is too much work. Again, Amazon came to the rescue. There are two options. One is that Amazon could generate a bunch of one-time-use $10 discount code which is specifically tied to the pedometer product, then, based on how many are redeemed, Amazon could bill us for the total cost. The other option is that we could buy a bunch of $10 gift cards in bulk and distribute to our participants electronically. We ultimately chose the gift card option for its flexibility and also for the fact that it is not considered a discount so that the device would still cost more than $25 for our participants to qualify for super saver shipping. Looking back, I do regret choosing the gift card option, because managing squatters turns out to be a big hassle, but that is not Amazon’s fault, it is just human nature.

Technology platform side

It is a no-brainer to use Amazon to leverage its scaling capabilities, especially for a short-term quick project like ours. One key thing we learned from this experience is that you should only use what you need. Amazon web services offer a wide range of services, all designed for scale, so it is likely that you will find a service that serves your need.

Take for example the email service Amazon provides. Initially, we used Gmail for sending out signup confirmations and email notifications. During the initial scaling trial, we soon hit Gmail’s limit on how fast we can send emails. Once realizing the problem, we quickly switched to Amazon SES (Simple Email Service). There is an initial cap on how many we can send, but it only took a couple of emails for us to lift the limit. With a couple of hours of coding and testing, we all of a sudden can send thousands of emails at once.

In addition to SES, we also leveraged AWS’ CloudWatch service to enable us to closely monitor and be alerted of system failures. Best of all, it all comes for free without any development effort from our side.

Even though Amazon web services offer a large array of services, you should only choose what you absolutely need. In other words, do not over engineer. Let us taking auto scaling as an example. If you host a website in Amazon, it is natural to think about putting in an auto-scaling solution, just in case to handle the unexpected. Amazon has its auto scaling solution, and we, at the Accenture labs, have even developed an auto-scaling solution called WebScalar in the past. If you are Netflix, it makes absolute sense to do so because your traffic is huge and it fluctuates widely. But if you are smaller, you may not need to scale beyond a single instance. If you do not need it, it is extra complexity that you do not want to deal with especially when you want to launch quick. We estimated that we will have around 4,000 participants, and when we did a quick profiling, we figured that a standard extra-large instance in Amazon would be adequate to handle the load. Sure enough, even though the website experienced a slow down for a short period of time during launch, it remains adequate to handle the traffic for the whole duration of the pilot.

We also learned a lesson on fault tolerance — really think through your backup solution. Steptacular survived two large-scale failures in the US East data center. We enjoyed peace of mind partly because we are lucky, partly because we have a plan. Steptacular uses an instance-store instance (instead of an EBS instance). We made the choice mainly for performance reasons — we want to free up the network bandwidth and leverage the local hard disk bandwidth. This turns out to have saved us from the first failure in Apr. which is caused by EBS blocks failure. Even though we cannot count on EBS for persistency, we build in our own solution. Most static content on the instance is bundled into a Amazon Machine Image (AMI). There are two pieces of less static content (the content that changes often) stored on the instance: the website logic and the steps database. The website logic is stored in a Subversion repository and the database is synced to another database running outside of the US East data center. This architecture allows us to be back up and running quickly, by first launching our AMI, then check out website code from repository and lastly dump and reload the database from the mirror. Even though we did not have to initiate this backup procedure, it is good to have the peace of mind knowing your data is safe.

Thanks to Amazon, both Amazon retail and Amazon web services, we are able to pull off the pilot in 3.5 weeks. More importantly, the pilot itself has collected some interesting results on how we can motivate people to exercise more. But I will leave that to a future post after we have a chance to dig deep into the data.

Acknowledgments

Launching Steptacular in 3.5 weeks would not have been possible without the help of many people. We would like to especially thank the following folks:

  • Jim Li from Omron for providing both hardware, software and logistics support
  • Jeff Barr from Amazon for connecting us with the right folks at Amazon retail
  • James Hamilton from Amazon for increasing our email limit on the spot
  • Charles Allen from Amazon for getting us the gift codes quickly
  • Tiffany Morley and Helen Shen from Amazon for managing the inventory so that the pedometer miraculously stayed in stock despite the huge demand

Last but not least, big kudos to the Steptacular team, which includes several Stanford students, who worked really hard even through the finals week to get the pilot up and running. They are one of the best team I proudly have ever worked with.

Amazon’s physical hardware and EC2 compute unit

Ever wonder what hardware is running behind Amazon’s EC2? Why would you even care? Well, there are at least a couple of reasons.

  1. Side-by-side comparisons. Amazon express their machine power in terms of EC2 compute units (ECU) and other cloud providers just express it in terms of number of cores. Either case, it is vague and you cannot perform economical comparison between different cloud offerings, and with owning-your-own-hardware approach. Knowing how much a EC2 computing unit is in terms of hardware raw power allows you to perform apple-to-apple comparison.
  2. Physical isolation. In many enterprise clients’ mind, security is the number one concern. Even though hypervisor isolation is robust, they feel more comfortable if there is a physical separation, i.e., they do not want their VM to sit on the same physical hardware right next to a hacker’s VM. Knowing the hardware computing power and the largest VM’s computing power, one can determine whether there is enough room left to host a hacker’s VM.

The observation below is only based on what we see in the N. Virginia data center, and the underlying hardware may very well be very different in other data centers (i.e., Ireland, N. California and Singapore). If you are curious, feel free to use the methodology that we will describe to see what is going on in other data centers.

Our observation is based on a combination of “hints” from several different tools and methods, including the following:

CPUID

The “cpuid” instruction is supported by all x86 CPU manufacturers, and it is designed to report the capabilities of the CPU. This instruction is non-trapping, meaning that you can execute it in user mode without triggering protection trap. In the Xen paravirtualized hypervisor (what Amazon uses), it means that the hypervisor would not be able to intercept the instruction, and change the result that it returns. Therefore, the output from “cpuid” is the real output from the physical CPU.

We look at several fields in the “cpuid” output. First and foremost, we look at the branding string, which identifies the CPU’s model number. We also look at “local APIC physical ID” in (1/ebx). The “APIC ID” is unique for a physical core. By enumerating all “APIC ID”s, we know how many physical cores are there. Lastly, we look at “Logical CPU cores” in (0x80000008/ecx). It is supposed to show how many hyper-thread cores are on a physical core.

Intel processor specifications

With the model numbers reported by “cpuid”, we could look up their data sheet to determine the exact specification of a processor, including how many cores per socket, how many sockets per system, and its cache size etc.

/sys/devices/system/cpu/cpu?/cache/index?/shared_cpu_map

This is a file in the Linux file system. It lists the cache hierarchy including which cores share a particular cache. This is used as a validation to match against the cache hierarchy specified in the CPU data sheet. However, this is not used to reach any conclusion, as we have seen it reporting wrong information in some cases.

Performance benchmark

We use PassMark-CPU Mark — a performance benchmark — to compare the CPU performance with other systems with the same CPU configuration. A matching performance number would confirm our observation.

System statistics

A variety of tools, such as “mpstat” and “top”, can report on the system’s performance statistics, including the CPU and memory usage. In particular, on a Xen-hypervisor, a VM can get the steal cycle statistics — time that is stolen from the VM to run other things, including other VMs. The documentation states that steal cycle counts the amount of time that your VM is ready to run but could not due to others competing for the CPU. Thus, if you keep your VM busy, you will see all the CPU cycle stolen from you. For example, on an m1.small VM, you will see the steal cycle to be roughly 60% and you can keep your CPU busy at most up to 40%. This is a hard cap Amazon puts on to limit you to one EC2 compute unit.

Now that the methodology is clear, we can dive into the observations. Amazon infrastructure runs on three set of distinct hardware.

High-memory instances

The high-memory instances (m2.xlarge, m2.2xlarge, m2.4xlarge) run on systems with dual-socket Intel Xeon X5550 (Nahelem) 2.66GHz processors. Intel Xeon X5550 processor has 4 cores, and each core is capable of hyper-threading, i.e., there could be 8 cores from the software’s point of view. However, Amazon disable hyper-threading, because “cpuid” 0x80000008/ecx reports that there is only one logical core. Further, the APIC IDs are 1, 3, 5, 7, 17, 19, 21, 23. The missing IDs (9, 11, 13, 15) are probably reserved for the hyper-threading cores and they are not used. The m2.4xlarge instance occupies the whole physical hardware. An m2.4xlarge instance’s Passmark-CPU mark is 10,052.6, on par with other dual-socket X5550 systems (average is 10,853). Furthermore, we never observe steal cycle beyond 1 or 2%.

High-CPU instances

The high-CPU instances (c1.medium, c1.xlarge) run on systems with dual-socket Intel Xeon E5410 2.33GHz processors. It is dual-socket because we see APIC IDs 0 to 7, and E5410 only has 4 cores. A c1.xlarge instance almost takes up the whole physical machine. However, we frequently observe steal cycle on a c1.xlarge instance ranging from 0% to 25% with an average of about 10%. The amount of steal cycle is not enough to host another smaller VM, i.e., a c1.medium. Maybe those steal cycles are used to run Amazon’s software firewall (security group). On Passmark-CPU mark, a c1.xlarge machine achieves 7,962.6, actually higher than an average dual-sock E5410 system is able to achieve (average is 6,903).

Standard instances

The standard instances (m1.small, m1.large, m1.xlarge) run on systems with a single socket Intel Xeon E5430 4 core 2.66GHz processor. A m1.small instance may occasionally run on a system consisting of an AMD Dual-Core Opteron 2218 HE Processor, but that system is rare to find (<10%), so we would not focus on it here. The Xeon E5430 platform is single socket because we only see APIC IDs 0,1,2,3.

By simple deduction, we can reason that an m1.xlarge instance does not take up the whole physical machine. Since a c1.xlarge instance is 20 ECUs (8 cores at 2.5 ECU each), we can reason that an E5410 processor is at least 10 ECU. Thus an E5430 would have roughly 11.4 ECU since its clock frequency is a little higher than that of an E5410 (2.66GHz vs. 2.33GHz). Since an m1.xlarge instance has only 8 ECU (4 cores at 2 ECU each), there is room for at least 3 more m1.small instances. This is an example where knowing the physical hardware configuration helps us reason about the CPU power allocated. In addition to reasoning by hardware configuration, we also observe large steal cycles on an m1.xlarge instance, which ranges from 0% to 75% with an average of about 30%.

A m1.xlarge instance achieves PassMark-CPU Mark score of 3,627.8. We cannot find other single-socket E5430 systems in PassMark’s database, but the score is less than half of what a c1.xlarge instance is able to achieve. This again confirms a large steal cycle.

In conclusion, we believe that a c1.xlarge and an m2.4xlarge instances occupy their own physical hardware. For people that are security conscious, they should choose those instances to avoid co-location hacking. In addition, an Intel Xeon X5550 has 13 ECU, an Intel Xeon E5430 has about 11 ECU, and an Intel Xeon E5410 has 10 ECU, where an ECU is roughly equivalent to a PassMark-CPU Mark score of 400. Using this information, you can perform economical comparison between cloud and your favorite alternative approach.