Comparing cloud providers on VM cost

How do you compare two IaaS clouds? Is Amazon EC2’s small standard instance (1 ECU, 1.7GB RAM, 160GB storage) cheaper or is Rackspace cloud’s 256MB server (4 cores, 256MB RAM, 10GB storage) cheaper? It is obviously simpler to compare them if you focus only on one metric. For example, let us assume your application is CPU bound and it does not require much memory at all. Then you should focus solely on the CPU power a cloud VM gives you. We have translated GoGrid, Rackspace, and Terremark‘s VM configurations into their equivalent ECU, so you can simply take a ratio between the cost and the ECU rating and pick the lowest ratio. Unfortunately, real-life applications are never that simple. They demand CPU cycle, memory, as well as hard disk storage capacity. So, how do you compare apple-to-apple?

The methodology

Since no methodology exists yet, we will propose one. Since the comparison results depend highly on the methodology chosen, we first will spell out the methodology we use so that if you have a different one and you come up with a different result, you can trace the source of the difference. If you see areas where we can improve the methodology, please do leave a comment. The methodology works as follows:

  1. We first break down the cost components in Amazon EC2. We assume Amazon has priced their instances using a linear model, i.e., the cost is equal to c * CPU + m * Mem + s * Storage, where c is the unit cost of CPU per ECU per hour, m is the unit cost of memory per GB per hour, and s is the unit cost of storage per GB per hour. Amazon provides several types of instances, each with a different combination of CPU, memory and storage, which is enough of a hint for us to use regression analysis to estimate c, m and s. The details are in our ECU cost breakdown analysis.
  2. Once we have the unit cost in EC2, we can compare it with another cloud provider. We take one VM configuration from a cloud provider at a time, we then compute what Amazon EC2 would charge for an instance with the exact same specification if EC2 were to offer it. This can be easily done by multiplying the EC2 unit costs (c, m, and s) with the amount of CPU, RAM, and storage in the VM, and add them up. Of course, this is hypothetical, because EC2 does not offer an instance with an exact same spec. So even if the EC2 price is lower, you cannot just buy a substitute from Amazon. However, this gives us a good sense of the relative cost.

We have done the analysis with GoGrid, Rackspace, and Terremark.

We can compute a ratio between a cloud VM’s cost with its hypothetical equivalent in EC2. The following lists the top few VMs that have the lowest ratio. If you are curious about the ratio for other VM configurations, feel free to dig into the individual posts on each provider. The ratio listed is assuming that you will get the maximum CPU allowed under bursting, which is frequently the case in those cloud providers. Further, the ratio listed is comparing with EC2 N. Virginia data center. Other EC2 data centers have a higher cost.

Provider RAM (GB) CPU (cores) storage (GB) cost ratio with an equivalent in EC2
Rackspace 0.25 4 10 0.168
Terremark 0.5 8 charged separately at $0.25/month/GB 0.19
Rackspace 0.5 4 20 0.314
Terremark 0.5 4 charged separately at $0.25/month/GB 0.338
Terremark 1 8 charged separately at $0.25/month/GB 0.375
Terremark 1.5 8 charged separately at $0.25/month/GB 0.491

 

How to use this data?

Due to the limitations of this methodology (comparing with a hypothetical equivalent in EC2), it only makes sense if one of the cloud provider you are comparing is Amazon EC2. In other words, do not compare Rackspace with Terremark based on the ratio.

It also makes no sense to use our results if you know the exact specification for your server. In that case, you should find a minimum VM configuration that is just barely bigger than your requirement and compare price.

Our results are useful if your application is flexible. For example, instead of using one m1.small instance in EC2, you could use several Rackspace 256MB VMs to achieve a dramatic cost savings. Examples of a flexible application include a batch application, such as a MapReduce job, which could be chopped down to a finer granularity. Another example could be web servers in a web server farm, where the load balancer can divide up the work to take advantage of whatever computation capacity provisioned on the web server.

Our results are also useful if you want to get a high level overview. Consider an enterprise purchaser who wants to choose a cloud platform. There are many dimensions he has to consider, e.g., features, cost, SLA, contract terms….. Doing a deep analysis at the beginning is just going to be overwhelming. Since Amazon is a big player in cloud, it most likely will be part of the evaluation. Having a ratio would give a ten-thousand-feet view such that the decision maker would know whether an alternative cloud would save him money. Then, as the evaluation progresses, he can dig deeper into a finer comparison.

Caveats:

There are many caveats in using our results that we should spell out.

  • This is only comparing a VM cost, including its CPU, memory and storage. But, it does not include other costs, such as bandwidth transfers. The bandwidth cost varies wildly, for example, GoGrid offers free inbound traffic, which can translate into a significant cost saving.
  • When we compare CPUs, we are only comparing their processing power, not their IO capabilities (both disk and network IO). In Amazon, we sometimes observe degraded IO performance, possibly due to competing VMs on the same host. It is a sad side effect of using popular cloud offerings.
  • As we mentioned, this only applies to fungible applications that can take full advantage of provisioned CPU, memory and storage resources. For example, if you cannot take advantage of the provisioned RAM, it does not matter if it is a good deal. You are wasting the memory, and you may be better off with a VM configuration from a different cloud provider with a smaller provisioned RAM.
  • This is not a substitute for feature comparisons. For example, GoGrid offers free F5 hardware load balancer. If you need a hardware load balancer, you should consider that separately.

The true cost of an ECU

How do you compare the cost of two cloud or IaaS offerings? Is Amazon EC2’s small instance (1 ECU, 1.7GB RAM, 160GB storage) cheaper or is Rackspace cloud’s 256MB server (4 cores, 256MB RAM, 10GB storage) cheaper? Unfortunately, answering this question is very difficult. One reason is that cloud vendors have been offering virtual machines with different configurations, i.e., different combinations of CPU power, memory and storage, making is difficult to perform an apple-to-apple comparison.

Towards the goal of a better apple-to-apple comparison, I will break down the cost for CPU, memory and storage individually for Amazon EC2 in this post. For those not interested in understanding the methodology, the high level conclusions are as follows. In Amazon’s N. Virginia data center, the unit costs are:

  • 1 ECU costs $0.01369/hour
  • 1 GB of RAM costs $0.0201/hour
  • 1 GB of local storage costs $0.000159/hour
  • A 10GB network interface costs $0.41/hour
  • A GPU costs $0.52/hour

Before we can break down the cost, we have to know what an instance’s (Amazon’s term for a virtual machine) cost consists of. We assume the cost includes solely the cost of its CPU, its memory, and its local storage space. This means that there is no fixed cost component, for example, to account for the hardware chassis, or to account for the static IP address. We make this assumption purely for simplicity. In practice, it makes little difference to the end result even if we assume there is a fixed cost component. We also note that the instance cost does not include the cost for the network bandwidth consumed, which is always charged separately, at least in the cloud providers we looked at.

Let us assume the instance cost is a linear function of the three components, i.e., Cost = c * CPU + m * Mem + s * Storage, where c, m and s are the unit cost of CPU, memory and local storage respectively. It is fortunate that Amazon EC2 offers several types of instances, each type of instance has a different combination of CPU, memory and storage, which offers us a clue of what each component costs. Combining the many types of instances, we can estimate the parameters c, m and s by using a least-square regression analysis. Let us first look at Amazon’s N. Virginia data center. We only use Linux instances’ hourly cost as the instance cost to avoid accounting for an OS’s licensing cost. The results from least-square regression are:

s = 0.0159 cent/GB/hour
m = 2.01 cent/GB/hour
c = 1.369  cent/ECU/hour

The linear model and the estimation actually match the real data really well. The following table shows the instances we used for regression. The last column shows the instance cost as predicted by our estimated parameters, and the second-to-last column shows the real EC2 cost. As you can see, the two costs actually match fairly well, suggesting that a linear model is a good approximation. We should note that we mark the Micro instance to have 0.35 ECU. This is an average of its ECU allocation as we have shown in our Micro instance analysis.

instance CPU(in ECU) RAM(in GB) Storage(in GB) Instance cost per hour (in cents) Fitted instance cost per hour (in cents)
m1.small 1 1.7 160 8.5 7.33
m1.large 4 7.5 850 34 34.07
m1.xlarge 8 15 1,690 68 67.97
t1.micro 0.35 0.613 0 2 1.71
m2.xlarge 6.5 17.1 420 50 49.96
m2.2xlarge 13 34.2 850 100 100.1
m2.4xlarge 26 68.4 1,690 200 200
c1.medium 5 1.7 350 17 15.83
c1.xlarge 20 7 1,690 68 68.32

It should come as no surprise that the memory is actually a significant component of the instance cost. Next time when you compare two cloud offerings, make sure to compare the RAM available.

In the estimation, we did not include EC2 cluster instances and cluster GPU instances, because they are different from other instances (both have a 10GB network interface and one has a GPU). But, now that we have a unit cost for CPU, memory and storage, we can estimate what those extra features cost.

For a cluster instance, combining the cost of CPU (33.5 ECU), memory (23GB), and storage (1690 GB) using our estimated parameters, the cost comes out to be $1.19/hour. Since Amazon charges $1.60/hour, the extra charge must be for the 10GB interface, which is the only feature that is different from other instances. Subtracting the two, the 10GB interface costs $0.41/hour.

For a cluster GPU instance, combining the cost of CPU (33.5 ECU), memory (22GB), and storage (1690 GB), the cost comes out to be $1.17/hour. Since Amazon charges $2.10/hour, the extra charge much be for the 10GB interface and the GPU. Subtracting the two costs and taking out the 10GB interface cost, we know the GPU costs $0.52/hour.

We can perform the same analysis for the other 3 Amazon data centers: N. California, Ireland and Singapore. Luckily, their cost structures are the same, so I only need to present one result. The unit costs are as follows:

s = 0.0169 cent/GB/hour
m = 2.316 cent/GB/hour
c = 1.575 cent/ECU/hour

The actual instance cost and the projected instance cost are as shown in the following table. Again, they agree very well. There are no cluster and cluster GPU instances in other data centers, so no cost for the 10GB interface and the GPU is shown.

instance CPU(in ECU) RAM(in GB) Storage(in GB) Instance cost per hour (in cents) Fitted instance cost per hour (in cents)
m1.small 1 1.7 160 9.5 8.22
m1.large 4 7.5 850 38 38.07
m1.xlarge 8 15 1,690 76 75.97
t1.micro 0.35 0.613 0 2.5 1.97
m2.xlarge 6.5 17.1 420 57 56.96
m2.2xlarge 13 34.2 850 114 114.1
m2.4xlarge 26 68.4 1,690 228 228
c1.medium 5 1.7 350 19 17.74
c1.xlarge 20 7 1,690 76 76.34

Amazon EC2 Micro instances deeper dive

Amazon today announced a new instance type called “Micro instances” (t1.micro). The official announcement states that its comes with 613MB RAM, and up to 2 ECU compute units. It also supports both 32 and 64 bits. Starting at $0.02/hour, Micro instances are the least expensive instances offered by AWS.

To understand Micro instances, let us first see what is the underneath physical hardware powering them. In a previous post, we have analyzed AWS’s physical hardware and ECU. Using the same methodology, we see that Micro instances use the same physical hardware powering the standard instances, i.e., single-socket Intel E5430 processor based systems. In fact, they probably run on the same clusters as the standard instances.

To understand what is the actual computing power they deliver, we run a CPU-intensive application trying to grab as much CPU as we are allowed. We then use the UNIX command top to monitor the actual CPU usage. top computes the average utilization every second. We observe that the CPU cycle we are allocated varies wildly from second to second. For a short period, we have 100% of the single core, but during other times, we have a much smaller allocation — often as low as 2%.

To see the long term average, we monitor /proc/stat statistics reported by the Linux OS. Using the statistics reported at two time instances spaced sufficiently apart, we can compute the long term average (this is actually how top computes the second-by-second average). We observe that we get roughly 15% of the CPU cycle of a single core. Since a small instance (m1.small) at 1 ECU has 40% of a single core on the same physical hardware, we conclude that a Micro instance has roughly 0.35 ECU. This is consistent with the memory allocation, where a Micro instance has 613MB memory, roughly 35% of that of a m1.small instance (1.7GB).

In summary, a Micro instance has 0.35 ECU, but it can burst up to 2.5 ECU for a few seconds at a time. The price for the burst into 2.5 ECU is that you end up with almost no CPU cycle at other times, so that the average could be 0.35 ECU.

From a pricing perspective, you are only paying for roughly 25% of a m1.small instance. So, for the average CPU you are getting, it is a better deal (25% of the price for 35% of the CPU). However, to get that deal, you will have to tolerate the wildly fluctuating CPU allocation. If your application can tolerate it or if your application never needs to burst much beyond 0.35 ECU, Micro instances may be a good solution.

Even with Micro instances, AWS is still more expensive than some other competitors. For example, at Rackspace cloud, you can pay $0.03/hour, and get a 512MB instance which can potentially burst to use up 4 cores worth of CPU capacity (on a different processor based on AMD though). The key is that there is no limit to the burst at other cloud vendors. As long as there are no other VMs demanding the CPU, you get the full allocation for as long as you want. We frequently get the whole CPU for many hours at other cloud vendors, which is a great bargain.

Amazon’s physical hardware and EC2 compute unit

Ever wonder what hardware is running behind Amazon’s EC2? Why would you even care? Well, there are at least a couple of reasons.

  1. Side-by-side comparisons. Amazon express their machine power in terms of EC2 compute units (ECU) and other cloud providers just express it in terms of number of cores. Either case, it is vague and you cannot perform economical comparison between different cloud offerings, and with owning-your-own-hardware approach. Knowing how much a EC2 computing unit is in terms of hardware raw power allows you to perform apple-to-apple comparison.
  2. Physical isolation. In many enterprise clients’ mind, security is the number one concern. Even though hypervisor isolation is robust, they feel more comfortable if there is a physical separation, i.e., they do not want their VM to sit on the same physical hardware right next to a hacker’s VM. Knowing the hardware computing power and the largest VM’s computing power, one can determine whether there is enough room left to host a hacker’s VM.

The observation below is only based on what we see in the N. Virginia data center, and the underlying hardware may very well be very different in other data centers (i.e., Ireland, N. California and Singapore). If you are curious, feel free to use the methodology that we will describe to see what is going on in other data centers.

Our observation is based on a combination of “hints” from several different tools and methods, including the following:

CPUID

The “cpuid” instruction is supported by all x86 CPU manufacturers, and it is designed to report the capabilities of the CPU. This instruction is non-trapping, meaning that you can execute it in user mode without triggering protection trap. In the Xen paravirtualized hypervisor (what Amazon uses), it means that the hypervisor would not be able to intercept the instruction, and change the result that it returns. Therefore, the output from “cpuid” is the real output from the physical CPU.

We look at several fields in the “cpuid” output. First and foremost, we look at the branding string, which identifies the CPU’s model number. We also look at “local APIC physical ID” in (1/ebx). The “APIC ID” is unique for a physical core. By enumerating all “APIC ID”s, we know how many physical cores are there. Lastly, we look at “Logical CPU cores” in (0x80000008/ecx). It is supposed to show how many hyper-thread cores are on a physical core.

Intel processor specifications

With the model numbers reported by “cpuid”, we could look up their data sheet to determine the exact specification of a processor, including how many cores per socket, how many sockets per system, and its cache size etc.

/sys/devices/system/cpu/cpu?/cache/index?/shared_cpu_map

This is a file in the Linux file system. It lists the cache hierarchy including which cores share a particular cache. This is used as a validation to match against the cache hierarchy specified in the CPU data sheet. However, this is not used to reach any conclusion, as we have seen it reporting wrong information in some cases.

Performance benchmark

We use PassMark-CPU Mark — a performance benchmark — to compare the CPU performance with other systems with the same CPU configuration. A matching performance number would confirm our observation.

System statistics

A variety of tools, such as “mpstat” and “top”, can report on the system’s performance statistics, including the CPU and memory usage. In particular, on a Xen-hypervisor, a VM can get the steal cycle statistics — time that is stolen from the VM to run other things, including other VMs. The documentation states that steal cycle counts the amount of time that your VM is ready to run but could not due to others competing for the CPU. Thus, if you keep your VM busy, you will see all the CPU cycle stolen from you. For example, on an m1.small VM, you will see the steal cycle to be roughly 60% and you can keep your CPU busy at most up to 40%. This is a hard cap Amazon puts on to limit you to one EC2 compute unit.

Now that the methodology is clear, we can dive into the observations. Amazon infrastructure runs on three set of distinct hardware.

High-memory instances

The high-memory instances (m2.xlarge, m2.2xlarge, m2.4xlarge) run on systems with dual-socket Intel Xeon X5550 (Nahelem) 2.66GHz processors. Intel Xeon X5550 processor has 4 cores, and each core is capable of hyper-threading, i.e., there could be 8 cores from the software’s point of view. However, Amazon disable hyper-threading, because “cpuid” 0x80000008/ecx reports that there is only one logical core. Further, the APIC IDs are 1, 3, 5, 7, 17, 19, 21, 23. The missing IDs (9, 11, 13, 15) are probably reserved for the hyper-threading cores and they are not used. The m2.4xlarge instance occupies the whole physical hardware. An m2.4xlarge instance’s Passmark-CPU mark is 10,052.6, on par with other dual-socket X5550 systems (average is 10,853). Furthermore, we never observe steal cycle beyond 1 or 2%.

High-CPU instances

The high-CPU instances (c1.medium, c1.xlarge) run on systems with dual-socket Intel Xeon E5410 2.33GHz processors. It is dual-socket because we see APIC IDs 0 to 7, and E5410 only has 4 cores. A c1.xlarge instance almost takes up the whole physical machine. However, we frequently observe steal cycle on a c1.xlarge instance ranging from 0% to 25% with an average of about 10%. The amount of steal cycle is not enough to host another smaller VM, i.e., a c1.medium. Maybe those steal cycles are used to run Amazon’s software firewall (security group). On Passmark-CPU mark, a c1.xlarge machine achieves 7,962.6, actually higher than an average dual-sock E5410 system is able to achieve (average is 6,903).

Standard instances

The standard instances (m1.small, m1.large, m1.xlarge) run on systems with a single socket Intel Xeon E5430 4 core 2.66GHz processor. A m1.small instance may occasionally run on a system consisting of an AMD Dual-Core Opteron 2218 HE Processor, but that system is rare to find (<10%), so we would not focus on it here. The Xeon E5430 platform is single socket because we only see APIC IDs 0,1,2,3.

By simple deduction, we can reason that an m1.xlarge instance does not take up the whole physical machine. Since a c1.xlarge instance is 20 ECUs (8 cores at 2.5 ECU each), we can reason that an E5410 processor is at least 10 ECU. Thus an E5430 would have roughly 11.4 ECU since its clock frequency is a little higher than that of an E5410 (2.66GHz vs. 2.33GHz). Since an m1.xlarge instance has only 8 ECU (4 cores at 2 ECU each), there is room for at least 3 more m1.small instances. This is an example where knowing the physical hardware configuration helps us reason about the CPU power allocated. In addition to reasoning by hardware configuration, we also observe large steal cycles on an m1.xlarge instance, which ranges from 0% to 75% with an average of about 30%.

A m1.xlarge instance achieves PassMark-CPU Mark score of 3,627.8. We cannot find other single-socket E5430 systems in PassMark’s database, but the score is less than half of what a c1.xlarge instance is able to achieve. This again confirms a large steal cycle.

In conclusion, we believe that a c1.xlarge and an m2.4xlarge instances occupy their own physical hardware. For people that are security conscious, they should choose those instances to avoid co-location hacking. In addition, an Intel Xeon X5550 has 13 ECU, an Intel Xeon E5430 has about 11 ECU, and an Intel Xeon E5410 has 10 ECU, where an ECU is roughly equivalent to a PassMark-CPU Mark score of 400. Using this information, you can perform economical comparison between cloud and your favorite alternative approach.