How to run MapReduce in Amazon EC2 spot market

If you often run large-scale MapReduce/Hadoop jobs in Amazon EC2, you must have thought about using the spot market. EC2’s spot market price for a spot instance is typically 60+% less than that of an on-demand instance. For a large job, where you use many instances for many hours, a 60+% saving could be a substantial amount.

Unfortunately, using spot market has not been trivial. In exchange for the lower price, Amazon has your explicit agreement that they can terminate you at any time. This is a problem since you may lose all your work. A research paper from HotCloud last year showed that even adding more spot instances (not replacing existing nodes) could be detrimental to a running MapReduce job. In other words, you add more resources to your cluster, but your running time could actually be longer.

Beyond lengthening your computation, spot market could even make you lose your data. Existing MapReduce implementations, such as Google’s internal implementation or Hadoop, are designed with failure in mind already. However, the assumed scenario is a hardware failure, i.e., a small fraction of nodes may go down at any time. This assumption is not true in the spot market environment, where all nodes of a cluster may fail at the same time. You not only can lose all your states (when the master nodes go down), but you can also lose all your data (when nodes holding replicas for a piece of data all go down).

What about bidding for a really high price for your spot instances, and hoping that Amazon never increases the price that high? Unfortunately there is no guarantee on how high the spot market price could be. There are several occasions last year where the spot instances price actually exceeded the on-demand instances. This is likely because some guys were bidding at a high-than-on-demand-instance price, and Amazon really needed to kill those instances to free up capacity.

While the naive approach of bidding at a high price may not work, I am happy to report that there is a new technique that can help you leverage spot market to save money. We recently developed a MapReduce implementation that could tolerate large-scale node failures (e.g., when your bid price is below Amazon’s spot price). Even if all nodes in your cluster are terminated, we can guarantee that no state is lost, and that you can continue make forward progress when your cluster comes back online (e.g., when your bid price is higher than Amazon’s spot price).

Our implementation leverages two key things. First, when Amazon terminates your instance, it is not a hard power off. Instead, it is a soft OS shutdown, where you have a couple of minutes to execute your shutdown script. We modified our shutdown script where we save the current progress and generate a new task for the remaining work so that another node can take over in the future. In other words, we use on-demand checkpointing to save states only when needed.

Second, we constantly save intermediate data in order to minimize the volume of state we have to save in the shutdown phase. Our solution is built on Cloud MapReduce, which constantly streams intermediate data out of the local node. In comparison, other MapReduce implementations, such as Hadoop, save all intermediate data locally before a task finishes. This could result in too large a dataset to save during the short shutdown window.

I would not belabor the details of our implementation, except mentioning that it was published last week at USENIX HotCloud conference. You can read the Spot Cloud MapReduce paper for the full details.