Energy Efficiency for MapReduce Workloads: An In-depth Study

Feng, B., Lu, J., Zhou, Y. and Yang, N.

    Energy efficiency has emerged as a crucial optimization goal in data centers. MapReduce has become a popular and even fashionable distributed processing model for parallel computing in data centers. Hadoop is an open-source implementation of MapReduce, which is widely used for short jobs requiring low response time. In this paper, we conduct an in-depth study of the energy efficiency for MapReduce workloads. We identify four factors that affect the energy efficiency of MapReduce. In particular, we make experiments over four typical MapReduce workloads that represent different kinds of application scenarios and measure the energy consumption with varied cluster parameters. Our key finding is that with well-tuned system parameters and adaptive resource configurations, MapReduce cluster can achieve both performance improvement and good energy saving simultaneously in some instances, which is surprisingly contrast to previous works on cluster-level energy conservation.
Cite as: Feng, B., Lu, J., Zhou, Y. and Yang, N. (2012). Energy Efficiency for MapReduce Workloads: An In-depth Study. In Proc. Australasian Database Conference (ADC 2012) Melbourne, Australia. CRPIT, 124. Zhang, R. and Zhang, Y. Eds., ACS. 61-70
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS