Matej Ugrin (2012) Deferable server for a Hadoop system. EngD thesis.
Abstract
An ever-growing amount of data, its variety and demand for faster and more responsive systems are some of the most common challenges in the emerging field of data intensive and distributed computing. The diploma thesis discusses the MapReduce programming model and its implementation in Apache Hadoop framework that enables us to manage and process distributed data intensive applications. The thesis describes several implementations of MapReduce model with core emphasis on Apache Hadoop project and its scheduling, which is one of the key component of the framework. In the second part of the thesis we introduced several types of schedulers, which we grouped according to their characteristics and common uses. The main focus of this part is to present the Fair Scheduler which serves as a basis for the implementation of the new FWS scheduler. The aim of the new scheduler is to improve job response times and their fair execution with an introduction of a time window structure and tie breaking mechanisms. The thesis concludes with a comparison of the two schedulers and the impact of various parameters on slots allocations for FWS scheduler. Results have shown that the effect of FWS scheduler on job response times is negligible, whereas on the other hand it achieves a significant improvement in a more fair and equal allocation of resources.
Actions (login required)