ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Increasing efficiency of job execution with resource co-allocation in distributed computer systems

Matija Cankar (2014) Increasing efficiency of job execution with resource co-allocation in distributed computer systems. PhD thesis.

[img]
Preview
PDF
Download (1128Kb)

    Abstract

    The field of distributed computer systems, while not new in computer science, is still the subject of a lot of interest in both industry and academia. More powerful computers, faster and more ubiquitous networks, and complex distributed applications are accelerating the growth of distributed computing. Large numbers of computers interconnected in a single network provide additional computing power to users whenever required. Such systems are, however, expensive and complex to manage, which can lead to unduly high expenses unless the infrastructure is efficiently utilised. Currently the most attractive forms of distributed systems are grid and cloud computing. In this dissertation we review some of the resource management approaches commonly used in grid and cloud computing. We examine scheduling approaches in systems with distributed and centralised infrastructure management and highlight the key properties of the applications for which distributed infrastructures are typically used. We present the advantages of scheduling flexible jobs which can scale themselves to the amount of allocated resources, and propose two scheduling approaches. The first approach supports co-allocation of computer resources to jobs on distributed infrastructures with distributed resource management. The latter implies that the system can use multiple autonomous schedulers, which do not have global control over the state of the resources on the nodes. We focus on schedulers that only map a single job to the infrastructure at a time. We propose an approach that supports collective demands, i.e. requests for a set of nodes that must collectively meet the specified demands for resources. We implemented this approach in the XtreemOS operating system and evaluated it in real and simulated environments. The results show that the use of collective demands extends search times, but this is compensated by the fact that the scheduled jobs load the infrastructure more sparingly and allow the jobs to start earlier. The second approach is applicable to offline resource scheduling in distributed infrastructures with global control over the resources. In other words, there is a single central scheduler that can schedule a whole set of jobs simultaneously. For such a set-up we propose analysing the jobs in a batch in order to pair and scale them into co-located subsets and thus improve utilisation. We implemented the proposed approach to run with the Haizea scheduler and evaluated its activity. The results show that the adjusting of a small job subset improves the utilisation of the infrastructure and the savings obtained more than outweigh the extra work needed for the adjusting. The proposed approaches allow the schedulers to better utilise the infrastructure and increase the likelihood of finding the appropriate resources for the job. Through the approaches described and experiments presented, we contribute to the formulation of new solutions for schedulers in the fields of grid and cloud computing. Some possible extensions are given in the conclusions.

    Item Type: Thesis (PhD thesis)
    Keywords: Distributed systems, resource management, scheduling, grid computing, cloud computing.
    Number of Pages: 96
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    izr. prof. dr. Uroš Lotrič270Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=50070&select=(ID=10770260)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 2675
    Date Deposited: 13 Sep 2014 11:11
    Last Modified: 26 Sep 2014 08:44
    URI: http://eprints.fri.uni-lj.si/id/eprint/2675

    Actions (login required)

    View Item