ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Simulation of job execution in managed distributed system

Janez Perme (2011) Simulation of job execution in managed distributed system. MSc thesis.

[img]
Preview
PDF
Download (3447Kb)

    Abstract

    Nowadays, the use of computers to solve complex computational problems is present virtually everywhere. There are more and more temporally and spatially complex problems. They are solved using purpose-built parallel systems. Most modern systems of this type fit almost exclusively into Flynn's MIMD category, examples of which are tightly coupled SMP systems, loosely coupled MPP systems, and clusters. Especially in the field of scientific computation, distributed systems have become established as an affordable substitute for parallel systems. They are suitable mainly for solving problems where the decomposition into subproblems is trivial. An important difference between the two approaches lies in its intended purpose. Parallel systems are designed to speed up problem solving (to enable a faster response time of the system), while distributed systems are mainly intended for environments where a high number of problems are solved (increased throughput of the system). A special type of distributed systems are voluntary systems that allow the exploitation of computing resources (processing and memory resources). The most famous and widespread systems are BOINC and Condor. The aim of both is the same, i.e. to exploit the computing resources during idle time. The BOINC system is used to exploit computing resources in a wide area environment, while the Condor system is suitable for closed environments. Both can be mutually complementary. The possibility of examining and evaluating the performance characteristics of such systems is crucial because it allows more efficient use of computing resources. Following the example of the BOINC and Condor systems and after simplifications were defined, a simulation model was designed in the form of an open queuing network with feedback. Simplifications were necessary since the simulation model would be otherwise difficult to manage. The queueing network consisted of a central queue and working queues. Working queues receive task from the central queue, carry them out and send the results back. The model verification confirmed the correctness of the operation, and the validation confirmed correct design. The implementation of the simulation model and of the simulation were carried out in the ns-2 discrete oriented open source simulation tool, which proved extremely versatile and flexible. The queue was designed with the help of two network nodes and their mutual network connection. The serving was realized by transferring the package via UDP network connections between nodes. Based on the designed model assumptions can be made about the behavior and performance of the real model. Any changes to the model parameters that were defined (such as queue capacity, number of queues, routing type, the probability of a feedback loop) can significantly affect its behaviour. The experiment results with the simulation model provided insight into the behavior of the model as its parameters were changed. For the input intensity, the Poisson process of UDP packets were used. It was found that as input intensity increases, so does the the model load, which means that the central server unit can quickly become a bottleneck. Its load further increases the greater the number of feeding units, and with the probability of returning UDP packets to the central queue. In the case of sufficiently powerful central queue, the bottleneck may be caused by working queues. Random routing does not allow for a steady load of working queues, especially if these have different performance. In this case a rapid increase in tasks to be performed, and consequently model saturation, follows working queues with worse performance. Balance is achieved by using the "shortest queue first" principle in routing. Working queues in this case are almost the same length. The use of this method of routing and of working queues with different performance enables us to discover some surprising behaviour. Working queues with better performance are automatically inclined to receiving long UDP packets, while those with worse performance receive shorter UDP packets. This might mean that more efficient free computer resources tend to implement more complex tasks, while less powerful free computer resources focus on implementing less demanding ones. The designed model is not restricted to simulations of a voluntary system. It is useful in any system consisting of a central unit and a set of work units. The only requirement is that the central unit sends jobs or tasks to the work units, which communicate the results back to the central unit upon completion.

    Item Type: Thesis (MSc thesis)
    Keywords: distributed systems, parallel systems, BOINC, Condor, simulation, queuing network, performance evaluation systems, ns-2 simulation tool
    Number of Pages: 114
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    prof. dr. Nikolaj Zimic244Mentor
    doc. dr. Andrej Brodnik5540Comentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=50070&select=(ID=00008606292)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 1493
    Date Deposited: 09 Sep 2011 10:04
    Last Modified: 19 Sep 2011 14:14
    URI: http://eprints.fri.uni-lj.si/id/eprint/1493

    Actions (login required)

    View Item