Uroš Čibej (2007) Data replication in grid computing. PhD thesis.
Grid is a distributed system that enables dynamic aggregation of geographically dislocated computing and data resources. In data grids, data management applications often use data replication to improve data access time and provide better fault tolerance. The goal of this thesis is to study data replication in data grids, present a theoretical basis for the design of new replication methods, and propose a set of new algorithms and methods. In this thesis we present a set of models that include different parameters of a data grid. We prove that data replication is an NP-hard and non-approximable optimization problem. Furthermore, we demonstrate with simulations, that the models describe well quality placements of data. For the formulated optimization problem we develop a set of heuristic algorithms, which we compare on a problem set and extract the best one. Since centralized replication is poorly scalable, we develop a distributed replication method. Using simulations we demonstrate the superiority of this method compared with other existing methods.
Actions (login required)