|
|
Data Clustering for Autonomic Application Replication
Author: Jie Yang
Source: Masters thesis, Vrije Universiteit, August 2005.
Abstract
This thesis has been realized in the context of GlobeDB, a system for
hosting Web applications that can automatically replicate application
data and maintain distributed consistency. GlobeDB adopts partial
replication to reduce the network latency and traffic, and adopts data
clusters to reduce the overhead of fine-grained replication. However,
GlobeDB only proposed a naive clustering algorithm, which was a
bottleneck to the systems performance. This thesis discusses the issue
of data clustering in GlobeDB. The main challenges include evaluating
the quality of clusters, selecting a clustering algorithm, and
deciding on a suitable number of clusters. We systematically study
various clustering algorithms and proposed some new
algorithms. Experiments prove that the new algorithms can efficiently
improve the performance of GlobeDB. We also propose criteria to select
the best clustering algorithm and parameters according to the
situation. In addition, we found that reclustering periodically can
improve performance compared with non-reclustering strategy, and the
best reclustering period is based on the stability of application
data's popularity.
|
Download
Bibtex Entry
@MastersThesis{Yang2005,
author = {Jie Yang},
title = {Data Clustering for Autonomic Application Replication},
school = {Vrije Universiteit},
address = {Amsterdam, The Netherlands},
year = {2005},
month = aug
}
|
gpierre@cs.vu.nl
|
|
|