An Incremental Multi-Centroid, Multi-Run Sampling Scheme For K-medoids-based Algorithms
Price
Free (open access)
Volume
28
Pages
Published
2002
Size
695 kb
Paper DOI
10.2495/DATA020531
Copyright
WIT Press
Author(s)
S-C Chu, J F Roddick & J-S Pan
Abstract
Data clustering has become an important task for discovering significant patterns and characteristics in large spatial databases. The Mufti-Centroid, Multi-Run Sampling Scheme (MCMRS) has been shown to be effective in improving the k-medoids-based clustering algorithms in our previous work. In this paper, a more advanced sampling scheme termed Incremental Multi-Centrozd, Multi-Run Sampling Scheme (IMCMRS) is proposed for k-medoids-based clustering algorithms. Experimental results demonstrate the proposed scheme can not only reduce by more than 80% computation time but also reduce the average distance per object compared with CLARA and CLARANS. IMCMRS is also superior to MCMRS. 1 Introduction Clustering is a useful practice of classification imposed over a finite set of objects. The goal of clustering is to group sets of objects into classes such that single groups have similar characteristics, while dissimilar objects are in separate groups. Various existing clustering algorithms have been proposed and designed to fit various formats and constraints of application including k-means [16], k-medoids [11], BIRCH [18], CURE [8], CHAMELEON [10], DBSCAN [4],
Keywords