TY - THES
T1 - Clustering with neighborhood graphs
A1 - Maier,Markus Martin
Y1 - 2010/05/06
N2 - Graph clustering methods are defined for general weighted graphs. If data is given in the form of points and distances between them, a neighborhood graph, such as the r-graph or kNN-graphs, is constructed and graph clustering is applied to this graph. We investigate the influence of the type and parameter of the neighborhood graph on the clustering results, when n sample points are drawn independently from a density in Euclidean space. In Chapter 2 we study "cluster identification';: the true clusters are the connected components of density level sets and a cluster is identified if its points are a connected component of the graph. We compare (modifications of) the mutual and the symmetric kNN-graph. They behave differently if the goal is to identify the "most significant'; clusters, whereas there is no difference if the goal is to identify all clusters. We give the range of k for which the clusters are identified in the graphs and derive the optimal choice of k, which, surprisingly, is linear in n. In Chapter 3 we study the convergence of the normalized cut (Ncut) and the ratio cut as n -> for cuts in the kNN- and the r-graph induced by a hyperplane. The limits differ; consequently Ncut on a kNN-graph does something systematically different than Ncut on an r-graph! This can be experimentally observed on toy and real data sets. Therefore, graph clustering criteria cannot be studied independently of the type of graph to which they are applied.
KW - Graph
KW - Cluster-Analyse
KW - Clusteridentifizierung
CY - Saarbrücken
PB - Saarländische Universitäts- und Landesbibliothek
AD - Postfach 151141, 66041 Saarbrücken
UR - http://scidok.sulb.uni-saarland.de/volltexte/2010/2966
ER -