Institute of Computing Technology, Chinese Academy IR
A unique property of single-link distance and its application in data clustering | |
Song, Yuqing1; Jin, Shuyuan2; Shen, Jie3 | |
2011-11-01 | |
发表期刊 | DATA & KNOWLEDGE ENGINEERING |
ISSN | 0169-023X |
卷号 | 70期号:11页码:984-1003 |
摘要 | We prove a unique property of single-link distance, based on which an algorithm is designed for data clustering. The property states that a single-link cluster is a subset with inter-subset distance greater than intra-subset distance, and vice versa. Among the major linkages (single, complete, average, centroid, median, and Ward's), only single-link distance has this property. Based on this property we introduce monotonic sequences of iclusters (i.e., single-link clusters) to model the phenomenon that a natural cluster has a dense kernel and the density decreases as we move from the kernel to the boundary. A monotonic sequence of iclusters is a sequence of nested iclusters such that an icluster in the sequence is a dominant child (in terms of size) of the icluster before it. Our data clustering algorithm is monotonic sequence based. We classify a dataset of one monotonic sequence into to two classes by splitting the sequence into two parts: the kernel part and the surrounding part. For a data set of multiple monotonic sequences, each leaf monotonic sequence represents the kernel of a class, which then "grows" by absorbing nearby non-kernel points. This algorithm, proved by experiments, compares favorable in effectiveness to other clustering algorithms. (C) 2011 Elsevier B.V. All rights reserved. |
关键词 | Hierarchical clustering Single-link cluster Icluster Isolation compactness Monotonic sequence |
DOI | 10.1016/j.datak.2011.07.003 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | Natural Science Foundation of China[61070112] ; Natural Science Foundation of China[61070116] ; Hi-Tech Research and Development Program of China (863 Program)[2009AA01Z317] |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Artificial Intelligence ; Computer Science, Information Systems |
WOS记录号 | WOS:000295431000003 |
出版者 | ELSEVIER SCIENCE BV |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/12582 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Song, Yuqing |
作者单位 | 1.Tianjin Univ Technol & Educ, Tianjin 300222, Peoples R China 2.Chinese Acad Sci, Inst Comp Technol, Beijing 100864, Peoples R China 3.Univ Michigan, Dept Comp & Informat Sci, Dearborn, MI 48128 USA |
推荐引用方式 GB/T 7714 | Song, Yuqing,Jin, Shuyuan,Shen, Jie. A unique property of single-link distance and its application in data clustering[J]. DATA & KNOWLEDGE ENGINEERING,2011,70(11):984-1003. |
APA | Song, Yuqing,Jin, Shuyuan,&Shen, Jie.(2011).A unique property of single-link distance and its application in data clustering.DATA & KNOWLEDGE ENGINEERING,70(11),984-1003. |
MLA | Song, Yuqing,et al."A unique property of single-link distance and its application in data clustering".DATA & KNOWLEDGE ENGINEERING 70.11(2011):984-1003. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论