(A) new similarity measure for categorical attribute-based clustering범주형 속성 기반 군집화를 위한 새로운 유사 측도

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 593
  • Download : 0
The problem of finding clusters is widely used in numerous applications, such as pattern recognition, image analysis, market analysis. The important factors that decide cluster quality are the similarity measure and the number of attributes. Similarity measures should be defined with respect to the data types. Existing similarity measures are well applicable to numerical attribute values. However, those measures do not work well when the data is described by categorical attributes, that is, when no inherent similarity measure between values. In high dimensional spaces, conventional clustering algorithms tend to break down because of sparsity of data points. To overcome this difficulty, a subspace clustering approach has been proposed. It is based on the observation that different clusters may exist in different subspaces. In this paper, we propose a new similarity measure for clustering of high dimensional categorical data. The measure is defined based on the fact that a good clustering is one where each cluster should have certain information that can distinguish it with other clusters. We also try to capture on the attribute dependencies. Experimental results on real datasets show clusters obtained by our proposed similarity measure are good enough with respect to clustering accuracy.
Advisors
Kim, Myoung-Horesearcher김명호researcher
Description
한국과학기술원 : 전산학전공,
Publisher
한국과학기술원
Issue Date
2009
Identifier
327352/325007 / 020063056
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전산학전공, 2009. 8., [ v, 22 p. ]

Keywords

clustering; similarity measure; k-means clustering; 군집화; 유사 측도; k-평균 군집화; clustering; similarity measure; k-means clustering; 군집화; 유사 측도; k-평균 군집화

URI
http://hdl.handle.net/10203/34887
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=327352&flag=dissertation
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0