Support vector based classification for bio-data mining바이오 데이터 마이닝을 위한 서포트 벡터 기반 분류

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 439
  • Download : 0
Data mining algorithms based on support vector learning such as the Support Vector Machines (SVMs) and the Support Vector Data Description (SVDD) have many the benefits over other data mining algorithms, and have been widely used in many research areas including bioinformatics. The conventional SVMs and the original SVDD, however, lack a mechanism for reflecting variations in the significance of data in a given data set; they treat all data equivalently. In many real world problems, however, data may have different degrees of significance due to noise, missing values, or density. Thus the main objective of this thesis is to propose new methods that can more correctly identify the optimal solution by reflecting the different significance into the data mining algorithms such as SVMs and SVDD, and to apply the proposed methods to the real areas especially in the bioinformatics field. To achieve the purpose mentioned above, we tried to address the following three questions in this thesis: 1) how to propose new SVMs that take into account all differences in significance to more accurately identify the optimal hyperplane (OHP) from a data set that has different significance among its members, 2) how to propose a new SVDD to more correctly identify the optimal description of a target data set by reflecting the differences in significance owing to the different density degree of each data point, and 3) how to apply the proposed methods to the real world problems such as the prediction of protein subcellular localization. To answer the first question, in this thesis we have proposed S-SVMs which can reflect the differences in the significance of training data. In the S-SVMs we have introduced a new distance measure called a significance-based distance which calculate the distance between a data point and a hyperplane based on the significance degree of the data point. Using the distance measure S-SVMs find the optimal hyperplane whil...
Advisors
Lee, Kwang-Hyungresearcher이광형researcher
Description
한국과학기술원 : 전산학전공,
Publisher
한국과학기술원
Issue Date
2006
Identifier
254437/325007  / 020025207
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학전공, 2006.2, [ xiii, 122 p. ]

Keywords

classification; support vector data description; support vector machines; Support vector learning; protein subcellular localization; 단백질 세포 내 위치정보; 분류; 서포트벡터데이터기술방법; 서포트벡터머신; 서포트벡터학습

URI
http://hdl.handle.net/10203/32909
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=254437&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0