Text mining: effective feature extraction and classification using NMF algorithm = 텍스트 마이닝: NMF 알고리즘을 이용한 효율적 특징 선택 및 분류

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 536
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorLee, Soo-Young-
dc.contributor.advisor이수영-
dc.contributor.authorBarman, Paresh Chandra-
dc.contributor.author발만, 파레쉬 찬드라-
dc.date.accessioned2011-12-12T07:25:37Z-
dc.date.available2011-12-12T07:25:37Z-
dc.date.issued2008-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=303563&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/27063-
dc.description학위논문(박사) - 한국과학기술원 : 바이오및뇌공학과, 2008. 8., [ vii, 103 p. ]-
dc.description.abstractIn this dissertation, we propose a novel concept termed nonnegative matrix factorization based on supervised feature selection and adaptation (NSFA) algorithm as an extension of unsupervised nonnegative matrix factorization (NMF) to document classification. In the text mining systems, term frequency based document vector representation model is the most common one where the terms are regarded as features. The natural language terms or words have some inherent problems such as synonymy that prevent terms being optimal features. The unsupervised NMF algorithm is used to extract the meaningful basis factor and corresponding coefficient factors of the documents where the basis vectors capture the concept of the documents by analyzing the co-occurrence distribution of terms. These basis vectors are used as features instead of individual terms. The unsupervised feature extraction reduces the feature dimension and also addresses the problems of natural language text. All features that are extracted by unsupervised NMF algorithm may not be relevant and optimal for classification. Based on the given category information the relevant features are selected and adapted to enhance the classification performance. As a selection criterion the rank of mutual information (MI) based relevance measure is used. For adaptation process standard NMF structure with single layer perceptron (NMF-SLP) and Feed-forward multilayer perceptron (MLP) networks are used. For NMF-SLP network a hybrid feature adaptation algorithm (NMFH) is proposed where the document feature vectors and classifier layer is trained on the basis of gradient descent based error minimization rule and the basis or concept vectors of the NMF layer are trained based on the KL-divergence minimization rule. For feed-forward multilayer perceptron (MLP) network we proposed two different learning algorithms named as MLP-NMFI and MLP-NMFI-NC. MLP-NMFI is defined as the MLP training by error back-propagation (EBP) rule w...eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectNon-negative Matrix Factorization-
dc.subjectText Mining-
dc.subjectDocument Classification-
dc.subjectFeature Adaptation-
dc.subjectFeature Selection-
dc.subject비음수 행렬 분해법-
dc.subject텍스트 마이닝-
dc.subject문서 분류-
dc.subject특징 적응-
dc.subject특징 선택-
dc.subjectNon-negative Matrix Factorization-
dc.subjectText Mining-
dc.subjectDocument Classification-
dc.subjectFeature Adaptation-
dc.subjectFeature Selection-
dc.subject비음수 행렬 분해법-
dc.subject텍스트 마이닝-
dc.subject문서 분류-
dc.subject특징 적응-
dc.subject특징 선택-
dc.titleText mining: effective feature extraction and classification using NMF algorithm = 텍스트 마이닝: NMF 알고리즘을 이용한 효율적 특징 선택 및 분류-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN303563/325007 -
dc.description.department한국과학기술원 : 바이오및뇌공학과, -
dc.identifier.uid020044523-
dc.contributor.localauthorLee, Soo-Young-
dc.contributor.localauthor이수영-
Appears in Collection
BiS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0