Computational identification of sequence variation and environmental condition in clinical depression from biomedical literature생물학 문헌을 통한 우울증에서의 염기서열 변화와 환경조건 자동 식별

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 513
  • Download : 0
Clinical depression is a complex disease, which is known to be influenced by various factors. As genetic and environmental factors are frequently referred to as the most influential in causing depression, there have been many studies that try to identify genes or proteins and environmental conditions associated with depression. While a number of text-mining (TM) systems identifying information about the genetic factors in the biomedical literature have consequently been developed, there is currently no TM system specifically targeted at extracting environmental conditions. As a result, biologists are provided only with incomplete information about depression by these TM systems, unable to help them to discover the etiology and treatment of depression. In the thesis, we propose a TM system that considers an interaction between genetic and environmental factors associated with depression. The system identifies not only relations between a sequence variation and depression but also changes in the relations according to environmental conditions. In order to develop the system, we split the system into two TM subsystems. The first system is applied to an existing system for extracting the relations between a sequence variation and depression from the biomedical literature. The system classifies whether the relations are positive or negative on a document level. Based on the dictionary with candidate terms for environmental conditions, the second system identifies the conditions in the biomedical literature containing the binary relations. Using the dependency of sentence, the system excludes terms wrongly classified as the conditions. The system is a first TM system considering a ternary relation among sequence variation, disease and condition. Through the system, we are able to provide more comprehensive information about depression than other systems. We expect that, as the system is applied to other diseases, biologists can easily identify diverse information associated with changes in symptoms of diseases including depression.
Advisors
Park, Jong C.researcher박종철researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2016
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전산학부, 2016.8 ,[iv, 27 p. :]

Keywords

depression; sequence variation; environmental condition; TM system; NLP; 우울증; 염기서열 변화; 환경조건; 텍스트 마이닝 시스템; 자연언어처리

URI
http://hdl.handle.net/10203/221875
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=663487&flag=dissertation
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0