DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 김현욱 | - |
dc.contributor.author | Kim, Minji | - |
dc.contributor.author | 김민지 | - |
dc.date.accessioned | 2024-07-25T19:31:03Z | - |
dc.date.available | 2024-07-25T19:31:03Z | - |
dc.date.issued | 2023 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1045830&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/320622 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 생명화학공학과, 2023.8,[iii, 36 p. :] | - |
dc.description.abstract | The field of biological sciences has seen a significant increase in the number of published papers, which serve as a crucial source for novel discoveries. However, despite the availability of advanced search engines, efficient gathering and processing of newly reported data from the vast collection of literature has become increasingly challenging. A critical area of information that requires systematic extraction from literature is gene-protein-reaction (GPR) associations. The availability of GPR associations plays an instrumental role in studying the connection between an organism’s genetic makeup and its observable characteristics. Additionally, they enable the development of computational models such as genome-scale metabolic models. This study introduces a Python-based text-mining framework, which facilitates the efficient and systematic extraction of GPR association information from literature. The system employs multiple deep learning-based language models, namely BioBERT, PubMedBERT, and BioGPT, to retrieve data on five entities: species, genes, proteins, chemicals, and metabolites. The extracted GPR associations are subsequently reconstructed in a Boolean logic. The text mining framework developed in this study holds promise in enhancing the efficient and comprehensive collection of biological information (i.e., GPR associations) from an extensive corpus of literature. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | 유전체 분석▼a유전자-단백질-생화학반응 연관성▼a텍스트 마이닝▼a딥러닝 기반 언어모델 | - |
dc.subject | genome annotation▼agene-protein-reaction association▼atext mining▼adeep learning-based language model | - |
dc.title | Reconstruction of gene-protein-reaction associations using biological big data and deep learning | - |
dc.title.alternative | 바이오 빅데이터 및 딥러닝 기반 유전자-단백질-생화학 반응 상관관계 구축 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :생명화학공학과, | - |
dc.contributor.alternativeauthor | Kim, Hyun Uk | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.