FP2VEC : new molecular featurizer inspired by natural language processingFP2VEC : 자연어 처리를 활용한 새로운 분자 표현식

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 493
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorKim, Dong Sup-
dc.contributor.advisor김동섭-
dc.contributor.authorJeon, Woosung-
dc.date.accessioned2019-09-03T02:40:24Z-
dc.date.available2019-09-03T02:40:24Z-
dc.date.issued2019-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=843182&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/266145-
dc.description학위논문(석사) - 한국과학기술원 : 바이오및뇌공학과, 2019.2,[iv, 28 p. :]-
dc.description.abstractThe quantitative structure-activity relationship (QSAR) models are regression or classification models to predict the chemical properties of compounds. An exact prediction of QSAR models can save time and costs compared with actual experiments. For the prediction of QSAR model, the molecular featurizer, the numerical expression of a chemical compound is also important. Recently, the machine learning and deep learning techniques are widely used to develop new molecular featurizers to improve the prediction accuracy of QSAR model. Here we introduce the new method for the molecular featurizer, FP2VEC, inspired by the natural language processing techniques. Our new method can express the chemical compounds as a vector representation which is trained by a supervised learning method. And we built a QSAR model using a simple convolutional neural network to evaluate the prediction performance of the FP2VEC method. We evaluated the prediction performance of our model against four for the classification tasks and five datasets for the regression tasks. And we compared our model with other molecular featurizer models. On the classification tasks, our model showed the best prediction accuracy among the benchmark models on three out of four datasets. Also, our model implemented with multi-task learning method outperformed other the benchmark models. And on the regression tasks, our model showed the best performance two out of five datasets. Lastly, we tested the effect of the hyperparameters in our model, and some hyperparameters influenced to the prediction accuracy significantly. As a result, our new molecular featurizer based on NLP techniques provides more useful information and improved the prediction accuracy of QSAR prediction compared with the previous methods.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectMolecular featurizer▼aquantitative structure-activity relationship▼aQSAR▼anatural language processing▼aNLP▼aconvolutional neural network▼aCNN▼amulti-task learning▼aQSAR prediction-
dc.subject분자 표현식▼a정량적 구조 활성 관계 모델▼a자연어 처리▼a합성곱 신경망▼a멀티태스킹 학습▼a정량적 구조 활성 관계 예측-
dc.titleFP2VEC-
dc.title.alternativeFP2VEC : 자연어 처리를 활용한 새로운 분자 표현식-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :바이오및뇌공학과,-
dc.contributor.alternativeauthor전우성-
dc.title.subtitlenew molecular featurizer inspired by natural language processing-
Appears in Collection
BiS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0