Application of machine learning for language models in porous materials다공성 물질 내 언어 모델을 위한 기계 학습의 응용

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 3
  • Download : 0
Porous materials have received a great amount of attention in recent years for their wide applications such as energy storage, gas separation and storage, catalysis, sensor, etc. This is due to their excellent properties such as large surface area, high chemical/thermal stability, and tunability. These materials are composed of tunable molecular building blocks through covalent bonds or metal ions (or clusters) via coordination interaction. They can, in principle, be synthesized in an infinite number of combinations such as metal-organic frameworks (MOFs), covalent-organic frameworks (COFs), porous polymer materials (PPN), and zeolite. Recently, machine learning has seen rapid development in a wide range of applications, in particular, language and vision. Concurrently, a considerable amount of research has been conducted on the application of machine learning in the field of crystalline porous materials. In particular, identifying structure-property relationships and inverse design via machine leaning has the potential to accelerate the discovery of optimal materials with desired property when exploring the vast chemical space of porous materials. This dissertation aims to develop machine learning models to predict various properties of porous materials such as synthesizability, gas uptake, diffusivity, and band gap by utilizing machine learning for language models which exhibits state-of-art performance in natural language process. First, a positive-unlabeled learning algorithm was developed to predict synthesizability of MOFs given synthesis conditions as inputs. To this end, synthesis conditions of MOFs were collected from scientific literature using the developed text-mining code. The algorithm successfully predicted successful synthesis in 83.1 % of the synthesized data in the test set. Second, a Transformer architecture, which has been considered the dominating neural network architecture in language models, was introduced for universal transfer learning in MOFs which enables transfer learning across various properties of MOFs. That is, MOFTransformer which is a multi-modal Transformer encoder pre-trained with 1 millon hypothetical MOFs was developed. This multi-modal model utilizes integrated atom-based graph and energy-grid embeddings to capture both local and global features of MOFs, respectively. By fine-tuning the pre-trained, it achieves state-of-the-art results for predicting across various properties. Third, beyond MOFs, we introduce PMTransformer (Porous Material Transformer), a multi-modal pre-trained Transformer model pre-trained on a vast dataset of 1.9 million hypothetical porous materials, including MOFs, COFs, PPNs, and zeolites. The PMTransformer showcases remarkable transfer learning capabilities, resulting in state-of-the-art performance in predicting various porous material properties. Fourth, a reinforcement learning framework was developed for inverse design of MOFs with desired properties, our motivation being designing promising materials for the important environmental application of direct air capture of CO2 (DAC). We demonstrate that the reinforcement learning framework can successfully design MOFs with critical characteristics important for DAC. These approaches
Advisors
김지한researcher
Description
한국과학기술원 :생명화학공학과,
Publisher
한국과학기술원
Issue Date
2023
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 생명화학공학과, 2023.8,[vii, 80 p. :]

Keywords

다공성 물질▼a분자 시뮬레이션▼a머신러닝▼a역설계▼a자연어 모델; Porous materials▼aMolecular simulation▼aMachine learning▼aInverse design▼aNatural language models

URI
http://hdl.handle.net/10203/320884
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1046843&flag=dissertation
Appears in Collection
CBE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0