Construction of explainable deep learning architectures for molecular graphs분자 그래프를 위한 심층학습 아키텍쳐 및 설명가능성에 관한 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 227
  • Download : 0
Prediction of physical properties and biological activities from a molecular structure of a compound has been a large interest of chemists, and researches for searching new methodology with better prediction accuracy and efficiency have been continued for a long time. Traditionally, physicochemical understanding of the molecule through ab initio quantum chemistry methods and rule-based expert systems built on past experiences were used for the prediction of the properties. However, deep learning that has been adopted in fields of chemistry in recent 5 years has shown its strength on both accuracy and swiftness when predicting the desired properties and revolutionized the strategy to solve problems in chemistry. Especially, the graph-based deep learning methods, utilizing molecular graphs rather than knowledge-based molecular fingerprints, learn the relationship between molecular structures and properties directly and provide highly accurate predictions on newly shown molecules. Therefore herein, we explore the possible improvements in molecular graphs and corresponding deep-learning architectures for efficient utilization of deep learning in chemistry and seek explainability when predicting from the structures. Molecular structures consisted of atoms and bonds can be intuitively converted into graph structure defined by nodes and edges. Consequently, the representation learning of molecular structures encoded in molecular graphs has been explored and recently proved its better performance compared to other deep learning methods. However, three-dimensional information such as conformers and orientations of the molecules were lost when converting into the molecular graphs, because the graphs were basically two-dimensional structures. To overcome the limitation of traditional molecular graphs, we investigated an extension of molecular graphs for possessing three-dimensional information and developed a corresponding graph neural network for learning the 3D information. Our proposed graph neural network on 3D graph representations showed an enhanced prediction of physical properties and biochemical activities when compared to the traditional graph neural networks, even exceeding the chemical accuracy on prediction task of hydration free energies. Moreover, when trained with three-dimensional molecular structures and their activities on inhibition of an enzyme, molecules with an appropriate orientation to the enzyme were predicted to be an active inhibitor, whereas different orientations of the same molecule were predicted to be less effective. Protein-ligand interaction is an important concept in pharmacochemistry, which seeks an understanding of the interaction between ligand molecules with their target proteins and their affinities. Approaches to predict protein-ligand interaction with the graph-based deep learning methods have earned wide interest, however, the fact that prediction relies on two independent molecules and governed by noncovalent interaction made the development of prediction models difficult. In this research, we developed a graph neural network that learns two molecular graphs representing the noncovalent interaction and the covalent interaction sequentially and analyzed the influence of noncovalent interaction on the prediction of binding affinity of the complex. For efficient training of the model, we restricted the protein structure into a smaller pocket which is the neighborhood of the ligand and found the 5 Å range cutoff was most effective. The noncovalent interaction was found to be exceedingly important compared to the covalent interaction on the prediction of binding affinity. These findings indicate the graph-based models can be applied to diverse problems in chemistry beyond the prediction of molecular properties within a single molecule by appropriate construction of the molecular graph. Finally, for applying deep learning models in real-world problems, understanding of the evidence and consistency of the prediction is necessary along with good accuracy. Deep learning models are vulnerable to improper training that shows good averaged performance but relies on fragmentary knowledge due to the opacity of the model. To overcome the risk of misguided training, we adopted explainability techniques on the aforementioned graph neural networks and evaluated the basis of the predictions. By visualizing atomic influences on the predicted properties with a heat map, we observed hydroxyl groups and amine groups contribute mostly to the solubility of a molecule, which is in a good agreement with chemical knowledge. Especially, information on the existence of carbon rings and hydrogen bonds included in the graph representation had a high influence on the accuracy of the prediction. When trained with protein-ligand complexes and their binding affinities, relevance for each atom showed a good correlation with the hydrogen bond patterns expected by the chemistry knowledge between the ligand and the protein. Our experiments indicate the promising future of the graph neural networks and their explainability analysis on applications for problems in multiple fields of chemistry.
Advisors
Choi, Insung S.researcher최인성researcher
Description
한국과학기술원 :화학과,
Publisher
한국과학기술원
Issue Date
2020
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 화학과, 2020.8,[viii, 51 p. :]

Keywords

deep learning▼amolecular graph▼amolecular property prediction▼aprotein-ligand interaction▼aexplainable AI; 심층학습▼a분자 그래프▼a물성 예측▼a단백질-리간드 상호작용▼a설명 가능한 인공지능

URI
http://hdl.handle.net/10203/295823
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=964776&flag=dissertation
Appears in Collection
CH-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0