Integrative analysis of human tissue 3D epigenomes by combining multi-omics data멀티오믹스 데이터를 결합한 인간 조직 3D 후성유전체의 통합적 분석

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 103
  • Download : 0
Background: After the Human genome project, lots of variants were identified. And succeeding researches investigated how these variants are associated with the traits such as disease. The 3D genome especially can estimate the effects of the variants in the distal non-coding region. However, the number of 3D genome data in human tissue is not enough to understand the regulatory mechanism of non-coding variants in vivo. In this aspect, lots of 3D genome prediction models have been developed. However, most of them are either transcription factor (TF)-dependent, or tissue-invariant. The TF-dependent models are not suitable to predict the 3D genome of human tissues without TF binding information. On the other hand, the tissue-invariant models do not reflect the cell type variant nature of genome structure. Results: In order to overcome these limitations, we combined the multi-omics data from human tissues, and build deep-learning model to predict 3D genome structure. At first, the 3D genomic and epigenomic data were curated, processed, and normalized to build the database. This database, 3DIV provides the intuitive visualization of uniformly processed human multi-omics data. With these multi-omics data, we build the DeepLUCIA, the deep-learning based model to predict the chromatin loops, one of the distinct feature of human 3D genome. As the result, DeepLUCIA predicts the chromatin loops well with 12 epigenetic marks even without CTCF, the critical TF in chromatin loop formation. For benchmark, we compared DeepLUCIA with the 3DEpiLoop, chromatin loop prediction model based on the classical machine learning approach. The prediction accuracy is comparable even 3DEpiLoop requires many TF binding information. Moreover, DeepLUCIA can predicts the inter-domain chromatin loops which might be crucial for high-order genome structure, while it was not the 3DEpiLoop's concern. With the verified prediction performance, we connected the genomic variants and the traits in the context of human tissue 3D genome. At first, the fetal-heart specific physical interaction between Brudaga syndrome-associated variants and its target gene SCN5A is predicted. Similarly, the SARS-COV-2 infection hospitality-associated genomic variants bind to the promoters of CCR gene clusters in monocyte and lung-specific manner. Finally, the age-related macular degeneration(AMD)-associated variants in KCNT2 gene body binds to the promoter of complement system-related CFH/CFHR gene clusters in liver-specific manner. Conclusion: We curated the massive epigenomic and genomic data from recent advancement of genomics. Then the data were combined with 3D genomics data to construct the multiomics database. With the application of deep-learning on this database, we can make the predictive model to predict 3D genome of human tissues which have not been covered by previous models based on classical machine learning approach. These database and prediction model are helpful to most of the biologists which are not familiar with the 3D genome data by providing the contexts of 3D genomics in interpretation of noncoding variants.
Advisors
Kim, Dongsupresearcher김동섭researcher
Description
한국과학기술원 :바이오및뇌공학과,
Publisher
한국과학기술원
Issue Date
2022
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 바이오및뇌공학과, 2022.2,[viii, 104 p. :]

URI
http://hdl.handle.net/10203/308025
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=996463&flag=dissertation
Appears in Collection
BiS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0