Machine learning for the identification of noncoding driver mutations in cancer = 암 세포에서 발생하는 돌연변이의 기능을 확인하기 위한 머신러닝 알고리즘 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 351
  • Download : 0
One of the greatest challenges in cancer genomics is to distinguish driver mutations from passenger mutations. Whereas recurrence is a hallmark of driver mutations, it is difficult to observe recurring noncoding mutations owing to a limited amount of whole-genome sequenced samples. Hence, it is required to develop a method to predict potentially recurrent mutations. In this work, I developed a random forest classifier that predicts regulatory mutations that may recur based on the features of the mutations repeatedly appearing in a given cohort. Recurrent mutations can arise at the same site or affect the same gene from different sites. Here I identified a set of mutations arising from individual samples and altering different cis-regulatory elements that converge on a common gene via chromatin interactions. With breast cancer and lung cancer as a model, I profiled up-to 50 quantitative features describing genetic and epigenetic signals at the mutation site, transcription factors whose binding motif were disrupted by the mutation, and genes targeted by long-range chromatin interactions. A true set of mutations for random forest was generated by interrogating publicly available pan-cancer genomes based on our statistical model of mutation recurrence. The performance of my random forest classifier was evaluated by cross validations. My methods enable to characterize recurrent regulatory mutations using a limited number of whole-genome samples, and based on the characterization, to predict potential driver mutations whose recurrence is not found in the given samples but likely to be observed with additional samples. The mutations and genes identified in this fashion showed strong relevance to cancer, in contrast to those with site-specific recurrence. My methods were capable of accurately predicting mutations recurring at the target gene level but not those recurring at the same site. In conclusion, I propose a novel approach to discovering potential cancer-driving mutations in noncoding regions.
Choi, Jung Kyoonresearcher최정균researcher
한국과학기술원 :바이오및뇌공학과,
Issue Date

학위논문(박사) - 한국과학기술원 : 바이오및뇌공학과, 2017.8,[iv, 83 p. :]


머신러닝▼a후성유전체▼a암 체세포 돌연변이▼a크로마틴 원거리 상호작용▼a전사체; machine learning▼aepigenome▼acancer somatic mutation▼adistal chromatin interaction▼atranscriptome

Appears in Collection
Files in This Item
There are no files associated with this item.


  • mendeley


rss_1.0 rss_2.0 atom_1.0