Finding type 2 diabetes causal single nucleotide polymorphism combinations and functional modules from genome-wide association data

Cited 3 time in webofscience Cited 3 time in scopus
  • Hit : 712
  • Download : 521
Background: Due to the low statistical power of individual markers from a genome-wide association study (GWAS), detecting causal single nucleotide polymorphisms (SNPs) for complex diseases is a challenge. SNP combinations are suggested to compensate for the low statistical power of individual markers, but SNP combinations from GWAS generate high computational complexity. Methods: We aim to detect type 2 diabetes (T2D) causal SNP combinations from a GWAS dataset with optimal filtration and to discover the biological meaning of the detected SNP combinations. Optimal filtration can enhance the statistical power of SNP combinations by comparing the error rates of SNP combinations from various Bonferroni thresholds and p-value range-based thresholds combined with linkage disequilibrium (LD) pruning. T2D causal SNP combinations are selected using random forests with variable selection from an optimal SNP dataset. T2D causal SNP combinations and genome-wide SNPs are mapped into functional modules using expanded gene set enrichment analysis (GSEA) considering pathway, transcription factor (TF)-target, miRNA-target, gene ontology, and protein complex functional modules. The prediction error rates are measured for SNP sets from functional module-based filtration that selects SNPs within functional modules from genome-wide SNPs based expanded GSEA. Results: A T2D causal SNP combination containing 101 SNPs from the Wellcome Trust Case Control Consortium (WTCCC) GWAS dataset are selected using optimal filtration criteria, with an error rate of 10.25%. Matching 101 SNPs with known T2D genes and functional modules reveals the relationships between T2D and SNP combinations. The prediction error rates of SNP sets from functional module-based filtration record no significance compared to the prediction error rates of randomly selected SNP sets and T2D causal SNP combinations from optimal filtration. Conclusions: We propose a detection method for complex disease causal SNP combinations from an optimal SNP dataset by using random forests with variable selection. Mapping the biological meanings of detected SNP combinations can help uncover complex disease mechanisms.
Publisher
BIOMED CENTRAL LTD
Issue Date
2013-04
Language
English
Article Type
Article; Proceedings Paper
Keywords

ENRICHMENT ANALYSIS; METABOLIC PATHWAYS; PROTEIN COMPLEXES; GENE SELECTION; RESOURCE; RISK; SET; CLASSIFICATION; PERSPECTIVES; ANNOTATION

Citation

BMC MEDICAL INFORMATICS AND DECISION MAKING, v.13, no.sup.1

ISSN
1472-6947
DOI
10.1186/1472-6947-13-S1-S3
URI
http://hdl.handle.net/10203/173869
Appears in Collection
BiS-Journal Papers(저널논문)
Files in This Item
000317188100003.pdf(482.05 kB)Download
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 3 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0