Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Use of support vector machines for disease risk prediction in genome-wide association studies: Concerns and opportunities

Mittag, Florian, Büchel, Finja, Saad, Mohamad, Jahn, Andreas, Schulte, Claudia, Bochdanovits, Zoltan, Simón-Sánchez, Javier, Nalls, Mike A., Keller, Margaux, Hernandez, Dena G., Gibbs, J. Raphael, Lesage, Suzanne, Brice, Alexis, Heutink, Peter, Martinez, Maria, Wood, Nicholas W, Hardy, John, Singleton, Andrew B., Zell, Andreas, Gasser, Thomas, Sharma, Manu, Williams, Nigel Melville ORCID: and Morris, Huw 2012. Use of support vector machines for disease risk prediction in genome-wide association studies: Concerns and opportunities. Human Mutation 33 (12) , pp. 1708-1718. 10.1002/humu.22161

Full text not available from this repository.


The success of genome-wide association studies (GWAS) in deciphering the genetic architecture of complex diseases has fueled the expectations whether the individual risk can also be quantified based on the genetic architecture. So far, disease risk prediction based on top-validated single-nucleotide polymorphisms (SNPs) showed little predictive value. Here, we applied a support vector machine (SVM) to Parkinson disease (PD) and type 1 diabetes (T1D), to show that apart from magnitude of effect size of risk variants, heritability of the disease also plays an important role in disease risk prediction. Furthermore, we performed a simulation study to show the role of uncommon (frequency 1-5%) as well as rare variants (frequency <1%) in disease etiology of complex diseases. Using a cross-validation model, we were able to achieve predictions with an area under the receiver operating characteristic curve (AUC) of ~0.88 for T1D, highlighting the strong heritable component (∼90%). This is in contrast to PD, where we were unable to achieve a satisfactory prediction (AUC ~0.56; heritability ~38%). Our simulations showed that simultaneous inclusion of uncommon and rare variants in GWAS would eventually lead to feasible disease risk prediction for complex diseases such as PD. The used software is available at

Item Type: Article
Date Type: Publication
Status: Published
Schools: MRC Centre for Neuropsychiatric Genetics and Genomics (CNGG)
Neuroscience and Mental Health Research Institute (NMHRI)
Subjects: R Medicine > R Medicine (General)
Additional Information: Huw Morris and Nigel Williams are collaborators on this article
Publisher: Wiley-Blackwell
ISSN: 1059-7794
Last Modified: 31 Oct 2022 08:59

Citation Data

Cited 32 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item