Machine learning for genetic prediction of psychiatric disorders: a systematic review

Bracher-Smith, Matthew, Crawford, Karen and Escott-Price, Valentina

2021. Machine learning for genetic prediction of psychiatric disorders: a systematic review. Molecular Psychiatry 26 , pp. 70-79. 10.1038/s41380-020-0825-2

	PDF - Accepted Post-Print Version Download (279kB)
Preview	PDF - Supplemental Material Download (623kB) \| Preview

Official URL: http://dx.doi.org/10.1038/s41380-020-0825-2

Abstract

Machine learning methods have been employed to make predictions in psychiatry from genotypes, with the potential to bring improved prediction of outcomes in psychiatric genetics; however, their current performance is unclear. We aim to systematically review machine learning methods for predicting psychiatric disorders from genetics alone and evaluate their discrimination, bias and implementation. Medline, PsycInfo, Web of Science and Scopus were searched for terms relating to genetics, psychiatric disorders and machine learning, including neural networks, random forests, support vector machines and boosting, on 10 September 2019. Following PRISMA guidelines, articles were screened for inclusion independently by two authors, extracted, and assessed for risk of bias. Overall, 63 full texts were assessed from a pool of 652 abstracts. Data were extracted for 77 models of schizophrenia, bipolar, autism or anorexia across 13 studies. Performance of machine learning methods was highly varied (0.48–0.95 AUC) and differed between schizophrenia (0.54–0.95 AUC), bipolar (0.48–0.65 AUC), autism (0.52–0.81 AUC) and anorexia (0.62–0.69 AUC). This is likely due to the high risk of bias identified in the study designs and analysis for reported results. Choices for predictor selection, hyperparameter search and validation methodology, and viewing of the test set during training were common causes of high risk of bias in analysis. Key steps in model development and validation were frequently not performed or unreported. Comparison of discrimination across studies was constrained by heterogeneity of predictors, outcome and measurement, in addition to sample overlap within and across studies. Given widespread high risk of bias and the small number of studies identified, it is important to ensure established analysis methods are adopted. We emphasise best practices in methodology and reporting for improving future studies.

Item Type:	Article
Date Type:	Publication
Status:	Published
Schools:	Research Institutes & Centres > MRC Centre for Neuropsychiatric Genetics and Genomics (CNGG) Schools > Medicine
Publisher:	Springer Nature
ISSN:	1359-4184
Funders:	MRC
Projects:	MR/L010305/1
Date of First Compliant Deposit:	12 June 2020
Date of Acceptance:	5 June 2020
Last Modified:	07 Nov 2024 01:30
URI:	https://orca.cardiff.ac.uk/id/eprint/132397

Citation Data

Cited 71 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item

Dimensions

Altmetric

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)