Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Identifying the neurodevelopmental and psychiatric signatures of genomic disorders associated with intellectual disability: a machine learning approach

Donnelly, Nicholas, Cunningham, Adam, Salas, Sergio Marco, Bracher-Smith, Matthew, Chawner, Samuel, Stochl, Jan, Ford, Tamsin, Raymond, F. Lucy, Escott-Price, Valentina ORCID: and van den Bree, Marianne B. M. ORCID: 2023. Identifying the neurodevelopmental and psychiatric signatures of genomic disorders associated with intellectual disability: a machine learning approach. Molecular Autism 14 (1) , 19. 10.1186/s13229-023-00549-2

[thumbnail of 13229_2023_Article_549.pdf] PDF - Published Version
Available under License Creative Commons Attribution.

Download (2MB)
License URL:
License Start date: 23 May 2023


Background: Genomic conditions can be associated with developmental delay, intellectual disability, autism spectrum disorder, and physical and mental health symptoms. They are individually rare and highly variable in presentation, which limits the use of standard clinical guidelines for diagnosis and treatment. A simple screening tool to identify young people with genomic conditions associated with neurodevelopmental disorders (ND-GCs) who could benefit from further support would be of considerable value. We used machine learning approaches to address this question. Method: A total of 493 individuals were included: 389 with a ND-GC, mean age = 9.01, 66% male) and 104 siblings without known genomic conditions (controls, mean age = 10.23, 53% male). Primary carers completed assessments of behavioural, neurodevelopmental and psychiatric symptoms and physical health and development. Machine learning techniques (penalised logistic regression, random forests, support vector machines and artificial neural networks) were used to develop classifiers of ND-GC status and identified limited sets of variables that gave the best classification performance. Exploratory graph analysis was used to understand associations within the final variable set. Results: All machine learning methods identified variable sets giving high classification accuracy (AUROC between 0.883 and 0.915). We identified a subset of 30 variables best discriminating between individuals with ND-GCs and controls which formed 5 dimensions: conduct, separation anxiety, situational anxiety, communication and motor development. Limitations: This study used cross-sectional data from a cohort study which was imbalanced with respect to ND-GC status. Our model requires validation in independent datasets and with longitudinal follow-up data for validation before clinical application. Conclusions: In this study, we developed models that identified a compact set of psychiatric and physical health measures that differentiate individuals with a ND-GC from controls and highlight higher-order structure within these measures. This work is a step towards developing a screening instrument to identify young people with ND-GCs who might benefit from further specialist assessment.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Medicine
Additional Information: License information from Publisher: LICENSE 1: URL:, Type: open-access
Publisher: BioMed Central
ISSN: 2040-2392
Funders: MRC
Date of First Compliant Deposit: 24 May 2023
Date of Acceptance: 16 April 2023
Last Modified: 09 Sep 2023 21:43

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics