Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Critical assessment of missense variant effect predictors on disease-relevant variant data

Rastogi, Ruchir, Chung, Ryan, Li, Sindy, Li, Chang, Lee, Kyoungyeul, Woo, Junwoo, Kim, Dong-Wook, Keum, Changwon, Babbi, Giulia, Martelli, Pier Luigi, Savojardo, Castrense, Casadio, Rita, Chennen, Kirsley, Weber, Thomas, Poch, Olivier, Ancien, François, Cia, Gabriel, Pucci, Fabrizio, Raimondi, Daniele, Vranken, Wim, Rooman, Marianne, Marquet, Céline, Olenyi, Tobias, Rost, Burkhard, Andreoletti, Gaia, Kamandula, Akash, Peng, Yisu, Bakolitsa, Constantina, Mort, Matthew ORCID: https://orcid.org/0000-0002-3986-0935, Cooper, David N. ORCID: https://orcid.org/0000-0002-8943-8484, Bergquist, Timothy, Pejaver, Vikas, Liu, Xiaoming, Radivojac, Predrag, Brenner, Steven E. and Ioannidis, Nilah M. 2025. Critical assessment of missense variant effect predictors on disease-relevant variant data. Human Genetics 10.1007/s00439-025-02732-2

[thumbnail of s00439-025-02732-2.pdf] PDF - Published Version
Available under License Creative Commons Attribution.

Download (1MB)

Abstract

Regular, systematic, and independent assessments of computational tools that are used to predict the pathogenicity of missense variants are necessary to evaluate their clinical and research utility and guide future improvements. The Critical Assessment of Genome Interpretation (CAGI) conducts the ongoing Annotate-All-Missense (Missense Marathon) challenge, in which missense variant effect predictors (also called variant impact predictors) are evaluated on missense variants added to disease-relevant databases following the prediction submission deadline. Here we assess predictors submitted to the CAGI 6 Annotate-All-Missense challenge, predictors commonly used in clinical genetics, and recently developed deep learning methods. We examine performance across a range of settings relevant for clinical and research applications, focusing on different subsets of the evaluation data as well as high-specificity and high-sensitivity regimes. Our evaluations reveal notable advances in current methods relative to older, well-cited tools in the field. While meta-predictors tend to outperform their constituent individual predictors, several newer individual predictors perform comparably to commonly used meta-predictors. Predictor performance varies between high-specificity and high-sensitivity regimes, highlighting that different methods may be optimal for different use cases. We also characterize two potential sources of bias. Predictors that incorporate allele frequency as a predictive feature tend to have reduced performance when distinguishing pathogenic variants from very rare benign variants, and predictors trained on pathogenicity labels from curated variant databases often inherit gene-level label imbalances. Our findings help illuminate the clinical and research utility of modern missense variant effect predictors and identify potential areas for future development.

Item Type: Article
Date Type: Published Online
Status: Published
Schools: Schools > Medicine
Publisher: Springer
ISSN: 0340-6717
Date of First Compliant Deposit: 3 April 2025
Date of Acceptance: 7 February 2025
Last Modified: 03 Apr 2025 14:15
URI: https://orca.cardiff.ac.uk/id/eprint/177386

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics