Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Genome-wide identification of dominant polyadenylation hexamers for use in variant classification

Shiferaw, Henoke K., Hong, Celine S., Cooper, David N. ORCID: https://orcid.org/0000-0002-8943-8484, Johnston, Jennifer J. and Biesecker, Leslie G. 2023. Genome-wide identification of dominant polyadenylation hexamers for use in variant classification. Human Molecular Genetics 32 (23) , pp. 3211-3224. 10.1093/hmg/ddad136

[thumbnail of COOPER, DAVID - PolyAPaper_021223.pdf] PDF - Accepted Post-Print Version
Download (494kB)

Abstract

Polyadenylation is an essential process for the stabilization and export of mRNAs to the cytoplasm and the polyadenylation signal hexamer (herein referred to as hexamer) plays a key role in this process. Yet, only 14 Mendelian disorders have been associated with hexamer variants. This is likely an under-ascertainment as hexamers are not well defined and not routinely examined in molecular analysis. To facilitate the interrogation of putatively pathogenic hexamer variants, we set out to define functionally important hexamers genome-wide as a resource for research and clinical testing interrogation. We identified predominant polyA sites (herein referred to as pPAS) and putative predominant hexamers across protein coding genes (PAS usage >50% per gene). As a measure of the validity of these sites, the population constraint of 4532 predominant hexamers were measured. The predominant hexamers had fewer observed variants compared to non-predominant hexamers and trimer controls, and CADD scores for variants in these hexamers were significantly higher than controls. Exome data for 1477 individuals were interrogated for hexamer variants and transcriptome data were generated for 76 individuals with 65 variants in predominant hexamers. 3′ RNA-seq data showed these variants resulted in alternate polyadenylation events (38%) and in elongated mRNA transcripts (12%). Our list of pPAS and predominant hexamers are available in the UCSC genome browser and on GitHub. We suggest this list of predominant hexamers can be used to interrogate exome and genome data. Variants in these predominant hexamers should be considered candidates for pathogenic variation in human disease, and to that end we suggest pathogenicity criteria for classifying hexamer variants.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Medicine
Publisher: Oxford University Press
ISSN: 0964-6906
Date of First Compliant Deposit: 20 October 2023
Date of Acceptance: 14 August 2023
Last Modified: 08 Nov 2024 15:15
URI: https://orca.cardiff.ac.uk/id/eprint/163345

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics