Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species

Kutschera, Verena E., Kierczak, Marcin, van der Valk, Tom, von Seth, Johanna, Dussex, Nicolas, Lord, Edana, Dehasque, Marianne, Stanton, David W.G., Khoonsari, Payam Emami, Nystedt, Björn, Dalén, Love and Díez-del-Molino, David 2022. GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species. BMC Bioinformatics 23 (1) , 228. 10.1186/s12859-022-04757-0

[thumbnail of s12859-022-04757-0.pdf]
Preview
PDF - Published Version
Available under License Creative Commons Attribution.

Download (1MB) | Preview

Abstract

Background Many wild species have suffered drastic population size declines over the past centuries, which have led to ‘genomic erosion’ processes characterized by reduced genetic diversity, increased inbreeding, and accumulation of harmful mutations. Yet, genomic erosion estimates of modern-day populations often lack concordance with dwindling population sizes and conservation status of threatened species. One way to directly quantify the genomic consequences of population declines is to compare genome-wide data from pre-decline museum samples and modern samples. However, doing so requires computational data processing and analysis tools specifically adapted to comparative analyses of degraded, ancient or historical, DNA data with modern DNA data as well as personnel trained to perform such analyses. Results Here, we present a highly flexible, scalable, and modular pipeline to compare patterns of genomic erosion using samples from disparate time periods. The GenErode pipeline uses state-of-the-art bioinformatics tools to simultaneously process whole-genome re-sequencing data from ancient/historical and modern samples, and to produce comparable estimates of several genomic erosion indices. No programming knowledge is required to run the pipeline and all bioinformatic steps are well-documented, making the pipeline accessible to users with different backgrounds. GenErode is written in Snakemake and Python3 and uses Conda and Singularity containers to achieve reproducibility on high-performance compute clusters. The source code is freely available on GitHub (https://github.com/NBISweden/GenErode). Conclusions GenErode is a user-friendly and reproducible pipeline that enables the standardization of genomic erosion indices from temporally sampled whole genome re-sequencing data.

Item Type: Article
Date Type: Published Online
Status: Published
Schools: Biosciences
Publisher: Springer Nature
ISSN: 14712105
Date of First Compliant Deposit: 3 August 2023
Date of Acceptance: 30 May 2022
Last Modified: 03 Aug 2023 10:32
URI: https://orca.cardiff.ac.uk/id/eprint/161415

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics