Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Comparing long-read assemblers to explore the potential of a sustainable low-cost, low-infrastructure approach to sequence antimicrobial resistant bacteria with Oxford nanopore sequencing

Boostrom, Ian, Portal, Edward A. R., Spiller, Owen B. ORCID: https://orcid.org/0000-0002-9117-6911, Walsh, Timothy R. and Sands, Kirsty 2022. Comparing long-read assemblers to explore the potential of a sustainable low-cost, low-infrastructure approach to sequence antimicrobial resistant bacteria with Oxford nanopore sequencing. Frontiers in Microbiology 13 , 796465. 10.3389/fmicb.2022.796465

[thumbnail of fmicb-13-796465 Frontiers Minion.pdf]
Preview
PDF - Published Version
Available under License Creative Commons Attribution.

Download (7MB) | Preview

Abstract

Long-read sequencing (LRS) can resolve repetitive regions, a limitation of short read (SR) data. Reduced cost and instrument size has led to a steady increase in LRS across diagnostics and research. Here, we re-basecalled FAST5 data sequenced between 2018 and 2021 and analyzed the data in relation to gDNA across a large dataset (n = 200) spanning a wide GC content (25–67%). We examined whether re-basecalled data would improve the hybrid assembly, and, for a smaller cohort, compared long read (LR) assemblies in the context of antimicrobial resistance (AMR) genes and mobile genetic elements. We included a cost analysis when comparing SR and LR instruments. We compared the R9 and R10 chemistries and reported not only a larger yield but increased read quality with R9 flow cells. There were often discrepancies with ARG presence/absence and/or variant detection in LR assemblies. Flye-based assemblies were generally efficient at detecting the presence of ARG on both the chromosome and plasmids. Raven performed more quickly but inconsistently recovered small plasmids, notably a ∼15-kb Col-like plasmid harboring blaKPC. Canu assemblies were the most fragmented, with genome sizes larger than expected. LR assemblies failed to consistently determine multiple copies of the same ARG as identified by the Unicycler reference. Even with improvements to ONT chemistry and basecalling, long-read assemblies can lead to misinterpretation of data. If LR data are currently being relied upon, it is necessary to perform multiple assemblies, although this is resource (computing) intensive and not yet readily available/useable.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Advanced Research Computing @ Cardiff (ARCCA)
Medicine
Additional Information: This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY).
Publisher: Frontiers Media
ISSN: 1664-302X
Date of First Compliant Deposit: 16 March 2022
Date of Acceptance: 26 January 2022
Last Modified: 12 Jun 2024 12:27
URI: https://orca.cardiff.ac.uk/id/eprint/148420

Citation Data

Cited 1 time in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics