John Sundh
HAPP: High-Accuracy Pipeline for Processing deep metabarcoding data
Sundh, John; Granqvist, Emma; Iwaszkiewicz-Eggebrecht, Ela; Manoharan, Lokeshwaran; van Dijk, Laura J. A.; Goodsell, Robert; Godeiro, Nerivania N.; Bellini, Bruno C.; Łukasik, Piotr; Miraldo, Andreia; Roslin, Tomas; Tack, Ayco J. M.; Andersson, Anders F.; Ronquist, Fredrik
Authors
Emma Granqvist
Ela Iwaszkiewicz-Eggebrecht
Lokeshwaran Manoharan
Laura J. A. van Dijk
Robert Goodsell
Nerivania N. Godeiro
Bruno C. Bellini
Piotr Łukasik
Andreia Miraldo
Tomas Roslin
Ayco J. M. Tack
Anders F. Andersson
Fredrik Ronquist
Abstract
We introduce HAPP, a high-accuracy pipeline for processing deep metabarcoding data, leveraging data richness to enhance the signal-to-noise-ratio. Starting with denoised amplicon sequence variants, the pipeline consists of four steps: (1) additional chimera removal, using UCHIME and a strict sample-based approach; (2) taxonomic annotation, combining k-mer matching (SINTAX) to a reference library with phylogenetic placement (EPA-NG) on a reference tree; (3) OTU clustering using SWARM, an open-source algorithm with precision and recall comparable to RESL used in circumscribing BOLD BINs; and (4) noise filtering (NUMTs and sequencing errors), using a new algorithm introduced here, NEEAT, which combines “echo” signals across samples with detection of unusual evolutionary signatures among clusters with similar DNA sequences. HAPP computations are parallelized across taxa, making analyses tractable on very large datasets. The performance of HAPP was validated through extensive benchmarks, involving CO1 data from BOLD and Malaise trap data, demonstrating significant improvements over the state of the art.
Working Paper Type | Preprint |
---|---|
Deposit Date | Jul 18, 2025 |
Publicly Available Date | Jul 22, 2025 |
DOI | https://doi.org/10.1101/2024.12.20.629441 |
Public URL | https://uwe-repository.worktribe.com/output/14704075 |
Files
HAPP: High-Accuracy Pipeline for Processing deep metabarcoding data
(1.6 Mb)
PDF
Licence
http://creativecommons.org/licenses/by/4.0/
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
Acting pre-emptively reduces the long-term costs of managing herbicide resistance
(2024)
Journal Article
Asian elephant calf physiology and mahout perspectives during taming in Myanmar
(2024)
Journal Article
Downloadable Citations
About UWE Bristol Research Repository
Administrator e-mail: repository@uwe.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search