Skip to main content

Research Repository

Advanced Search

HAPP: High-Accuracy Pipeline for Processing deep metabarcoding data

Sundh, John; Granqvist, Emma; Iwaszkiewicz-Eggebrecht, Ela; Manoharan, Lokeshwaran; van Dijk, Laura J. A.; Goodsell, Robert; Godeiro, Nerivania N.; Bellini, Bruno C.; Łukasik, Piotr; Miraldo, Andreia; Roslin, Tomas; Tack, Ayco J. M.; Andersson, Anders F.; Ronquist, Fredrik

HAPP: High-Accuracy Pipeline for Processing deep metabarcoding data Thumbnail


Authors

John Sundh

Emma Granqvist

Ela Iwaszkiewicz-Eggebrecht

Lokeshwaran Manoharan

Laura J. A. van Dijk

Robert Goodsell

Nerivania N. Godeiro

Bruno C. Bellini

Piotr Łukasik

Andreia Miraldo

Tomas Roslin

Ayco J. M. Tack

Anders F. Andersson

Fredrik Ronquist



Abstract

We introduce HAPP, a high-accuracy pipeline for processing deep metabarcoding data, leveraging data richness to enhance the signal-to-noise-ratio. Starting with denoised amplicon sequence variants, the pipeline consists of four steps: (1) additional chimera removal, using UCHIME and a strict sample-based approach; (2) taxonomic annotation, combining k-mer matching (SINTAX) to a reference library with phylogenetic placement (EPA-NG) on a reference tree; (3) OTU clustering using SWARM, an open-source algorithm with precision and recall comparable to RESL used in circumscribing BOLD BINs; and (4) noise filtering (NUMTs and sequencing errors), using a new algorithm introduced here, NEEAT, which combines “echo” signals across samples with detection of unusual evolutionary signatures among clusters with similar DNA sequences. HAPP computations are parallelized across taxa, making analyses tractable on very large datasets. The performance of HAPP was validated through extensive benchmarks, involving CO1 data from BOLD and Malaise trap data, demonstrating significant improvements over the state of the art.

Working Paper Type Preprint
Deposit Date Jul 18, 2025
Publicly Available Date Jul 22, 2025
DOI https://doi.org/10.1101/2024.12.20.629441
Public URL https://uwe-repository.worktribe.com/output/14704075

Files





You might also like



Downloadable Citations