IPED: a highly efficient denoising tool for Illumina MiSeq Paired-end 16S rRNA gene amplicon sequencing data

Background The development of high-throughput sequencing technologies has revolutionized the field of microbial ecology via the sequencing of phylogenetic marker genes (e.g. 16S rRNA gene amplicon sequencing). Denoising, the removal of sequencing errors, is an important step in preprocessing amplicon sequencing data. The increasing popularity of the Illumina MiSeq platform for these applications requires the development of appropriate denoising methods. Results The newly proposed denoising algorithm IPED includes a machine learning method which predicts potentially erroneous positions in sequencing reads based on a combination of quality metrics. Subsequently, this information is used to group those error-containing reads with correct reads, resulting in error-free consensus reads. This is achieved by masking potentially erroneous positions during this clustering step. Compared to the second best algorithm available, IPED detects double the amount of errors. Reducing the error rate had a positive effect on the clustering of reads in operational taxonomic units, with an almost perfect correspondence between the number of clusters and the theoretical number of species present in the mock communities. Conclusion Our algorithm IPED is a powerful denoising tool for correcting sequencing errors in Illumina MiSeq 16S rRNA gene amplicon sequencing data. Apart from significantly reducing the error rate of the sequencing reads, it has also a beneficial effect on their clustering into operational taxonomic units. IPED is freely available at http://science.sckcen.be/en/Institutes/EHS/MCB/MIC/Bioinformatics/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1061-2) contains supplementary material, which is available to authorized users.

Tags
Data and Resources
To access the resources you must log in

This item has no data

Identity

Description: The Identity category includes attributes that support the identification of the resource.

Field Value
PID https://www.doi.org/10.1186/s12859-016-1061-2
PID pmc:PMC4850673
PID pmid:27130479
URL https://dx.doi.org/10.1186/s12859-016-1061-2
URL http://europepmc.org/articles/PMC4850673
URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4850673/
URL https://doi.org/10.1186/s12859-016-1061-2
URL http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1061-2
URL https://link.springer.com/article/10.1186/s12859-016-1061-2
URL https://paperity.org/p/76111417/iped-a-highly-efficient-denoising-tool-for-illumina-miseq-paired-end-16s-rrna-gene
URL https://lirias.kuleuven.be/bitstream/123456789/540259/4/2016081.pdf
URL https://academic.microsoft.com/#/detail/2343340722
URL http://link.springer.com/content/pdf/10.1186/s12859-016-1061-2
URL https://dblp.uni-trier.de/db/journals/bmcbi/bmcbi17.html#MysaraLRM16
URL http://dx.doi.org/10.1186/s12859-016-1061-2
URL https://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/s12859-016-1061-2
URL https://core.ac.uk/display/34662925
URL https://lirias.kuleuven.be/handle/123456789/540259
URL https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1061-2
Access Modality

Description: The Access Modality category includes attributes that report the modality of exploitation of the resource.

Field Value
Access Right Open Access
Attribution

Description: Authorships and contributors

Field Value
Author Jeroen Raes, 0000-0002-1337-041X
Publishing

Description: Attributes about the publishing venue (e.g. journal) and deposit location (e.g. repository)

Field Value
Collected From Europe PubMed Central; PubMed Central; ORCID; UnpayWall; Datacite; Crossref; Lirias; Microsoft Academic Graph; CORE (RIOXX-UK Aggregator)
Hosted By Europe PubMed Central; SpringerOpen; BMC Bioinformatics; Lirias
Journal BMC Bioinformatics, 17, 1
Publication Date 2016-04-29
Publisher Springer Nature
Additional Info
Field Value
Country Belgium
Format Electronic
Language English
Resource Type Other literature type; Article; UNKNOWN
keyword 16S rRNA gene amplicon sequencing
system:type publication
Management Info
Field Value
Source https://science-innovation-policy.openaire.eu/search/publication?articleId=dedup_wf_001::f6b964c372a24566e509b5379ae72467
Author jsonws_user
Last Updated 24 December 2020, 20:16 (CET)
Created 24 December 2020, 20:16 (CET)