Cnidaria: fast, reference-free clustering of raw and assembled genome and transcriptome NGS data

Background: Identification of biological specimens is a major requirement for a range of applications. Reference-free methods analyse unprocessed sequencing data without relying on prior knowledge, but generally do not scale to arbitrarily large genomes and arbitrarily large phylogenetic distances. Results: We present Cnidaria, a practical tool for clustering genomic and transcriptomic data with no limitation on genome size or phylogenetic distances. We successfully simultaneously clustered 169 genomic and transcriptomic datasets from 4 kingdoms, achieving 100% identification accuracy at supra-species level and 78% accuracy for species level. Discussion: CNIDARIA allows for fast, resource-efficient comparison and identification of both raw and assembled genome and transcriptome data. This can help answer both fundamental (e.g. in phylogeny, ecological diversity analysis) and practical questions (e.g. sequencing quality control, primer design).

Tags
Data and Resources
To access the resources you must log in

This item has no data

Identity

Description: The Identity category includes attributes that support the identification of the resource.

Field Value
PID https://www.doi.org/10.1186/s12859-015-0806-7
PID urn:urn:nbn:nl:ui:32-493341
PID pmc:PMC4630969
PID pmid:26525298
PID arXiv:1511.05530
URL http://europepmc.org/abstract/MED/26525298
URL https://dblp.uni-trier.de/db/journals/bmcbi/bmcbi16.html#AflitosSSPJR15
URL https://dx.doi.org/10.1186/s12859-015-0806-7
URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4630969/
URL https://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/s12859-015-0806-7
URL http://europepmc.org/articles/PMC4630969
URL http://dx.doi.org/10.1186/s12859-015-0806-7
URL https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-015-0806-7
URL https://arxiv.org/pdf/1511.05530.pdf
URL https://academic.microsoft.com/#/detail/2103383866
URL https://www.narcis.nl/publication/RecordID/oai%3Alibrary.wur.nl%3Awurpubs%2F493341
URL https://core.ac.uk/display/92650414
URL http://arxiv.org/abs/1511.05530
URL https://ui.adsabs.harvard.edu/abs/2015arXiv151105530A/abstract
URL https://paperity.org/p/74736613/cnidaria-fast-reference-free-clustering-of-raw-and-assembled-genome-and-transcriptome-ngs
URL https://research.wur.nl/en/publications/cnidaria-fast-reference-free-clustering-of-raw-and-assembled-geno
URL http://link.springer.com/content/pdf/10.1186/s12859-015-0806-7.pdf
URL http://library.wur.nl/WebQuery/wurpubs/493341
URL http://link.springer.com/article/10.1186/s12859-015-0806-7/fulltext.html
URL https://link.springer.com/article/10.1186/s12859-015-0806-7
URL https://doi.org/10.1186/s12859-015-0806-7
Access Modality

Description: The Access Modality category includes attributes that report the modality of exploitation of the resource.

Field Value
Access Right Open Access
Attribution

Description: Authorships and contributors

Field Value
Author S.A. Aflitos, 0000-0002-9179-5309
Publishing

Description: Attributes about the publishing venue (e.g. journal) and deposit location (e.g. repository)

Field Value
Collected From Europe PubMed Central; PubMed Central; ORCID; UnpayWall; Datacite; arXiv.org e-Print Archive; NARCIS; Crossref; Microsoft Academic Graph; CORE (RIOXX-UK Aggregator)
Hosted By Europe PubMed Central; SpringerOpen; Wageningen Yield; arXiv.org e-Print Archive; NARCIS; BMC Bioinformatics
Journal BMC Bioinformatics, 16, 1
Publication Date 2015-01-01
Additional Info
Field Value
Country Netherlands
Description Comment: 47 pages, 13 figures
Format application/octet-stream
Language English
Resource Type Article; Preprint; UNKNOWN
keyword 92D20, 92B10, 92-08
system:type publication
Management Info
Field Value
Source https://science-innovation-policy.openaire.eu/search/publication?articleId=dedup_wf_001::cd1c7b0808fd0fbcbafcbf3c7c832cf1
Author jsonws_user
Last Updated 26 December 2020, 00:36 (CET)
Created 26 December 2020, 00:36 (CET)