Prioritization, clustering and functional annotation of MicroRNAs using latent semantic indexing of MEDLINE abstracts

Background The amount of scientific information about MicroRNAs (miRNAs) is growing exponentially, making it difficult for researchers to interpret experimental results. In this study, we present an automated text mining approach using Latent Semantic Indexing (LSI) for prioritization, clustering and functional annotation of miRNAs. Results For approximately 900 human miRNAs indexed in miRBase, text documents were created by concatenating titles and abstracts of MEDLINE citations which refer to the miRNAs. The documents were parsed and a weighted term-by-miRNA frequency matrix was created, which was subsequently factorized via singular value decomposition to extract pair-wise cosine values between the term (keyword) and miRNA vectors in reduced rank semantic space. LSI enables derivation of both explicit and implicit associations between entities based on word usage patterns. Using miR2Disease as a gold standard, we found that LSI identified keyword-to-miRNA relationships with high accuracy. In addition, we demonstrate that pair-wise associations between miRNAs can be used to group them into categories which are functionally aligned. Finally, term ranking by querying the LSI space with a group of miRNAs enabled annotation of the clusters with functionally related terms. Conclusions LSI modeling of MEDLINE abstracts provides a robust and automated method for miRNA related knowledge discovery. The latest collection of miRNA abstracts and LSI model can be accessed through the web tool miRNA Literature Network (miRLiN) at http://bioinfo.memphis.edu/mirlin. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1223-2) contains supplementary material, which is available to authorized users.

Tags
Data and Resources
To access the resources you must log in

This item has no data

Identity

Description: The Identity category includes attributes that support the identification of the resource.

Field Value
PID https://www.doi.org/10.1186/s12859-016-1223-2
PID pmid:27766940
PID pmc:PMC5073981
URL https://dblp.uni-trier.de/db/journals/bmcbi/bmcbi17S.html#RoyCMH16
URL http://dx.doi.org/10.1186/s12859-016-1223-2
URL https://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/s12859-016-1223-2
URL https://link.springer.com/article/10.1186/s12859-016-1223-2
URL https://academic.microsoft.com/#/detail/2529040676
URL http://europepmc.org/articles/PMC5073981
URL https://link.springer.com/content/pdf/10.1186/s12859-016-1223-2.pdf
URL https://paperity.org/p/78637943/prioritization-clustering-and-functional-annotation-of-micrornas-using-latent-semantic
URL https://doi.org/10.1186/s12859-016-1223-2
URL https://core.ac.uk/display/81750623
URL http://link.springer.com/content/pdf/10.1186/s12859-016-1223-2.pdf
URL https://dx.doi.org/10.1186/s12859-016-1223-2
URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5073981
URL https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1223-2
Access Modality

Description: The Access Modality category includes attributes that report the modality of exploitation of the resource.

Field Value
Access Right Open Access
Attribution

Description: Authorships and contributors

Field Value
Author Roy, Sujoy
Author Curry, Brandon C.
Author Madahian, Behrouz
Author Homayouni, Ramin
Publishing

Description: Attributes about the publishing venue (e.g. journal) and deposit location (e.g. repository)

Field Value
Collected From Europe PubMed Central; PubMed Central; UnpayWall; Datacite; Crossref; Microsoft Academic Graph; CORE (RIOXX-UK Aggregator)
Hosted By Europe PubMed Central; SpringerOpen; BMC Bioinformatics
Journal BMC Bioinformatics, 17, null
Publication Date 2016-10-06
Publisher Springer Science and Business Media LLC
Additional Info
Field Value
Language Undetermined
Resource Type Other literature type; Article; UNKNOWN
system:type publication
Management Info
Field Value
Source https://science-innovation-policy.openaire.eu/search/publication?articleId=dedup_wf_001::fadb1b0f97571215f73c60c472b88da6
Author jsonws_user
Last Updated 25 December 2020, 17:20 (CET)
Created 25 December 2020, 17:20 (CET)