AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number

Abstract Background Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry. Results We integrated strategies from machine learning, cartography, and graph theory into a new informatics method for automatically clustering self-organizing map ensembles of high-dimensional data. Our new method, called AutoSOME, readily identifies discrete and fuzzy data clusters without prior knowledge of cluster number or structure in diverse datasets including whole genome microarray data. Visualization of AutoSOME output using network diagrams and differential heat maps reveals unexpected variation among well-characterized cancer cell lines. Co-expression analysis of data from human embryonic and induced pluripotent stem cells using AutoSOME identifies >3400 up-regulated genes associated with pluripotency, and indicates that a recently identified protein-protein interaction network characterizing pluripotency was underestimated by a factor of four. Conclusions By effectively extracting important information from high-dimensional microarray data without prior knowledge or the need for data filtration, AutoSOME can yield systems-level insights from whole genome microarray expression studies. Due to its generality, this new method should also have practical utility for a variety of data-intensive applications, including the results of deep sequencing experiments. AutoSOME is available for download at http://jimcooperlab.mcdb.ucsb.edu/autosome.

Tags
Data and Resources
To access the resources you must log in

This item has no data

Identity

Description: The Identity category includes attributes that support the identification of the resource.

Field Value
PID https://www.doi.org/10.1186/1471-2105-11-117
PID pmid:20202218
PID pmc:PMC2846907
URL https://link.springer.com/article/10.1186/1471-2105-11-117
URL https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-117
URL http://europepmc.org/articles/PMC2846907
URL https://doaj.org/article/68a72fcabed940809212261615607bb3
URL http://core.ac.uk/display/26637792
URL http://www.ceng.metu.edu.tr/~tcan/ceng734_20101/Schedule/AutoSOME.pdf
URL http://www.biomedcentral.com/1471-2105/11/117
URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2846907/
URL https://academic.microsoft.com/#/detail/2061239067
URL https://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/1471-2105-11-117
URL https://dblp.uni-trier.de/db/journals/bmcbi/bmcbi11.html#NewmanC10
URL https://paperity.org/p/56777646/autosome-a-clustering-method-for-identifying-gene-expression-modules-without-prior
URL https://doi.org/10.1186/1471-2105-11-117
URL https://dx.doi.org/10.1186/1471-2105-11-117
URL http://link.springer.com/content/pdf/10.1186/1471-2105-11-117.pdf
URL http://dx.doi.org/10.1186/1471-2105-11-117
URL https://core.ac.uk/display/26637792
URL http://link.springer.com/article/10.1186/1471-2105-11-117/fulltext.html
URL https://doaj.org/toc/1471-2105
URL https://www.researchgate.net/profile/Aaron_Newman2/publication/41759735_AutoSOME_a_clustering_method_for_identifying_gene_expression_modules_without_prior_knowledge_of_cluster_number/links/00b7d534feca288153000000.pdf
Access Modality

Description: The Access Modality category includes attributes that report the modality of exploitation of the resource.

Field Value
Access Right Open Access
Attribution

Description: Authorships and contributors

Field Value
Author Aaron M Newman, 0000-0002-1857-8172
Publishing

Description: Attributes about the publishing venue (e.g. journal) and deposit location (e.g. repository)

Field Value
Collected From Europe PubMed Central; PubMed Central; ORCID; Datacite; UnpayWall; DOAJ-Articles; Crossref; Microsoft Academic Graph; CORE (RIOXX-UK Aggregator)
Hosted By Europe PubMed Central; SpringerOpen; BMC Bioinformatics
Publication Date 2010-03-04
Publisher Springer Science and Business Media LLC
Additional Info
Field Value
Language UNKNOWN
Resource Type Other literature type; Article; UNKNOWN
system:type publication
Management Info
Field Value
Source https://science-innovation-policy.openaire.eu/search/publication?articleId=dedup_wf_001::528ffed85b4bb8f6a5b12978a10b84a3
Author jsonws_user
Last Updated 23 December 2020, 14:38 (CET)
Created 23 December 2020, 14:38 (CET)