Badapple: promiscuity patterns from noisy evidence.

Background Bioassay data analysis continues to be an essential, routine, yet challenging task in modern drug discovery and chemical biology research. The challenge is to infer reliable knowledge from big and noisy data. Some aspects of this problem are general with solutions informed by existing and emerging data science best practices. Some aspects are domain specific, and rely on expertise in bioassay methodology and chemical biology. Testing compounds for biological activity requires complex and innovative methodology, producing results varying widely in accuracy, precision, and information content. Hit selection criteria involve optimizing such that the overall probability of success in a project is maximized, and resource-wasteful “false trails” are avoided. This “fail-early” approach is embraced both in pharmaceutical and academic drug discovery, since follow-up capacity is resource-limited. Thus, early identification of likely promiscuous compounds has practical value. Results Here we describe an algorithm for identifying likely promiscuous compounds via associated scaffolds which combines general and domain-specific features to assist and accelerate drug discovery informatics, called Badapple: bioassay-data associative promiscuity pattern learning engine. Results are described from an analysis using data from MLP assays via the BioAssay Research Database (BARD) http://bard.nih.gov. Specific examples are analyzed in the context of medicinal chemistry, to illustrate associations with mechanisms of promiscuity. Badapple has been developed at UNM, released and deployed for public use two ways: (1) BARD plugin, integrated into the public BARD REST API and BARD web client; and (2) public web app hosted at UNM. Conclusions Badapple is a method for rapidly identifying likely promiscuous compounds via associated scaffolds. Badapple generates a score associated with a pragmatic, empirical definition of promiscuity, with the overall goal to identify “false trails” and streamline workflows. Unlike methods reliant on expert curation of chemical substructure patterns, Badapple is fully evidence-driven, automated, self-improving via integration of additional data, and focused on scaffolds. Badapple is robust with respect to noise and errors, and skeptical of scanty evidence. Electronic supplementary material The online version of this article (doi:10.1186/s13321-016-0137-3) contains supplementary material, which is available to authorized users.

Tags
Data and Resources
To access the resources you must log in

This item has no data

Identity

Description: The Identity category includes attributes that support the identification of the resource.

Field Value
PID pmid:27239230
PID https://www.doi.org/10.1186/s13321-016-0137-3
PID pmc:PMC4884375
URL http://link.springer.com/content/pdf/10.1186/s13321-016-0137-3
URL https://doi.org/10.1186/s13321-016-0137-3
URL https://core.ac.uk/display/81914325
URL https://dblp.uni-trier.de/db/journals/jcheminf/jcheminf8.html#YangULSOB16
URL http://dx.doi.org/10.1186/s13321-016-0137-3
URL https://jcheminf.biomedcentral.com/track/pdf/10.1186/s13321-016-0137-3
URL https://link.springer.com/article/10.1186/s13321-016-0137-3
URL https://paperity.org/p/76094547/badapple-promiscuity-patterns-from-noisy-evidence
URL http://europepmc.org/articles/PMC4884375
URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4884375/
URL https://academic.microsoft.com/#/detail/2397609778
URL http://jcheminf.springeropen.com/articles/10.1186/s13321-016-0137-3
URL https://dx.doi.org/10.1186/s13321-016-0137-3
URL http://link.springer.com/content/pdf/10.1186/s13321-016-0137-3.pdf
URL http://link.springer.com/article/10.1186/s13321-016-0137-3/fulltext.html
URL https://jcheminf.biomedcentral.com/articles/10.1186/s13321-016-0137-3
Access Modality

Description: The Access Modality category includes attributes that report the modality of exploitation of the resource.

Field Value
Access Right Open Access
Attribution

Description: Authorships and contributors

Field Value
Author Tudor Oprea, 0000-0002-6195-6976
Author Christopher Lipinski, 0000-0001-7355-7254
Publishing

Description: Attributes about the publishing venue (e.g. journal) and deposit location (e.g. repository)

Field Value
Collected From Europe PubMed Central; PubMed Central; ORCID; Datacite; UnpayWall; Crossref; Microsoft Academic Graph; CORE (RIOXX-UK Aggregator)
Hosted By Europe PubMed Central; SpringerOpen; Journal of Cheminformatics
Publication Date 2016-05-28
Additional Info
Field Value
Language Undetermined
Resource Type Other literature type; Article; UNKNOWN
system:type publication
Management Info
Field Value
Source https://science-innovation-policy.openaire.eu/search/publication?articleId=dedup_wf_001::be9da50392b68d5de4028e59f8c3780a
Author jsonws_user
Last Updated 22 December 2020, 15:21 (CET)
Created 22 December 2020, 15:21 (CET)