CIViCmine

This describes the output files for the CIViCmine project. These files are loaded directly by the CIViCmine viewer. The code for this viewer is available in the CIViCmine Github repo if you want to run it independently. Each file is a tab-delimited file with a header, no comments and no quoting. You likely want civicmine_collated.tsv if you just want the list of cancer biomarkers. If you want the supporting sentences, look at civicmine_sentences.tsv. You can use the matching_id column to connect the two files. If you want to dig further and are okay with a higher false positive rate, look at civicmine_unfiltered.tsv. civicmine_collated.tsv: This contains the cancer biomarkers with citation counts supporting them. It contains the normalized cancer and gene names along with IDs for HUGO, Entrez Gene and the Disease Ontology. civicmine_sentences.tsv: This contains the supporting sentences for the cancer biomarker in the collated file. Each row is a single supporting sentence for one cancer biomarker. This file contains information on the source publication (e.g. journal, publication date, etc), the actual sentence and the cancer biomarker extracted. civicmine_unfiltered.tsv: This is the raw output of the applyModelsToSentences.py script across all of PubMed, Pubmed Central Open Access and PubMed Central Author Manuscript Collection. It contains every predicted relation with a prediction score above 0.5. So this may contain many false positives. Each row contain information on the publication (e.g. journal, publication date, etc) along with the sentence and the specific cancer biomarker extracted (with HUGO, Entrez Gene and Disease Ontology IDs). This file is further processed to create the other two.

Tags
Data and Resources
To access the resources you must log in

This item has no data

Identity

Description: The Identity category includes attributes that support the identification of the resource.

Field Value
PID https://www.doi.org/10.5281/zenodo.3529793
URL https://figshare.com/articles/CIViCmine/11465799
URL http://dx.doi.org/10.5281/zenodo.3529793
URL https://zenodo.org/record/3529793
Access Modality

Description: The Access Modality category includes attributes that report the modality of exploitation of the resource.

Field Value
Access Right Open Access
Attribution

Description: Authorships and contributors

Field Value
Author Lever, Jake
Publishing

Description: Attributes about the publishing venue (e.g. journal) and deposit location (e.g. repository)

Field Value
Collected From Zenodo; figshare; Datacite
Hosted By Zenodo; figshare
Publication Date 2019-11-05
Publisher Zenodo
Additional Info
Field Value
Language UNKNOWN
Resource Type Dataset
system:type dataset
Management Info
Field Value
Source https://science-innovation-policy.openaire.eu/search/dataset?datasetId=dedup_wf_001::91fe5b5394b657ab4d3eec1ea3425682
Author jsonws_user
Last Updated 12 January 2021, 17:49 (CET)
Created 12 January 2021, 17:49 (CET)