BT101: SourceData: Making Data discoverable
|Title||BT101: SourceData: Making Data discoverable|
|Publication Type||Conference Paper|
|Year of Publication||2016|
|Authors||George N, El-Gebali S, Lemberger T|
|Conference Name||International Conference on Biomedical Ontology and BioCreative (ICBO BioCreative 2016)|
|Publisher||CEUR-ws.org Volume 1747|
In molecular and cell biology, most of the data presented in published papers are not available in formats that allow for direct analysis and systematic mining. The goal of the SourceData project (http://sourcedata.embo.org) is to make published data easier to find, to connect papers containing related information and to promote the reuse and novel analysis of published data. The main concept underlying the project is that the structure of a dataset provides information about the design of the study in question and can be exploited in powerful data-oriented search strategies. SourceData has therefore developed tools to generate machine-readable descriptive metadata from figures in published manuscripts. Experimentally tested hypotheses are represented as directed relationships between standardized biological entities. Once processed, a comprehensive ‘scientific knowledge graph’ can be generated from this data (see demo video1 at https://vimeo.com/sourcedata/kg), making the body of data efficiently searchable. Importantly, this graph is objectively grounded in published data and not on the potentially subjective interpretation of the results.