IT703: Semantic Digitization of Experimental Data in Biological Sciences
|Title||IT703: Semantic Digitization of Experimental Data in Biological Sciences|
|Publication Type||Conference Paper|
|Year of Publication||2016|
|Conference Name||International Conference on Biomedical Ontology and BioCreative (ICBO BioCreative 2016)|
|Publisher||CEUR-ws.org Volume 1747|
A major bulk of published experimental data, referred to as ÔGold StandardÕ data, is available in a format that cannot be easily accessed by computers unless effectively curated. Most curation techniques bank on mining the text for information. Here we propose and demonstrate the efficacy of curating the experimental data itself. The data models facilitate digitization of the every aspect of the information associated with the experimental data. The models utilize several universally accepted ontologies as well as in-house developed alphanumeric notations for digitizing different aspect of the data. The data models have sufficient flexibility to address the extensive variability in experimental data. They have a very generic nature and can be used to curate and digitize experimental data from any organism. The digitized data is easily stored in a relational database management system and can thus be rapidly searched and integrated. These models have been successfully used to digitize data from over 20,000 experiments spanning over 500 research articles on rice biology. The entire dataset is available as a database entitled ÔManually Curated Database of Rice ProteinsÕ at www.genomeindia.org/biocuration.