Most pharmaceutical companies have employed an Electronic Laboratory Notebook (ELN) with the goal of centralising their R&D data.
ELNs have become an important source of experimental data, together with the development of methods and Standard Operating Procedures (SOPs). However, much of the information stored within an ELN is captured as free text or a collection of attachments. As a result, the ability to mine it is typically limited to rudimentary text and keyword searches.
Most ELNs are limited to searching for only those terms used by the author of an experiment. Because of this, any inconsistent use of synonyms during data entry makes it difficult to identify and collate all relevant data for a disease or target of interest. For example, an experiment describing work on “muscarinic acetylcholine receptor M1” will not be found by a scientist who performs a search using the more commonly used synonym “cholinergic receptor muscarinic 1”.
Even for more defined entries, the meaning of a field or its contents may be ambiguous, imprecise or contain multiple different data types, such as “gene”, “assay type” and “species”.
At the core of SciBite’s platform are established vocabularies which apply an explicit, unique meaning and description to scientific terms. This enables complex experimental text to be contextualised so that it can be understood and used as high quality, actionable data – irrespective of its source.
Standard reference vocabularies can be augmented with proprietary information, such as project codes and IDs used to track materials such as compounds and cell lines.
SciBite’s tools generate a semantic index which transforms unstructured experimental text (including supporting files such Word documents, PowerPoint presentations and PDFs) into a structure that can be queried via a simple user interface. These queries can be used to provide answers to questions that would otherwise require time-consuming, error-prone manual aggregation.
As illustrated below, this semantically enriched data can be discovered using SciBite’s built-in user interface, a 3rd-party search and visualization tool such as Spotfire or via the ELN itself (or a combination of all three).
Most ELNs only have rudimentary search capabilities. For example, a search of a typical ELN for the Alzheimer’s related gene, PSEN1, would miss references to synonyms such as Presenilin-1, AD3 and PSNL1.
Semantic enrichment ensures that all relevant data is found, regardless of which synonym is used as the search term. SciBite not only makes it simpler to interrogate ELN data, it also facilitates more complex ontology-based questions as shown below.
Questions that can be rapidly answered with semantically enriched ELN data include:-
Semantic enrichment should not be limited to the retrospective analysis of existing data. SciBite can make any ELN data entry form semantically intelligent through SciBite Forms, enabling organisations to achieve semantic enrichment of their data in real-time at the point of capture.
By leveraging this capability, a field to capture ‘Species’ can be made both semantically aware and computationally accessible without adding unnecessary burden to scientists who subsequently enter data.
In place of restrictive and lengthy drop-down menus, users enter text into semantically aware fields and have relevant terms suggested to them as they type.
Unlock the full potential of ELN data
More than FAIR: Unlocking the value of your bioassay data
Get in touch with us to find out how we can transform your data
© SciBite Limited / Registered in England & Wales No. 07778456