SciBite’s newest VOCabs – What’s in our 6.5 release

SciBite’s VOCabs power a host of semantic use cases including search and analytics. Created from public ontologies and reference databases for a wide range of topics, these vocabularies are enriched using our proprietary tools, and curated by our team of experts to ensure maximum capture of relevant synonyms by subject area and context.

VOCabs Logo (Blue) 1200x450px

In the latest 6.5 release of VOCabs, we have added the following new vocabularies.

Enhanced small molecule identification

CHEBICHEM is a comprehensive dictionary of small molecules and there abbreviations that have multiple applications across clinical and manufacturing semantic search, data analytics and pharmacovigilance. Analysts can use this vocabulary to identify reactants and products from a range of sources from journals to patent experimentals, analyse drug metabolism and pharmacokinetic data, or chemical properties within Chemistry, Manufacturing and Controls (CMC) regulatory submissions.

With the addition of CHEBIROLE, users can identify chemical entities by their role, for example food manufacturers can search chemical catalogues for food components (e.g. preservatives). Users have the ability to link this information to create knowledge graphs, for example researchers in biomedical R&D are able to connect compounds with their activity on a target or role to repurpose drugs or identify potential adverse events.

Expanded biological entities

Biological pathways that describe cellular processes, including metabolism, signalling, transport, cell motility, and host-virus interactions to mention a few, are key in understanding diseases and their potential treatments. The addition of the PATHWAY vocabulary can be used to analyse omics data reported in sources such as journals or clinical trial reports, for example proteins identified in OAS antiviral response, or the creation of knowledge graphs to predict outcomes (e.g. identifying diseases implicated by changes to the NTRK1 signalling pathway).

This release also includes a new human microRNA (HUMIRNA) vocabulary, based on miRBase. MicroRNAs are small single-stranded non-coding RNA molecules that can be linked to many human diseases making them interesting targets for clinical diagnostics or therapeutic agents. This vocabulary includes synonym variation rules that allows for the many different ways microRNAs are written, helping researchers maximize the extraction of these entities from journals, experimental protocols, and inventory lists. For example, hsa-mir-21 could be also written as human miR-21, human miRNA-21, human mir21, mir-21 etc.

Access more disease terms  

DOID, the Human Disease vocabulary is another new addition and contains inheritable, environmental, and infectious origins of human diseases. As with all our new vocabularies, the vocabulary includes corrections, synonym additions and disambiguations by SciBite. DOID can feature in multiple use cases including the creation of application ontologies as part of a “clean data” entry, for example in LIMS or ELNs, and performing semantic searches of clinical literature, company websites, press releases etc. for drug repurposing or competitor analysis, where users scan extensive lists of human disease terms.

The SNOMED vocabulary, which previously covered only Diseases, has now been expanded to include Procedures and Findings. For convenience, we have created a vocabulary for each of these branches (SNOMEDDIS, SNOMEDPROC, SNOMEDFIND), however the SNOMED vocabulary, now an application vocabulary, can be used to extract entities from all three.

Analyse experimental data

The LABSPECIES vocabulary consists of terminology used to describe common laboratory animals, including their Latin names and breeds. This dictionary has been designed to assist researchers analysing and comparing preclinical animal models, for example the pharmacokinetics and pharmacodynamics effects of drug candidates.

No single public ontology provides comprehensive coverage of laboratory equipment; therefore, we have created LABEQUIP, a new vocabulary for common laboratory equipment. We have incorporated classes from public ontologies such as OBI, NCIT and CHMO and, where no appropriate class exists in the public ontologies, we have supplemented these with SciBite-created classes.

Updated vocabularies 

In addition to the new vocabularies, we have updated over 20 vocabularies using the latest available public data, some of which are highlighted here:

  • INDICATION has been updated from MeSH version 2021 and includes addition of variation rules to expand synonym coverage.
  • DRUG has been updated from the ChEMBL29 release. This vocabulary features novel compounds extracted from press and product releases, pharmaceutical websites, as well as the Covid19 vaccines. Using TERMite’s hitFilter option, it is now possible to filter search on clinical phase (PHASE1-4) and molecule type (small molecule, biologicals).
  • The MEDDRA-based vocabularies, MDRAE and MDRACUTEAE have been updated with terms from MedDRA 24.1 (September 2021).
  • EFO has been updated to the 3.36.0 (November 2021) version, which features the removal of 8,000 largely anatomy terms from Uberon.

SciBite’s VOCabs allows researchers to follow FAIR (Findable, Accessible, Interoperable and Reusable) principles when entering and mining context-specific data for a range of scientific search and analytic use cases.

For more information about SciBite’s VOCabs and SciBite products, contact us.

Contact us

Related articles

  1. SciBite announces the release of SciBite Search 2.0

    In this blog we announce the v2.0 release of SciBite Search, our intelligent scientific search platform. We’ve expanded our Elsevier data connectivity, broadening the sources you can load and search, as well as a host of features that improve the user experience.

  2. SKOS in CENtree: Further support in our latest 2.1 release

    At SciBite terminologies underpin all that we do. There are many ways to represent and build a standardised terminology, each with different levels of complexity. On one hand you have simple, informal, lightweight terminologies (e.g., glossaries, dictionaries, and thesauri), where the meaning (semantics) of terms is captured using natural language.


How could the SciBite semantic platform help you?

Get in touch with us to find out how we can transform your data

Contact us