What’s in our 6.5.2 TERMite / VOCabs release

SciBite’s vocabularies fuel a host of use cases, from complex querying to data integration and discovery of new knowledge. In the 6.5.2 release of VOCabs, SciBite introduces the new Emtree VOCab pack, as well as a new Sequence Ontology vocab to the Genotype-Phenotype vocab pack. Several updates to existing vocabularies are also included.   

TERMite Logo (Blue) 1200x450px

In the latest 6.5 release of VOCabs, we have added the following new vocabularies. 


The Emtree VOCab pack, based on Elsevier’s Emtree taxonomy includes 90,000+ drug, disease, medical devices, essential life science and broader society and environment entities. SciBite has further enriched and optimized these thesauri using our proprietary tools and manual curation, with rules and additional vocabularies to generate the Emtree VOCab pack with 2.5 million synonyms. For the first time users can annotate both proprietary and public text-based datasets to this standard, the underlying index that is used to structure Embase, Elsevier’s comprehensive biomedical literature database. 

The VOCab pack includes five dictionaries including:

  • EMTREE_DEVICES (Devices branch)
  • EMTREE_PROCEDURES (Procedures branch)
  • EMTREE_HEALTHCARE (Health Care Concepts branch)
  • EMTREE_SOCENV (Society and Environment branch) and 
  • EMTREE_OTHER (the remaining Emtree dictionaries without the above branches) 

Emtree_Devices, which includes the Medical Device Trade and General Names, and Global Medical Device Nomenclature (GMDN), supports those involved in the manufacture and conformity of medical devices. Analysts can use this vocabulary to identify a wide variety of medical devices for effectiveness studies, regulatory submissions to product-vigilance activities. 

The Emtree_Procedures dictionary includes an extensive collection of scientific, procedural and statistical techniques organised by subject. Users are able to couple this Emtree vocabulary with SciBite’s extensive biomedical dictionaries to serve a host of use cases, for example identify suitable methods for gene-sequencing or analyse the adverse events of drugs in rechallenge studies. 

The Emtree_Healthcare includes concepts related to health care from multiple dictionaries, including health care facilities and services, management, organizations, personnel, quality and economics.  

Emtree_SocEnv is a much broader branch featuring more holistic societal and environmental terms. 

Sequence ontology 

SciBite’s enriched SEQONT module developed from the Sequence Ontology (SO) includes terms used to describe the features, attributes, collections and variants of biological sequences, such as reference_genome, chromosome_breakpoint and dominant_negative_variant. Identifying features in gene regulatory networks and disease pathways, make them interesting entities for biomedical studies, for example target druggability, and with TERMite, SciBite’s named entity recognition platform, researchers can now more easily identify and annotate text containing these descriptors. 

Updated vocabularies  

In addition to the new vocabularies, we have updated the following vocabularies using the latest available public data: 

  • CELLTYPE has been updated to the 2021-08-10 version of the Cell Ontology 
  • CHEMMETH was updated to latest public version (17th February 2022)
  • CHEMREC has been updated to latest public version (RXNO 16th December 2021; MOP 1st February 2022) 
  • The CORONAPROT vocabulary was updated to latest public version (19th May 2022)
  • EDAM updated to latest public version (19th May 2022)
  • HPO Human Phenotype Ontology: Updated to latest public version (14th February 2022)
  • SNOMED, SNOMEDDIS, SNOMEDPROC, SNOMEDFIND were updated to latest public version (31st July 2021)
  • TAXVIRUS has been updated to latest public version (14th December 2021)
  • TAXEUM was updated to latest public version (14th December 2021) 

For more information about SciBite’s VOCabs and SciBite products, contact us here.

Get in touch

Related articles

  1. SciBite brings enterprise ontologies to Benchling – Ontology backed data capture

    Unstructured and siloed data in the life sciences remains a significant barrier to fulfilling the promise of digital transformation. Awareness is growing for the importance of data capture and storage, enabling it to be effectively found, accessed, used interoperably and reused. These are the foundations of FAIR. Capturing data with FAIR in mind, ensuring your data is “born FAIR”, is key to unlocking the full potential of data.

  2. How ontology enrichment is essential in maintaining clean data

    Ontologies have become a key piece of infrastructure for organisations as they look to manage their metadata to improve the reusability and findability of their data. This is the final blog in our blog series 'Ontologies with SciBite'. Follow the blog series to learn how we've addressed the challenges associated with both consuming and developing ontologies.


How could the SciBite semantic platform help you?

Get in touch with us to find out how we can transform your data

Contact us