Use Cases

Discover how SciBite’s powerful solutions are supporting scientists and researchers.

Use Cases Overview

Gartner report

Gartner® The Pillars of a Successful Artificial Intelligence Strategy

Access report

Knowledge Hub

Explore expert insights, articles, and thought leadership on scientific data challenges.

Knowledge Hub

Resources

Discover our whitepapers, spec sheets, and webinars for in-depth product knowledge.

Resources

Events

Join us at upcoming events and webinars to learn more about SciBite solutions.

Events

News

Stay informed with the latest SciBite updates, announcements, and industry news.

News

About SciBite

Explore SciBite’s full suite of solutions to unlock the potential of your data.

Discover more about us

Our Partners

We build powerful partnerships with world-leading organizations.

Our Partners

SKOS in CENtree: Further support in our latest 2.1 release

At SciBite terminologies underpin all that we do. There are many ways to represent and build a standardised terminology, each with different levels of complexity. On one hand you have simple, informal, lightweight terminologies (e.g., glossaries, dictionaries, and thesauri), where the meaning (semantics) of terms is captured using natural language.

These can get more informative when we encode structure and semantic relationships into them, such as taxonomies or controlled vocabularies. At the other end of the spectrum, we have full-blown formal ontology language, like OWL, that allows you to build terminologies with strict and precise semantics.

At SciBite, we recognize that there’s a mixture of formats for capturing terminologies in the wild and that each serves different use cases.

Standards such as Medical Subject Headings (MeSH) are designed for document indexing and categorization.  Concepts in MeSH are organized into a hierarchy using generic broader/narrower relationships that are useful in supporting document retrieval and navigation.

For example, in MeSH, the Anatomy branch organizes Body Regions into a hierarchy where concepts such as Eye, Mouth, Nose, and Chin are all narrower terms under Face, which is itself a narrower term for Head. In contrast, other standards, such as Uberon, represent body regions using an OWL ontology where stricter relationships such as subclass and part-of are used to organise the hierarchy and provide a more meaningful description of these concepts.

Figure 1: Terminologies overview. Terminologies may be represented using varied levels of expressivity and formality, depending on the use case it is designed to serve, as well as its level of maturity.

Figure 2: Strict semantics in OWL vs. weaker semantics in MeSH. Above we can see a class from Uberon, an OWL-based ontology, in CENtree and the equivalent class in MeSH. The graph view shows the type of relationships for each class, being strict part_of relations in Uberon and less specific in MeSH.

In a bid to improve the interoperability of controlled vocabularies and terminologies, where weaker semantics are required to organize concepts into hierarchies, then the Simple Knowledge Organisation System (SKOS) provides a convenient alternative to more formal ontology modeling languages like OWL.

What is SKOS

SKOS was built as a standard by the W3C for the representation of controlled vocabularies and thesauri in the late 2000s. SKOS is often a good starting point when building new vocabularies that may later become ontologies; it provides a more complete standard for describing common features of a controlled terminology such as standard label predicates (pref label, alt label, etc.) and taxonomic information (broader/narrower relationships).

SKOS is predominantly used to support search and navigation use cases. In such settings, the alt label predicate enables synonyms to be captured, while the broader and narrower predicates allow users to browse for search terms and enable information retrieval applications to use this structure to automatically expand queries.

Furthermore, SKOS-XL, which defines an extension for SKOS, allows for the representation of literal entitles (e.g., a label or synonym) as a resource in their own right. This feature allows vocabulary editors to provide unique identification to textual labels and grants the ability to define relationships between these entities. At SciBite, we can take advantage of the SKOS-XL representation to add additional information about synonyms to aid named entity recognition (NER) in our TERMite system.

This makes SKOS-XL the perfect means for representing and sharing vocabularies within the SciBite stack. SKOS-XL allows for NER ‘rules’ to be captured in the vocabulary before it is passed on to TERMite to be used for marking up text. The same SKOS-XL representation can also be used by our search solution, SciBite Search, for encoding the associated taxonomy of the vocabulary.

SKOS in CENtree

Up until recently, CENtree primarily supported OWL, hiding a lot of the complexity captured in OWL through the utilization of the internal CENtree representation model. This internal representation is also used for controlled vocabularies, which aligns well with SKOS. We are very pleased to announce that in CENtree 2.1 we have some additional features that build upon CENtree’s ability to support the ingestion, manipulation, and export of SKOS-based terminologies:

  • We support ingest and export of both SKOS and OWL; supporting organizations that are working with mixed representations
  • We use SKOS as a higher-level exchange format for vocabs – where we are mostly focused on lexical information (labels and synonyms) and some taxonomy
  • We support SKOS-XL where we want to say additional things about lexical entities (synonyms), such as capture provenance or add termite switch information

Conclusion

Although CENtree has been designed to handle a wide variety of standards in a seamless manner, we have extended some of the SKOS support in the latest release of the tool. Additional SKOS support will not only provide a smooth integration from CENtree to TERMite but will also enable users that are either at the start of their ontology journey and, therefore, are ingesting terminologies with less complexity into CENtree, or those who are utilizing the SKOS format as a means of representing terminologies across the business.

To learn more about CENtree or find out more about how we can help you get more from your data, contact the SciBite team.

Andy Balfe
Product Manager, SciBite

Andy Balfe received his BSc and PhD in organic chemistry from the University of East Anglia. He coordinates the delivery of innovative projects across SciBite’s product suite.

Other articles by Andy:

  1. Ontology mapping: Advancing data interoperability Read article
  2. SciBite launches Workbench – Taking the effort out of tabular data curation Read article
  3. Harnessing our latest VOCab: Emtree read article
  4. What’s in our 6.5.2 TERMite / VOCabs release read article
Share this article
Relevant resources, events and news