SKOS in CENtree: Further support in our latest 2.1 release

At SciBite terminologies underpin all that we do. There are many ways to represent and build a standardised terminology, each with different levels of complexity. On one hand you have simple, informal, lightweight terminologies (e.g., glossaries, dictionaries, and thesauri), where the meaning (semantics) of terms is captured using natural language.

Blog - SKOS In CENtee

These can get more informative when we encode structure and semantic relationships into them, such as taxonomies or controlled vocabularies. At the other end of the spectrum, we have full-blown formal ontology language, like OWL, that allows you to build terminologies with strict and precise semantics.

 

figure 1

Figure 1. Terminologies overview. Terminologies may be represented using varied levels of expressivity and formality, depending on the use case it is designed to serve, as well as its level of maturity.

At SciBite, we recognize that there’s a mixture of formats for capturing terminologies in the wild and that each serves different use cases.

Standards such as Medical Subject Headings (MeSH) are designed for document indexing and categorization.  Concepts in MeSH are organized into a hierarchy using generic broader/narrower relationships that are useful in supporting document retrieval and navigation. For example, in MeSH, the Anatomy branch organizes Body Regions into a hierarchy where concepts such as Eye, Mouth, Nose, and Chin are all narrower terms under Face, which is itself a narrower term for Head. In contrast, other standards, such as Uberon, represent body regions using an OWL ontology where stricter relationships such as subclass and part-of are used to organise the hierarchy and provide a more meaningful description of these concepts.

Figure 2

Figure 2. Strict semantics in OWL vs. weaker semantics in MeSH. Above we can see a class from Uberon, an OWL-based ontology, in CENtree and the equivalent class in MeSH. The graph view shows the type of relationships for each class, being strict part_of relations in Uberon and less specific in MeSH.

In a bid to improve the interoperability of controlled vocabularies and terminologies, where weaker semantics are required to organize concepts into hierarchies, then the Simple Knowledge Organisation System (SKOS) provides a convenient alternative to more formal ontology modeling languages like OWL.

What is SKOS

SKOS was built as a standard by the W3C for the representation of controlled vocabularies and thesauri in the late 2000s. SKOS is often a good starting point when building new vocabularies that may later become ontologies; it provides a more complete standard for describing common features of a controlled terminology such as standard label predicates (pref label, alt label, etc.) and taxonomic information (broader/narrower relationships).

SKOS is predominantly used to support search and navigation use cases. In such settings, the alt label predicate enables synonyms to be captured, while the broader and narrower predicates allow users to browse for search terms and enable information retrieval applications to use this structure to automatically expand queries.

Furthermore, SKOS-XL, which defines an extension for SKOS, allows for the representation of literal entitles (e.g., a label or synonym) as a resource in their own right. This feature allows vocabulary editors to provide unique identification to textual labels and grants the ability to define relationships between these entities. At SciBite, we can take advantage of the SKOS-XL representation to add additional information about synonyms to aid named entity recognition (NER) in our TERMite system.

This makes SKOS-XL the perfect means for representing and sharing vocabularies within the SciBite stack. SKOS-XL allows for NER ‘rules’ to be captured in the vocabulary before it is passed on to TERMite to be used for marking up text. The same SKOS-XL representation can also be used by our search solution, SciBite Search, for encoding the associated taxonomy of the vocabulary.

SKOS in CENtree

Up until recently, CENtree primarily supported OWL, hiding a lot of the complexity captured in OWL through the utilization of the internal CENtree representation model. This internal representation is also used for controlled vocabularies, which aligns well with SKOS. We are very pleased to announce that in CENtree 2.1 we have some additional features that build upon CENtree’s ability to support the ingestion, manipulation, and export of SKOS-based terminologies:

  • We support ingest and export of both SKOS and OWL; supporting organizations that are working with mixed representations
  • We use SKOS as a higher-level exchange format for vocabs – where we are mostly focused on lexical information (labels and synonyms) and some taxonomy
  • We support SKOS-XL where we want to say additional things about lexical entities (synonyms), such as capture provenance or add termite switch information

Conclusion

Although CENtree has been designed to handle a wide variety of standards in a seamless manner, we have extended some of the SKOS support in the latest release of the tool. Additional SKOS support will not only provide a smooth integration from CENtree to TERMite but will also enable users that are either at the start of their ontology journey and, therefore, are ingesting terminologies with less complexity into CENtree, or those who are utilizing the SKOS format as a means of representing terminologies across the business.

To learn more about CENtree or find out more about how we can help you get more from your data, contact the SciBite team.

Contact us

Related articles

  1. SciBite announces the release of CENtree 2.0.1
     

    In this blog we announce the 2.0.1 release of CENtree, SciBite’s ontology management platform, which sees the introduction of features that enable you and your team greater control over managing and deploying ontologies in your applications, and a closer integration with TERMite.

    Read
  2. Healthcare digital transformation challenges: Can we enable healthcare systems to trust their data?

    Image and link to LinkedIn profile of blog author Arvind Swaminathan

    At SciBite, we are passionate about enabling organizations to make full use of their data to help them make evidence-based decisions, especially to help organizations overcome their healthcare digital transformation challenges. To support organizations on this journey, we offer a suite of products to help organizations adopt FAIR data standards.

    Read

How could the SciBite semantic platform help you?

Get in touch with us to find out how we can transform your data

Contact us