Using Enterprise Search to unlock the wealth of R&D data

Learn more in this blog post about our partnership with Sinequa and how our technologies work together to provide a winning combination of Cognitive Search and powerful Life Sciences Semantics, as we explain how enterprise search platforms are enabling Pharmaceutical companies to become more information-driven.

Unlocking data

Finding relevant information to help make effective decisions can be a challenge for most Pharmaceutical companies. It typically starts with laboriously searching a range of sources, including both internal documents and the biomedical literature; then reading and interpreting everything that looks interesting.

This process is resource intensive, which constrains the number of different sources that can be searched and the depth of review possible. The exponentially growing amount of data and increasingly diverse range of sources, coupled with the reliance on human interpretation, significantly increases the risk of missing something important.

We were pleased that Gengis Birsen, Sales Engineer from Sinequa, joined our SciBite User Group Meeting in Boston, to explain how enterprise search platforms (also known as Insight Engines per Gartner’s definition or Cognitive Search per Forrester’s definition) are enabling Pharmaceutical companies to become more information-driven.

Gengis explained that “enterprise search platforms surface the most relevant information in a way that can be easily consumed and interpreted by users.” Sinequa’s product is an example of such a platform, which provides a scalable, high-performance infrastructure that enables secure access to valuable insights and relevant information from within millions of documents across the entire organization.  This, in turn, empowers companies to make faster, more-informed decisions to analyse all of the enterprise content and data securely and without losing context.

Transitioning to an information driven approach to decision making[1] Source: Sinequa

Transitioning to an information-driven approach to decision-making [1] Source: Sinequa

Semantically enriching enterprise data

So where does SciBite fit in? Well, in science, it’s pretty common that, for example, a search for the Alzheimer’s related gene, PSEN1, would miss references to synonyms such as Presenilin-1, AD3, and PSNL1. The inconsistent use of synonyms during data entry makes it difficult to identify and collate all relevant data for a disease or target of interest.

Sinequa has integrated SciBite’s semantic layer via our simple API, which is specifically designed to integrate and enrich complementary technologies. According to Gengis, “SciBite brings scientific understanding to Sinequa-based solutions and enables them to ‘understand’ scientific concepts such as drugs, targets, and indications .”

SciBite’s world class ontologies apply an explicit, unique meaning and description to scientific terms found in a company’s entire document collection as well as public data such as PubMed, patents, and grant applications. This enables complex scientific text to be contextualised so that it can be understood and used as high-quality, actionable data, irrespective of its source. SciBite generates a semantic index by transforming unstructured scientific text, including Word documents, PowerPoint presentations, and PDFs, into a structure that can be queried in a simple fashion to answer questions that would otherwise require time-consuming, error-prone manual aggregation.

Semantic enrichment of enterprise data [2] Source: Sinequa

Semantic enrichment of enterprise data [2] Source: Sinequa

Users can perform more comprehensive and inclusive queries, ensuring that all relevant data is found, regardless of which synonym is used as the search term. They can also easily answer more complex questions, such as concept-type searches (e.g., mentions of a drug or indication of interest) and ontology-based queries (e.g. mentions of a ‘type’ of entity such as a gene), and explore relationships between entities, such as identifying the targets that have been studied that are associated with inflammatory disorders and generating a list of most frequently co-occurring disease indications for a gene of interest.

Finding experts

Gengis gave a fantastic example of a semantically enriched enterprise search in action, which involved finding people with specific expertise within an organization. Having an understanding of the availability and location of internal expertise is fundamental to effective and efficient project planning. However, this is a challenge for most organizations, particularly for global companies with geographically dispersed R&D teams. Although a person’s expertise is often reflected in the content they generate, such as reports, presentations, experimental write-ups, email messages, and posts to internal messaging boards, it’s typically difficult to extract and use this information effectively.

Gengis used the example of a global company with over 10,000 people in R&D who used the Sinequa platform to analyze over 200 million documents (40% of which were internal) for scientific entities, including drugs, genes, modes of action, and diseases.

In conjunction, Sinequa’s NLP was used to automatically identify people of significance within these documents. When combined, this information enabled expertise to be mapped and relationships to be identified across the entire organization.

Today, stakeholders within this organization can find and rank experts in a given area, e.g., diabetes mellitus or arteriosclerosis, and unify information in 360° views.

Identification of expertise across the entire organisation [3]

Identification of expertise across the entire organisation [3]

We’ve heard many other great examples of how coupling semantic enrichment and enterprise search is unlocking the potential of R&D data to drive drug discovery initiatives. As Gengis said, “The combination of Sinequa and SciBite opens unparalleled access to drug discovery intelligence and vast amounts of knowledge, previously hidden in scattered document repositories, and ensures users are better informed, without overloading them with information.” Thanks again to Gengis for his interesting and insightful presentation and demonstration.

To learn more, you can watch our webinar with Sinequa on Unlocking the Wealth of R&D Data.

If you’d like to find out more about how SciBite’s solutions can help unlock the potential of the R&D data in your business, contact the SciBite team today.

Contact us

[1] Taken from Gengis’ presentation ‘Unlock the Wealth of R&D Data’, presented at SciBite’s 2018 UGM in Boston
[2] Taken from Gengis’ presentation ‘Unlock the Wealth of R&D Data’, presented at SciBite’s 2018 UGM in Boston
[3] Identification of expertise across the entire organization

Related articles

  1. Powering question-driven problem solving with semantic integration

    In this blog post hear from GSK's Scientific Lead within the Data and Computational Sciences Solutions team, Samiul Hasan on how semantic integration can be made to ultimately become part of an integrated learning framework for more informed scientific decision making.

  2. The pivotal role of semantic enrichment in the evolution of data commons

    In this blog post, discover how Pfizer have integrated SciBite’s semantically enriched vocabularies into their Data Commons project, which has the goal of enabling scientists to develop and refine hypotheses by investigating correlations between genetic and phenotypic data.


How could the SciBite semantic platform help you?

Get in touch with us to find out how we can transform your data

Contact us