In an increasingly data-driven society, it can be overwhelming to keep your knowledge base current, and effectively utilize data. The result in many organizations is underutilizing the data and content available to them. This is particularly prevalent in disciplines like the life sciences where the data is largely unstructured in nature and where heterogenous and rapidly evolving terminology is prevalent. This makes it less amenable to computational processing, hampering search and extraction of scientific insights. These challenges underpin the increasing adoption of FAIR (Findable, Accessible, Interoperable and Reusable) data principles.
Increasingly, the content landscape is more complex; more content providers as well as a broader range of internal data management tools lead to further siloing of data and making it inaccessible to users within an organization. Data driven organizations need to support both internal and external content, making it accessible to users and enabling them to realize the investment made in this content.
Access to data is one part of the problem; findability is a second challenge that data driven organizations face. In this blog post, learn how Copyright Clearance Center (CCC) and SciBite have combined their expertise to deliver a FAIR data platform that offers full semantic search within RightFind Navigate. RightFind Navigate both streamlines the delivery and aggregation of content and leverages industry leading named entity recognition to enable full semantic search over aggregated content sets.
As leaders in providing access to content, CCC developed RightFind Navigate to break down data silos and to streamline the delivery of information to users, whether that is scientific literature, global life science patents, or the latest information on drugs or ongoing clinical trials. RightFind provides access to the most comprehensive collection of scientific, medical and technical content, including over 5 million open access articles, in a copyright-compliant manner.
RightFind Navigate provides a unified view of both internal and external content sets, saves time and effort, and eliminates data silos that limit the accessibility and usability of this content by users. Through a machine learning (ML) backed search experience, users are able to personalize their search experience, helping them to find key insights quickly and efficiently and directly feeding into the development of the next generation of therapeutics. This search experience is further enhanced through the application of semantics, powered by the SciBite semantic platform.
SciBite are industry leaders in the enterprise-wide implementation of FAIR data principles. Through an award-winning semantic platform, SciBite uses hand curated and optimized ontologies (VOCabs) that contextualize and align data to the broader scientific community. These ontologies cover a broad range of concepts within the life sciences such as drugs, genes, proteins, combination therapies and medical devices. When combined with a named entity recognition and extraction tool (TERMite), these VOCabs are used to identify key concepts or entities within the data and assign an ID to these entities.
A key to this tagging process is the ability to leverage the broad synonyms or different naming conventions that the VOCabs support, e.g. breast cancer, cancer of the breast, mammary tumour or breast tumour are all different names for the same thing. This means that independent of how the data was captured in the original text, it will be accurately identified. Once tagged as an entity and assigned an ID (for example MedDRA ID10006187 for breast cancer), users can leverage all of the synonyms associated with the ID for search and analytics.
For the RightFind Navigate user, this means that independent of which search term they use, they will always return robust and inclusive search results. For example, whether the user searches for breast cancer or cancer of the breast, RightFind Navigate will know that these are equivalent and will return the same results, something that wouldn’t necessarily be the case if relying on string matching (keyword search). This improves the user experience and ensures that relevant content and insights are not missed.
Data silos hamper access to data and limit the utility of internal and external content sets, as does the procurement of commercial data sources. Through RightFind Navigate, CCC provides a solution to streamline the delivery of published content and presents this alongside internal content and datasets. Combined with industry leading semantic search powered by the SciBite platform, RightFind Navigate enables this aggregated content to be effectively navigated and insights extracted.
De-siloing this content not only saves time and improves user experience, it also maximizes ROI for organizations who invest heavily in both internal and external content. In addition, having a platform of organized and harmonized data unlocks a whole host of downstream applications that can take this ROI to another level, such as analytical dashboards and even predictive ML and AI models.
Keep Learning: In this white paper from CCC and SciBite, we highlight four practical applications for semantic enrichment across the drug development pipeline, enabling life sciences organizations to not only save time but increase the accuracy and efficiency of their processes.
Copyright Clearance Center (CCC) is a global leader in content management, licensing, discovery, and delivery solutions. Through its relationships with those who use and create content, CCC drives market-based solutions that fuel research, power publishing, and respect copyright. With its subsidiaries RightsDirect and Ixxus, CCC provides solutions for millions of people from the world’s largest companies and academic institutions.
We partner with leading enterprise search platforms to enhance real-time big data analytics for pharma and biotech companies. Semantic search capabilities improve the accuracy of search results allowing companies to make data-informed decisions. Find out more about how SciBite’s solutions can help unlock the potential of the R&D data in your business.
Sam leads partnerships and alliances at SciBite, working collaboratively with existing partners and developing new partnerships aligned to SciBite’s strategic goals. He has a strong technical background in the life sciences, with a PhD in Protein Biochemistry from the University of Nottingham and post-doctoral training in bioinformatics within the department of Neurosurgery at the University of California San Francisco.
Prior to Joining SciBite he held technical sales and commercial roles at Carl Zeiss and most recently led business development at Repositive, building relationships with contract research organisations, biotech’s and pharma companies, facilitating data exchange and search across multiomic datasets. He has a good grasp of the challenges of dealing with unstructured scientific data, and collaboratively developing practical solutions to overcome these.