Enterprise Search, a complex challenge in extracting accurate results from the ever-increasing volumes and variability of big data. But, and it’s a big but, how does a search system know what to look for?
Enterprise search represents a complex challenge: extracting accurate results from the ever-increasing volumes and variability of data.
But (and it’s a big but), how does a search system know what to look for? Hammond’s syndrome, Anton Vogt syndrome, Athetosis are all terms referring to the same condition – yet a computer will not be able to determine this without some help. Try searching for these in your web browser, do you get the same results? No?
The role of semantics to bring clarity to synonymous data.
Semantics are at the heart of what we do at SciBite. We offer a collection of over 85 scientific vocabularies covering a diverse range of topics across the Biopharmaceutical R&D process (Genes, Adverse Events, Pathology, Indications etc.). Each vocabulary contains a list of terms (we call them entities) and their various synonyms, which, enable searching to become more scientifically aware and ultimately simplifying the experience for the end user – nobody wants a system more complicated than it needs to be right?
So you’re now sold on the power of semantic enrichment. The next step is to ensure that these vocabularies are both exhaustive and well maintained, as what is the value if they are out of date?
Our semantic library is constantly updated and expanded by a team of scientific experts. Across the collection of vocabularies we can count over 20 million synonyms, many fold enriched on what may be available in the public domain.
Consider MeSH, a series of controlled vocabularies for the purpose of indexing information for life sciences a reference example. Search for Hammond’s syndrome and MeSH has 14 synonyms; SciBite has twice that at 28. Abetalipoproteinemia has 7 synonyms in MeSH, SciBite has more than 100. A greater reference library will yield more extensive results, simple.
Once a system can identify and extract entities, it then needs to determine the correct meaning in the case of ambiguity. EGFR could be Epidermal Growth Factor Receptor or Estimated-Glomerular Filtration Rate. Our tools not only identify and extract entities in text but they also deal with disambiguation to uncover, where possible, the correct meaning of such terms.
With enterprise search initiatives aiming to interrogate vast big data architectures, adding in our semantic capabilities will provide users with more comprehensive, relevant and accurate results than ever before.
Explore the value of our semantic search layer.
Announcing the latest version of our flagship text analytics software for life sciences, TERMite 5.9.
ReadOver the 50 years how we collect and play music has changed dramatically from physical copies on Vinyl through to electronic mp3s. Each new technology often requires a new device and format to play yet it is still essentially just music.
ReadGet in touch with us to find out how we can transform your data
© SciBite Limited / Registered in England & Wales No. 07778456