Having been an active member of the Pistoia Alliance since 2016, SciBite has been involved in numerous Pistoia projects. We are delighted to have strengthened this relationship of late, with the appointment of our Head of Ontologies, Jane Lomax, to the Board!
Pistoia events, for anyone that has ever attended, will attest, provide a great opportunity to come together as an industry to share insight, talk over common issues, and most importantly, identify where we can come together in a non-competitive setting to provide solutions that benefit all!
This year’s Spring Conference was the biggest Pistoia event yet, with approximately 300 attendees from over 100 organizations – a massive testament to the Alliance and all involved! Here I summarize some of the key takehomes from a typically engaging and collaborative Pistoia event.
The conference touched on a variety of important topics, including a healthy panel discussion on the importance of diversity and inclusion; both in supporting creativity but also in representative trial recruitment. Other areas included: data – its ownership, accessibility, and management; ontologies and standards to support data management, particularly the FAIRification thereof; and the utilization of technology, such as artificial intelligence, to facilitate data analysis.
All topics supported the common goal of Alliance members – accelerating the process of insight extraction. Gabriel Boronat, from Janssen, furthered this, introducing a thought-provoking concept; a marketplace to connect insights from various departments within an enterprise.
Ok, so what was covered? Let’s start with the data. There is still a common problem with data, particularly in the context of ownership, governance, and security, which if not addressed, can further data debt. A lot of these issues appear to come from scientists’ reluctance to share – but fear not, this restrictive culture can be addressed, either by education, incentivization, and change management… or more abruptly, via a mandate from above! Multimodal data that is, data spanning different types and contexts, is also referenced more and more; handling image data, sequence data, quantitative data, and associated metadata is just as important as textual data.
One of the main aims of Achim Plueckebaum, via the Data42 initiative at Novartis, was to ensure that the effort associated with accessing, managing and FAIRifying clinical data in the business was slashed – knowing that this could massively improve the ability to unlock hidden hypotheses that would otherwise remain unseen.
An understanding of the importance of quality, foundational data management by implementing the FAIR principles is ubiquitous across the Alliance. This assumption is supported by the notable lack of talks (bar Pablo Millan from AstraZeneca) that defined the FAIR principles highlighting the fact that the industry gets it and doesn’t need to be reminded. Instead, concepts such as FAIRgreen, a FAIRe(nough) framework, as well as metrics for defining how FAIR a dataset is were discussed – each building on the fact that FAIRification is a must, an effort that is dependent on standards.
Moving onto ontologies and standards. Whilst progress is being made across the industry, supported directly by the Alliance, in terms of creating or extending standards (just need to look at the great work of the SEED project for an example of this), there still appears to be some gaps on how best to utilize specific, more complex ontologies.
Looking at the IDMP work, an area that has made great strides in recent times, highlights the necessity to have standards accessible to all and in a format that can be easily understood; removing technical barriers and democratizing ontologies is paramount to them being embraced. Accessibility of these ontologies is also hugely important when it comes talking to the ROI of investing in an ontology-driven data-centric approach.
As well as others, the Data42 initiative and the ARC project (introduced by Maren Monnich, GSK) aim to provide data scientists with an accessible platform to perform data analysis. Many techniques for data analysis are utilized throughout the industry, including… wait for it… Artificial intelligence (AI).
Unless you have lived under a rock for the last 12 months, you will all be aware and will more than likely have played with ChatGPT. Mentions of ChatGPT, along with other large language models (LLMs) and machine learning (ML) approaches, were scattered throughout the day, with the afternoon panel talking a little more about the future of AI. It appears to be generally accepted by the evidence-based audience that although AI and ML can bring great improvements to the analysis of data, they need to be handled with care.
The adage “garbage in, garbage out,” has evolved somewhat to “misinformation in, misinformation out,” particularly in the context of LLMs. Although excitement around LLMs is clear to see, there is also clear that smaller, more explainable, and deployable models is the approach of choice, supported by some of the applications of AI mentioned by Richard Jackson of AstraZeneca during an informative QA session and the fantastic talk given by Reinhard Pietzsch who spoke of AI in the context of pharmacovigilance.
I was honoured to stand up and talk to the audience on how we, at SciBite, can support data-centric approaches to R&D; particularly in the context of clinical trial data. A topic that aligned nicely with the morning sessions from Gabriel Boronat and Achim Plueckebaum.
It’s always great to get face-to-face with the community and talk over the latest issues and trends. It’s clear to see that the alliance understands the value of FAIR data… but ensuring that the value add of ontologies can be communicated and that these ontologies can be made accessible to all are clear points that need to be considered. I look forward to the next!
Please reach out if you have any questions or comments about the above or would like to hear more on the topic touched on during my presentation!
Joe Mullen, Director of Science & Professional Services. Holds a Ph.D. from Newcastle University in the development of computational approaches to drug repositioning, with a focus on semantic data integration and data mining. He has been with SciBite since 2017, initially as part of the Data Science team.
1. [Webinar] How important is subject matter expertise in Life Sciences when using technology and artificial intelligence?
Watch on demand
2. [Blog] Revolutionizing Life Sciences: The incredible impact of AI in Life Science [Part 1]
3. [Blog] Why Use Your Ontology Management Platform as a Central Ontology Server, read more.
4. [Blog] SKOS in CENtree: Further support in our latest 2.1 release, read more.
5. [Blog] Talking to TERMite – introducing the SciBite scripting suite, read article.
Artificial intelligence (AI) has been applied to numerous aspects of the life sciences, from disease diagnosis to drug discovery; in the first of this two-part blog series, we outline the impact of AI in Life Science and illustrate the various success stories of AI in Life Science.Read
As discussed in part 1, Artificial intelligence (AI) has revolutionized several areas in life sciences, including disease diagnosis and drug discovery. In this second blog, we introduce some specific text-based models whilst also discussing the challenges and future impact of AI in Life Science.Read
Get in touch with us to find out how we can transform your data
© SciBite Limited / Registered in England & Wales No. 07778456