The work presented here describes the winning entry to a challenge set by ARM, Atos and Cavium at the Wellcome Genome Campus Hackathon 2018.
Image Source: Paul Rogers
Essential to the success of the project was our team of fellow Hackathon delegates from Cambridge University, Copenhagen University, KANO Computing, King’s College London and Red Hat each offering unique skills and insights into how the challenge could be addressed.
Hackathon participants were challenged to develop innovations in the use of mobile technology for biological data processing. The challenge sponsors highlighted the fact that while our everyday mobile devices frequently make use of compute-intensive techniques like image processing and machine learning, there remains a notable lack of biodata applications for this type of hardware. Almost invariably, biological processing is performed in centralized laboratories and data centers. But, with the capacity of hardware continuously increasing, this is ceasing to be a necessity for many tasks.
One such computationally intensive biodata task is genetic sequencing. By looking at an individual’s genome, scientists are able to predict susceptibility to an array of illnesses, as well as adverse events in response to certain medications. However, the genome is, for all intents and purposes, static. Once it has been sequenced, there is no need to sequence it again, and so there seems to be limited use in making this technology portable.
As humans, however, we are not alone; within each of us is an ecosystem forever in flux. It is a commonly cited figure that microbial cells outnumber human cells in the body by a ratio of 10:1. More recent research places the ratio much closer to 1:1, but what remains undisputed is that the composition of this ecosystem, known as the microbiome, can have a profound impact on our health and our reaction to medications.
When the microbiome and the body become desynchronized, the results can be deeply unpleasant, and none are more conscious of this than those who suffer from IBDs (inflammatory bowel diseases). IBDs used to have relatively low incidence rates, but the western world in particular, has seen those incidence rates rise dramatically over the past 50 years. It is clear that, while an individual’s genetics play an important role in their being susceptible to these diseases, the environment is also having an enormous impact. Sufferers of these diseases tend to have microbiome profiles significantly different from the general population, and ongoing research aims to target the microbiome with therapies in order to rectify these imbalances.
This informed the basis of team GoGut’s project. If we could deliver microbiome sequencing pipelines and analysis directly into the hands of users, it would empower them to monitor their own microbiome and quickly react to downward trends or identify positive stimuli. What’s more, as data is accumulated it will become more and more feasible to recommend lifestyle, diet or medication changes to pursue (or avoid) based on the reaction of other users with similar distributions of gut flora. This will also enable less invasive diagnostics and monitoring, and generate vital data for clinicians.
The problem from a patient’s perspective can be summarised in the following statement:
In a recent news story, for example, medical students complained that they were not being taught about nutrition, despite the fact that diet is suspected to play a huge role in the recent growth in number of IBD sufferers. This is partly due to the difficulty in controlling variables when attempting to look at diet and the complexity of the effects of foods on the body. One result of this is that there are many foods which have now been implicated in both causing and preventing cancer. The data generated by individuals monitoring their microbiomes in their own homes will enable doctors to more confidently recommend dietary changes.
By connecting a portable sequencer (such as those produced by Nanopore) to an embedded device containing a pipeline augmented with AI analytics, a user would be able to perform the assessment procedure, from start to finish, from the comfort of their own home, and then monitor the progression of their illness as they attempt to get it under control. They could be helped by a suggestive model, recommending changes tailored to their own microbiome, which would become more and more powerful as more people adopted the platform.
Team GoGut therefore tested the feasibility of porting a complete metabolomics pipeline for analysing an individual’s microbiome onto ARM’s latest portable architecture. This pipeline filters out human DNA, aligns the remaining DNA to microbial species and then calculates the distribution. It can also be used to analyse the transcriptomic data of those species in order to assess what the microbes are doing, allowing more fine-tuned analysis. With the code successfully ported by the software engineers of the team, SciBite’s Oliver Giles implemented a neural network to receive the output of the pipeline – in the form of a proportional profile of microbe species – to predict disease status (Figs 1-2).
Figure 1. Neural network diagram. Neural network accepts a microbiome profile, which in turn is fed through hidden layers to extract patterns which are used for the prediction of disease (IBD) status
Figure 2. Neural network diagram. The genus-level makeup of the microbiome (a single column) is graphed over time with disease status (overlayed in black) in order to assess the impact lifestyle changes have on disease status
With a very small dataset of files from healthy control patients and IBD-affected patients fed through the pipeline, the neural network was able to predict the individual’s IBD status with over 90% accuracy. This could be improved upon greatly with more time and data. A derivative score from the output of the neural network can then be tracked over time, allowing individuals to observe the effect of lifestyle/medication changes on their disease. The next step would be to use the data generated in this process to craft suggestive models, offering individual’s advice based on what has been effective in sufferers with similar microbiome profiles to themselves.
Thanks go out to everybody from team GoGut, the ARM and Atos teams, the mentors and the campus team for hosting a fantastic event.
In this blog we cover how to look potentially reduce the cost of and speed up the repurposing pipeline.Read
Disease detective part 3:
In our final disease detective article, we’ll take Part 2’s topic a little further and zoom in on how we can find new relationships between diseases where direct evidence is sparse.
Get in touch with us to find out how we can transform your data
© SciBite Limited / Registered in England & Wales No. 07778456