Parkinson’s disease (PD) is a progressive neurodegenerative and movement disorder. It affects millions of people worldwide [1], and its prevalence is estimated to increase [2]. While dopamine replacement therapy can treat its symptoms effectively, there is no cure for PD. Importantly, by the time it’s diagnosed, patients may have lost over half of their brain’s dopamine-producing cells, and damage to other brain systems may be underway [3]. So, early diagnosis is key.
Reliable biomarkers may address this challenge and help in developing accurate tests, ideally even before motor and cognitive symptoms appear [4]. Early intervention would then improve patients’ quality of life and alleviate the burden on carers and health systems.
Biomarker validation is a complex multi-step process, but regardless of the disease being studied, it’s necessary to research literature and documents to investigate roles that the marker plays in areas of interest. Using text strings to search articles for one or more specific genes or diseases may not expose knowledge that is closely relevant to our queries, ultimately delaying the discovery process.
Figure 1: A sample result of querying SciBite Search for the HSPA5 gene and applying a semantic-based filter as described below.
SciBite’s software tools harness the power of semantic enrichment to bring biomarker investigation to the next level, helping users to extract and integrate new bodies of knowledge, with our VOCabs – comprehensive Named Entity Recognition (NER) dictionaries based on authoritative resources such as MeSH, HGNC and ChEMBL – serving as the building blocks.
They cover a wide range of biomedical domains, from genes and pathways to symptoms, diseases, drugs and therapies, adverse events and more. Thanks to curation by SciBite’s in-house experts, these VOCabs are enriched with synonyms beyond their source ontologies that supercharge your text-searching and knowledge-extraction capabilities.
Figure 2: Detail of Search Filters functionality in SciBite Search. In the example, results of a literature search for the HSPA5 gene are filtered to return documents that also mention synucleinopathy diseases.
Let’s look at how SciBite Search, our semantic search solution, can be applied to biomarker validation in PD. We created a sample query using our VOCab entity for the HSPA5 gene, recently proposed as a co-classifier for de novo PD [4].
We then applied a taxonomy search filter, restricting results to articles relevant to synucleinopathies, a group of diseases of which PD is a subtype.
We configured the filter to include results not just for synucleinopathy, but also for diseases that are child nodes of the VOCab entity “Synucleinopathies”.
Figure 3: Detail of the taxonomy tree for Synucleinopathies. Using the search filters pictured above means that all entities in that tree branch are included in the query filter. Diseases in the SciBite INDICATION VOCab are based on the MeSH vocabulary.
The query returned ~1,300 results, starting from a dataset of ~4.5 million full-text articles from PubMed Central, which had been ingested into SciBite Search and indexed with SciBite VOCabs.
Entities highlighted in yellow are full matches to elements in the query; underlined entities match annotations from a range of VOCabs. The representative search result in the screenshot illustrates:
SciBite Search allows seamless, integrated search on multiple corpora of documents, and supports public data and proprietary information alike.
Discover more and connect!
[1] Parkinson disease (9 August 2023). World Health Organization. Retrieved from https://www.who.int/news-room/fact-sheets/detail/parkinson-disease. Accessed 09.04.2025.
[2] Dongning S., et al., Projections for prevalence of Parkinson’s disease and its driving factors in 195 countries and territories to 2050: modelling study of Global Burden of Disease Study 2021. BMJ, 2025 Mar 5;388:e080952. DOI: 10.1136/bmj-2024-080952. Accessed 09.04.2025.
[3] Parkinson’s Disease: Challenges, Progress, and Promise, National Institute of Neurological Disorders and Stroke, Retrieved from https://www.ninds.nih.gov/current-research/focus-disorders/parkinsons-disease-research/parkinsons-disease-challenges-progress-and-promise#toc-resources. Accessed 09.04.2025.
[4] Hällqvist, J. et al., Plasma proteomics identify biomarkers predicting Parkinson’s disease up to 7 years before symptom onset. Nat. Commun., 2024 Jun 18;15(1):4759. DOI: 10.1038/s41467-024-48961-3. Accessed 09.04.2025.
Paola Roncaglia is a skilled bioinformatician and biomedical ontologist based in Trieste, Italy. With over 20 years of experience, she specializes in semantic enrichment and analysis of large-scale biological data. Currently a Scientific Curation Consultant at SciBite, Paola previously worked as a Gene Ontology Developer at the European Bioinformatics Institute. She holds a Ph.D. in Biophysics and Neurobiology and has made significant contributions to the integration of biological data, resulting in numerous impactful publications.