How Artificial Intelligence Can Transform the Diagnosis of a Polygenic Disorder
It’s been a while since I last posted here - life has been a bit busy with school, exams, and a few other projects I’ve been giving more time to. I do have a couple of posts in the works that I’m hoping to publish soon, so this blog definitely hasn’t been forgotten!
This post is slightly different from what I usually write. I wrote this for an article competition for the Penrose Magazine (for which I am now a Biology ambassador!) and it's my first piece of more formal academic writing. The idea actually came from one of my earlier blog posts, where I first explored the intersection of AI and hypermobile Ehlers-Danlos Syndrome (hEDS). This version is more detailed and goes much deeper into the science - I've focused more on research and analysis, and less on personal reflection.
Writing this taught me a lot - from how to structure a scientific argument more clearly, to using IEEE style citations and engaging with sources in a more rigorous way. It was definitely a challenge, but one I really enjoyed, and it’s made me more confident in exploring scientific writing in the future.
I’d love to know what you think :)
Exploring Hypermobile Ehlers-Danlos Syndrome: How Artificial Intelligence Can Transform the Diagnosis of a Polygenic Disorder
Genetic disorders are complex, and current diagnostic tools cannot always detect or explain them effectively. Improved understanding enables faster, more accurate diagnosis and treatment. Hypermobile Ehlers-Danlos Syndrome is a connective tissue disorder which is often invisible.
The problem with hEDS is that it lacks a known genetic marker, and its symptoms vary widely between individuals, overlap with other conditions, and may be subtle or intermittent. These factors contribute to diagnostic uncertainty and delay. Artificial Intelligence can be a useful solution for this, as it has the capacity to process huge amounts of data and identify patterns, enabling it to provide insights at a scale and speed beyond human capability [1].
Connective tissue provides structural support and cohesion between cells, tissues and organs, and it contributes to a large number of body functions, such as transporting nutrients, defending against pathogens, storing fat, and repairing damaged tissues [2]. Connective tissue consists of an extracellular matrix (a protein network that supports and structures cells and tissues in the body) and various cells. Most connective tissues are composed of ground substances, fibres, and cells, although some specialised fluid connective tissues do not have fibres [3]. As a result, hEDS can cause widespread pain, fatigue, and joint instability and affect organ tissue, skin, digestion, and even brain function. Despite its wide-ranging impact, diagnosis is often delayed by years due to reliance on symptom-based criteria, overlap with other conditions and a lack of definitive tests [4]. Because it is invisible both to the eye and often to current scientific tools, hEDS remains misunderstood and frequently misdiagnosed. Current figures estimate that it affects 1 in 3000 people; however, it is now thought that figures for the prevalence of hEDS are an underestimate [5]. There are 13 recognised types of Ehlers-Danlos Syndrome, and hEDS is the only one without a currently identified genetic mutation [6].
Despite ongoing research to identify the genetic basis of hEDS, no single genetic mutation has been identified. Scientists think that hEDS is likely to be polygenic [7], meaning it is caused by a combination of multiple gene variants each contributing in a small way. Consequently, research into hEDS is complicated by the small individual effects of each variant and the absence of a single causative mutation. Additionally, many variants are common in the general population who do not have hEDS, which makes it difficult to identify which gene variants are linked to hEDS [8].
Additionally, hEDS isn’t a straightforward condition as symptoms can vary widely between individuals. Symptoms can be represented on a circular spectrum as shown in Figure 1 (below).
Different people present different levels of symptoms such as joint instability, fatigue, brain fog, chronic pain, digestive issues, etc. This symptom heterogeneity means that no two people experience hEDS in the exact same way, which makes it difficult to research using traditional tools due to very different symptom profiles [9].
In genetic research, scientists need to clearly define who does and does not have a condition so that they are able to compare groups accurately. Symptom heterogeneity complicates research because patients may be grouped together despite differing underlying genetic and biological contributors [10]. Additionally, when a study includes patients with varying symptoms, any shared genetic patterns become much harder to identify which can hide genetic signals because it creates extra variation in the data and makes patterns harder to see [11]. Therefore, even if certain gene variants do contribute to hEDS, traditional genetic studies may fail to detect them.
Another reason why hEDS resists traditional genetic research is due to small sample sizes. Genetic research needs large sample sizes because genetic effects are usually small, requiring very large datasets to detect small effects. If only a few people are studied, researchers are unable to tell whether a genetic pattern is just random chance or actually linked to the condition [12]. In hEDS research, studies often have small sample sizes because hEDS is underdiagnosed and misdiagnosed, and strict diagnostic criteria reduce the number of eligible participants. Additionally, research funding and recruitment are limited. All of this means that researchers may only have dozens or hundreds of patients instead of tens of thousands.
When you combine the two problems of symptom heterogeneity and small sample sizes, it results in weak patterns. If each person has different symptoms and possibly different genetic contributors, shared genetic signals become faint, and patterns do not repeat enough to mean anything. Additionally, statistical methods stop working well as they are heavily reliant on repetition and consistency. Small diverse groups reduce statistical power and increase false negatives, which means that even if a gene does matter, the study might miss it.
Currently, genetic research relies on whole genome sequencing, which reads every base pair of DNA, or exome sequencing, which only looks at the 2% of DNA which codes for proteins. WGS allows a wide range of variant types in a large number of genes to be tested simultaneously, making it the most comprehensive genomic test available [13]. WES tests a much smaller region of the genome, so it’s faster and generates fewer variants of uncertain significance than WGS [14]. When using WES and WGS in hEDS research, researchers compare the DNA of patients with hEDS to healthy controls to try to find patterns. However, this is a slow and limited process, relying heavily on manual comparison and strong statistical signals which polygenic conditions often don’t produce. With complex conditions like hEDS where there are large variations between patients, it is extremely difficult to draw meaningful conclusions with traditional tools. WGS generates large numbers of variants, which traditional analyses filter individually based on predefined assumptions, limiting discovery in complex disorders. These tools are very powerful, but not ideal for polygenic and heterogeneous conditions like hEDS.
Artificial Intelligence offers a potential approach to address the challenges of analysing polygenic and heterogeneous conditions. AI is suited to polygenic conditions as machine learning can help with pattern detection by analysing thousands of genomes simultaneously to identify the combinations of variants rather than single mutations [15]. AI looks for patterns across many weak signals rather than one strong signal, which is exactly what is required for hEDS research. Because machine learning is able to analyse thousands of genomes at once and spot tiny combinations of genetic variants that may correlate with hEDS symptoms, these models can detect patterns that humans are unable to see. AI is also able to go beyond genetics alone by combining genetic data with clinical records, imaging, joint flexibility scores, and symptom histories. Traditional methods usually analyse these separately, and combining different factors takes much more time and effort than it would with AI [16]. By combining data types, researchers can reveal deeper relationships with hEDS. Since symptoms don’t necessarily map cleanly onto genes, AI can be a powerful tool by linking genetic profiles to specific symptom clusters and identifying patterns around different data types simultaneously. Incorporating longitudinal data, such as symptom progression and repeated clinical measurements over time, could further improve AI predictions by allowing models to track symptom progression and changes in clinical features that static snapshots of patient data may miss. This is particularly valuable for fluctuating conditions such as hEDS [17]. However, the performance of AI models does depend on the quality and diversity of data they are trained on. Incomplete, biased or unrepresentative datasets can lead to inaccurate predictions, misclassification, or underdiagnosis in certain groups. Ensuring high-quality and representative datasets is therefore crucial for reliable AI-assisted diagnosis.
The use of AI in research could potentially solve the heterogeneity problem by using ‘unsupervised learning’. This allows it to cluster patients without predefined labels based on shared patterns rather than just diagnostic labels [18]. This may even reveal subtypes of hEDS with better-defined groups of similar biology but different symptoms, resulting in more targeted research and treatment. Current diagnostic criteria may group together biologically different patients due to heterogeneity, but subtypes could reflect different underlying mechanisms. All of this is incredibly important as it can have real-world impact. AI models trained on known cases could flag patients earlier than human doctors are able to and reduce the diagnostic delay, especially in patients with atypical presentations of hEDS. For example, an AI model could analyse a patient’s genetic profile alongside clinical history and symptom patterns, and flag potential hEDS cases even when joint instability or fatigue alone might not trigger a referral. This could alert clinicians to investigate further, reducing the diagnostic delay commonly reported by patients. This is particularly relevant for conditions like hEDS that are invisible illnesses and often misdiagnosed.
Hypermobile Ehlers-Danlos Syndrome presents a major challenge for genetic research as it is likely a polygenic condition and has symptom heterogeneity. Traditional genetic approaches are powerful for single-gene disorders, however they struggle to identify patterns in complex and variable conditions such as hEDS. Artificial Intelligence offers a promising solution by enabling pattern-based analysis across large datasets. By combining genetic data with clinical data, imaging and symptom information, AI has the potential to uncover hidden relationships, reduce diagnostic uncertainty, and perhaps identify biological subtypes. In the future, AI-driven research tools could support patients with hEDS. However, deploying AI in clinical settings faces challenges, such as difficulty with integration with electronic health records, clinician training, regulatory approval, and maintaining patient privacy and data security. Even though much further research and validation are needed, AI represents a powerful step towards improving understanding, diagnosis, and outcomes for individuals living with this often misunderstood condition.
Citations reference list:
[1] S. Joksimovic, D. Ifenthaler, R. Marrone, M. De Laat, and G. Siemens, “Opportunities of artificial intelligence for supporting complex problem-solving: Findings from a scoping review,” Computers and Education: Artificial Intelligence, vol. 4, no. 100138, p. 100138, 2023, doi: https://doi.org/10.1016/j.caeai.2023.100138.
[2] P. Kamrani, A. Jan, G. Marston, and T. C. Arbor, “Anatomy, Connective Tissue,” National Library of Medicine, Mar. 05, 2023. https://www.ncbi.nlm.nih.gov/books/NBK538534/
[3] P. Kamrani, A. Jan, G. Marston, and T. C. Arbor, “Anatomy, Connective Tissue,” National Library of Medicine, Mar. 05, 2023. https://www.ncbi.nlm.nih.gov/books/NBK538534/
[4] “How do I get diagnosed with Ehlers-Danlos Syndrome? | The EDS Clinic,” Eds.clinic, 2024. https://www.eds.clinic/articles/how-to-official-eds-diagnosis?
[5] “Are the Ehlers-Danlos Syndromes and Hypermobility Spectrum Disorders Rare or Common? - The Ehlers Danlos Society,” The Ehlers Danlos Society, Aug. 30, 2024. https://www.ehlers-danlos.com/prevalence/
[6] The Ehlers-Danlos Society, “What is EDS?,” The Ehlers Danlos Society, 2017. https://www.ehlers-danlos.com/what-is-eds/
[7]“hEDS-START study | Institute of Genetics and Cancer | Institute of Genetics and Cancer,” Institute of Genetics and Cancer, Sep. 23, 2024. https://institute-genetics-cancer.ed.ac.uk/research/research-groups-a-z/ralston-group/heds-start
[8] “The HEDGE Study - The Ehlers Danlos Society,” The Ehlers Danlos Society, Nov. 04, 2025. https://www.ehlers-danlos.com/the-hedge-study/
[9] “Challenges and Progress in Diagnosing Ehlers-Danlos Syndrome,” Rheumatology Advisor, Oct. 03, 2025. https://www.rheumatologyadvisor.com/features/diagnosing-ehlers-danlos-syndrome/
[10] E. Feczko, O. Miranda-Dominguez, M. Marr, A. M. Graham, J. T. Nigg, and D. A. Fair, “The Heterogeneity Problem: Approaches to Identify Psychiatric Subtypes,” Trends in Cognitive Sciences, vol. 23, no. 7, pp. 584–601, Jul. 2019, doi: https://doi.org/10.1016/j.tics.2019.03.009.
[11] M. Jackson, L. Marks, G. H. W. May, and Joanna B. Wilson, “The genetic basis of disease,” Essays In Biochemistry, vol. 62, no. 5, pp. 643–723, Dec. 2018, doi: https://doi.org/10.1042/ebc20170053.
[12] C. C. Serdar, M. Cihan, D. Yücel, and M. A. Serdar, “Sample size, Power and Effect Size revisited: Simplified and Practical Approaches in pre-clinical, Clinical and Laboratory Studies,” Biochemia Medica, vol. 31, no. 1, pp. 27–53, Feb. 2021, doi: https://doi.org/10.11613/bm.2021.010502.
[13] “Whole genome sequencing — Knowledge Hub,” GeNotes. https://www.genomicseducation.hee.nhs.uk/genotes/knowledge-hub/whole-genome-sequencing/#ad vantages-and-limitations-of-wgs
[14] A. Frost, “Whole exome sequencing — Knowledge Hub,” GeNotes, 2016. https://www.genomicseducation.hee.nhs.uk/genotes/knowledge-hub/whole-exome-sequencing/#how -does-it-work (accessed Jan. 08, 2026)
[15] G. Soldà and R. Asselta, “Applying artificial intelligence to uncover the genetic landscape of coagulation factors,” Journal of thrombosis and haemostasis : JTH, vol. 23, no. 4, pp. 1133–1145, Apr. 2025, doi: https://doi.org/10.1016/j.jtha.2024.12.030 .
[16] M. Wang, W. Chang, and Y. Zhang, “Artificial Intelligence for the Diagnosis and Management of Cancers: Potentials and Challenges,” MedComm, vol. 6, no. 11, pp. e70460–e70460, Nov. 2025, doi: https://doi.org/10.1002/mco2.70460.
[17] A. Rajkomar, J. Dean, and I. Kohane, “Machine Learning in Medicine,” New England Journal of Medicine, vol. 380, no. 14, pp. 1347–1358, 2019, doi:https://doi.org/10.1056/nejmra1814259.
[18] C. M. Eckhardt et al., “Unsupervised machine learning methods and emerging applications in healthcare,” Knee Surgery, Sports Traumatology, Arthroscopy, vol. 31, no. 2, pp. 376–381, Nov.2022, doi: https://doi.org/10.1007/s00167-022-07233-7.
Comments
Post a Comment