The Enigma of the Genome: How "Junk DNA" Emerged as the Genetic Dark Matter Shaping Our Health
Once dismissed as evolutionary debris, the vast non-coding regions of our genome are now understood to be a complex and vital command center, orchestrating everything from our development to our susceptibility to disease. This is the story of how science's biggest mystery became its most promising frontier.For decades, a tantalizing puzzle lay at the heart of genetics. When the first glimpses of the human genome became available, scientists were met with a startling revelation: a mere 2% of our DNA contains the recipes for the proteins that build and operate our bodies. The other 98%, a vast and seemingly barren expanse, was initially christrocinated with the dismissive moniker "junk DNA." This so-called genetic detritus was long thought to be a meaningless collection of evolutionary leftovers, a genomic graveyard of failed experiments and parasitic elements.
But as the resolution of our scientific instruments sharpened, a new and far more intricate picture began to emerge. This "junk" was not junk at all. Instead, it was revealed to be a bustling, dynamic, and profoundly important regulatory landscape, a hidden layer of control that dictates how, when, and where our genes are switched on and off. This is the genetic dark matter that, while not shining with the obvious light of protein-coding genes, exerts a powerful gravitational influence on our entire biology. The journey from "junk DNA" to the "dark matter of the genome" is a paradigm shift that has not only rewritten textbooks but is also paving the way for revolutionary new approaches to understanding and treating human disease.
From "Junk" to Gem: A Historical Perspective on Non-Coding DNA
The notion that a significant portion of our genome might be non-functional has its roots in the mid-20th century. Pioneers of population genetics like J.B.S. Haldane and Hermann Muller raised a critical question: if every piece of our DNA was essential, wouldn't the natural rate of deleterious mutations be overwhelmingly high, leading to a genetic load that no species could survive? This led to the early hypothesis that only a small fraction of the genome could be functionally indispensable.
The term "junk DNA" was formally introduced into the scientific lexicon in the 1960s, but it was geneticist Susumu Ohno who popularized it in his seminal 1972 paper, "So much 'junk' DNA in our genome." Ohno's argument was multifaceted. He pointed to the "C-value paradox," the perplexing observation that the size of an organism's genome does not correlate with its biological complexity. For instance, some amphibians and plants have genomes many times larger than humans. This suggested that the excess DNA in these organisms was not contributing to their complexity and was therefore likely non-functional.
Ohno also expanded on his theory of evolution by gene duplication, proposing that much of this "junk" was the fossilized remains of genes that had been duplicated and then silenced by mutations over evolutionary time. In his view, our genome was akin to an archaeological site, strewn with the "fossil remains of extinct genes."
The discovery of introns in the 1970s seemed to bolster the "junk DNA" theory. Introns are non-coding sequences that interrupt the protein-coding regions (exons) of genes and are spliced out before a protein is made. The existence of these lengthy intervening sequences further suggested that much of our DNA was not directly contributing to the final protein product.
Adding to this picture was the discovery of transposable elements (TEs), or "jumping genes." These are DNA sequences that can move from one location in the genome to another. In the 1980s, it became clear that a huge portion of the repetitive DNA in the human genome was made up of these TEs. Scientists like Francis Crick and Leslie Orgel championed the idea that these were "selfish DNA" elements, acting as parasites within our genome, propagating themselves without providing any benefit to the host.
However, even during the heyday of the "junk DNA" concept, there were dissenting voices. Many scientists were uncomfortable with the idea that the vast majority of our genetic material was useless. They proposed that this non-coding DNA might have other, more subtle functions, such as regulating gene expression or maintaining the structure of our chromosomes. As we will see, these early inklings of a more complex reality were remarkably prescient.
The ENCODE Revolution: A New Encyclopedia of the Genome
The early 21st century marked a major turning point in our understanding of the genome. The completion of the Human Genome Project in 2003 provided the first comprehensive look at our genetic blueprint, but it also highlighted how little we knew about the function of the vast non-coding regions. To address this, the National Human Genome Research Institute (NHGRI) launched the ENCyclopedia Of DNA Elements (ENCODE) project. This ambitious international collaboration aimed to create a "parts list" of all the functional elements in the human genome.
The initial phase of ENCODE, a pilot project focusing on just 1% of the genome, yielded surprising results. Published in 2007, the findings provided compelling evidence that the genome was "pervasively transcribed," meaning that the vast majority of its DNA was being copied into RNA, not just the protein-coding genes.
The full-scale ENCODE project, the results of which were published in a flurry of papers in 2012, went even further. The consortium, which had grown to include hundreds of scientists from around the world, had analyzed 147 different cell types using a variety of high-throughput techniques. Their bombshell conclusion was that a staggering 80% of the human genome has some kind of "biochemical function." This finding was widely reported in the media as the death knell for "junk DNA."
The ENCODE project identified millions of "gene switches" – regulatory elements that control when and where genes are turned on and off. These findings had profound implications for our understanding of disease. For instance, it was already known that about 90% of the genetic variations associated with common diseases are located outside of protein-coding genes. ENCODE's work suggested that these disease-associated variants were likely disrupting the function of these newly discovered regulatory elements.
However, the "80% functional" claim was not without its critics. A heated debate erupted in the scientific community over what it means for a piece of DNA to be "functional." The ENCODE consortium used a broad definition of function, classifying any part of the genome that showed "specific biochemical activity," such as being transcribed into RNA or having a protein bind to it, as functional.
Critics, however, argued for a stricter, evolutionary definition of function based on natural selection. They contended that for a DNA sequence to be truly functional, its sequence must be conserved over evolutionary time, indicating that it is important for survival. They pointed out that only about 5-10% of the human genome is evolutionarily conserved, a far cry from ENCODE's 80% figure. This discrepancy became known as the "ENCODE incongruity."
Proponents of the ENCODE findings countered that the project was a biomedical discovery effort, not an evolutionary one, and that its goal was to identify all biochemically active regions that might be relevant to human health. They argued that even if a sequence is not strictly conserved, its activity could still have important biological consequences.
Despite the controversy, the ENCODE project fundamentally shifted the way we view the genome. It moved the conversation away from the simplistic "junk" versus "not junk" dichotomy and towards a more nuanced understanding of the genome as a complex, multi-layered information processing system. It laid the groundwork for a new era of research into the "dark matter" of the genome and its profound influence on our health.
The Dark Matter's Secrets: Unveiling the Functions of Non-Coding DNA
The revelations from the ENCODE project and subsequent research have unveiled a dazzling array of functions for the once-dismissed "junk DNA." This genetic dark matter is not a void, but a universe of regulatory elements and functional molecules that are essential for life.
The Conductors of the Genetic Orchestra: Regulatory Elements
Much of the non-coding DNA acts as a sophisticated control panel for our genes. These regulatory elements are short sequences of DNA that don't code for proteins themselves but act as binding sites for proteins called transcription factors. These transcription factors, in turn, control the rate at which genes are transcribed into RNA. The main types of regulatory elements include:
- Enhancers: These are the accelerators of the genome. They can be located thousands of base pairs away from the gene they regulate, but when a transcription factor binds to an enhancer, it dramatically increases the transcription of that gene. They often work in a tissue-specific manner, ensuring that genes are turned on only in the cells where they are needed.
- Silencers: As their name suggests, silencers are the brakes of the genome. They bind to repressor proteins that interfere with the transcription machinery, slowing down or stopping gene expression. Like enhancers, they are crucial for ensuring that genes are expressed at the appropriate times and in the appropriate cells.
- Insulators: These elements act as barriers, preventing enhancers from activating the wrong genes or silencers from repressing genes that should be active. They help to organize the genome into distinct regulatory neighborhoods.
The interplay between these regulatory elements is a complex dance that fine-tunes gene expression during development and in response to environmental cues.
The Three-Dimensional Genome: A World of Loops and Folds
For many years, the genome was depicted as a linear string of DNA. However, we now know that it is folded into a complex three-dimensional structure within the nucleus. This 3D architecture is not random; it plays a critical role in gene regulation.
Chromatin, the complex of DNA and proteins that makes up our chromosomes, is organized into loops. These loops bring distant enhancers and their target promoters into close physical proximity, allowing for the activation of gene expression. The formation of these loops is often mediated by proteins like CTCF and cohesin, which bind to specific sites in the non-coding DNA.
This 3D organization also creates insulated neighborhoods called topologically associated domains (TADs). These TADs help to ensure that enhancers only interact with the genes within their own domain, preventing promiscuous gene activation. Disruptions to this 3D architecture, often caused by mutations in non-coding DNA, can lead to developmental disorders and diseases like cancer.
The RNA Revolution: When the Message is the Medium
One of the most profound discoveries in modern biology is that RNA is not just a passive messenger between DNA and protein. A vast and diverse class of non-coding RNAs (ncRNAs) are transcribed from our "junk DNA," and these molecules have a wide range of important functions.
- Long Non-Coding RNAs (lncRNAs): As their name suggests, lncRNAs are long RNA molecules (over 200 nucleotides) that do not code for proteins. They are a diverse and versatile class of molecules that can act as:
Signals: Some lncRNAs are produced in response to specific stimuli, acting as signals to trigger a cellular response.
Decoys: LncRNAs can act as molecular sponges, binding to and sequestering transcription factors or other proteins, preventing them from carrying out their normal functions.
Guides: LncRNAs can guide chromatin-modifying enzymes to specific locations in the genome, where they can either activate or repress gene expression. A classic example is the lncRNA Xist, which coats one of the X chromosomes in female mammals, leading to its inactivation to ensure equal dosage of X-linked genes between males and females. Another well-studied lncRNA, HOTAIR, acts as a scaffold to bring two different protein complexes to specific genes, leading to their repression.
Scaffolds: LncRNAs can act as scaffolds, bringing together multiple proteins to form a functional complex.
- MicroRNAs (miRNAs): These are small RNA molecules (around 22 nucleotides long) that play a major role in post-transcriptional gene regulation. They typically bind to messenger RNAs (mRNAs), leading to their degradation or preventing them from being translated into proteins. It is estimated that miRNAs regulate the expression of at least half of all human genes, and they are involved in virtually every biological process.
The discovery of this vast and complex world of non-coding RNAs has fundamentally changed our understanding of gene regulation. It has revealed a hidden layer of control that is just as important as the protein-coding genes themselves.
The Engine of Evolution: How "Junk DNA" Drives Genetic Innovation
Far from being a static graveyard of evolutionary relics, the non-coding genome is a dynamic and creative force in evolution. The very features that once led to its dismissal as "junk" – its repetitive nature and its population of "selfish" transposable elements – are now understood to be powerful engines of genetic change.
The Evolutionary Arms Race: A Battle Within Our Genome
Our genomes are in a constant state of conflict. Transposable elements, which make up nearly half of our DNA, are driven to replicate and insert themselves throughout the genome. This can be highly disruptive, causing mutations that can lead to disease. In response, our genomes have evolved sophisticated defense mechanisms to silence these "jumping genes."
One of the main lines of defense is a large family of proteins called KRAB-zinc finger proteins (KRAB-ZFPs). These proteins have rapidly evolved to recognize and bind to new TEs, recruiting other proteins to epigenetically silence them. This has led to an "evolutionary arms race" where TEs are constantly evolving to evade recognition by the host's defense machinery, while the host is constantly evolving new KRAB-ZFPs to counteract them. This ongoing battle has been a major driver of the evolution of our gene regulatory networks.
From Parasite to Partner: The Creative Power of Exaptation
While TEs can be destructive, they can also be a source of genetic innovation. Over evolutionary time, many of these "selfish" elements have been co-opted or "exapted" by the host genome for new functions.
TEs are peppered with regulatory sequences that can act as enhancers, promoters, and insulators. When a TE inserts itself near a gene, it can create a new regulatory switch, altering the gene's expression pattern. This has been a major source of evolutionary novelty. For example:
- The evolution of the mammalian brain has been linked to the acquisition of new enhancers derived from a type of TE called AmnSINE1.
- A class of TEs called MER41B contains binding sites for the transcription factor STAT1, and these have been repurposed to create new regulatory elements involved in our innate immune response.
- Some TEs have even been "domesticated" to become new host genes. The most famous example is the RAG1 gene, which is essential for the development of our adaptive immune system and is derived from a DNA transposon.
The Viral Ghosts in Our Genome
About 8% of the human genome is made up of human endogenous retroviruses (HERVs), the fossilized remains of ancient retroviral infections that integrated into the germline of our primate ancestors millions of years ago. While most of these HERVs are now inactive due to mutations, some have been co-opted for important biological functions.
One of the most remarkable examples is the syncytin genes, which are derived from the envelope genes of ancient retroviruses. These genes are essential for the formation of the placenta, the organ that nourishes the developing fetus. It is thought that the fusogenic properties of the viral envelope protein were repurposed to help form the syncytiotrophoblast, a layer of cells in the placenta that is formed by the fusion of multiple cells. The acquisition of syncytin genes from different retroviruses appears to have occurred independently in different mammalian lineages, a striking example of convergent evolution.
HERVs have also played a major role in shaping our immune system. Their long terminal repeats (LTRs) contain regulatory sequences that can be activated by interferons, key signaling molecules in the immune response. The dispersal of these LTRs throughout the genome has helped to create a vast network of interferon-inducible genes, enhancing our ability to fight off infections.
The study of TEs and HERVs has transformed our view of the non-coding genome. What was once seen as a collection of parasites and junk is now understood to be a dynamic and creative force, a genomic playground where evolution can experiment with new functions and drive the emergence of novel traits.
When the Dark Matter Goes Awry: Non-Coding DNA in Human Disease
The discovery of the vast regulatory landscape within our non-coding DNA has opened up a new frontier in our understanding of human disease. It is now clear that mutations in these regions, which were once invisible to standard genetic tests, can have profound consequences for our health. Over 90% of the genetic variants associated with common diseases through genome-wide association studies (GWAS) are located in non-coding DNA, highlighting the critical role of these regions in disease susceptibility.
The Broken Switches: Non-Coding Mutations in Disease
Mutations in enhancers, silencers, and other regulatory elements can disrupt the delicate balance of gene expression, leading to a wide range of diseases. These mutations can:
- Create or destroy binding sites for transcription factors: This can lead to the inappropriate activation or repression of genes.
- Disrupt the 3D architecture of the genome: Mutations in insulator elements or other architectural sequences can alter chromatin looping, leading to enhancers activating the wrong genes.
This new understanding of the non-coding genome is providing insights into a host of diseases:
- Cancer: Many cancers are driven by the overexpression of oncogenes or the underexpression of tumor suppressor genes. Mutations in non-coding regulatory elements can contribute to this by altering the expression of these key cancer genes. For example, mutations in the promoter of the TERT gene are common in many cancers and lead to its overexpression, which allows cancer cells to divide indefinitely.
- Neurological and Neurodevelopmental Disorders: The development and function of the brain are exquisitely sensitive to the precise timing and levels of gene expression. It is therefore not surprising that mutations in non-coding DNA have been linked to a variety of brain disorders, including language impairment, autism, schizophrenia, and bipolar disorder. For example, a recent study found that non-coding repeat expansions are a significant cause of familial and sporadic epilepsy and other neurological syndromes.
- Cardiovascular Disease: Non-coding RNAs, such as lncRNAs and miRNAs, have been shown to play critical roles in the development and progression of cardiovascular diseases, including cardiac hypertrophy, heart failure, and myocardial infarction. For instance, the lncRNA CHRF has been shown to induce cardiac hypertrophy by acting as a sponge for miR-489.
- Autoimmune Diseases: The immune system must be tightly regulated to distinguish between self and non-self. Dysregulation of gene expression in immune cells can lead to autoimmune diseases, where the body attacks its own tissues. LncRNAs and other non-coding elements have been implicated in the pathogenesis of a wide range of autoimmune diseases, including rheumatoid arthritis, systemic lupus erythematosus, and multiple sclerosis.
The study of the non-coding genome is still in its early days, but it is already clear that this "dark matter" holds the key to understanding the genetic basis of many common and complex diseases.
Taming the Dark Matter: The Therapeutic Potential of Non-Coding DNA
The growing understanding of the role of non-coding DNA in disease is not just an academic exercise; it is opening up exciting new avenues for the development of novel therapies and diagnostics. For the first time, we are able to target the root causes of many diseases by modulating the activity of these once-hidden regulatory elements.
New Drugs for "Undruggable" Targets
Many disease-causing proteins are difficult to target with traditional small-molecule drugs because they lack well-defined binding pockets. Non-coding RNAs, however, offer a way to target these "undruggable" proteins at the source by modulating the expression of their genes. Several therapeutic strategies are being developed to target non-coding RNAs:
- Antisense Oligonucleotides (ASOs): These are short, synthetic strands of nucleic acid that are designed to bind to a specific RNA molecule through Watson-Crick base pairing. This can lead to the degradation of the target RNA or block its function. ASOs are a versatile technology that can be used to target both lncRNAs and miRNAs. Several ASO-based drugs have already been approved by the FDA for the treatment of various diseases, and many more are in clinical trials.
- Small Interfering RNAs (siRNAs): These are short, double-stranded RNA molecules that can trigger the degradation of a specific mRNA. SiRNA-based therapies are a powerful way to silence the expression of a particular gene. The FDA has approved several siRNA drugs, including patisiran for the treatment of hereditary transthyretin-mediated amyloidosis.
- Small Molecules: While challenging, it is possible to develop small molecules that can bind to and modulate the function of non-coding RNAs. These molecules can interfere with the RNA's structure or its interaction with other molecules. For example, researchers are developing small molecules that can inhibit the processing of oncogenic miRNAs.
- CRISPR-Based Therapies: The revolutionary gene-editing technology CRISPR-Cas9 can be used to directly edit the DNA sequence of non-coding regulatory elements. This could be used to correct disease-causing mutations in enhancers, silencers, or other regulatory regions. While still in its early stages, CRISPR-based therapies hold immense promise for the treatment of a wide range of genetic diseases.
Biomarkers for a New Era of Personalized Medicine
Non-coding RNAs are also proving to be valuable biomarkers for disease. Because they are often found in bodily fluids like blood and urine, they can be used for non-invasive "liquid biopsies." The expression levels of specific miRNAs and lncRNAs can be used to:
- Diagnose diseases earlier: Changes in the levels of circulating ncRNAs can be an early sign of disease, even before symptoms appear.
- Predict disease prognosis: The expression profile of certain ncRNAs can help to predict how a disease will progress and how a patient will respond to treatment.
- Monitor treatment response: Changes in ncRNA levels can be used to monitor how well a patient is responding to therapy.
The use of non-coding RNAs as biomarkers is ushering in a new era of personalized medicine, where treatments can be tailored to the specific molecular profile of an individual's disease.
Challenges on the Horizon
Despite the enormous promise of non-coding RNA therapeutics, there are still significant challenges to overcome. These include:
- Delivery: Getting RNA-based drugs to the right cells and tissues in the body is a major hurdle.
- Specificity: Ensuring that RNA-based drugs only affect their intended target and do not have off-target effects is crucial.
- Immunogenicity: The immune system can sometimes recognize RNA-based drugs as foreign, leading to an unwanted immune response.
Researchers are actively working on solutions to these challenges, developing new delivery systems and chemical modifications to improve the safety and efficacy of RNA-based therapies.
The Uncharted Territory: The Future of "Junk DNA" Research
While we have made tremendous strides in understanding the non-coding genome, we have still only scratched the surface. Much of this genetic dark matter remains uncharted territory, and the full extent of its influence on our biology is yet to be discovered.
The next generation of large-scale genomics projects are already underway, aiming to build on the foundation laid by ENCODE. These projects will use even more powerful technologies to map the functional elements of the genome in a wider range of cell types and developmental stages.
One of the biggest challenges for the future is to move from simply identifying functional elements to understanding their precise mechanisms of action. How do these millions of gene switches work together to create a symphony of gene expression? How does the 3D architecture of the genome change during development and in response to environmental signals? And how do disruptions in these complex regulatory networks lead to disease?
Answering these questions will require a multidisciplinary approach, combining the power of genomics, molecular biology, computational biology, and clinical medicine. It will also require the development of new technologies to probe the inner workings of the cell with even greater resolution.
As we delve deeper into the mysteries of the genetic dark matter, we must also grapple with the ethical, legal, and social implications of this new knowledge. How do we interpret the results of genetic tests that identify variants in non-coding DNA, especially when their functional consequences are unknown? How do we ensure that this new genetic information is not used to discriminate against individuals? And what are the ethical considerations of using technologies like CRISPR to edit the non-coding genome?
The journey into the dark matter of the genome is far from over. It is a journey that is sure to be filled with more surprises and paradigm shifts. But one thing is certain: the once-dismissed "junk DNA" has emerged from the shadows to take its rightful place as a key player in the grand drama of life. It is a testament to the boundless complexity and elegance of the genome, and a reminder that even in the most familiar of places, there are still vast and undiscovered worlds waiting to be explored. The secrets hidden within this genetic dark matter hold the promise of a future where we can understand, treat, and even prevent some of our most devastating diseases. The age of the non-coding genome has truly begun.
Reference:
- https://pubmed.ncbi.nlm.nih.gov/26721493/
- https://pmc.ncbi.nlm.nih.gov/articles/PMC8212082/
- https://pmc.ncbi.nlm.nih.gov/articles/PMC7407480/
- https://www.frontiersin.org/journals/molecular-biosciences/articles/10.3389/fmolb.2023.1297413/full
- https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2025.1534862/full
- https://www.mdpi.com/1422-0067/26/7/3055
- https://aesnet.org/abstractslisting/targeting-long-non-coding-rnas-with-antisense-oligonucleotides-in-a-mouse-model-of-epileptogenesis-improves-seizure-and-cognitive-outcomes
- https://pmc.ncbi.nlm.nih.gov/articles/PMC4416997/
- https://www.mdpi.com/1422-0067/24/22/16213
- https://pmc.ncbi.nlm.nih.gov/articles/PMC7388364/
- https://www.bohrium.com/paper-details/non-coding-rnas-the-key-detectors-and-regulators-in-cardiovascular-disease/812551959448911873-5398
- https://www.frontiersin.org/journals/physiology/articles/10.3389/fphys.2020.00798/full
- https://pmc.ncbi.nlm.nih.gov/articles/PMC11011542/
- https://www.bohrium.com/paper-details/noncoding-rna-therapeutics-challenges-and-potential-solutions/812136510244519936-5481
- https://pmc.ncbi.nlm.nih.gov/articles/PMC6802278/
- https://pmc.ncbi.nlm.nih.gov/articles/PMC10319589/
- https://pmc.ncbi.nlm.nih.gov/articles/PMC8701955/
- https://www.researchgate.net/publication/352514208_Noncoding_RNA_therapeutics_-_challenges_and_potential_solutions
- https://pmc.ncbi.nlm.nih.gov/articles/PMC10297435/
- https://www.mdpi.com/2311-553X/10/2/17
- https://pubmed.ncbi.nlm.nih.gov/35326738/
- https://www.mdpi.com/1420-3049/27/19/6717
- https://pmc.ncbi.nlm.nih.gov/articles/PMC9969060/
- https://pmc.ncbi.nlm.nih.gov/articles/PMC11931079/
- https://www.researchgate.net/publication/290455422_Targeting_Long_Noncoding_RNA_with_Antisense_Oligonucleotide_Technology_as_Cancer_Therapeutics
- https://www.frontiersin.org/journals/cellular-neuroscience/articles/10.3389/fncel.2019.00352/full
- https://pmc.ncbi.nlm.nih.gov/articles/PMC8564736/
- https://tech4future.info/en/non-coding-dna-genetic-mutations-cancer/
- https://maxplanckneuroscience.org/defect-in-noncoding-dna-might-trigger-brain-disorders-such-as-severe-language-impairment/
- https://www.binasss.sa.cr/bibliotecas/bhm/jul24/37.pdf
- https://pubmed.ncbi.nlm.nih.gov/34535299/
- https://pmc.ncbi.nlm.nih.gov/articles/PMC7686979/
- https://www.mdpi.com/2311-553X/7/4/65
- https://www.mdpi.com/1422-0067/22/4/1741
- https://www.iomcworld.org/open-access/the-research-progress-of-long-noncoding-rnas-in-autoimmune-diseases-46513.html
- https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2019.03129/full
- https://www.mdpi.com/2072-6694/14/6/1588
- https://www.jove.com/t/65124/antisense-oligonucleotides-as-tool-for-prolonged-knockdown-nuclear
- https://pmc.ncbi.nlm.nih.gov/articles/PMC5785983/
- https://www.frontiersin.org/journals/molecular-neuroscience/articles/10.3389/fnmol.2017.00028/full
- https://pmc.ncbi.nlm.nih.gov/articles/PMC5483188/
- https://www.mdpi.com/2073-4409/13/10/800
- https://www.tandfonline.com/doi/full/10.1586/14737159.2014.971761
- https://pmc.ncbi.nlm.nih.gov/articles/PMC8810986/
- https://ncri.amegroups.org/article/view/5002/html