In the hallowed halls of paleontology, where fossils have long been the silent storytellers of our planet's deep past, a revolutionary new technique is making these ancient remains speak in ways we never thought possible. This groundbreaking field, known as paleoproteomics, is centered on the analysis of ancient proteins. More specifically, scientists are now able to extract and sequence proteins from the enamel of fossilized teeth, opening a new, remarkably clear window into the lives of our ancient relatives and the world they inhabited. This is not merely an incremental advance; it is a paradigm shift that is pushing back the boundaries of what we can know about extinct species, including early hominins.
For decades, the study of ancient biomolecules was dominated by the analysis of ancient DNA (aDNA). While aDNA has undeniably revolutionized our understanding of the recent evolutionary past, its fragile nature means that it rarely survives for more than a few hundred thousand years, especially in the warm climates where much of human evolution unfolded. Proteins, however, are far more resilient. Encased within the dense, mineralized fortress of tooth enamel, some proteins can endure for millions of years, patiently waiting for scientists with the right tools to unlock their secrets.
Tooth enamel proteomics is not just about identifying the presence of proteins; it is about sequencing them, which means determining the precise order of their amino acid building blocks. Because the blueprint for these proteins is encoded in an organism's DNA, their sequences hold a treasure trove of genetic information. This allows paleontologists to piece together evolutionary relationships, determine the biological sex of individuals, and even gain insights into their physiology and development, all from a tiny fragment of a fossilized tooth.
This burgeoning field is already yielding spectacular results. From clarifying the evolutionary position of the giant ape Gigantopithecus blacki to determining the sex of 2-million-year-old hominins in South Africa, tooth enamel proteomics is answering questions that have long perplexed paleontologists. It is a story of cutting-edge science meeting the deep past, of microscopic molecules revealing macroscopic truths, and of how the hardest tissue in the vertebrate body is providing some of the most profound insights into our own origins.
The Indestructible Time Capsule: Why Tooth Enamel?
The remarkable ability of tooth enamel to preserve proteins for millions of years lies in its unique structure and composition. Enamel is the hardest and most highly mineralized tissue in the vertebrate body, composed of approximately 96% inorganic material, primarily in the form of hydroxyapatite crystals. These crystals are tightly packed in a highly organized structure of rods and interrod enamel. During its formation, a suite of specialized proteins orchestrates this intricate mineralization process.
As the enamel matures, most of these proteins are degraded and removed, leaving behind a tissue that is almost entirely mineral. However, trace amounts of protein fragments become entrapped within the hydroxyapatite crystals. This crystalline matrix acts as a natural time capsule, shielding these residual proteins from the ravages of time, such as water, microbes, and fluctuating temperatures, which would quickly degrade more exposed organic matter.
"Teeth are rocks in our mouths," explains Dr. Daniel Green, a researcher at Harvard University and Columbia University. "They're the hardest structures that any animals make, so you can find a tooth that is a hundred or a hundred million years old, and it will contain a geochemical record of the life of the animal." This includes not only chemical signatures of diet and environment but also these precious protein sequences.
While other fossilized tissues like bone also contain proteins, primarily collagen, enamel offers a distinct advantage. The enamel proteome—the complete set of proteins in enamel—is relatively small and tissue-specific, which simplifies analysis and reduces the chances of contamination from the surrounding burial environment. Furthermore, the unique proteins of enamel, such as amelogenin, are not found elsewhere in the body, providing a clear and unambiguous signal.
The Protein Advantage: Surpassing the Limits of Ancient DNA
For many years, ancient DNA (aDNA) was the gold standard for exploring the genetics of extinct species. It allowed scientists to redraw the human family tree, revealing the existence of previously unknown hominins like the Denisovans and uncovering a history of interbreeding between our ancestors and other archaic humans. However, DNA is a notoriously fragile molecule. It is susceptible to chemical decay, particularly through hydrolysis and oxidation, processes that are accelerated by water and heat. As a result, the effective lifespan of aDNA is limited. Outside of the frigid conditions of the permafrost, aDNA has rarely been recovered from fossils older than about 500,000 years. The oldest human DNA ever sequenced, for instance, dates back to about 400,000 years. This leaves a vast and frustrating gap in our understanding of earlier hominins, as much of our evolutionary story played out in the warmer climates of Africa and Asia, which are notoriously poor for DNA preservation.
This is where paleoproteomics steps in, offering a powerful solution to this temporal impasse. Proteins are inherently more stable than DNA. The peptide bonds that link amino acids together are more robust than the bonds in the DNA backbone. When locked within the protective mineral matrix of tooth enamel, proteins can survive for astonishingly long periods, far exceeding the known limits of aDNA.
Recent studies have dramatically underscored this longevity. In 2019, an international team of researchers, including Enrico Cappellini and Frido Welker, successfully sequenced proteins from a 1.9-million-year-old molar of Gigantopithecus blacki, an extinct giant ape from a subtropical region of southern China. Shortly after, the same team pushed the timeline back even further, analyzing a ~1.77-million-year-old rhino tooth from Dmanisi, Georgia. These groundbreaking studies demonstrated that protein analysis could break through the "aDNA barrier," providing genetic information from fossils found in warmer regions and from a much deeper time frame.
More recently, in 2025, two studies published in the journal Nature shattered previous records. One team, led by researchers from the Smithsonian’s Museum Conservation Institute and Harvard University, recovered protein fragments from rhino, hippo, and elephant teeth in Kenya's Turkana Basin that were between 1.5 million and 18 million years old. Simultaneously, another study described proteins from a rhino relative's tooth found in Canada's High Arctic, dating back more than 20 million years. These discoveries have firmly established that proteins are a far more durable source of ancient genetic information than DNA.
While the information gleaned from proteins is not as comprehensive as a full genome sequence, it provides crucial data that would otherwise be completely lost to time. As Dr. Michael Buckley of the University of Manchester puts it, "The preservation potential of proteins is known to be much greater than for DNA, so we can simply go further back in time and look at species that may have gone extinct beyond the capabilities of ancient DNA studies."
The Keys to the Kingdom: Proteins of the Enamel Proteome
The secret to enamel's remarkable preservation power lies in a small but highly informative group of proteins. While developing enamel is rich in protein, mature enamel, the kind found in fossils, contains only trace amounts. The most important of these for paleoproteomic analysis is amelogenin.
Amelogenin is the most abundant protein in the developing enamel matrix, making up about 90% of its organic content. Its primary role is to guide the growth and organization of hydroxyapatite crystals during amelogenesis (enamel formation). What makes amelogenin particularly valuable for paleontologists is that the gene for it is located on both the X and Y sex chromosomes (as AMELX and AMELY, respectively).
These two forms of the protein are slightly different from each other. By detecting the presence of the Y-chromosome-specific version (AMELY), scientists can confidently identify a fossil as belonging to a male individual. If only the X-chromosome version (AMELX) is found, the individual is female. This ability to determine the biological sex of a fossil is a significant breakthrough, as it is often difficult or impossible to do so from skeletal remains alone, especially when dealing with fragmentary fossils or juvenile individuals.
Beyond amelogenin, other proteins, though less abundant, are also part of the enamel proteome and can be recovered from fossils. These are often referred to as non-amelogenin proteins and include:
- Ameloblastin (AMBN): Also involved in enamel formation, this protein helps to control crystal elongation.
- Enamelin (ENAM): This protein is crucial for the initiation of crystal formation and the overall structural integrity of the enamel.
- Tuftelin (TUFT1): Found at the junction between the enamel and the underlying dentin, its precise role is still being studied.
Together, these proteins, along with others that can be found in trace amounts, such as serum albumin, make up the "enamelome." The sequences of these proteins are unique to different species and evolve over time. By comparing the amino acid sequences from an extinct animal to those of living species, scientists can reconstruct evolutionary relationships with a high degree of confidence.
Unlocking the Past: The Methodology of Tooth Enamel Proteomics
Extracting and analyzing proteins that have been locked away for millions of years is a delicate and highly sophisticated process. It requires ultra-clean laboratory conditions to prevent contamination from modern proteins (like skin keratin from the researchers themselves) and cutting-edge analytical equipment. The general workflow can be broken down into several key stages:
1. Sample Preparation: The process begins with the careful selection of a fossil tooth. To access the pristine interior of the enamel, the outer surface is meticulously cleaned and a small amount of enamel powder is drilled out. This powder is then treated with a weak acid to dissolve the mineral matrix of hydroxyapatite, releasing the trapped protein fragments.An important distinction from many modern proteomic techniques is that the ancient proteins are already broken down into smaller fragments, called peptides, through natural degradation over geological time. Therefore, the enzymatic digestion step typically used in "bottom-up" proteomics to break down proteins is often omitted. This "shotgun" approach works directly with the diagenetically generated peptides.
2. Liquid Chromatography-Tandem Mass Spectrometry (LC-MS/MS): The extracted peptide mixture is then injected into a machine that is the heart of the paleoproteomics laboratory: a liquid chromatograph coupled to a tandem mass spectrometer.- Liquid Chromatography (LC): First, the complex mixture of peptides is separated based on their chemical properties (like size or hydrophobicity) as they pass through a long, thin column. This separation is crucial because it simplifies the mixture before it enters the mass spectrometer, allowing for more comprehensive analysis.
- Tandem Mass Spectrometry (MS/MS): As the separated peptides emerge from the chromatograph, they are ionized (given an electrical charge) and sent into the mass spectrometer. The instrument performs two stages of mass analysis:
MS1: The first mass spectrometer measures the mass-to-charge ratio of all the intact peptides, creating a "survey scan" of everything that is present at that moment.
MS2: The instrument then intelligently selects individual peptide ions, one by one, fragments them further by colliding them with an inert gas, and sends these smaller fragments into a second mass spectrometer. This second stage measures the mass-to-charge ratios of the fragment ions.
3. Data Analysis and Sequence Reconstruction: The result of the MS/MS analysis is a massive dataset of spectra. Each spectrum from the MS2 stage represents the fragmentation pattern of a single peptide. Sophisticated bioinformatics software is then used to interpret these patterns.The software works by taking the fragmentation data and comparing it against a database of known protein sequences from related living organisms. By matching the observed fragment masses to the predicted fragments of a known protein, the algorithm can identify the peptide and, by extension, the protein it came from. When multiple overlapping peptides are identified, their sequences can be pieced together to reconstruct a larger portion of the original ancient protein.
4. Authentication: A critical step in the process is to ensure that the identified proteins are genuinely ancient and endogenous to the fossil, not modern contaminants. Scientists use several lines of evidence for this. One of the most powerful is the pattern of chemical damage. Over time, certain amino acids spontaneously degrade in predictable ways, such as the deamidation of asparagine and glutamine. The presence of these specific damage patterns is a strong indicator of antiquity. Furthermore, the recovery of tissue-specific proteins like amelogenin provides confidence that the sequences are authentic.This entire workflow, from drilling the tooth to identifying the proteins, allows researchers to read the genetic information preserved in enamel, a feat that was considered science fiction just a few decades ago.
Landmark Discoveries: Rewriting the Story of Our Ancestors
The application of tooth enamel proteomics to the fossil record has already led to a series of stunning discoveries that are reshaping our understanding of human evolution and the history of other extinct mammals.
*Clarifying the Place of a Giant Ape: Gigantopithecus blacki**
For a long time, the evolutionary relationships of Gigantopithecus blacki, a massive ape that roamed the forests of Southeast Asia from about 2 million to 350,000 years ago, were a mystery. Due to the scarcity of its remains—mostly teeth and a few jawbones—and the inability to recover aDNA from its subtropical habitat, its position on the primate family tree was debated. In 2019, a team led by Frido Welker and Enrico Cappellini managed to extract enamel proteins from a 1.9-million-year-old Gigantopithecus molar found in a cave in southern China. Their analysis of the protein sequences revealed that Gigantopithecus was a close relative of the modern orangutan, with their lineages diverging around 10-12 million years ago. This study was a landmark achievement, as it not only solved a long-standing paleontological puzzle but also proved that informative proteomes could be retrieved from fossils nearly two million years old and from warm environments.
Meeting Our Ancient Cousins: Homo antecessor and Paranthropus robustus**
The technique has also shed new light on early members of our own genus. In 2020, researchers analyzed a tooth from an 800,000-year-old Homo antecessor fossil from Spain. The protein evidence placed H. antecessor as a close sister lineage to the last common ancestor of modern humans, Neanderthals, and Denisovans, clarifying its crucial position in the hominin family tree.
Even more recently, in 2025, a study on 2-million-year-old fossils of Paranthropus robustus from Swartkrans Cave in South Africa yielded extraordinary results. Paranthropus, a distant, robust-jawed cousin of early humans, lived alongside our direct ancestors. The team successfully extracted enamel proteins from four individuals and, using the amelogenin protein, determined that two were male and two were female. This was a significant first, providing direct molecular evidence for the sex of such ancient hominins. The analysis also revealed unexpected genetic diversity within the species, suggesting that the Paranthropus population may have been more complex than previously thought, perhaps consisting of different subgroups. This discovery highlighted the power of paleoproteomics to uncover biological information that is completely invisible in the fossilized bones themselves.
Resolving Rhino Relationships: The Case of Stephanorhinus**
The power of tooth enamel proteomics extends far beyond hominins. A ~1.77-million-year-old rhinoceros fossil from the famous archaeological site of Dmanisi in Georgia provided a perfect test case. The fossil belonged to the extinct genus Stephanorhinus, whose evolutionary relationship to other rhinos, like the woolly rhino (Coelodonta antiquitatis), was unclear. Ancient DNA could not be recovered from a fossil of this age and location. However, the enamel proteome was beautifully preserved. The analysis of its protein sequences placed the Dmanisi rhino as a sister group to the clade containing the woolly rhino and Merck's rhinoceros. This demonstrated that the woolly rhino evolved from an early Stephanorhinus* lineage, effectively resolving a long-standing debate in paleontology. The study also successfully identified the specimen as male via its amelogenin protein.
Pushing the Limits in the Turkana BasinPerhaps the most dramatic demonstration of the technique's potential comes from a 2025 study on fossils from the Turkana Basin in Kenya, a region famous for its hominin fossils and its relentlessly hot climate. A team led by Daniel Green recovered peptide fragments from the teeth of ancient rhinos and elephant relatives dating back as far as 18 million years. This extended the record for recovered proteins fivefold, a truly field-changing result. The fact that these proteins survived for so long in one of the hottest places on Earth is a testament to the incredible preservative power of tooth enamel. The protein sequences, though fragmented, were still informative enough to confirm the evolutionary relationships of these ancient beasts to their modern counterparts. For instance, a 16-million-year-old proboscidean shared enamel sequences with today's elephants, while an ancient hippo fossil aligned with living hippos and their closest living relatives, whales.
These landmark studies are just the beginning. They serve as a powerful proof of concept, demonstrating that the biological information locked in fossil teeth can now be accessed, providing a new and exciting avenue for exploring the deep past.
Limitations and the Path Forward
Despite its revolutionary potential, tooth enamel proteomics is not without its limitations, and it is not a "magic bullet" that will answer all questions. It is essential to have a realistic perspective on what the technique can and cannot do.
The Information Gap: Proteins vs. GenomesThe most significant limitation is the amount of genetic information that can be retrieved. The enamel proteome is very small, based on only about 10 genes, compared to the tens of thousands of genes in a full genome. Even with perfect preservation, the total amount of sequence data from proteins is orders of magnitude less than what can be obtained from aDNA. This means that some of the most profound discoveries made with aDNA, such as quantifying the percentage of Neanderthal ancestry in modern humans, are currently beyond the reach of proteomics. Similarly, while protein data can be used to estimate when evolutionary lineages diverged, the precision of these dates is lower than estimates based on whole genomes due to the smaller number of available data points.
The Challenge of DegradationWhile proteins are far more durable than DNA, they are not immortal. Over millions of years, the original protein chains inevitably break down into smaller and smaller peptide fragments. This fragmentation means that researchers can only recover a portion of the original protein sequence, and the coverage across the protein is often uneven. As fossils get older, the number and length of recoverable peptides decrease, eventually reaching a point where no useful information can be obtained.
The Future is Bright: Overcoming the HurdlesDespite these limitations, the future of paleoproteomics is incredibly bright. The field is rapidly advancing on multiple fronts, from improvements in instrumentation to more sophisticated data analysis techniques.
- Technological Advancements: Mass spectrometers are becoming ever more sensitive, allowing researchers to detect and sequence ever smaller quantities of protein. New methods of data acquisition and analysis are being developed to maximize the information that can be extracted from fragmented and damaged peptides.
- Expanding the Toolkit: Researchers are exploring ways to combine paleoproteomics with other analytical techniques to get the most out of precious fossil samples. For example, methods are being developed to screen samples for protein preservation before destructive sampling, helping to conserve valuable fossil resources.
- Pushing Deeper in Time: The recent discovery of proteins in fossils over 20 million years old has opened the tantalizing possibility of exploring even deeper time. While the challenges are immense, some scientists are hopeful that it may one day be possible to recover protein fragments from even older organisms, perhaps even from dinosaurs. As Dr. Enrico Cappellini noted, with these recent results, such ideas are starting to move from the realm of science fiction toward plausible scientific inquiry.
- Answering New Questions: Beyond phylogenetic relationships and sex determination, researchers are beginning to explore what else ancient proteins can tell us. For example, studying post-translational modifications—chemical changes made to proteins after they are synthesized—could provide insights into the physiology and health of ancient organisms.
In conclusion, tooth enamel proteomics represents a monumental leap forward in our ability to study the deep past. By providing a direct window into the genetic makeup of long-extinct species, it is allowing us to answer questions that were once thought to be permanently lost to history. It has broken through the time barrier that limited ancient DNA research, opening up millions of years of evolutionary history for molecular investigation, particularly in the very regions where our own lineage was born. While the road ahead has its challenges, this remarkable technique has already transformed our understanding of ancient life and promises to continue revealing the secrets of our ancient relatives, one protein at a time. The silent stones are finally beginning to tell their most profound stories.
Reference:
- https://www.labmanager.com/paleoproteomics-offers-a-look-into-the-deep-past-25785
- https://www.zmescience.com/science/paleontology/oldest-protein-18-million-years/
- https://www.nautilus.bio/blog/paleoproteomics-unleashing-the-proteome-of-the-ancient-world/
- https://pmc.ncbi.nlm.nih.gov/articles/PMC6894936/
- https://www.johnhawks.net/p/how-will-ancient-proteins-change-paleoanthropology
- https://www.sciencedaily.com/releases/2025/11/251101000412.htm
- https://ellipse.prbb.org/ancient-proteins-reveal-sex-and-genetic-diversity-in-2-million-year-old-hominids/
- https://projects.research-and-innovation.ec.europa.eu/en/horizon-magazine/humans-family-tree-revealed-ancient-proteins
- https://en.wikipedia.org/wiki/Ancient_protein
- https://en.wikipedia.org/wiki/Amelogenin
- https://www.cambridge.org/core/journals/paleobiology/article/abs/biochemical-analyses-of-fossil-enamel-and-dentin/7731E617C50DB457E7A3A8ECB3342B96
- https://www.sci.news/paleontology/miocene-proteins-mammal-tooth-enamel-14057.html
- https://news.rpi.edu/content/2016/06/15/historic-fossils-find-new-life-telling-story-ancient-proteins
- https://academic.oup.com/gbe/article/17/2/evaf007/7965156
- https://en.wikipedia.org/wiki/Denisovan
- https://www.smithsonianmag.com/smart-news/scientists-recover-ancient-proteins-from-animal-teeth-up-to-24-million-years-old-opening-doors-to-learning-about-the-past-180986966/
- https://academic.oup.com/mbe/article/18/12/2146/1074271
- https://www.researchgate.net/publication/11635141_Molecular_Evidence_for_Precambrian_Origin_of_Amelogenin_the_Major_Protein_of_Vertebrate_Enamel
- https://pubmed.ncbi.nlm.nih.gov/7626398/
- https://experiments.springernature.com/articles/10.1038/s41596-024-00975-3
- https://pubmed.ncbi.nlm.nih.gov/38671208/
- https://pmc.ncbi.nlm.nih.gov/articles/PMC5819086/
- https://research.manchester.ac.uk/en/studentTheses/ancient-proteomics-an-investigation-into-protein-survival-in-arch/