A chicken-egg question: Where do baby genes come from?

28 Apr 2017

New genes are more likely to appear on the stage of evolution in full-fledged form rather than gradually take shape through successive stages of "proto genes" that become more and more refined over generations.

This is the surprising upshot from research led by Benjamin Wilson and Joanna Masel at the University of Arizona, published as an Advance Online Publication by the scientific journal Nature Ecology & Evolution on 24 April.

Evolutionary biologists have long pored over the question of where new genes come from, which poses something of a chicken-and-egg problem. Conventional wisdom has it that new genes - DNA sequences that code for a

protein molecule - evolve from existing genes through duplication and divergence. This happens when DNA copying mechanisms accidentally leave behind an extra copy of a particular gene. Naturally occurring mutations

subsequently introduce changes that alter the DNA sequence such that the new gene assumes a function previously not found in the organism's lineage.

Previous studies by other researchers suggested that new genes also emerge from non-coding DNA sequences, via primitive "proto-genes" that become refined over generations, resulting in an "adult," fully functional gene.

Masel and her team found the opposite to be more likely, based on the fact that non-coding DNA sequences are likely to give rise to highly ordered proteins. Proteins, which consist of amino acids chained together into so-called

polypeptides, tend to fold into three-dimensional structures that range from simple to mindbogglingly complicated. And while "ordered" may sound like a good thing, Masel is quick to point out that a healthy dose of disorder is key

to success when it comes to evolution coming up with new genes that serve as blueprints for new proteins.

For the study, the researchers compiled data on full-genome DNA sequences downloaded from yeast and mouse databases.

"We take all the known mouse genes and yeast genes and query them against everything that's ever been sequenced and see what they're related to," explains Masel, a professor in the Department of Ecology and Evolution and a

member of the UA's BIO5 Institute, "and based on that, we assign each gene an age that tells us when it was born."

In the next step, the team used statistical analyses to create a model revealing the average degree of order that would be present in each gene's product.

"We found that the youngest genes are the least ordered of all, which is what you would expect to get if you birthed a gene," Masel says.

The key to a protein that can contribute a useful function for its organism while not harming it is a healthy mix between regions that are soluble because they consist of hydrophilic, or "water-loving," amino acids and stretches that

are insoluble because of their hydrophobic, or "water-repelling," amino acids.

If a protein consists of too many water-loving amino acids, it will remain largely unfolded, floating around inside the cell as an unorganized chain incapable of performing biological tasks. If too much of its length is water-repelling,

the amino acids will clump together, rendering the protein unusable, and even dangerous, because when such misfolded proteins bump into each other, they tend to stick to each other and accumulate.

"Now think about the most highly ordered proteins we know - amyloids," Masel says, referring to the infamous piles of proteins found in the brain of Alzheimer's patients. "Because of this, the first order of business for any

prospective gene is: 'Do no harm. Do not misfold.'"

This has profound implications for the evolution of new genes from non-coding DNA sequences. Because such sequences are likely to give rise to highly ordered proteins, they are likely to be deleterious to the organism. In this

scenario, any prospective new gene must start out as some kind of "super gene," in contrast to a "proto gene." Rather than making its debut in the gene pool as an unrefined gene that still bears many similarities to the non-coding

DNA sequences it came from, the protein it encodes must start with a higher-than average degree of disorder to prove itself before evolution would allow it becoming a permanent member of the gene pool.

"Instead of gradually working up to having more hydrophilic regions, young genes work their way down from being more hydrophilic and disordered, to more hydrophobic regions," Masel says. "In other words, when it comes to

structural disorder, a polypeptide has the highest chance of being born if it is 'extra gene-like,' rather than 'sort of gene-like.'"

The probability that a gene could arise from a random, non-coding sequence -- also known as "junk DNA," on the other hand, used to be considered negligible, based on the premise that in the vast majority of cases, a random

sequence does more harm than good. This may not be so, argues a second paper in the same issue by Rafik Neme, one of the co-authors of the study discussed here. Neme, currently a postdoctoral researcher at Columbia

University Medical Center in New York, found the first experimental evidence that non-coding, "silent" stretches of DNA are anything but that.

"Until now, nobody knew whether a randomly sequence could immediately have any effect that would result in a function, or whether function was slowly acquired over time," Neme says. "It's similar to the idea of having a monkey

typewriting at random, and expecting it to produce meaningful work."

Neme's experiments show that many sequences exhibit relevant activities immediately, some good and some bad. This, in turn, suggests a discrete transition between non-genes and genes and would favor certain kind of

sequences and functions over others.

Based on their findings, Neme and Masel point out, the pool from which genes are born might be more conducive to birthing new genes than one might expect.

"In our scenario, a gene precursor would be a transcript that happened to be translated into a protein sometimes but has no function," she says. "These things come up in evolution all the time, and mutation will quickly destroy it

unless that polypeptide provides the organism with some advantage. There either is an advantage that natural selection can act on, or there isn't, so we don't think the would-be genes stick around for very long."

This in turn suggests that gene birth is a sudden transition, rather than a gradual process involving many intermediate steps.

Server CPU Shortages Grip China as AI Boom Strains Intel and AMD Supply Chains

By Cygnus | 06 Feb 2026

Intel and AMD server CPU shortages are hitting China as AI data center demand surges, pushing lead times to six months and driving prices higher.

Budget 2026-27 Seeks Fiscal Balance Amid Rupee Volatility and Industrial Stagnation

By Cygnus | 02 Feb 2026

India's Budget 2026-27 targets fiscal discipline with record capex as markets tumble, the rupee weakens and manufacturing struggles to regain momentum.

The Thirsty Cloud: Why 2026 Is the Year AI Bottlenecks Shift From Chips to Water

By Axel Miller | 28 Jan 2026

As AI server density surges in 2026, data centers face a new bottleneck deeper than chips — the massive water demand required for cooling next-generation infrastructure.

The New Airspace Economy: How Geopolitics Is Rewriting Aviation Costs in 2026

By Axel Miller | 22 Jan 2026

Airspace bans, sanctions and corridor risk are forcing airlines into costly detours in 2026, raising fuel burn, reducing aircraft utilisation and pushing airfares higher worldwide.

India’s Data Center Arms Race: The Battle for Power, Cooling, and AI Real Estate

By Cygnus | 22 Jan 2026

India’s data centre boom is turning into an AI arms race where power contracts, liquid cooling and fast commissioning decide the winners across Mumbai, Chennai and Hyderabad.

India’s Oil Balancing Act: Refiners Rebuild Middle East Supply Lines as Russia Flows Disrupt

By Axel Miller | 21 Jan 2026

India’s refiners are rebalancing crude sourcing as Russian imports fell to a two-year low in December 2025, lifting OPEC’s share and raising geopolitical risk concerns.

Arctic Fever: How ‘Greenland Tariff’ Politics Sparked a Global Flight to Safety

By Axel Miller | 20 Jan 2026

Greenland-linked tariff threats have injected fresh uncertainty into transatlantic trade, triggering a risk-off shift in markets and reshaping global supply chain planning.

The New Oil (Part 5): Friend-Shoring, Supply Chain Fragmentation and the Cost of Resilience

By Cygnus | 19 Jan 2026

Friend-shoring is reshaping lithium, rare earth and graphite supply chains, creating a resilience premium and new winners and losers in clean tech.

The New Oil (Part 4): Can Technology Break the Dependency?

By Cygnus | 16 Jan 2026

Can magnet recycling and rare-earth-free motors reduce global dependence on strategic minerals? Part 4 explores breakthroughs, limits and timelines.

A chicken-egg question: Where do baby genes come from?

28 Apr 2017

Latest articles

Global Chip Sales Expected to Hit $1 Trillion This Year, Industry Group Says

Citi to Match Government Seed Funding for Children’s ‘Trump Accounts’

Huawei-Backed Aito Partners With UAE Dealer to Enter Middle East Market

AI is No Bubble: Nvidia Supplier Wistron Sees Order Surge Through 2027

Tech Selloff Weighs on Asian Markets; Indonesia Slides After Moody’s Outlook Cut

Amazon Plans $200 Billion AI Spending Surge; Shares Slide on Investor Jitters

Server CPU Shortages Grip China as AI Boom Strains Intel and AMD Supply Chains

OpenAI launches ‘Frontier’ AI agent platform in enterprise push

Toyota set for third straight quarterly profit drop as costs and tariffs weigh

Featured articles

Server CPU Shortages Grip China as AI Boom Strains Intel and AMD Supply Chains

By Cygnus | 06 Feb 2026

Budget 2026-27 Seeks Fiscal Balance Amid Rupee Volatility and Industrial Stagnation

By Cygnus | 02 Feb 2026

The Thirsty Cloud: Why 2026 Is the Year AI Bottlenecks Shift From Chips to Water

By Axel Miller | 28 Jan 2026

The New Airspace Economy: How Geopolitics Is Rewriting Aviation Costs in 2026

By Axel Miller | 22 Jan 2026

India’s Data Center Arms Race: The Battle for Power, Cooling, and AI Real Estate

By Cygnus | 22 Jan 2026

India’s Oil Balancing Act: Refiners Rebuild Middle East Supply Lines as Russia Flows Disrupt

By Axel Miller | 21 Jan 2026

Arctic Fever: How ‘Greenland Tariff’ Politics Sparked a Global Flight to Safety

By Axel Miller | 20 Jan 2026

The New Oil (Part 5): Friend-Shoring, Supply Chain Fragmentation and the Cost of Resilience

By Cygnus | 19 Jan 2026

The New Oil (Part 4): Can Technology Break the Dependency?

By Cygnus | 16 Jan 2026

Latest articles

Global Chip Sales Expected to Hit $1 Trillion This Year, Industry Group Says

Citi to Match Government Seed Funding for Children’s ‘Trump Accounts’

Huawei-Backed Aito Partners With UAE Dealer to Enter Middle East Market

AI is No Bubble: Nvidia Supplier Wistron Sees Order Surge Through 2027

Tech Selloff Weighs on Asian Markets; Indonesia Slides After Moody’s Outlook Cut

Amazon Plans $200 Billion AI Spending Surge; Shares Slide on Investor Jitters

Server CPU Shortages Grip China as AI Boom Strains Intel and AMD Supply Chains

OpenAI launches ‘Frontier’ AI agent platform in enterprise push

Toyota set for third straight quarterly profit drop as costs and tariffs weigh