Function, non-function, some function: a brief history of junk DNA.

It is commonly suggested by anti-evolutionists that recent discoveries of function in non-coding DNA support intelligent design and refute “Darwinism”. This misrepresents both the history and the science of this issue. I would like to provide some clarification of both aspects.

When people began estimating genome sizes (amounts of DNA per genome) in the late 1940s and early 1950s, they noticed that this is largely a constant trait within organisms and species. In other words, if you look at nuclei in different tissues within an organism or in different organisms from the same species, the amount of DNA per chromosome set is constant. (There are some interesting exceptions to this, but they were not really known at the time). This observed constancy in DNA amount was taken as evidence that DNA, rather than proteins, is the substance of inheritance.

These early researchers also noted that some “less complex” organisms (e.g., salamanders) possess far more DNA in their nuclei than “more complex” ones (e.g., mammals). This rendered the issue quite complex, because on the one hand DNA was thought to be constant because it’s what genes are made of, and yet the amount of DNA (“C-value”, for “constant”) did not correspond to assumptions about how many genes an organism should have. This (apparently) self-contradictory set of findings became known as the “C-value paradox” in 1971.

This “paradox” was solved with the discovery of non-coding DNA. Because most DNA in eukaryotes does not encode a protein, there is no longer a reason to expect C-value and gene number to be related. Not surprisingly, there was speculation about what role the “extra” DNA might be playing.

In 1972, Susumu Ohno coined the term “junk DNA“. The idea did not come from throwing his hands up and saying “we don’t know what it does so let’s just assume it is useless and call it junk”. He developed the idea based on knowledge about a mechanism by which non-coding DNA accumulates: the duplication and inactivation of genes. “Junk DNA,” as formulated by Ohno, referred to what we now call pseudogenes, which are non-functional from a protein-coding standpoint by definition. Nevertheless, a long list of possible functions for non-coding DNA continued to be proposed in the scientific literature.

In 1979, Gould and Lewontin published their classic “spandrels” paper (Proc. R. Soc. Lond. B 205: 581-598) in which they railed against the apparent tendency of biologists to attribute function to every feature of organisms. In the same vein, Doolittle and Sapienza published a paper in 1980 entitled “Selfish genes, the phenotype paradigm and genome evolution” (Nature 284: 601-603). In it, they argued that there was far too much emphasis on function at the organism level in explanations for the presence of so much non-coding DNA. Instead, they argued, self-replicating sequences (transposable elements) may be there simply because they are good at being there, independent of effects (let alone functions) at the organism level. Many biologists took their point seriously and began thinking about selection at two levels, within the genome and on organismal phenotypes. Meanwhile, functions for non-coding DNA continued to be postulated by other authors.

As the tools of molecular genetics grew increasingly powerful, there was a shift toward close examinations of protein-coding genes in some circles, and something of a divide emerged between researchers interested in particular sequences and others focusing on genome size and other large-scale features. This became apparent when technological advances allowed thoughts of sequencing the entire human genome: a question asked in all seriousness was whether the project should bother with the “junk”.

Of course, there is now a much greater link between genome sequencing and genome size research. For one, you need to know how much DNA is there just to get funding. More importantly, sequence analysis is shedding light on the types of non-coding DNA responsible for the differences in genome size, and non-coding DNA is proving to be at least as interesting as the genic portions.

To summarize,

Since the first discussions about DNA amount there have been scientists who argued that most non-coding DNA is functional, others who focused on mechanisms that could lead to more DNA in the absence of function, and yet others who took a position somewhere in the middle. This is still the situation now.
Lots of mechanisms are known that can increase the amount of DNA in a genome: gene duplication and pseudogenization, duplicative transposition, replication slippage, unequal crossing-over, aneuploidy, and polyploidy. By themselves, these could lead to increases in DNA content independent of benefits for the organism, or even despite small detrimental impacts, which is why non-function is a reasonable null hypothesis.
Evidence currently available suggests that about 5% of the human genome is functional. The least conservative guesses put the possible total at about 20%. The human genome is mid-sized for an animal, which means that most likely a smaller percentage than this is functional in other genomes. None of the discoveries suggest that all (or even more than a minor percentage) of non-coding DNA is functional, and the corollary is that there is indirect evidence that most of it is not.
Identification of function is done by evolutionary biologists and genome researchers using an explicit evolutionary framework. One of the best indications of function that we have for non-coding DNA is to find parts of it conserved among species. This suggests that changes to the sequence have been selected against over long stretches of time because those regions play a significant role. Obviously you can not talk about evolutionarily conserved DNA without evolutionary change.
Examples of transposable elements acquiring function represent co-option. This is the same phenomenon that is involved in the evolution of complex features like eyes and flagella. In particular, co-option of TEs appears to have happened in the evolution of the vertebrate immune system. Again, this makes no sense in the absence of an evolutionary scenario.
Most transposable elements do not appear to be functional at the organism level. In humans, most are inactive molecular fossils. Some are active, however, and can cause all manner of diseases through their insertions. To repeat: some transposons are functional, some are clearly deleterious, and most probably remain more or less neutral.
Any suggestions that all non-coding DNA is functional must explain why an onion needs five times more of it than you do. So far, none of the proposed unilateral functions has done this. It therefore remains most reasonable to take a pluralistic approach in which only some non-coding elements are functional for organisms.

I realize that this will have no effect on the arguments made by anti-evolutionists, but I hope it at least clarifies the issue for readers who are interested in the actual science involved and its historical development.

3 thoughts on “Function, non-function, some function: a brief history of junk DNA.”

Jonathan Badger on June 14, 2007 at 8:23 am said:

Well, while I certainly agree that functionality/adaptive nature shouldn’t be the null hypothesis, I don’t buy the assumption that most non-coding DNA must be non-functional simply because onions have much more non-coding DNA than we do.

For that argument to work, you have to buy into the anthropocentric notion of “less complex” and “more complex” organisms, which you (quite correctly) put in scare quotes earlier in your article. As humbling as it may be, humans may not have the most “complicated” gene regulation.

A better argument (which you use yourself on the linked page) is pointing out that apparently closely related species can have vastly different genome sizes. This brings up the issue without anthropocentric bias.
TR Gregory on June 14, 2007 at 8:56 am said:

Quite right, although I don’t think there is anything inherently anthropocentric in my argument. I only make a connection to humans in this way for two reasons: 1) most of the discussion is about function in the human genome, so even if you manage to show function little by little in our genome and make it to a majority of sequences, you’d have to do that five times over in the onion, and 2) it gets people to think about the issue in a new way because most non-specialists are used to thinking of humans as “complex” in some way.

In the more developed version, I point out that several of the domesticated onion’s congeners have much larger genomes, and I have also pointed out that it is equally challenging to find that a pufferfish “needs” (if absolute functionality is expected) only 1/10 as much non-coding DNA as humans.

Humans indeed may have less complex gene regulation. But is it 10 times less complex than a salamander or 10 times more complex than a pufferfish? In any case, the evidence to date points to only a small portion of our own genome being functional in this way, so in that sense a comparison based on regulation is moot.

But yes, let me be clear and agree with Jonathan unambiguously: there is no objective basis for anthropocentrism in discussions about genomic features.
phoboskitty on April 17, 2008 at 11:30 am said:

about the “amount” of non-coding DNA… and the idea of less or more “complex”

is this tied to Time? like would not a species that has been around for let say 100,000,000 years have more of this than lets say Humans (around 200,000 years old)assuming there is not some environmental disaster or mass extinction

and is the “Junk” the same across a species, like would a Wolf have the same Junk or mostly the same junk as a Basset hound?, or like is the amount of “junk” in the Human Genome, around the same for the Neanderthal Genome? or will you find the same kind of non-coding DNA across a diverse amount of species

Comments are closed.