We have been told in science news stories since the early 1990s that biologists long neglected the potential significance of noncoding DNA. (Sadly, this is in line with the claims made by creationists, who claim that “Darwinism” is to blame despite the obvious fact that Darwinian adaptationism would expect functions. Some biologists likewise play up the notion that we have ignored noncoding sequences and just now are coming to appreciate them, thanks, no doubt, to their own revolutionary insights, but again, this ignores a diverse literature on the topic spanning the rise of the tools necessary for such work up to the present.) But what about the science stories that were actually written during the supposed period during which noncoding DNA was dismissed as uninteresting (i.e. 1980 to the early 1990s)?
If you had a subscription to Science in the 1980s, you would have read stories like these by their science writer Roger Lewin:
Lewin, R. 1981. Evolutionary history written in globin genes. Science 214: 426-427.
Even though the human β-globin complex contains a relatively large number of active genes, 95 percent of the locus is made up of DNA that does not code for proteins. What is the role of this extra DNA, if any? The pseudogenes constitute just a small proportion of the region, although more pseudogenes might exist. Some of the DNA is made up of representatives of well-known families of repetitive sequences. And the remainder is DNA of no known function or comparable sequence.
“We wanted to test the hypothesis that this extra DNA is ‘junk DNA,'” says Jeffreys, “so we compared the β loci in humans, gorillas, and baboons.” Jeffreys and his colleagues reasoned that if it were junk DNA, then over the 20 to 40 million years of evolution represented by humans, apes, and Old World monkeys both the sequence and the overall quantity of intergenic DNA could be expected to vary. “It turned out that the cluster is remarkably stable,” reports Jeffreys. “The overall pattern and size of the cluster is the same, and the rate of nucleotide substitutions is one-quarter to one-fifth of what be expected in functionless DNA”. The noncoding DNA therefore appears not to be junk, but what function it might perform is still a mystery.
Lewin, R. 1982. Repeated DNA still in search of a function. Science 217: 621-623.
[Reporting about an NIH International Workshop in Highly Repeated DNA July, 1982]
Interest in repetitive DNA sequences goes back many years but, as with many aspects of molecular biology, the advent of recombinant DNA technology and DNA sequencing now permits previously unmatched scrutiny of the structures of interest.
If mobility is a reality, and most agree that it probably is, then it seems likely that at least some members of repeat families will have important effects in the genome, even if they have no formal function. Enhancing recombination and altering rates of gene expression are obvious possibilities, while the initiation of new species is a more recondite proposal.
The truth is, however, that the functions of the large and motley collection of repeated DNA families are proving particularly resistant to elucidation. Putative functions are many, including, variously, involvement in chromosome pairing, control of gene expression, processing of messenger RNA precursors, and participation in DNA replication. So far none has been established, save for the single exception of a small family that gives rise to 7S RNA, a molecule that recently was serendipitously discovered to be an essential component of a particle that mediates the secretion of proteins from cells.
Some repetitive DNA will undoubtedly be shown to have a function, in the formal sense; some will likely be shown to exert important effects; and the remainder may well have no function or effect at all and can therefore be called selfish DNA. Repetitive DNA constitutes a substantial proportion of the genome (up to 90 percent in some cases), and there is considerable speculation on how it will eventually be divided between these three groups. Current bets would put a small fraction in the function category, with distribution of the rest rising steeply through the effect and selfish categories.
Satellite DNA unquestionably is a puzzle. What determines the number of copies in a repeat family? And how does the genome tolerate so much of it? Perhaps, as Singer has recently promulgated, just a small fraction of the satellite sequences is essential to some genomic function while the remainder is harmless surplus. This, she indicates, is a comfortable middle ground between the extreme selfish DNA position, which sees no function in all this “junk DNA,” and the adaptationist position, which looks for functions in every structure. The same questions and speculations can be applied to dispersed repetitive DNA.
One observation that might be taken as evidence of function in repeated sequences is the frequency of transcription into RNA. A significant proportion of nuclear RNA contains transcripts of repeated sequences, although 90 percent of this is lost in RNA processing and exit to the cytoplasm. Davidson and his colleagues have shown that in sea urchin the spectrum of repeat families that are transcribed changes during development, an appealing argument for some regulatory function. Most intriguing, however, is the discovery that only a small proportion of any repeat family is ever transcribed. “Most members appear to be quiescent, which must make you cautious when isolating samples in search of their function.”
It is clear that, from their abundance, their unusual structure, and their frequent transcription, dispersed repetitive DNA families cannot be ignored. But it is equally clear that for the most part they, like their tandemly repeated relatives, remain a phenomenon in search of a function.
Lewin, R. 1982. Adaptation can be a problem for evolutionists. Science 216: 1212-1213.
Molecular biology of recent years has revealed many new and intriguing categories of DNA, some of which appear to have no role. One explanation of this has been that the nonaptive sequences provide raw material for future evolution. But the logic of natural selection does not allow for selection for future use. More likely is that the accumulation of nonaptive DNA is a consequence of the innate property of repeated sequences of nucleic acid to replicate and move around the genome. Later it may be recruited to perform some role, in which case it becomes an exaptation.
Lewin, R. 1983. A naturalist of the genome. Science 222: 402-405.
Some mobile elements are large and complex, measuring as much as 10,000 nucleotides in length and carrying many genes, while others are simple sections of repeated DNA just a few hundred nucleotides long. Some people would classify all such elements as “junk” or “parasitic” DNA. Others strongly demur and insist that, for instance, although there is yet to be found any convincing evidence for the involvement of a limited class of elements in development in organisms other than maize, the possibility should by no means be dismissed. In any case it is clear that the mobility of certain genetic elements is essential in the generation of the huge diversity of antibodies in vertebrates and in the production of different antigenic coats in certain parasites. Jumping genes clearly represent a potentially rich source of mutation. In addition, an evolutionary link between mobile elements and retroviruses now seems incontrovertible, as does a causal relationship with certain cancers.
Lewin, R. 1985. More progress in messenger RNA splicing. Science 228: 977.
This summer marks 8 years since eukaryotic genes were first discovered to be interrupted by noncoding sequences, known variously as intervening sequences or introns. The discovery raised two sets of questions. The first concerns the origin and function-if any-of introns, which, by its very nature, is a very difficult question to test and therefore remains somewhat in the realms of speculation, although significant insights are being made.The second focuses on the mechanics of removal of these sequences in the production of mature RNA molecules, and in principle should be experimentally more tractable. The immense effort directed at this second question has produced during the past 8 years some conventional biochemistry, some novel and surprising nucleic acid chemistry, and a great deal of frustration.
Lewin, R. 1986. “Computer genome” is full of junk DNA. Science 232: 577-578.
Many biologists were unhappy with the idea that much of the DNA might have no function, says Loomis. “There is a very strong feeling that if a molecule, or any kind of biological structure, exists, then it must be serving some kind of selectively advantageous purpose. I disagree with this viewpoint very strongly.” Loomis prefers to turn the question around. “We should ask, ‘what is the selective advantage of getting rid of a particular structure?’ This is not common thinking.”
It is of course very difficult to prove that a structure or a sequence of DNA has no function. “People will always say, ah, but you haven’t looked under the right conditions,” says Loomis. In the case of multigene families, the best data come from mutation experiments.
Lewin, R. 1988. Chance and repetition. Science 240: 603.
With some kind of concerted effort to map and sequence the entire human genome now appearing to be inevitable, there will be much excitement at the prospect of discovering what is encoded in the 3-billion-base “message”. There are certain to be some surprises, perhaps even equivalent in magnitude to the discovery a decade ago of long, noncoding sequences that interrupt the great majority of eukaryotic genes. But there are many biologists who expect large parts of the genome to be devoid of any function at all: “We face the prospect of trudging through huge tracts of junk DNA,” remarked British molecular biologist Sydney Brenner during one of the many recent panel discussions on the project.
At least some proportion of the DNA in the genomes of most organisms is in the form of these so-called middle repetitive sequences, ranging from 3% to as much as 70%: typically, the bigger the genome, the more repetitive DNA. There is a long tradition in biology that, seeing structures as extensive as these, argues that there must be a functional explanation for them.
Biologists have long speculated about the function of middle repetitive sequences, with regulation of gene expression being one popular notion. Loomis and Gilpin’s perspective, however, is that, although some middle repetitive sequences may have acquired a function once they have formed, there is no need to invoke function as a selective pressure for their origin.
Part of the Quotes of interest series.