Yet another link between dinosaurs and birds.

Score another point for Darwin’s bulldog. T.H. Huxley, who argued that birds are descended from dinosaurs nearly 140 years ago (Huxley 1868), would undoubtedly be pleased with himself were he privy to the wealth of data that has accrued in support of his conjecture. Evidence from the rather good fossil records of dinosaurs and birds strongly supports this notion, and now two additional lines of evidence can be added in its favour.

The first relates to my own area of study, that of genome size diversity. In a recent paper in Nature, Organ et al. used the link between genome size and cell size of extant taxa (in this case, osteocyte spaces) to infer genome sizes for extinct dinosaurs from various lineages (see Zimmer 2007 f0r a helpful overview). I have promised to post a detailed discussion of the genome size-flight issue another time, so I will briefly note that the main finding is that saurischian dinosaurs, the lineage from within which birds are thought to be descended, had small genomes relative to other reptiles, including ornithischian dinosaurs. This suggests that genome size reduction began (but probably did not culminate) prior to the origin of avian flight and provides yet another intriguing link between saurischian dinosaurs and birds.

The second was reported in papers published by Asara et al. and Schweitzer et al. in the April 13 issue of Science, and revealed intriguing similarities in the amino acid sequences of collagen proteins from a 68 million year old bone from Tyrannosaurus rex and that of a modern chicken. Specifically, amino acid sequence identity was closer to chicken (and presumably to all other modern birds) (58%) than to frog or newt (51%). Note that collagen sequence is usually quite conserved and that what the authors were dealing with were fragments of proteins. Note also that some other interesting comparisons — especially with other living archosaurs, namely alligators or crocodiles — were not possible based on the currently available data. Plenty more to do as follow up to this study, but a very interesting result. A summary of the studies can be found in National Geographic and the New York Times.

This is a nice example of the way in which independent sources of information converge on a common conclusion in evolutionary biology, and how new discoveries simultaneously can raise novel questions and provide innovative means by which to approach them.

———–

References

Asara, J.M., M.H. Schweitzer, L.M. Freimark, M. Phillips, and L.C. Cantley. 2007. Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry. Science 316: 280-285.

Huxley, T.H. 1868. On the animals which are most nearly intermediate between birds and the reptiles. Annals and Magazine of Natural History, Series 4, 2: 66-75.

Organ, C.L., A.M. Shedlock, A. Meade, M. Pagel, and S.V. Edwards. 2007. Origin of avian genome size and structure in non-avian dinosaurs. Nature 446: 180-184.

Schweitzer, M.H., Z. Suo, R. Avci, J.M. Asara, M.A. Allen, F.T. Arce, and J.R. Horner. 2007. Analyses of soft tissue from Tyrannosaurus rex suggest the presence of protein. Science 316: 277-280.

Zimmer, C. 2007. Jurassic genome. Science 315: 1358-1359.


The discovery of DNA.

The following is an adapted excerpt from The Evolution of the Genome, © 2005 Elsevier Academic Press.

In the mid- to late 1800s (and to an extent, well into the 20th century), proteins were considered the most significant components of cells. Their very name reflects this fact, being derived from the Greek proteios, meaning “of the first importance”. In 1869, while developing techniques to isolate nuclei from white blood cells (which he obtained from pus-filled bandages, a plentiful source of cellular material in the days before antiseptic surgical techniques), 25 year-old Swiss biologist Friedrich Miescher stumbled across a phosphorous-rich substance which, he stated, “cannot belong among any of the protein substances known hitherto” (quoted in Portugal and Cohen 1977 [1]). To this substance he gave the name nuclein, and published his results in 1871 after confirmation of the remarkable finding by his advisor, Felix Hoppe-Seyler (for reviews, see Mirsky 1968; Portugal and Cohen 1977; Lagerkvist 1998; Wolf 2003) [2, 3].

Miescher continued his work on nuclein for many years, in part refuting claims that it was merely a mixture of inorganic phosphate salts and proteins. Yet Miescher never departed from the common proteinocentric wisdom, and instead suggested that the nuclein molecule served as little more than a storehouse of cellular phosphorus. In 1879, Walther Flemming coined the term chromatin (Gr. “colour”) in reference to the coloured components of cell nuclei observed after treatment with various chemical stains, and in 1888 Wilhelm Waldeyer used the term chromosome (Gr. “colour body”) to describe the threads of stainable material found within the nucleus. For some time, debate existed over whether or not chromatin and nuclein were one and the same. The argument was largely settled when Richard Altman obtained protein-free samples of nuclein in 1889. As part of this work, Altman proposed a more appropriate (and familiar) term for the substance, nucleic acid. Over time, the components of the nucleic acid molecules were deduced, and by the 1930s, nuclein had become desoxyribose nucleic acid, and later, deoxyribonucleic acid (DNA).

The important developments that took place over the ensuing decades are well documented (e.g., Portugal and Cohen 1977; Judson 1996), including early hypotheses of DNA’s structure (such as Phoebus Levene’s failed tetranucleotide hypothesis, or the incorrect helical model of Linus Pauling), Erwin Chargaff’s discovery of the constant ratio of the two purines with their respective pyrimidines, Rosalind Franklin’s x-ray crystallography of the DNA molecule, and other key developments leading up to Watson and Crick’s monumental synthesis in 1953 and the subsequent deciphering of the genetic code.

Miescher died of tuberculosis in 1895 at the age of 51. His was a major contribution to biology, as were the discoveries of countless other individuals up to and beyond the elucidation of DNA’s physical structure and the dawn of molecular genetics.

————

Notes

[1] I stumbled across this book at a used bookstore in Madison, Wisconsin at the 1999 SSE meeting. That was in the days before searches on Amazon.com, Google, and Wikipedia were easy and routine, and I was unaware that the book existed so I considered it quite a lucky find.

[2] Hoppe-Seyler also had his own journal, in which Miescher’s results were published, but was not a co-author on the paper. My, how things have changed!

[3] For more information about Miescher, see the following:


References

Judson, H.F. 1996. The Eighth Day of Creation. CSHL Press, Plainview, NY.

Lagerkvist, U. 1998. DNA Pioneers and Their Legacy. Yale University Press, New Haven, CT.

Miescher, F. 1871. Über die chemische Zusammensetzung der Eiterzellen. Hoppe-Seyler’s medizinish-chemischen Untersuchungen 4: 441-460.

Mirsky, A.E. 1968. The discovery of DNA. Scientific American 218 (June): 78-88.

Portugal, F.H. and J.S. Cohen. 1977. A Century of DNA. MIT Press, Cambridge, MA.

Tracy, K. 2005. Friedrich Miescher and the Story of Nuclei Acid. Mitchell Lane Publishers.

Wolf, G. 2003. Friedrich Miescher, the man who discovered DNA.


Macaque genome published.

The April 13 issue of Science includes a collection of papers reporting and analyzing the sequence of the macaque (Macaca mulatta) genome. This marks the third primate genome to be sequenced (after human in 2001 and chimpanzee in 2005). Needless to say, comparisons of three genomes are far more informative than analyses involving only one or two sequences, and the papers contained in the special issue of Science already include some novel insights of evolutionary and medical significance that were previously unattainable. Carl Zimmer at The Loom provides a general summary of some key findings.

There is, rightly, a lot of interest in comparing genes among the three primate species. Non-coding DNA also gets a much-deserved amount of attention; in fact, this time we are fortunate enough to see an entire paper devoted to transposable elements. One general finding of interest relates to the number of transposable elements in the three genomes, which is remarkably similar (and quite high) in the three species. Here is the breakdown:


No wonder Ford Doolittle once remarked, probably only half-jokingly, that “our genomes … might be ironically viewed as vehicles for the replication of Alu sequences”. They do, after all, outnumber protein-coding genes by about 50 : 1.

The Genomes OnLine Database (GOLD) provides a list of other completed and forthcoming genome sequences. The macaque is only the latest in a rapidly growing list of genome projects that will continue to provide exciting new information about the evolution of genomes and the organisms carrying them.


A word about "junk DNA".

“It seems as though ‘junk DNA’ has become a legitimate jargon in a glossary of molecular biology. Considering the violent reactions this phrase provoked when it was first proposed in 1972, the aura of legitimacy it now enjoys is amusing, indeed.”

– Ohno and Yomo, 1991


The origin of “junk DNA”

Two main problems struck Susumu Ohno as particularly important in his seminal work on the genetics of evolutionary diversification. The first was the lack of correspondence between genome size (amount of DNA) and morphological complexity (taken as a proxy for gene number), which was a prominent topic of discussion in the early 1970s. As he noted in 1972, “If we take the simplistic assumption that the number of genes contained is proportional to the genome size, we would have to conclude that 3 million or so genes are contained in our genome. The falseness of such an assumption becomes clear when we realize that the genome of the lowly lungfish and salamanders can be 36 times greater than our own” (Ohno 1972a). In fact, Ohno and his colleagues were well aware that much of the DNA in the mammalian genome could not code for proteins, lest the mutational load become fatally high (e.g., Comings 1972; Ohno 1972b, 1974).

The second problem related to the conservative force of purifying selection and the limitations it places on the diversification of species. Ohno (1973) attempted to kill both of these vexatious birds with a single conceptual stone:

The points I wish to make are: 1) Natural selection is an extremely conservative force. So long as a particular function is assigned to a single gene locus in the genome, natural selection only permits trivial mutations of that locus to accompany evolution. 2) Only a redundant copy of a gene can escape from natural selection and while being ignored by natural selection can accumulate meaningful mutation to emerge as a new gene locus with a new function. Thus, evolution has been heavily dependent upon the mechanism of gene duplication. 3) The probability of a redundant copy of an old gene emerging as a new gene, however, is quite small. The more likely fate of a base sequence which is not policed by natural selection is to become degenerate. My estimate is that for every new gene locus created about 10 redundant copies must join the ranks of functionless DNA base sequence. 4) As a consequence, the mammalian genome is loaded with functionless DNA.

The corpulent genomes of dipnoans and urodele amphibians were similarly thus accounted for under this view: “Lungfish and salamanders clearly show the tragic consequences of exclusive dependence upon tandem duplication” (Ohno 1970, p.96). Of course, this differs from current thinking about lungfish and salamander genome size, but that’s another story.

To Ohno, this situation not only permitted, but also paralleled, the evolution of life at large. As he put it, “The earth is strewn with fossil remains of extinct species; is it any wonder that our genome too is filled with the remains of extinct genes?” (Ohno 1972a). The primary outcome of this gene duplication mechanism would not be the generation of new genes, but the deactivation of redundant copies – just as extinction has been the fate of more than 99% of species that have ever lived (Raup 1991). Once purifying selection ceased to shelter gene sequences from change, they would be free to mutate and, if one imagines a set of three gene copies initially sharing the same sequence, it is likely that “in a relatively short time, two of the three duplicates would join the ranks of ‘garbage DNA’” (Ohno 1970, p.62).

In Ohno’s usage, as in the vernacular, “garbage” refers to both the loss of function and the lack of any further utility (it was once useful, but now it isn’t). “Garbage DNA” proved to be an unsuccessful meme, but its essence remains
in the wildly popular term coined by Ohno two years later – “junk DNA”. Thus, as Ohno (1972b) stated, “at least 90% of our genomic DNA is ‘junk’ or ‘garbage’ of various sorts”. Interestingly, Ohno mentioned “junk DNA” only in the titles of two of his papers (1972a, 1973), and invoked the term only once in passing in a third (1972b). Comings (1972), on the other hand, gave what must be considered the first explicit discussion of the nature of “junk DNA”, and was the first to apply the term to all non-coding DNA.

There are several independent mechanisms by which non-coding DNA can accumulate in the genome. Gene duplication and deactivation is one such mechanism, but this, we now know, applies to only a minority of the non-coding sequences. Nevertheless, the term “junk DNA” was used in some early general descriptions of non-coding elements, including heterochromatin. For example, Comings (1972) noted that:

It has frequently been suggested that the DNA of genetically inactive heterochromatin represents the degenerate and useless DNA of the genome. However, heterochromatin rarely constitutes more than 20% of the genome. This suggests that there are two categories of junk DNA, (1) DNA of constitutive heterochromatin which is neither transcribed nor translated, and (2) nonheterochromatic junk DNA which is probably transcribed, but not translated. This distinction adds one more dimension to the mystery of heterochromatic DNA. Why is it singled out to be nontranscribable when being nontranslatable seems adequate for most of the junk DNA? Perhaps there is clustered junk (heterochromatic DNA) and nonclustered junk, just as there is clustered repetitious DNA (satellite DNA) and nonclustered repetitious DNA.


Later, Ohno himself began applying the term “junk” to heterochromatic, intergenic, and intronic sequences: “Much of this junk DNA occurs as large heterochromatin blocks, often localized in pericentric regions of mammalian chromosomes, or as intergenic spacers and intervening sequences within genes.” (Ohno 1985).

It is clear, however, that Ohno (1982) believed all these sequences were produced by gene duplication:

This great preponderance of intergenic spacers in the euchromatic region is due mostly to the extreme inefficacy of the mechanism of gene duplication as a means of creating new genes with altered active sites. For every redundant copy of the pre-existent gene that emerged triumphant as a new gene, hundreds of other copies must have degenerated to join the rank of junk DNA.


This mechanism alone was considered capable of explaining the vast intergenic regions of eukaryotic genomes. According to Ohno (1985):


Indeed, the abundance of pseudogenes (recent degenerates) attests to the inefficacy of gene duplication as a means of acquiring new genes with novel functions. The net consequence of hundreds of millions of years of continuous gene duplication is the desertification of the euchromatic region of modern vertebrates; the average distance between still functioning gene loci becoming progressively longer.


Junk DNA, function, and non-function

“Junk DNA” had a specific meaning when it first was formulated. It was meant to describe the loss of protein-coding function by deactivated gene duplicates, which in turn were believed to constitute the bulk of eukaryotic genomes. As different types of non-coding DNA were identified, the concept of gene duplication as their source – and therefore “junk DNA” as their descriptor – found new and broader application. However, it is now clear that most non-coding DNA is not produced by this mechanism, and is therefore not accurately described as “junk” in the original sense.

The term “pseudogene” — the technical term for functionless gene copies — was not coined until 1977 (Jacq et al. 1977), and the more explicit definition of these sequences that specified non-function in terms of protein-coding emerged almost a decade later. So, although Ohno’s original description of “junk DNA” obviously involved what are now called “pseudogenes”, there was no initial requirement for non-function. As Comings (1972) put it, “Being junk doesn’t mean it is entirely useless. Common sense suggests that anything that is completely useless would be discarded.” (This is what Sydney Brenner meant by the distinction between “trash” or “rubbish”, which one throws away, and “junk”, which one keeps; Brenner 1998). Of course, Ohno did reject the notion of protein-coding function for the extinct genes. As he described it, “a functional gene locus is defined as that DNA base sequence which may sustain deleterious mutations”, and from this it followed that “a DNA base sequence in which all sorts of mutational changes are permissible is obviously not contributing to the well-being of an organism, and for this very reason, it has no function” (Ohno 1973). On the other hand, and in the same publication, Ohno (1973) suggested a different role for non-coding DNA: “The bulk of functionless DNA in the mammalian genome may serve as a damper to give a reasonably long cell generation time (12 hours or so instead of several minutes)”.

From the very beginning, the concept of “junk DNA” has implied non-functionality with regards to protein-coding, but left open the question of sequence-independent impacts (perhaps even functions) at the cellular level. “Junk DNA” may now be taken to imply total non-function and is rightly considered problematic for that reason, but no such tacit assumption was present in the term when it was coined.

Two groups of people, though maximally divergent in their reasons for so doing, have been driven by a philosophical need to identify functions for all n
on-coding DNA. The first includes strict adaptationists, among whom it was often assumed that all non-coding DNA, by virtue of its very existence, must be endowed with some as-yet-unknown function of critical importance: “The very fact that amplified sequences have been maintained, withstanding rigours of selection, indicates some adaptive significance” (Sharma 1985).

We may also consider the following discussion comments recorded at the end of Ohno (1973):

Yunis: “This is what I emphasized earlier, that this DNA must have a functional value since nothing is known so widespread and universal in nature that has proven useless.”

Fraccaro: “Well, there is an exception to that rule. A lot of us have permanent positions at the University but are considered by others (mainly by students) meaningless and of no utility whatsoever.”


These examples aside, it seems likely that most evolutionary biologists today could tolerate a conclusion, if such were rendered, that a significant fraction of non-coding DNA is functionless
. This is not true of the second group in question, compared to whom the passion for function is unrivaled. As Dawkins (1999) suggested, “creationists might spend some earnest time speculating on why the Creator should bother to litter genomes with untranslated pseudogenes and junk tandem repeat DNA”. In fact, many have done so (e.g., Gibson 1994; Wieland 1994; Batten 1998; Jerlström 2000; Walkup 2000; Woodmorappe 2000; Bergman 2001). Although apparently “not enough is yet known about eukaryotic genomes to construct a comprehensive creationist model of pseudogenes” (Woodmorappe 2000), the theme that undergirds all of these discussions is that all non-coding DNA must, a priori, be functional.

To satisfy this expectation, creationist authors (borrowing, of course, from the work of molecular biologists, as they do no such research themselves) simply equivocate the various types of non-coding DNA, and mistakenly suggest that functions discovered for a few examples of some types of non-coding sequences indicate functions for all (see Max 2002 for a cogent rebuttal to these creationist confusions). Case in point: a few years ago, much ado was made of Beaton and Cavalier-Smith’s (1999) titular proclamation, based on a survey of cryptomonad nuclear and nucleomorphic genomes, that “eukaryotic non-coding DNA is functional”. The point was evidently lost that the function proposed by Beaton and Cavalier-Smith (1999) was based entirely on coevolutionary interactions between nucleus size and cell size.

Those who complain about a supposed unilateral neglect of potential functions for non-coding DNA simply have been reading the wrong literature. In fact, quite a lengthy list of proposed functions for non-coding DNA could be compiled (for an early version, see Bostock 1971). Examples include buffering against mutations (e.g., Comings 1972; Patrushev and Minkevich 2006) or retroviruses (e.g., Bremmerman 1987) or fluctuations in intracellular solute concentrations (Vinogradov 1998), serving as binding sites for regulatory molecules (Zuckerkandl 1981), facilitating recombination (e.g., Comings 1972; Gall 1981; Comeron 2001), inhibiting recombination (Zuckerkandl and Hennig 1995), influencing gene expression (Britten and Davidson 1969; Georgiev 1969; Nowak 1994; Zuckerkandl and Hennig 1995; Zuckerkandl 1997), increasing evolutionary flexibility (e.g., Britten and Davidson 1969, 1971; Jain 1980; reviewed critically in Doolittle 1982), maintaining chromosome structure and behaviour (e.g., Walker et al. 1969; Yunis and Yasmineh 1971; Bennett 1982; Zuckerkandl and Hennig 1995), coordingating genome function (Shapiro and von Sternberg 2005), and providing multiple copies of genes to be recruited when needed (Roels 1966).

Does non-coding DNA have a function? Some of it does, to be sure. Some of it is involved in chromosome structure and cell division (e.g., telomeres, centromeres). Some of it is undoubtedly regulatory in nature. Some of it is involved in alternative splicing (Kondrashov et al. 2003). A fair portion of it in various genomes shows signs of being evolutionarily conserved, which may imply function (Bejerano et al. 2004; Andolfatto 2005; Kondrashov 2005; Woolfe et al. 2005; Halligan and Keightley 2006). On the other hand, the largest fraction is comprised of transposable elements — some of which become co-opted by the host genome, some of which play major role in generating genomic variation, some of which may be involved in cellular stress response, and yet others of which remain detrimental to host fitness (Kidwell and Lisch 2001; Biémont and Vieira 2006). The upshot is that some non-coding DNA is most certainly functional — but when it is, this usually makes sense only in an evolutionary context, particularly through processes like co-option. More broadly, those who would attribute a universal function for non-coding DNA must bear the following in mind: any proposed function for all non-coding DNA must explain why an onion or a grasshopper needs five times more of it than anyone reading this sentence.

Should “junk” be thrown out?

There is nothing wrong with a word taking on a new meaning as knowledge changes – that is, unless reference to an original (and outmoded) sense lingers as a source of confusion, or the term expands so much as to lose contact with an initially accurate definition. Indeed, even the term “evolution” is technically a misnomer since its etymology implies an “unfolding”, as of a pre-determined developmental program (see Bowler 1975). The objection raised here is not to terms that change in usage per se, but to those whose shifting usage involves collecting or retaining unwanted conceptual baggage. This is especially relevant when the baggage is toted surreptitiously (note that no serious biologist takes “evolution” to mean a pre-determined unfolding but that ideas of inherent “progress” have been almost impossible to shake; see Gould 1996; Ruse 1996).

“Junk DNA”, which originally was coined in reference to now-functionless gene duplicates (i.e., true broken-down “junk”), is now used as “a catch-all phrase for chromosomal sequences with no apparent function” (Moore 1996). Its current usage also implies a lack of function which is accurate by definition for pseudogenes in regard to protein-coding, but which does not hold for all non-coding elements. The term has deviated from or outgrown its original use, and its continued invocation is non-neutral in its expression – and generation – of conceptual biases.

“Junk DNA” is not the only offender. Non-coding DNA has been called by many names that have had the same pejorative undertones (intentional or not) implying uselessness, if not outright wastefulness. Examples include excess DNA (Zuckerkandl 1976; Doolittle and Sapienza 1980), surplus or nonessential or degenerate or silent DNA (Comings 1972; Gilbert 1978), quiet DNA (Lefevre 1971), garbage DNA (Ohno 1970), non-informational or nonsense DNA (Ohno 1972b), worthless DNA (Ohno 1973), trivial DNA (Ohno 1974), vestigial DNA (Loomis 1973), redundant DNA (Vinogradov 1998), supplementary DNA (Hutchinson et al. 1980), secondary DNA (Hinegardner 1976), and incidental DNA (Jain 1980).

As Gould (2002, p.503) stated, “A rose may retain its fragrance under all vicissitudes of human taxonomy, but never doubt the power of a name to shape and direct our thoughts”. Because it is generally no longer applied in its original meaningful sense, because the type of DNA to which it actually relates now has a more descriptive name (pseudogenes), and because of its connotations of total phenotypic inertness, the term “junk DNA” should probably be abandoned in favour of less subjective terminology. “Non-coding DNA” serves this purpose quite well.

Concluding remarks

It is an exciting time in genome biology. Aspects of genomic form and function that were largely inconceivable only a few decades ago are now being revealed on a daily basis. It should come as no surprise (and indeed, it probably does not) that new roles are being discovered for non-coding DNA and that some of yesterday’s buzzwords — including “junk DNA” — are destined for the dustbin. However, extrapolating each report that a given small segment of DNA may be functional to mean that all non-coding DNA is vital is as counterproductive as dismissing non-coding DNA as totally non-functional. Genomes are complex, and there is little use in approaching them from a simplistic point of view.

——

Andolfatto, P. 2005. Adaptive evolution of non-coding DNA in Drosophila. Nature 437: 1149-1152.

Batten, D. 1998. ‘Junk’ DNA (again). Creation Ex Nihilo Technical Journal 12: 5.

Beaton, M.J. and T. Cavalier-Smith. 1999. Eukaryotic non-coding DNA is functional: evidence from the differential scaling of cryptomonad genomes. Proceedings of the Royal Society of London, Series B 266: 2053-2059.

Bejerano, G., M. Pheasant, I. Makunin, S. Stephen, W.J. Kent, J.S. Mattick, and D. Haussler. 2004. Ultraconserved elements in the human genome. Science 304: 1321-1325.

Bennett, M.D. 1982. Nucleotypic basis of the spatial ordering of chromosomes in eukaryotes and the implications of the order for genome evolution and phenotypic variation. In Genome Evolution (eds. G.A. Dover and R.B. Flavell), pp. 239-261. Academic Press, New York.

Bergman, J. 2001. The functions of introns: from junk DNA to designed DNA. Perspectives on Science and Christian Faith 53: 170-178.

Biémont, C. and C. Vieira. 2006. Junk DNA as an evolutionary force. Nature 443: 521-524.

Bostock, C. 1971. Repetitious DNA. Advances in Cell Biology 2: 153-223.

Bowler, P.J. 1975. The changing meaning of “evolution”. Journal of the History of Ideas 36: 95-114.

Bremmerman, H.J. 1987. The adaptive significance of sexuality. In The Evolution of Sex and its Consequences (ed. S.C. Stearns), pp. 135-161. Birkhauser Verlag, Basel.

Brenner, S. 1998. Refuge of spandrels. Current Biology 8: R669.

Britten, R.J. and E.H. Davidson. 1969. Gene regulation for higher cells: a theory. Science 165: 349-357.

Britten, R.J. and E.H. Davidson. 1971. Repetitive and non-repetitive DNA sequences and a speculation on the origins of evolutionary novelty. Quarterly Review of Biology 46: 111-138.

Castillo-Davis, C.I. 2005. The evolution of noncoding DNA: how much junk, how much func? Trends in Genetics 21: 533-536.

Comeron, J.M. 2001. What controls the length of noncoding DNA? Current Opinion in Genetics & Development 11: 652-659.

Comings, D.E. 1972. The structure and function of chromatin. Advances in Human Genetics 3: 237-431.

Dawkins, R. 1999. The “information challenge”: how evolution increases information in the genome. Skeptic 7: 64-69.

Doolittle, W.F. and C. Sapienza. 1980. Selfish genes, the phenotype paradigm and genome evolution. Nature 284: 601-603.

Doolittle, W.F. 1982. Selfish DNA after fourteen months. In Genome Evolution (eds. G.A. Dover and R.B. Flavell), pp. 3-28. Academic Press, New York.

Gall, J.G. 1981. Chromosome structure and the C-value paradox. Journal of Cell Biology 91: 3s-14s.

Georgiev, G.P. 1969. On the structural organization of operon and the regulation of RNA synthesis in animal cells. Journal of Theoretical Biology 25: 473-490.

Gibbs, W.W. 2003. The unseen genome: gems among the junk. Scientific American 289(5): 46-53.

Gibson, L.J. 1994. Pseudogenes and origins. Origins 21: 91-108.

Gilbert, W. 1978. Why genes in pieces? Nature 271: 501.

Gould, S.J. 1996. Full House. Harmony Books, New York.

Gould, S.J. 2002. The Structure of Evolutionary Theory. Harvard University Press, Cambridge, MA.

Halligan, D.L. and P.D. Keightley. 2006. Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison. Genome Research 16: 875-884.

Hinegardner, R. 1976. Evolution of genome size. In Molecular Evolution (ed. F.J. Ayala), pp. 179-199. Sinauer Associates, Inc., Sunderland.

Hutchinson, J., R.K.J. Narayan, and H. Rees. 1980. Constraints upon the composition of supplementary DNA. Chromosoma 78: 137-145.

Jacq, C., J.R. Miller, and G.G. Brownlee. 1977. A pseudogene structure in 5S DNA of Xenopus laevis. Cell 12: 109-120.

Jain, H.K. 1980. Incidental DNA. Nature 288: 647-648.

Jerlström, P. 2000. Pseudogenes: are they non-functional? Creation Ex Nihilo Technical Journal 14: 15.

Kidwell, M.G. and D.R. Lisch. 2001. Transposable elements, parasitic DNA, and genome evolution. Evolution 55: 1-24.

Kondrashov, F.A. and E.V. Koonin. 2003. Evolution of alternative splicing: deletions, insertions and origin of functional parts of proteins from intron sequences. Trends in Genetics 19: 115-119.

Kondrashov, A.S. 2005. Fruitfly genome is not junk. Nature 437: 1106.

Lefevre, G. 1971. Salivary chromosome bands and the frequency of crossing over in Drosophila melanogaster. Genetics 67: 497-513.

Loomis, W.F. 1973. Vestigial DNA? Developmental Biology 30: F3-F4.

Makalowski, W. 2003. Not junk after all. Science 300: 1246-1247.

Max, E.E. 2002. Plagiarized errors and molecular genetics: another argument in the evolution-creation controversy. Talk.Origins Archive.

Moore, M.J. 1996. When the junk isn’t junk. Nature 379: 402-403.

Nowak, R. 1994. Mining treasures from ‘junk DNA’. Science 263: 608-610.

Ohno, S. 1970a. Evolution by Gene Duplication. Springer-Verlag, New York.

Ohno, S. 1970b. The enormous diversity in genome sizes of fish as a reflection of nature’s extensive experiments with gene duplication. Transactions of the American Fisheries Society 1970: 120-130.

Ohno, S. 1972. So much “junk” DNA in our genome. In Evolution of Genetic Systems (ed. H.H. Smith), pp. 366-370. Gordon and Breach, New York.

Ohno, S. 1973. Evolutional reason for having so much junk DNA. In Modern Aspects of Cytogenetics: Constitutive Heterochromatin in Man (ed. R.A. Pfeiffer), pp. 169-173. F.K. Schattauer Verlag, Stuttgart, Germany.

Ohno, S. 1974. Chordata 1: protochordata, cyclostomata, and pisces. In Animal Cytogenetics, Vol. 4 (ed. B. John), pp. 1-92. Gebrüder Borntraeger, Berlin.

Ohno, S. 1982. The common ancestry of genes and spacers in the euchromatic region: omnis ordinis hereditarium a ordinis priscum minutum. Cytogenetics and Cell Genetics 34: 102-111.

Ohno, S. 1985. Dispensable genes. Trends in Genetics 1: 160-164.

Patrushev, L.I. and I.G. Minkevich. 2006. Eukaryotic noncoding DNA sequences provide genes with an additional protection against chemical mutagens. Russian Journal of Bioorganic Chemistry 32: 1068-1620.

Petsko, G.A. 2003. Funky, not junky. Genome Biology 4: 104.

Raup, D.M. 1991. Exctinction. W.W. Norton & Co., New York.

Roels, H. 1966. “Metabolic” DNA: a cytochemical study. International Review of Cytology 19: 1-34.

Ruse, M. 1996. Monad to Man. Harvard University Press, Cambridge, MA.

Shapiro, J.A. and R. von Sternberg. 2005. Why repetitive DNA is essential to genome function. Biological Reviews 80: 227-250.

Sharma, A.K. 1985. Chromosome architecture and additional elements. In Advances in Chromosome and Cell Genetics (eds. A.K. Sharma and A. Sharma), pp. 285-293. Oxford and IBH Publishing Co., New Delhi.

Slack, F.J. 2006. Regulatory RNAs and the demise of ‘junk’ DNA. Genome Biology 7: 328.

Vinogradov, A.E. 1998. Buffering: a possible passive-homeostasis role for redundant DNA. Journal of Theoretical Biology 193: 197-199.

Walker, P.M.B., W.G. Flamm, and A. McLaren. 1969. Highly repetitive DNA in rodents. In Handbook of Molecular Cytology (ed. A. Lima-de-Faria), pp. 52-66. North-Holland Publishing Co., Amsterdam.

Walkup, L.K. 2000. Junk DNA: evolutionary discards or God’s tools? Creation Ex Nihilo Technical Journal 14: 18-30.

Wickelgren, I. 2003. Spinning junk into gold. Science 300: 1646-1649.

Wieland, C. 1994. Junk moves up in the world. Creation Ex Nihilo Technical Journal 8: 125.

Woodmorappe, J. 2000. Are pseudogenes ‘shared mistakes’ between primate genomes? Creation Ex Nihilo Technical Journal 14: 55-71.

Woolfe, A., M. Goodson, D.K. Goode, P. Snell, G.K. McEwen, T. Vavouri, S.F. Smith, P. North, H. Callaway, K. Kelly, K. Walter, I. Abnizova, W. Gilks, Y.J.K. Edwards, J.E. Cooke, and G. Elgar. 2005. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biology 3: e7.

Yunis, J.J. and W.G. Yasmineh. 1971. Heterochromatin, satellite DNA, and cell function. Science174: 1200-1209.

Zuckerkandl, E. 1976. Gene control in eukaryotes and the C-value paradox: “Excess” DNA as an impediment to transcription of coding sequences. Journal of Molecular Evolution 9: 73-104.

Zuckerkandl, E. and W. Hennig. 1995. Tracking heterochromatin. Chromosoma 104: 75-83.

Zuckerkandl, E. 1997. Junk DNA and sectorial gene expression. Gene 205: 323-343.

__________

Update: At Sandwalk, Larry Moran argues that the term “junk DNA” is “a good term”, “an accurate term”, and “a useful term”. You can read my response in the comments section of the original post or in my re-post on this blog.


Units of measurement.

There sometimes is confusion surrounding the units employed in genome size publications. Genome sizes — the amount of DNA per copy of a genome — have traditionally been given in units of mass, namely picograms (1 pg = 10-12 g). More recently, people have been interested in knowing the number of base pairs per genome rather than genomic mass, which makes sense if one wishes to sequences those base pairs.

The fact is that essentially all genome size measurements represent relative estimates based on the density of stain or fluorescence of dye as compared to a standard (more on this in a later post), with the exception of truly complete genomic sequencing, which is very uncommon for eukaryotes. The units into which these relative data are converted is simply a matter of preference, and if one knows the mass of a given nucleotide then one can easily convert between picograms and base pairs. Indeed, Dolezel et al. (2003) did this calculation based on the following data:


The net result is that one easily can convert between picograms and base pairs as follows:

DNA content in bp = (0.978 x 109) x DNA content in pg

DNA content in pg = DNA content in bp / (0.978 x 109)

Yes, there will be a little bit of error involved if there are biases toward AT or GC in the genome. However, “by using the data in Table 1, relative weights of nucleotide pairs can be calculated as follows: AT = 615.3830 and GC = 616.3711, bearing in mind that one phosphodiester linkage involves a loss of one H2O molecule” (Dolezel et al. 2003). In other words, the difference is very slight and is negligible relative to the experimental error inherent in any genome size estimate.

To put it very simply, the units are interchangeable with

1 pg = 978 Mbp
.

————–

Dolezel, J., J. Bartos, H. Voglmayr, and J. Greilhuber. 2003. Nuclear DNA content and genome size of trout and human. Cytometry 51A: 127-128.


A nod to (and from) the Sandwalk.

Larry Moran recently gave the lab a nice nod on his Sandwalk blog. It seems fitting to have one of the first posts on this new blog be an act of reciprocity. So here, in photographic form, is a personal nod to Sandwalk from the Sandwalk.

I’ll probably post a more detailed discussion of the association between genome size, metabolic rate, and flight another day. In the meantime, you can check out the original paper by Chris Organ and colleagues in Nature and the very nice piece about it by Carl Zimmer in Science that prompted Larry’s post. There is also a discussion by Greg Laden which seems pretty reasonable overall.
_________________________________

Further reading:

Gregory, T.R. 2002. A bird’s-eye view of the C-value enigma: genome size, cell size, and metabolic rate in the class Aves. Evolution 56: 121-130.

Organ, C.L., A.M. Shedlock, A. Meade, M. Pagel, and S.V. Edwards. 2007. Origin of avian genome size and structure in non-avian dinosaurs. Nature 446: 180-184.

Zimmer, C. 2007. Jurassic genome. Science 315: 1358-1359.


My grad student made me do it.

One of the joys of advising graduate students is that you get to interact with people who have widely divergent backgrounds, expertise, and interests. It so happens that the newest member of the lab is an avid blogger, and has managed to convince me that this can be a useful means of communication (and perhaps a somewhat productive distraction from tedious duties). The net result, after consultation with our resident expert on options for the name, is Genomicron — a blog devoted to exploring genomic diversity and evolution. Due to time constraints I expect to only post semi-regularly for the foreseeable future, but this will provide a venue for discussing interesting and exciting developments in genomics, biodiversity science, and evolutionary biology*. Welcome.

TRG

* Assuming, of course, that anyone other than my students will read it.