Here’s the first sentence from a paper published recently in Genome by Vibhu Ranjan Prasad and Karin Isler:
Gene content, the number of genes coding for proteins, is correlated with genome size in both noneukaryotes and eukaryotes (Lynch and Conery 2003; Konstantinidis and Tiedje 2004; Gregory 2002, 2005).
The whole C-value enigma is based on the well-known discrepancy between genome size and gene number, which should be very clear from my papers (including the two that they cite).
From the first page of Gregory (2002):
Genome size bears no relationship to organismal complexity. If C-values are constant because DNA is the stuff of genes, then how could they be unrelated to gene number?
From the first page of Gregory (2005):
The discordance between genome size and organism complexity or gene number did not remain a paradox — that is, a pair of mutually exclusive truths — for very long. The discovery of non-coding DNA in the early 1970s explained the failure of DNA content to reflect the number of genes, and in so doing resolved the paradox. However, as with most significant advances in genetic knowledge, this finding raised more questions than it answered.
Not to mention this figure from Gregory (2005):
As is often the case, these authors focused only on sequenced genomes. For prokaryotes this isn’t such a problem because there is a much narrower range in genome size. The main bias in prokaryote genome sequencing has more to do with which species can be cultured in the lab. In eukaryotes, only species with small to medium sized genomes have been sequenced due to the difficulties and expense of working with large genomes.
The authors make the valid point that we should incorporate phylogenetic information where possible in doing comparisons across species, such as when assessing potential relationships between genome size and gene number. However, they are missing the much more important bias in the data, which is that it only includes eukaryotes of smallish genome sizes.
And for crying out loud, how did such a blatant miscitation make it into the journal?
Prasad, V.J. and K. Isler. (2012). Assessment of phylogenetic structure in genome size – gene content correlations. Genome 55: 391-395.
Gregory, T.R. (2005). Synergy between sequence and size in large-scale genomics. Nature Reviews Genetics 6: 699-708.
Gregory, T.R. (2002). A bird’s-eye view of the C-value enigma: genome size, cell size, and metabolic rate in the class Aves. Evolution 56: 121-130.