Genome size is good for you.

I imagine that every practicing scientist has experienced, in one form or another, the tendency of many non-scientists to expect all research to be directly beneficial to human health and well-being. I used to respond facetiously to these kinds of expectations when expressed by friends or family members, with something along the lines of “My work has absolutely no practical applications to human welfare whatsoever”.

Of course, this is not true. Genome size is becoming very relevant to fields of inquiry that are likely to have major significance for medicine. Notably, genome size data provide an important indication of the cost and difficulty of sequencing a given genome, and thus represent a prime criterion in the choice of sequencing targets. As an example, I performed a genome size estimate for Biomphalaria glabrata, a planorbid snail that serves as an intermediate host for the trematode flatworm Schistosoma mansoni which causes the debilitating disease known as schistosomiasis. The genome of B. glabrata is one of the smallest so far reported for a gastropod, and is now being sequenced (along with S. mansoni).

More recently, Jenner and Wills (2007) made explicit mention of genome size as an important factor in deciding on the next set of models for evo-devo studies. Discoveries regarding the fundamental genetic underpinnings of development have obvious implications for medical science and here, too, genome size is becoming increasingly seen as important. As they put it,

Whole-genome sequences are an increasingly important resource for many biological disciplines, including evoâ€“devo^15,^49,⁵⁰. However, financial and technical constraints mean that there is currently a preference for species with small genomes. This compounds the bias that is already introduced by the big six. First, putatively general conclusions about genome evolution might actually be specific to those smaller genomes that have been fully sequenced. For example, when focusing only on sequenced genomes, a close correspondence between genome size and gene number in eukaryotes is observed. The C-value paradox becomes apparent only when genome-size data from non-sequenced genomes is included⁵¹. Second, there are important genetic, morphological, physiological and ecological correlates of genome size in a range of animals and plants^51,⁵². Some correlates seem ubiquitous in animals and plants, such as those between genome size and cell size, body size and the inverse of developmental rate⁵². Others are group specific: genome size correlates mostly with metabolic rate in homeotherms, but with developmental type and ecology in amphibians⁵³, and is positively correlated with egg size in copepods, plethodontid salamanders and fishes^51,^52,⁵⁴. Studying these correlated traits in phylogenetically disparate taxa could illuminate the relationships between small genome size and rapid development, as well as the evolution of strongly cell-lineage-dependent development in taxa such as tunicates and nematodes, and the partial fragmentation of their Hox clusters^55,⁵⁶.

References 51, 52, and 53 in that paragraph are papers of mine, so again I am forced to admit that my work may have some practical application after all.

My main focus is on genome size diversity in eukaryotes, which mostly means differences among species in the abundance of noncoding DNA. In bacteria, most of the genome is composed of protein-coding genes, so unlike in eukaryotes there is a very strong correlation between genome size and gene number. Genome size is generally small in parasites and endosymbionts and larger in free-living species (probably because population bottlenecks and relaxed selection on gene function result in gene loss by deletion bias in bacteria associated with hosts [Mira et al. 2001]).

But this observation is not the link between genome size and human health that I had in mind for this post. In this month’s issue of Antimicrobial Agents and Chemotherapy, Steven Projan argues that genome size is associated with the evolution of antibiotic resistance in bacteria. In Dr. Projan’s own words,

It is observed here that the ability of a given bacterium toevolve toward a multidrug resistance phenotype is a functionof genome size. In Table 1, a number of examples are provided,but even an expanded analysis shows that this observation holdstrue. That is, the larger the genome the greater the propensityof a bacterium to display multidrug resistance phenotypes andthe smaller the genome the less likely it is that antibacterialresistance will emerge and disseminate within that species.What is proposed here is that, just as there is a continuumof genome sizes among bacteria, there is a continuum in theability or propensity of a bacterium to become “multidrug resistant”and that continuum is reflected in the size of the genome. Thisis not to say that we do not observe resistance to certain agentseven in organisms with the smallest genomes (macrolide resistanceappears in virtually every pathogen at some level). There isprobably a solid biological reason for this observation; organismswith larger genomes are more adaptable to environmental changesbecause they have more (genetic) information to draw upon. Itappears that organisms with smaller genomes have become more“specialized,” residing in particular environmental niches (Treponemapallidum and the Chlamydiae are cases in point), and their lackof versatility in adapting to different environments is alsomanifest in an inability to develop mechanisms for coping withantibiotics. Indeed, we have learned that virtually each andevery time a bacterium either acquires a novel resistance determinantor a mutant strain arises with decreased susceptibility to anantibacterial drug, the bacterium experiences a “fitness burden.”With time, compensatory mutations are selected in which thebacterium accumulates mutations that allow for something likewild-type growth in a strain that is now phenotypically resistant(e.g., topA mutations in gyrB mutant strains). Bacteriawith larger genomes simply have a greater opportunity to developthese compensatory mutations. It must be emphasized that itdoes not matter whether we are discussing the acquisition ofa novel resistance gene as opposed to a mutation that altersthe target or results in up-regulation of an efflux pump. Theaccumulating evidence tells us that all require some form ofadaptation. Another consequence of this phenomenon is that antibioticcycling in health care settings is unlikely to result in a reversionof the local microflora to susceptibility as the compensatorymutations “lock in” the resistance phenotype.

He continues by noting, “I and several of those I have discussed this observation withwere perplexed that it had not previously been articulated.Although to be fair, others have suggested it is a trivial,if not nonsensical, observation and worthy only of cocktailparty conversation… in fact, I believe that this is an importantguide as to where and which organisms we actually need novelantibacterial agents for.” Projan blames an overemphasis on individual organisms with small genomes for the overlooking of this potentially important pattern. In other words, it is the sort of thing that can only be applied to human health research if one takes a broad view of genomic diversity.

As much fun as it is to study genome size for purely academic reasons, it seems it actually may be good for us too.

3 thoughts on “Genome size is good for you.”

rajeev on April 25, 2007 at 4:16 am said:

From my limited IT support experience – the newer sequencing technologies like 454 produce raw images of the plate. As such, it does not matter how large the genome is – the cost of sequencing run is the same whether the genome is 5Mbp or 500Mbp. The analyses of that genome can still be priced using the genome size, however.
TR Gregory on April 25, 2007 at 4:44 am said:

Certainly, the cost of sequencing can be expected to continue to plummet. However, so far what you’re describing applies mostly to bacteria and archaea which have tiny genomes. Animal genome sizes range from 30Mb to 130,000Mb. Vertebrate genome sizes alone vary from about 400Mb to 130,000Mb. Flowering plants range from 60Mb to 124,000Mb. For the time being, genome size remains an important (and more or less required) datum in sequencing proposals. It is also the case that a large and (presumably) highly repetitive genome sequence will be more challenging to assemble than a simple, mostly single-copy genome. Not surprisingly, mammals (~3,000Mb) are at the top end of the range of genome sizes sequenced so far, which provides a very biased view of genome size if this is the only source of data considered.
TheBrummell on April 26, 2007 at 8:05 am said:

As much fun as it is to study genome size for purely academic reasons, it seems it actually may be good for us too.

D’oh! I wanted to get into a field that was as completely divorced from human health as possible. Maybe if I focus on invertebrates and ignore the human side as much as possible it will just go away.

Comments are closed.