Bacterial genomes and evolution.

The seminar that I give most often when I am invited to speak at other universities begins with a brief introduction to genomes, sets up some comparisons between bacteria and eukaryotes, and then moves into a short overview of bacterial genome size evolution before spending the remainder of the time on genome size diversity and its importance among animals.

The main things that I have to say about bacterial genomes are:

1) Unlike in eukaryotes, bacterial genome size shows a strong positive relationship with gene number (in other words, bacterial genomes contain little non-coding DNA).

Genome size and gene number in bacteria and archaea.
From Gregory and DeSalle (2005).

2) Bacterial genome sizes do not vary anywhere near as much as those of animals do (on the order of 20-fold versus 7,000-fold).

The diversity of archaeal, bacterial, and eukaryotic genome
sizes as currently known from more than 10,000 species.
From Gregory (2005).

3) The major pattern in bacteria is that, on average, free-living species have larger genomes than parasitic species which in turn have larger genomes than obligate endosymbionts (Mira et al. 2001; Gregory and DeSalle 2005; Ochman and Davalos 2006).

Genome sizes among bacteria with differing lifestyles.
Because genome size is primarily determined by the
number of genes in bacteria, the question to be addressed
is why symbionts have fewer genes in their genomes.
From Gregory and DeSalle (2005).

In order to explain these patterns, it was sometimes argued that some bacteria have small genomes because there is selection for rapid cell division, with larger DNA contents taking longer to replicate and thereby slowing down the cell cycle. However, when Mira et al. (2001) compared doubling time and genome size in bacteria that could be cultured in the lab, they found no significant relationship between them. In other words, selection for small genome size is probably not responsible for the highly compact genomes of some bacteria, even though it seems plausible that, more generally, selection does prevent the accumulation of non-coding DNA to eukaryote levels in bacterial cells.

Mira et al. (2001) suggested a different interpretation that is based on two other major processes in evolution — mutation and genetic drift. In terms of mutation, they pointed out that on the level of individual changes that add or subtract relatively small quantities of DNA — i.e., insertions or deletions, or “indels” — deletions tend to be somewhat larger than insertions. The insertions in this case are separate from the addition of whole genes, which happens often in bacteria through sharing of genes among individuals or even across species (“horizontal gene transfer” or “lateral gene transfer“) or gene duplication.

In bacteria (and eukaryotes) small-scale deletions tend
to involve more base pairs than insertions, creating a
“deletion bias”. Of course, larger insertions such as of
transposable elements or gene duplicates are not part
of this calculation as they add much more DNA at once.
From Mira et al. (2001).

So, on the one hand, there are processes that can add genes (duplication and lateral gene transfer), whereas in the absence of these processes, and if there are no adverse consequences to losing DNA (i.e., there is no selective constraint occurring), genomes should tend to get smaller as a result of this deletion bias. In free-living bacteria, there are many opportunities for gene exchange, with lateral gene transfer adding DNA at an appreciable frequency. Moreover, free-living bacteria tend to occur in astronomical numbers, and elementary population genetics reveals that selection will be strong under such conditions (so that even a mildly deleterious mutation, such as a deletion or disruptive insertion, will probably be lost from the population over time). Finally, free-living bacteria must produce their own protein products, and therefore tend to make use of all their genes, which places selective constraints on changes (including indels) in those sequences.

Endosymbiotic bacteria, especially those that live within the cells of eukaryote hosts, are different in multiple relevant respects. First, they do not regularly encounter other bacteria from whom they can receive genes. Second, they occur in drastically smaller numbers — indeed, they experience a population bottleneck severe enough to shift the balance from selection to drift. Third, they come to rely on some metabolites provided by the host and no longer make use of all their own genes. These factors in combination mean that the selective constraints on many endosymbiont genes are relaxed, and the dominant processes become deletion bias and random drift. Over many generations, endosymbiotic bacteria lose the genes they are not using (and some that are only mildly constrained by selection, such is the strength of drift under such conditions) due to deletion bias, and the end result is highly compact genomes.

The compaction of genomes in endosymbionts can be extreme. The smallest genome known in any cellular organism (except, perhaps, one in Craig Venter‘s lab) is found in the bacterial genus Carsonella, a symbiont that lives within the cells of psyllid insects. It contains only 159,662 base pairs of DNA and 182 genes, some of which overlap (Nakabachi et al. 2006).

Carsonella (dark blue) living within the cells and
around the nucleus (light blue) of a psyllid insect.
From Nakabachi et al. (2006).

In some other bacteria, genes that are not used (including non-functional duplicates) may not be lost for some time and may persist as pseudogenes, just as are observed in large numbers in eukaryote genomes. These tend to undergo additional mutations and to degrade over time but can still be recognized as copies of existing genes. In Mycobacterium leprae, the pathogen that causes leprosy, for example, there are more than 1,100 pseudogenes alongside roughly 1,600 functional genes (Cole et al. 2001). Its genome is about 1 million base pairs smaller than that of its relative M. tuberculosis, but clearly many of the inactive genes have not (yet) been deleted.

The two major influences on bacterial genomes: insertion of
genes by duplication and lateral gene transfer, and the loss
of non-functional sequences by deletion.
From Mira et al. (2001).

It would be nice if this post could end there, having delivered a brief overview of an interesting issue in comparative genomics. Sadly, there is more to say because some anti-evolutionists apparently have begun using the topic in a confused attempt to challenge evolutionary science. In particular, though I note that I have become aware of this only second hand, some creationists apparently have suggested that all bacterial genomes are degrading and therefore that bacteria today are simpler than they were in the past, such that complex structures like flagella could not have evolved from less complicated antecedents.

It should be obvious that not all genomes are necessarily “degrading” just because there is a net deletion bias. For starters, selective constraints prevent essential genes from being lost by this mechanism in most bacteria. Furthermore, there exist well established mechanisms that can add new genes to bacterial genomes, including lateral gene transfer and gene duplication. In fact, the rate of gene duplication seems to be related to genome size in bacteria (Gevers et al. 2004). Also, as Nancy Moran noted in an email, “The most primitive bacteria were certainly simple, but they are not around or at least are not easily identified. Many modern bacteria have large genomes and are very complex.” Finally, the compact genomes of endosymbionts, such as in the aphid symbiont Buchnera aphidicola, tend to be more stable than the genomes of free-living bacteria in terms of larger-scale perturbations such as chromosomal rearrangements (Silva et al. 2003).

Some bacteria, in particular those that have shifted to a
parasitic or endosymbiotic dependence on a eukaryote host,
have undergone genome reductions (green, red) as compared
to inferred ancestral conditions. Nevertheless, many other
species continue to display large genomes (blue).
However, the very earliest bacteria probably began
with small genomes and simple cellular features.
From Ochman (2006).

As with eukaryotes, the genomes of bacteria provide exceptional confirmation of the fact of common descent. Not only do comparative gene sequence analyses shed light on the relatedness of different bacterial lineages and the evolution of features like flagella, but the presence — and loss to varying degrees — of non-functional DNA highlights a strong historical signal.

Given that it is her work that is being misused by anti-evolutionists, it is fitting that Dr. Moran be given the last word:

“It seems to me that the widespread occurrence of degrading genes, which are present in most genomes including those of animals, plants, and bacteria, argues pretty strongly in favor of evolution. They are the molecular equivalent of vestigial organs.”

Quite right.



Cole, S.T., K. Eiglmeier, J. Parkhill, K.D. James, N.R. Thomson, P.R. Wheeler, and et al. 2001. Massive gene decay in the leprosy bacillus. Nature 409: 1007-1011.

Gevers, D., K. Vandepoele, C. Simillion, and Y. Van de Peer. 2004. Gene duplication and biased functional retention of paralogs in bacterial genomes. Trends in Microbiology 12: 148-154.

Gregory, T.R. 2005. Synergy between sequence and size in large-scale genomics. Nature Reviews Genetics 6: 699-708.

Gregory, T.R. and R. DeSalle. 2005. Comparative genomics in prokaryotes. In The Evolution of the Genome, ed. T.R. Gregory. Elsevier, San Diego, pp. 585-675.

Mira, A., H. Ochman, and N.A. Moran. 2001. Deletional bias and the evolution of bacterial genomes. Trends in Genetics 17: 589-596.

Nakabachi, A., A. Yamashita, H. Toh, H. Ishikawa, H.E. Dunbar, N.A. Moran, and M. Hattori. 2006. The 160-kilobase genome of the bacterial endosymbiont Carsonella. Science 314: 267.

Ochman, H. 2006. Genomes on the shrink. Proceedings of the National Academy of Sciences of the USA 102: 11959-11960.

Ochman, H. and L.M. Davalos. 2006. The nature and dynamics of bacterial genomes. Science 311: 1730-1733.

Silva, F.J., A. Latorre, and A. Moya. 2003. Why are the genomes of endosymbiotic bacteria so stable? Trends in Genetics 19: 176-180.

11 comments to Bacterial genomes and evolution.

  • RPM

    I’m really not happy with the conclusion that the deletion bias in bacteria is the result of mutational bias. Mira et al inferred the deletion bias by comparing pseudogenes from different species, which does not allow one to decouple mutation from selection. A more appropriate analysis would look at polymorphism data or lab experiments that would allow for more decoupling between mutational and selection pressures.


  • TR Gregory

    What kind of experiment do you have in mind? And what do you expect the selection pressure to be for favouring small deletions to pseudogenes? Do you mean competition experiments within populations, perhaps on the basis of cell division rate due to faster DNA replication? They looked across species and found no such relationship, but presumably you could look at a microevolutionary scale also.


  • Ed Yong

    Lovely post TR – I suspect that the problem the creationists are having with this (aside from, you know, the general idiocy) is the false idea that evolution is about progress. It’s the trite pop definition of the word which can’t possibly be in line with a deletion of genetic material.


  • RPM

    A competition experiment is not necessary. Just grow up a population of bacteria starting with a genome that contains pseudogenese and give them unlimited resources. If there is a deletion bias, you should see more deletions than insertions in the pseudogenes.

    The problem with looking across species in any evolutionary analysis is that it’s nearly impossible to separate mutation from selection (if in “junk DNA”). To really do that, you need to look at smaller time scales.


  • TR Gregory

    I don’t think a general bias toward deletion is particularly controversial.

    There’s at least one study that looked at the question at the level you suggest (including a competition experiment).

    Genome Research 12: 408-413 (2002)

    Domain-Level Differences in Microsatellite Distribution and Content Result from Different Relative Rates of Insertion and Deletion Mutations

    David Metzgar, Li Liu, Christian Hansen, Kevin Dybvig, and Christopher Wills

    Microsatellites (short tandem polynucleotide repeats) are found throughout eukaryotic genomes at frequencies many orders of magnitude higher than the frequencies predicted to occur by chance. Most of these microsatellites appear to have evolved in a generally neutral manner. In contrast, microsatellites are generally absent from bacterial genomes except in locations where they provide adaptive functional variability, and these appear to have evolved under selection. We demonstrate a mutational bias towards deletion (repeat contraction) in a native chromosomal microsatellite of the bacterium Mycoplasma gallisepticum, through the collection and analysis of independent mutations in the absence of natural selection. Using this and similar existing data from two other bacterial species and four eukaryotic species, we find strong evidence that deletion biases resulting in repeat contraction are common in bacteria, while eukaryotic microsatellites generally experience unbiased mutation or a bias towards insertion (repeat expansion). This difference in mutational bias suggests that eukaryotic microsatellites should generally expand wherever selection does not exclude them, whereas bacterial microsatellites should be driven to extinction by mutational pressure wherever they are not maintained by selection. This is consistent with observed bacterial and eukaryotic microsatellite distributions. Hence, mutational biases that differ between eukaryotes and bacteria can account for many of the observed differences in microsatellite DNA content and distribution found in these two groups of organisms.


  • RPM

    microsats have a different mutational process than non-microsats. I’d be very hesitant to generalize from microsats.


  • TR Gregory

    Then I guess you’ll have to continue assuming it’s selection until evidence that fits your requirement arises. :-)


  • TR Gregory

    Anyway, I’m with you on the usefulness of other sources of data, but the larger point of my post is that only certain bacterial genomes shrink and that this follows established evolutionary principles. That there is disagreement about which of those principles is the most important in this instance is a fine illustration that in evolution we can argue about the explanation but this has no bearing on the underlying fact.


  • Blake Stacey

    While we’re on the general subject, I’d be interested to hear a biology person’s opinion of this preprint I noticed, “An evolutionary model with Turing machines”, which investigates via simulation the growth of non-coding genetic material. My comments, from a decidedly amateur perspective, are here.

    I have no stake in this issue and (to my knowledge) no connection with the authors — it just caught my eye whilst I was browsing the arXiv’s quantitative biology RSS feed.


  • SteveF

    Here’s an upcoming paper on bacterial genomes and evolution, from Nancy Moran:

    Obligate symbioses with nutrient-provisioning bacteria have originated often during animal evolution and have been key to the ecological diversification of many invertebrate groups. To date, genome sequences of insect nutritional symbionts have been restricted to a related cluster within Gammaproteobacteria and have revealed distinctive features, including extreme reduction, rapid evolution, and biased nucleotide composition. Using recently developed sequencing technologies, we show that Sulcia muelleri, a member of the Bacteroidetes, underwent similar genomic changes during coevolution with its sap-feeding insect host (sharpshooters) and the coresident symbiont Baumannia cicadellinicola (Gammaproteobacteria). At 245 kilobases, Sulcia’s genome is approximately one tenth of the smallest known Bacteroidetes genome and among the smallest for any cellular organism. Analysis of the coding capacities of Sulcia and Baumannia reveals striking complementarity in metabolic capabilities.


Leave a Reply




You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>