Are you a cat genome person or a dog genome person?

The most recent issue of Genome Research contains a report of the cat genome sequence (Pontius et al. 2007), adding Felis catus to the rapidly growing collection of animal genome sequences. One of the reasons that the number of mammal sequences is increasing so quickly is that there have been reduced standards for sequence coverage. To wit, the cat is one of 24 mammal species approved by NHGRI for “low redundancy” sequencing, meaning that the sequence will be covered only 2-fold (vs. up to 7x coverage in dog, chimp, human, mouse, and rat). Moreover, in this report, only 60% of the euchromatic DNA was actually sequenced (and nevermind the heterochromatin). Seventeen of these low redundancy genomes have already been released, as noted in the table from Green (2007). This leaves many gaps in the sequence, but the rationale is that having incomplete genomes from many species can be at least as informative as having more thorough sequences from only a few species.

In the trade-off between breadth vs. depth — or phylogenetic diversity vs. individual resolution — this leans more towards the former. Of course, this does not preclude improving coverage later, and in fact many of the 2x genomes are already being sequenced to a higher redundancy.

http://www.genome.org/cgi/content/short/17/11/1547

Of the greatest interest to me, about 32% of the available cat sequence is made up of transposable elements, mostly LINEs and SINEs as in other mammals. The percentage might be higher overall since much of the non-coding portion of the genome was not sequenced in the cat. Not having this information is one of the downsides of low coverage. On the other hand, the TE content looks to be very similar to dog anyway, so this is useful information that would not be available yet if we had to insist on 7x coverage for every species.

Speaking of the dog genome, it bears noting that a survey sequence of only about 25% of the genome at 1.5x coverage was released in 2003 (Kirkness et al. 2003). This initial sequence (from Craig Venter’s poodle Shadow) was followed by work from a different set of authors who released a complete dog genome (7.5x coverage) in 2005 (Lindblad-Toh et al. 2005). So again, releasing a partial sequence certainly does not stop a more detailed coverage from being done down the line.

In an ideal world we might have high redundancy, totally complete (not just euchromatic), fully annotated, completely accurate genome sequences from multiple individuals from thousands of species — but that isn’t reality for the time being.

Given such constraints, do you think we should have incomplete data from lots of species, or high depth information from a few species? In other words, are you a cat genome person or a dog genome person?

_________

ps: You’ll note that I resisted the temptation to post pictures of my own cats — you’re welcome.

Let’s try the "Wiki database" idea.

In the comments to my previous post, Nick Matzke* suggested the idea that one could create an open database on the Wikipedia model in which people could compile and contribute published data and access the complete dataset freely. This would not replace other databases that give a pile of information for each species (e.g., FishBase), but would deliver one parameter at a time for a number of species. It would be good if the datasets could be exported to a spreadsheet for analysis as well.

Having been through the extremely time-consuming task of building a database from the bottom up (and, thankfully, having a pro web designer build it, which took a massive effort on his part as well), I would be all for trying a community-assembled database for other kinds of information. Kehan has suggested that something like the scratchpad used by the European Distributed Institute of Taxonomy (EDIT) would work nicely for this. I think that looks very promising.

So, I am totally up for trying this experiment. I should tell you that WikiData.org, WikiBase.org, OpenData.org, and a bunch of the obvious domains are unavailable. Anyone got a good name? Maybe a biological term that reflects a bunch of individuals contributing to an emergent database? (Please don’t say “EmergentDatabase.org” — try to be more creative). I’ll register it, then hopefully we can recruit someone with web experience to help set it up. I have data on red blood cell sizes that we could start with that is a reasonably complete dataset, and of course chromosome numbers could also instigated and then contributed to by users.

No point letting a great idea pass by!

________

* Incidentally, Nick, when are you going to get your own blog?



We need more online databases.

If you are a graduate student or postdoc and are wondering how you might make a significant contribution, I recommend assembling an online database to be made freely available. In particular, a database of chromosome numbers for animals would be very useful, and I don’t mind if you start out with the limited sampling of karyotype information that I have included in the Animal Genome Size Database. Why not start with, oh I don’t know, butterflies, and then let me see the dataset once you have it compiled? Anytime in the next week or so would be great. Not that I am trying to do a comparison of chromosome numbers in butterflies or anything.


Zimmer on science writing.

Over at The Loom, science writer Carl Zimmer provides a link to what he considers an excellent example of science writing. I want to point out the way he sets this up, which I think is of interest in light of what is often discussed on this blog:

I’m sometimes asked who my favorite science writers are. I don’t like science writers per se; I like science writing, or rather some science writing–the passages and chapters and books that remind me just how good science writing can get, just how high above the wasteland of hackery, dishonest simplification, and cliches it can rise.

Carl is, of course, quite right about this. Not just in expressing frustration at the wasteland of mediocrity, but also about the fact that it is good writing rather than writers that should be of interest. Obviously, one who produces consistently good writing can be said to be a good writer, but this does not mean that he or she can be considered, nor expected, to be infallible when it comes to accurately reporting scientific information. I have tried to focus on individual stories rather than criticizing particular writers, but this is a lesson that I will have to keep more clearly in mind.

The evolution of Genomicron.

As promised, here is my first response to a blog meme, namely the “evolution” meme with which I was tagged some time ago by Larry Moran of Sandwalk. I probably won’t make a habit of responding to these, but this one seems particularly appropriate given the content of the blog. I’m also not going to play by the rules, in which one is supposed to list five posts and tag others — I am listing a whole bunch of posts by genre and not tagging anybody else. Please bear in mind that evolution is not a progressive process, and that the particular path followed by a given lineage is not necessarily adaptive. So, that said, here is a brief history of the Genomicron blog…

Origins: I started the blog more or less as a lark. I had a new grad student who was an avid blogger, and through conversations with him I began reading some of the bigger science blogs. When I noticed that my work was being discussed on some of them, I obviously became more interested. However, it struck me that most of the big blogs were chimaeras of politics, religion, and science, and that few were dedicated to genome biology of the sort that I am interested in (The Tree of Life notwithstanding!). It seemed there was an available niche for a (mostly) science-only, genomes-plus-evolution blog, and so I decided to launch one. The first post was just a “well, here I am” sort of announcement [My grad student made me do it], and the second was a hat-tip to Larry for mentioning my lab [A nod to (and from) the Sandwalk]. At the time, I said that “Due to time constraints I expect to only post semi-regularly for the foreseeable future”. So much for that prediction.

Exaptation: Several of the early posts on Genomicron were recycled from things I had written for other reasons. I wanted to get some content up, so I modified existing texts or simply wrote about very basic topics in my field. Specifically, I posted some historical information about “junk DNA”, and some general information about genome size. I also began discussing new reports in genomics, such as the sequencing of the macaque genome and some studies that made reference to genome size.

Here are some examples from this phase:

Flagellum flap: My first foray into the wild world of blog debates came when I posted a brief summary of a PNAS paper by Liu and Ochman on the evolution of bacterial flagella [Genome sequences reduce the complexity of bacterial flagella]. I mostly commented on how this shows that so-called “irreducibly complex” structures can be and are studied from an evolutionary perspective, in particular using genomic data. I also gave roughly equal time to Nick Matzke’s excellent model. Little did I know that this paper would be highly controversial, and that I would be sucked into a vortex of bandwagon-jumping and ad hominem attacks that almost turned me off blogging [Doubts about blogging for scientists] and solidified my view that blogs are fine for commentary but that one must never give the false impression that this represents scientific peer review [Peer review or peanut gallery?]. I also tried to provide some comments on how one could conduct scientific discussions on blogs that were, most likely, rather unrealistic given the nature of teh interweb [Blogs as a medium for scientific discussion]. Incidentally, as far as I know, everyone involved in this dust up is getting along just fine now.

Junk, junk, everywhere: Meanwhile, I continued discussing “junk DNA” and the various misconceptions and intentional misrepresentations about it. I was not aware that this was such a hot issue, in particular with regard to the creationism/ID-evolution “debate”. To paraphrase Alvar Ellegard, junk DNA is one of those focal points of debate on which practically everybody is compelled to have some sort of opinion, though it cannot in most cases be called an informed one. This is especially true of anti-evolutionists, who really don’t get it at all. By way of example, I point to the short piece in Wired in which I was interviewed along with Francis Collins and some IDers that set off a firestorm of criticism in the science blogosphere [Junk DNA gets Wired].

Here are some posts on the topic:

The media and me: When I started this blog, I really did not have any intention of commenting regularly on the quality of media reports. Honestly. However, it became clear that the state of science reporting is, shall we say, problematic. The first post about this that I wrote was about a common misconception regarding some living species being “more evolved” than others [Chimps are not more evolved than humans or anyone else]. Since then, I have become increasingly frustrated with media reports on science, as reflected by a growing list of posts on the subject, which extends up to the present:

Interestingly, one of the most widely read posts I have written was on this topic, in which I vented some frustration with a snarky, sarcastic list entitled Anatomy of a bad science story. This was picked up by various blogs, and was even printed in an issue of The Skeptic magazine published by Australian Skeptics [Terrible science writing in The Skeptic magazine].

To be fair, I have also regularly complimented good science writing:

DAPs: More recently, I have turned some attention to bad science in addition to bad science reporting. In particular, I have noted a few examples of biased data used to support hypotheses regarding genome size and noncoding DNA. I haven’t discussed this kind of thing much on the blog up to this point because I prefer to deal with it in the peer-reviewed literature, but since most people would not encounter those discussions, it seems useful to present it à la blogue as well. Examples of this include:

And there you have it. What does the future hold? Given the role of contingency in evolution, I’m as curious to find out as you are.


Et tu, Daily Mail?

The BBC story about Dr. Curry’s “predictions” for the future of human evolution that I discussed in the previous post was released in Oct. 2006, but now the Daily Mail has run a very similar article as well. Like the BBC, they claim that a “top scientist” made serious “predictions” along these lines. Other blogs seem to have taken them at their word and have assumed the Curry really meant all this as real science.

If you read his original “Bravo Report“, it is pretty obvious that he was using very hypothetical sci-fi kinds of examples as illustrations of general evolutionary concepts (sympatric speciation, assortative mating, sexual selection) and not much more. An easily sensationalized and therefore ill-chosen tact, to be sure, but not the outrageous idiocy for which he is being slammed.

I don’t think my interpretation is far off, given the following preamble to Curry’s essay:

In the summer of 2006 I was commissioned by Bravo Television to write an essay on the future of human evolution. The essay was intended as a ‘science fiction’ way of illustrating some aspects of evolutionary theory.

Bravo then sent out a press release on the essay, but did not release the essay itself. As a result, a wildly distorted version of what I had written ended up being reported as ‘science fact’ in the media. I do not endorse the content of these media reports.

A lesson for bloggers: always read the original source, especially if a media story seems too silly to be true.


Just when I thought it couldn’t get any stupider, along comes the BBC.

I honestly, and obviously naively, thought that I had seen the stupidest speculation passed off as science news with the LiveScience “report” that humans will be marrying robots within 45 years (at least in Massachusetts) [The story that caused me to stop reading LiveScience].

But I stand corrected — or, rather, sit dumbfounded — by a story on the BBC website entitled Human species ‘may split in two’. This time we are served some predictions by a Dr. Oliver Curry, described as an “evolutionary theorist at the London School of Economics”. Here are some highlights:

“Humanity may split into two sub-species in 100,000 years’ time as predicted by HG Wells, an expert has said.”

The human race would peak in the year 3000, he said – before a decline due to dependence on technology.”

In the nearer future, humans will evolve in 1,000 years into giants between 6ft and 7ft tall, he predicts, while life-spans will have extended to 120 years, Dr Curry claims.”

“Physical appearance, driven by indicators of health, youth and fertility, will improve, he says, while men will exhibit symmetrical facial features, look athletic, and have squarer jaws, deeper voices and bigger penises.”

Chins would recede, as a result of having to chew less on processed food.”

One of two things is happening here. Either Dr. Curry is the so-bad-it-hurts-my-head, absolute worst evolutionary theorist I have ever encountered, or the BBC is distorting the heck out of what he said to make it sound as though he is the worst evolutionary theorist I have ever encountered.

At this point, I was prepared to enter into a tirade about people who know nothing about evolution talking entirely out of their posteriors, but something told me that no self-respecting evolutionary theorist could say anything this silly and not mean it as a parody or an April Fool’s joke. And guess what? Dr. Curry wasn’t this silly. Not by a long shot.

This BBC story is one of many to have picked up these “expert predictions” as though they had merit. However, these were not predictions at all, but intentionally amusing speculations written in a short piece for a television station. As Dr. Curry put it on his website, “The Bravo Evolution Report was a brief ‘think piece’, commissioned by Bravo Television to celebrate their 21st anniversary. Writing about the future of evolution for Bravo seemed to offer a fun, ‘sci fi’ way to introduce some evolutionary principles to a popular audience.”

As Dr. Curry notes in a remarkably restrained understatement, “Unfortunately, when filtered through headlines and talkshows, the coverage did not faithfully reflect the aim and scope of the original piece”.

Instead of commenting further, I invite you to read the BBC story — then go and read what Dr. Curry actually wrote, also remembering the context in which it was written.

My head hurts, and it’s the BBC’s fault.


Science spam.

A few weeks ago, Jonathan Eisen vented some frustration at science spam on his blog [Biotechnology spam getting worse and worse]. Like many other researchers, I am inundated with irritating spam from science companies, usually selling reagents or equipment that I don’t use. They just end up in the spam filter with the rest of the aggravating, time-wasting solicitations that we all receive.

This one struck me as particularly amusing, however, so I thought I would share it. Next time you’re thinking of writing a paper about macroevolutionary theory in a paleontology journal, be sure to order your monoclonal antibodies from these folks.

Hi Dr. Gregory,

We’ve learned of your research with TWIST from the journal article titled Macroevolution, hierarchy theory, and the C-value enigma. MyBioSource is currently has the TWIST monoclonal antibody in our catalog products. Please click on the link below to view the datasheet of TWIST antibody.

TWIST Monoclonal Antibody (Datasheet: http://www.mybiosource.com/datasheet.php?products_id=120163
Other TWIST Antibody or Protein (Products Listing: https://www.mybiosource.com/advanced_search_result.php?keywords=TWIST&Submit=SEARCH&search_from_header=on

Additionally, we have over 12,000 monoclonal, polyclonal antibodies, recombinant proteins and peptides. Please spend a few minutes to browse our catalog offering.

Please visit MyBioSource.com and get started: http://www.mybiosource.com

Best regards,

MBS Sales Team
sales@mybiosource.com
Tel: 619-795-6727
Fax: 619-512-4535


The It’s Its There Their They’re Quiz (with a note about data and bacteria).

Everyone seemed to be taking The It’s Its There Their They’re Quiz, so I thought I would give it a try.

As annoying as these all-too-common errors are, I find it worse when science bloggers use “data” or “bacteria” as the singular (both are plural — of “datum” and “bacterium,” respectively). In other words, “this data shows…” irritates me more than “this bird uses it’s beak to…” on a scientific website.


The story that caused me to stop reading LiveScience.

I have quite a few science news feeds pumping content into my aggregator. These have been quite useful, and have brought to my attention several interesting studies that I would not have read about otherwise. (Although one does, of course, have to consult the primary articles to get past the common problems in media reports).

Unfortunately, the gems appear to be rare amidst the rubble of science writing on these services. Many of them are just parroted press releases (which, as we all know, can be the worst for sensationalizing research). Others have staff writers producing original contributions. LiveScience is one with its own writers, some of which I have complimented (here and here), and others I have criticized (here and here), in previous posts.

But here’s the post that finally earned LiveScience their walking papers:

Forecast: Sex and Marriage with Robots by 2050

In it, we are treated to the musings of David Levy at the University of Maastricht in the Netherlands, who earned a Ph.D. for writing a thesis claiming that humans will marry robots within the next few decades. If you think that’s ridiculous, wait until you see what he told LiveScience in this story:

“My forecast is that around 2050, the state of Massachusetts will be the first jurisdiction to legalize marriages with robots.”

“Massachusetts is more liberal than most other jurisdictions in the United States and has been at the forefront of same-sex marriage. There’s also a lot of high-tech research there at places like MIT.”

“If you ask me if every human will want to marry a robot, my answer is probably not. But will there be a subset of people? There are people ready right now to marry sex toys.”

“Maybe some other relationships could welcome a robot. Instead of a woman saying, ‘Darling, not tonight, I have a headache,’ you could get ‘Darling, I have a headache, why not use your robot?'”

“The question is not if this will happen, but when. I am convinced the answer is much earlier than you think.”

“Love and sex with robots are inevitable.”


This is not science, it is not news, and it is not something I want cluttering up my aggregator feed. So long, LiveScience, it’s been a slice.

It pains me to say it, but in light of other complaints from scientists in the blogosphere (e.g., here and here), I am actually beginning to wonder if, despite the efforts of some excellent writers, science reporting on the whole does more harm than good. I despise the ivory tower approach to academia (hence this blog), but in my opinion misinformation is worse than missing information. PZ suggests that scientists are just going to have to handle much of the reporting themselves — maybe, but we also have other things to do. Science writers — I mean, the good ones (and I know there are lots of you out there and that you are just as frustrated as I am) — what can be done about all this?