Let’s try the "Wiki database" idea.

In the comments to my previous post, Nick Matzke* suggested the idea that one could create an open database on the Wikipedia model in which people could compile and contribute published data and access the complete dataset freely. This would not replace other databases that give a pile of information for each species (e.g., FishBase), but would deliver one parameter at a time for a number of species. It would be good if the datasets could be exported to a spreadsheet for analysis as well.

Having been through the extremely time-consuming task of building a database from the bottom up (and, thankfully, having a pro web designer build it, which took a massive effort on his part as well), I would be all for trying a community-assembled database for other kinds of information. Kehan has suggested that something like the scratchpad used by the European Distributed Institute of Taxonomy (EDIT) would work nicely for this. I think that looks very promising.

So, I am totally up for trying this experiment. I should tell you that WikiData.org, WikiBase.org, OpenData.org, and a bunch of the obvious domains are unavailable. Anyone got a good name? Maybe a biological term that reflects a bunch of individuals contributing to an emergent database? (Please don’t say “EmergentDatabase.org” — try to be more creative). I’ll register it, then hopefully we can recruit someone with web experience to help set it up. I have data on red blood cell sizes that we could start with that is a reasonably complete dataset, and of course chromosome numbers could also instigated and then contributed to by users.

No point letting a great idea pass by!


* Incidentally, Nick, when are you going to get your own blog?

We need more online databases.

If you are a graduate student or postdoc and are wondering how you might make a significant contribution, I recommend assembling an online database to be made freely available. In particular, a database of chromosome numbers for animals would be very useful, and I don’t mind if you start out with the limited sampling of karyotype information that I have included in the Animal Genome Size Database. Why not start with, oh I don’t know, butterflies, and then let me see the dataset once you have it compiled? Anytime in the next week or so would be great. Not that I am trying to do a comparison of chromosome numbers in butterflies or anything.

Biodiversity databases.

The recent launch of the Encyclopedia of Life has generated quite a bit of excitement. It is my hope that advances such as this will help to make information about the millions of species that inhabit the planet accessible to everyone. It is the ultimate in open access science. In keeping with this, here is a list of biodiversity databases that are freely available to anyone. (I am sure to have missed some and I left out many taxon-specific pages — please leave me a comment or send me an email if you know of any other major resources and I will update the compilation).

Genome size databases.

In case anyone is unaware of their existence, here are the links to the available genome size databases.

For a summary of the databases, see Gregory et al. (2007).

For a discussion about units of measurement in genome size, see here.

A summary of genome size ranges in various animals is available here.

A much smaller database of genome sizes that also includes some taxa besides animals, plants, and fungi is posted here.

For bacterial and archaeal (“prokaryote”) genome size data, see here and here and here.

For a list of completed and ongoing genome sequencing initiatives, see the Genomes OnLine Database (GOLD).

For vertebrate red blood cell sizes, see here.