What’s wrong with this figure? (Round three).

Posted on January 22, 2008 by T. Ryan Gregory

In the process of finishing up a paper, I came across this figure (Gilbert 2007).

Figure 1. Drosophila species assemblies, showing assembly sizes and coverage of these by D. melanogaster genome DNA (top and middle lines, in megabases, left ordinate), and counts of chromosome segments inverted relative to Dmel (bottom line, right ordinate). Species on abscissa are taxonomically ordered with Dgri most distant from Dmel.

Two questions. One, why are these points joined by lines? Two, what does “taxonomically ordered” mean? I suspect it is equivalent to “phylogenetic sequence”, which was discussed in a previous edition of “What’s wrong with this figure?“.

Here is the phylogeny that is most often seen in discussions of the 12 Drosophila genome sequences (in this instance, based on Crosby et al. 2007).

Pop quiz: Are D. grimshawi and D. melanogaster the most distantly related species in this subsample of the genus?

The blogosphere overreacts!

Posted on January 18, 2008 by T. Ryan Gregory

I am not interested in getting into a battle with any of my fellow bloggers on this issue, especially since I actually read all of the blogs involved and appreciate what each one of them has to say. But I do have to point out that sometimes the blogosphere overreacts, and things get blown out of proportion. Also, the sun rises in the morning and snow is cold.

On his new powerblog (seriously, Seed must have told him he has to post something every 15 minutes), Greg Laden made the following statement:

The “Junk DNA” story is largely a myth, as you probably already know. DNA does not have to code for one of the few tens of thousands of proteins or enzymes known for any given animal, for example, to have a function. We know that. But we actually don’t know a lot more than that, or more exactly, there is not a widely accepted dogma for the role of “non-coding DNA.” It does really seem that scientists assumed for too long that there was no function in the DNA.

What I actually said was, I think, pretty innocuous and mostly accurate:

Over on his blog, Greg Laden points to some new work by John Mattick’s group on non-coding RNA expression in mouse brains. It’s interesting stuff, and worth a look. Please bear in mind as you do, however, that non-protein-coding but functional RNA is nothing new. Ribosomes are made of non-coding RNA, for one thing. Sadly, Greg seems to have bought into the distortions (several promoted by Mattick) about what people have said about non-coding DNA.

That was the extent of my discussion of Greg in particular. I then provided a list of examples of functions that have been suggested, and concluded by giving my opinion about how the results of this quite interesting paper should be interpreted realistically.

Larry says I have “already tried to teach Greg some real science about junk DNA”. RPM says I “put Greg in his place”. Genome Technology Online says I “blasted” Greg. And SF Matheson says Greg is “being spanked a little too hard” (he could be referring to commenters, but this follows a line about what Larry and I wrote).

In his reaction and in the comments to others, Greg decides to:

1) School me on why genome size is relevant, with special reference to birds and flying. Since this is based partly on my own work, I find this curious.

2) Insinuate that objection to claims of function for all eukaryotic DNA are cultish, and that those who agree with Larry Moran and me are “disciples”.

3) Say (to RPM) “…this post of yours, Moran’s writing on this, and to a much lesser extent T.R. Gregory’s work, is sufficiently impolite and tending sometimes to the obnoxious that it makes it hard for people to engage in learning, as opposed to debate.” (I get a qualifier, but am listed).

Again, here is what Greg claimed:

The “Junk DNA” story is largely a myth, as you probably already know. DNA does not have to code for one of the few tens of thousands of proteins or enzymes known for any given animal, for example, to have a function. We know that. But we actually don’t know a lot more than that, or more exactly, there is not a widely accepted dogma for the role of “non-coding DNA.” It does really seem that scientists assumed for too long that there was no function in the DNA.

And yet again, here is what I actually said about Greg’s statement — no more, no less:

Sadly, Greg seems to have bought into the distortions (several promoted by Mattick) about what people have said about non-coding DNA.

I think they are distortions. And I think Greg’s statement shows he agrees with them. Judge for yourself if the blogosphere got this one right with regard to what I, myself, actually wrote.

___________

Update:

I feel I should provide some clarification, so let me address the statements that Greg made and explain why they are inaccurate.

(1) The “Junk DNA” story is largely a myth.
This is false. There is good reason to expect that much or most of the genome is non-functional, and it takes evidence to show otherwise.

(2) DNA does not have to code for one of the few tens of thousands of proteins or enzymes known for any given animal, for example, to have a function. We know that. But we actually don’t know a lot more than that…
Yes, we do. We know that about half of the genome in humans is made of inactive transposable elements. We know that many mechanisms can add or subtract DNA without being related to function. We know the patterns of diversity in genome size for 10,000 species of eukaryotes.

(3)… or more exactly, there is not a widely accepted dogma for the role of “non-coding DNA.”
This implies that there is a role and we just don’t have the details about it, but the premise is not something you can assume as a given.

(4) It does really seem that scientists assumed for too long that there was no function in the DNA.
This is not true, but it is the claim made by Mattick and others (usually non-scientists). People assumed function from the very beginning, either for all DNA or simply a lot of it. This is true right back to the very first use of the term “junk DNA”, and it was true when people had to explicitly challenge the assumption of function, and it has continued up to the present. Some people, mostly sequencers, may have ignored the rest of the genome and focused on genes, but that does not reflect the range of views that have always been expressed.

How much DNA could be deleted from the human genome?

Posted on January 17, 2008 by T. Ryan Gregory

Larry Moran asks an interesting set of questions about human DNA:

How much of it could be removed without affecting our species in any significant way in terms of viability and reproduction? Or even in terms of significant ability to evolve in the future? In other words, how much is junk?

The options are:

None
less than 10%
between 11% and 49%
between 50% and 74%
between 75% and 89%
90% or more

I hope people take his poll, because I think it will be intriguing to see what most people think. However, I have to admit that I won’t really be able to vote, for the reason I outlined in the comments to his article:

I think this is where a distinction between nonfunctional and inconsequential is important [see Effect versus function for more details]. In terms of whether most DNA is functional, I would agree on the basis of what we currently know that much of it could be deleted in principle. However, this would also affect cell size, and therefore organs, and therefore organisms. And yet, I would not necessarily consider the influence of DNA amount on the cell as a function. It could very well be that there is upward pressure from transposable elements and other mechanisms that cause DNA to accumulate, and downward pressure against this accumulation through selection on organisms. The balance that has been reached is not necessarily adaptive for either side. However, deleting a lot of it could have impacts on development and morphology nonetheless. Maybe it would even be a beneficial change, maybe deleterious, but I can’t assume that there would be no effect.

In some ways, it’s a little like asking, how much of the bacteria in your gut could you kill without having an effect on your health? In principle, a lot of it is clearly not functional, and none of it is there just to function on your behalf. But if you cleared the gut of all bacteria, or even just the commensal and parasitic species, would there be no effect? And if there were adverse consequences, would you take this as evidence that all those bacteria had a function after all?

Genome Technology Online is still confused.

Posted on January 16, 2008 by T. Ryan Gregory

I mentioned once before that Genome Technology Online is confused about non-coding DNA. Today they confirmed this: The Semantics of “Junk” DNA — We’re Confused, Too.

Greg Laden points out a recently published paper in PNAS that gives credence to the theory that non-coding RNA has specific function. The work identified 849 ncRNAs (of 1,328 examined) that are expressed in the adult mouse brain, the majority of which they also found were associated with and expressed in specific regions, cell types, or cellular compartments. He’s blasted here and here for his supposed naivety, but he maintains that the “paper is interesting and the evidence for ncRNA having some function is reasonable.”

Funny, that’s pretty much what I concluded about the paper too:

I am not about to claim that the study hasn’t shown evidence of function for these non-coding regions. I think it’s quite interesting, and it wouldn’t surprise me if lots of non-coding RNA turned out to have a regulatory function.

Greg wasn’t “blasted” by me, but I did point out his misinterpretation of the evidence for function in non-coding DNA, of which these non-coding RNAs are a tiny fraction. The extrapolation from this to “junk DNA” in general is what I was noting. I think aggregating services like Genome Technology Online are useful, but not when they do little more than give inaccurate comment.

Your inner fish.

Posted on January 15, 2008 by T. Ryan Gregory

Neil Shubin was on the Colbert Report recently, just one of a growing list of scientists who have appeared on the show (including, for example, Craig Venter and Ken Miller). You may also remember seeing Shubin interviewed for the NOVA program Judgment Day: Intelligent Design on Trial. Shubin was the man behind the discovery of the transitional fossil Tiktaalik roseae, which represents a superb example of the predictive power and massive wealth of supportive evidence within evolutionary biology.

Shubin’s book Your Inner Fish was released today, and I have been waiting eagerly for it. The excerpt I read was very well written and interesting, and the idea of having examples of human characteristics that are holdovers from our evolutionary history is very exciting for teaching. Plus, anyone who discovers such an awesome transitional species — in Canada, no less — will find his way onto my reading list.

HT: Pharyngula

Signs of function in non-coding RNAs in mouse brain.

Posted on January 15, 2008 by T. Ryan Gregory

Over on his blog, Greg Laden points to some new work by John Mattick’s group on non-coding RNA expression in mouse brains. It’s interesting stuff, and worth a look. Please bear in mind as you do, however, that non-protein-coding but functional RNA is nothing new. Ribosomes are made of non-coding RNA, for one thing. Sadly, Greg seems to have bought into the distortions (several promoted by Mattick) about what people have said about non-coding DNA.

The “Junk DNA” story is largely a myth, as you probably already know. DNA does not have to code for one of the few tens of thousands of proteins or enzymes known for any given animal, for example, to have a function. We know that. But we actually don’t know a lot more than that, or more exactly, there is not a widely accepted dogma for the role of “non-coding DNA.” It does really seem that scientists assumed for too long that there was no function in the DNA.

As I have noted, people have been proposing functions for non-coding DNA since the beginning. As I noted in one of my first Genomicron posts,

Those who complain about a supposed unilateral neglect of potential functions for non-coding DNA simply have been reading the wrong literature. In fact, quite a lengthy list of proposed functions for non-coding DNA could be compiled (for an early version, see Bostock 1971). Examples include buffering against mutations (e.g., Comings 1972; Patrushev and Minkevich 2006) or retroviruses (e.g., Bremmerman 1987) or fluctuations in intracellular solute concentrations (Vinogradov 1998), serving as binding sites for regulatory molecules (Zuckerkandl 1981), facilitating recombination (e.g., Comings 1972; Gall 1981; Comeron 2001), inhibiting recombination (Zuckerkandl and Hennig 1995), influencing gene expression (Britten and Davidson 1969; Georgiev 1969; Nowak 1994; Zuckerkandl and Hennig 1995; Zuckerkandl 1997), increasing evolutionary flexibility (e.g., Britten and Davidson 1969, 1971; Jain 1980; reviewed critically in Doolittle 1982), maintaining chromosome structure and behaviour (e.g., Walker et al. 1969; Yunis and Yasmineh 1971; Bennett 1982; Zuckerkandl and Hennig 1995), coordingating genome function (Shapiro and von Sternberg 2005), and providing multiple copies of genes to be recruited when needed (Roels 1966).

I am not about to claim that the study hasn’t shown evidence of function for these non-coding regions. I think it’s quite interesting, and it wouldn’t surprise me if lots of non-coding RNA turned out to have a regulatory function. But let’s be realistic with this. The authors consider a “long” non-coding RNA transcript to be >200bp. So let’s just round up and say 1,000bp for convenience. They identified around 850 potentially functional sequences (and ~500 that do not show evidence of functional expression, at least in the brain) and estimate that there are 20,000 of them all told. 1,000bp x 20,000 = 20Mb. The mouse genome is about 3Gb. In other words, this study, even read generously, has identified possible function for 0.7% of the mouse genome.

In summary, cool research. Important question, neat result. But let’s not start the usual extrapolationfest that normally accompanies such publications.

Course evaluations.

Posted on January 15, 2008 by T. Ryan Gregory

There is more than enough discussion about the usefulness (or not) of student evaluations of courses. I will refer you to Larry Moran‘s recent discussion where you can find one perspective as well as links to other articles. More recently, there is an interesting piece in University Affairs by Brett Zimmerman of York University.

Course evaluations â€“ studentsâ€™ revenge?

To some extent, the issue appears to be that anonymous evaluations are not particularly helpful. I don’t know how it is done at most universities, but at Guelph evaluations are now done online on a volunteer basis (so the problem becomes getting them to participate) and only comments that are signed count in T&P reviews. I personally find the comments useful, if in no other way than to provide confidence that some of the new approaches I try to implement are successful.

I don’t think we want to abolish student feedback, and I think this should be part of T&P reviews, especially in disciplines like science where there can be a tendency to consider oneself a researcher first and an educator a distant second. However, unsigned comments, especially when they allow students to make insulting statements without consequences, are not useful and should be updated.

Junk DNA and ID redux.

Posted on January 15, 2008 by T. Ryan Gregory

Just a reminder, these are the important points under discussion:

* Proponents of ID themselves clearly suggest that “junk DNA” will mostly or all be functional.

* No unambiguous explanation has been given for why ID must assume that non-coding DNA is functional, especially since they say nothing can be known about the designer or the mechanism.

* The existence of much non-functional DNA would not necessarily refute the idea of design, as many human-designed structures have redundant, non-functional, or even counterproductive characteristics. It would, however, challenge certain assumptions about the designer and the mechanism, which again is why these must be made explicit if the junk DNA argument is to be invoked. Therefore, this is only a useful prediction if one includes details about the mechanism of design.

* The demonstration that all or most non-coding DNA is functional would not support ID to the exclusion of evolution, because a strict interpretation of Darwinian processes has always been taken to propose function as well.

* The demonstration that all or most non-coding DNA in the human genome is functional would still leave the question unanswered as to why the designer put five times more in onion genomes.

* Many functions that have been proposed or demonstrated are dependent on the process of co-option, the same process that is involved in the evolution of complex features.

* Evidence for function in non-coding DNA comes from analyses using evolutionary methods. Other approaches, such as deleting some, have not supported the hypothesis that it is functional.

* The current evidence for function, and other details about how non-coding DNA forms, both suggest that most non-coding DNA is non-functional, or at least that this is the most plausible condition pending much more evidence.

Feel free to comment, but please address these points directly.

"Fx of Junk DNA" or "Mondo hackitude-o-rama".

Posted on January 14, 2008 by T. Ryan Gregory

I do not read the intelligent design blogs, but I do read the blogs of some people who read the intelligent design blogs. Today, Afarensis points to a series of “predictions” put forward by ID proponents at the behest of William Dembski on his blog Uncommon Descent for use in an upcoming interview or something. Curious to see what ID predicts (since, as noted, I don’t read their blogs and they don’t publish in the scientific literature), I took a look.

Apparently, many ID proponents don’t know what a prediction is, or at least not what a useful, testable, scientific prediction is. For example:

“Intelligent design can predict that science will never be able to explain how this complex life arose (homochirality). This prediction has been confirmed every year for decades.”

“That after â€œbillions and billionsâ€ of generations of any particular biological entity no new morphology will occur due to random mutations and natural selection.”

“ID predicts that many, if not all, innovative technology achievements of human kind (read agency) will have direct parallels in, or derivation from, biological systems.”

And so on.

But here is what caught my attention.

“The single most important prediction of Intelligent Design is that, although there might be the occasional degeneration of either macroscopic or microscopic structure, most structures should serve a purpose. Thus most organs should not be vestigial, and most DNA should not be â€œjunk DNAâ€.”

Others concurred that 1) junk DNA won’t be junk, and 2) this is a prediction of ID, and 3) this distinguishes ID from evolutionary science.

I have been over this many times, but of course ID proponents probably don’t read my blog (or the scientific literature). Here are the salient points:

1) There is good reason to believe that non-functional DNA is common. The mechanisms of formation — primarily things like transposable element multiplications, gene duplication and pseudogenization, replication slippage, and unequal crossing over — can add DNA without any requirement that it serve a function. Actual data from genome sequences confirms that the most substantial fraction of eukaryotic DNA is transposable elements, some of which are functional, some of which cause disease, and perhaps most of which are now inactive molecular vestiges. Evolutionary biologists do not simply assume non-function out of ignorance. The default assumption for much non-coding DNA is non-function because of what we do know about how it gets there.

2) I have yet to see a convincing, or even unconvincing, argument for why ID requires that non-functional features be rare or non-existent if indeed nothing can be known about the designer. Even very sophisticated products of design by humans have their redundancies and non-functional aspects. For a fun example, consider what was discovered when parts of the source code for Windows 2000 were leaked to the internet during development. It was laced with curse words, warnings of “hacks” that had to be written in to make parts work, and just general expressions of frustration. For example:

“ // We are morons. We changed the IDeskTray interface between IE4″

“ // TERRIBLE HORRIBLE NO GOOD VERY BAD HACK“

“ * The magnitude of this hack compares favorably with that of the national debt.”

“ // Mondo hackitude-o-rama.”

These are remarks that do not code for anything. You could take them all out and nothing would happen to the function of the software. Or, taking a larger view of this analogy, your computer hard drive probably has on it all sorts of redundant, partly deleted, or perhaps even malevolent bits of code, and yet it nonetheless was a designed structure.

Here is the argument I am making. Either IDists cannot say anything one way or the other about non-function, or they must provide information about the method and motive of the designer to justify the assumption.

3) One of the basic assumptions made by hardcore adaptationists is that non-coding DNA must be functional or it would have been deleted. So, this “prediction” is not exclusive, or even original, to ID — it is based firmly in the most rigid applications of Darwinian processes.

One commenter on the ID blog said this:

“As already mentioned, â€œjunk-DNAâ€ would completely undermine ID if it turned out to really be â€œjunkâ€. But, of course it isnâ€™t.”

I disagree on both fronts. The existence of truly non-functional DNA would not automatically indicate a conclusive refutation of intelligent design, it would only be evidence against design by a divine designer. One can have design with non-function. I also note that there presently is no convincing evidence that more than a small portion of the human genome is functional, let alone the many genomes that are much larger than that of humans.

IDists can consider “all or most non-coding DNA will have some function” their “single most important prediction” if they choose, but this is meaningless because it provides no specifics, it does not actually allow a test of ID unless they acknowledge the features of the designer, and it was already made by some evolutionary biologists decades ago.

The Dr. Credibility principle.

Posted on January 13, 2008 by T. Ryan Gregory

The title “Doctor” and the abbreviated prefix “Dr.” come from the Latin for “teacher”, and are traditionally bestowed on those who have earned a doctoral degree, the highest academic degree attainable. The suffix Ph.D. is an abbreviation for PhilosophiÃ¦ Doctor (L. “Teacher of philosophy”), with “philosophy” from the Greek for “love or pursuit of wisdom”. The Ph.D. is awarded in most academic disciplines, including science (in many cases, the D.Sc., or Doctor of Science, is awarded as an honour for special accomplishment). Medical professionals may also hold the title “Doctor” even though they may do little or no teaching, with common suffixes being M.D. (Medicinae Doctor, or Doctor of Medicine), D.V.M. (Doctor of Veterinary Medicine), D.D.S. (Doctor of Dental Surgery), and so on. It is also possible to obtain the title “Doctor” in areas such as homeopathic medicine or chiropractic. In other words, the title alone does not provide much information about what the individual’s qualifications are. However, one can reasonably argue that the prestige attached to it comes from the fact that individuals such as medical doctors and scientists who hold the title are typically ranked among those with greatest prestige.

For the past several years, i.e., since completing my Ph.D. degree, my official title has been “Dr.” and not “Mr.”. Some people become upset if you call them “Mr.” (or “Ms.”) instead of “Dr.”, but I generally try not to make too much of this. In academic settings, I think a good rule is that people who do not hold the degree, or colleagues in formal settings, should address those with the degree by their official title of “Dr.”. On the other hand, in informal settings with colleagues who hold the same title, or with students with whom one is familiar (e.g., graduate students in one’s own group), using the title simply becomes awkward and first names are appropriate. I will admit that it did take me some time to get used to calling my advisors by their first names, and I have since observed this in several of my students. Outside of academic settings, I generally don’t correct people if they call me “Mr.”, and I don’t introduce myself with the title. I believe most of my colleagues take a similar approach, though of course there is some variation.

Two recent discussions on science blogs have prompted this post, in case you were curious as to why I was bothering to tell you this. Both relate to what I am calling the “Dr. Credibility principle”, which is an attempt to gain undeserved credibility simply through invoking the title.

The first is the example of Dr. Sharon Moalem, author of the book Survival of the Sickest. It appears Dr. Moalem has been misrepresenting his credentials, making it seem as though he has obtained a medical degree, when in fact he has not (though he his a med student). PZ Myers has already discussed the book, which seems to be largely based on non-scientific and pseudoscientific arguments. I had a feeling that this would be so right away. Why? Because he used the title “Dr.” on the cover.

You see, scientists almost never use the prefix “Dr.” on their book covers. It doesn’t matter if it is a popular book or a technical one. If you have science books, go see for yourself. We just don’t do it. In fact, I am willing to extend the “Dr. Credibility principle” to claim that whenever you see someone making a big deal out of their title, especially on a book cover, you can usually bet it is because the content of the book cannot stand on its own without a deflecting appeal to authority. Yet, as noted, those with real authority, and because of whom the title has some prestige, do not use it in that way.

The second case is the Discovery Institute’s public relations gimmick of having “scientists” sign the following “Dissent from Darwinis m” statement:

We are skeptical of claims for the ability of random mutation and natural selection to account for the complexity of life. Careful examination of the evidence for Darwinian theory should be encouraged.

As I tell my students, this is so benign that I could easily sign it, and I suspect any biologist could. I too am skeptical of claims that random mutation and natural selection account for everything. And careful examination of the evidence should be encouraged in all sciences, always. Nevertheless, as John Lynch has demonstrated, people with qualifications in evolutionary biology don’t sign this list because it is obviously intended to confuse the uninitiated.

Compare this with a statement that many biologists (though only those with the name Steve or its derivatives) are only too happy to put their names on:

Evolution is a vital, well-supported, unifying principle of the biological sciences, and the scientific evidence is overwhelmingly in favor of the idea that all living things share a common ancestry. Although there are legitimate debates about the patterns and processes of evolution, there is no serious scientific doubt that evolution occurred or that natural selection is a major mechanism in its occurrence. It is scientifically inappropriate and pedagogically irresponsible for creationist pseudoscience, including but not limited to “intelligent design,” to be introduced into the science curricula of our nation’s public schools.

The Discovery Institute has managed to find only 700 signatories since 2001. The NCSE’s list, which is meant only as a parody, has 860 Steves.

Most of the individuals who have signed the Discovery Institute’s list have degrees in engineering (see the Salem conjecture), chemistry, physics, or computer science, though several at least have biological training, albeit in physiology or molecular biology. Some of them are not scientists of any sort. The only unifying feature appears to be that they hold a Ph.D. in something. Once again, it’s the “Dr. Credibility principle”: If one has the title, one must be an authority on the issue under discussion.

The Dr. Credibility principle finds it most extraordinary application in the Orwellian-entitled Physicians and Surgeons for Scientific Integrity. Who can join, you ask? “Any person with an M.D., D. O., D.D.S., D.M.D., D.V.M. or equivalent degree may become a physician/surgeon member of PSSI”. Personally, I don’t think anyone should take a medical doctor, dentist, or veterinarian as an authority on biology any more than they should let biologists like me prescribe drugs or perform a root canal. The only possible explanation is that this is another misguided appeal to authority. Unfortunately, many people are liable to buy into it, especially in groups like anti-evolutionists, where authority seems to trump data regularly.

Let me say it clearly. Having the title “Dr.” does not make anyone an authority on anything except, one would hope, the field in which they obtained a degree. Anything else is just playing doctor to gain undue credibility.