Over on his blog, Greg Laden points to some new work by John Mattick’s group on non-coding RNA expression in mouse brains. It’s interesting stuff, and worth a look. Please bear in mind as you do, however, that non-protein-coding but functional RNA is nothing new. Ribosomes are made of non-coding RNA, for one thing. Sadly, Greg seems to have bought into the distortions (several promoted by Mattick) about what people have said about non-coding DNA.
The “Junk DNA” story is largely a myth, as you probably already know. DNA does not have to code for one of the few tens of thousands of proteins or enzymes known for any given animal, for example, to have a function. We know that. But we actually don’t know a lot more than that, or more exactly, there is not a widely accepted dogma for the role of “non-coding DNA.” It does really seem that scientists assumed for too long that there was no function in the DNA.
As I have noted, people have been proposing functions for non-coding DNA since the beginning. As I noted in one of my first Genomicron posts,
Those who complain about a supposed unilateral neglect of potential functions for non-coding DNA simply have been reading the wrong literature. In fact, quite a lengthy list of proposed functions for non-coding DNA could be compiled (for an early version, see Bostock 1971). Examples include buffering against mutations (e.g., Comings 1972; Patrushev and Minkevich 2006) or retroviruses (e.g., Bremmerman 1987) or fluctuations in intracellular solute concentrations (Vinogradov 1998), serving as binding sites for regulatory molecules (Zuckerkandl 1981), facilitating recombination (e.g., Comings 1972; Gall 1981; Comeron 2001), inhibiting recombination (Zuckerkandl and Hennig 1995), influencing gene expression (Britten and Davidson 1969; Georgiev 1969; Nowak 1994; Zuckerkandl and Hennig 1995; Zuckerkandl 1997), increasing evolutionary flexibility (e.g., Britten and Davidson 1969, 1971; Jain 1980; reviewed critically in Doolittle 1982), maintaining chromosome structure and behaviour (e.g., Walker et al. 1969; Yunis and Yasmineh 1971; Bennett 1982; Zuckerkandl and Hennig 1995), coordingating genome function (Shapiro and von Sternberg 2005), and providing multiple copies of genes to be recruited when needed (Roels 1966).
I am not about to claim that the study hasn’t shown evidence of function for these non-coding regions. I think it’s quite interesting, and it wouldn’t surprise me if lots of non-coding RNA turned out to have a regulatory function. But let’s be realistic with this. The authors consider a “long” non-coding RNA transcript to be >200bp. So let’s just round up and say 1,000bp for convenience. They identified around 850 potentially functional sequences (and ~500 that do not show evidence of functional expression, at least in the brain) and estimate that there are 20,000 of them all told. 1,000bp x 20,000 = 20Mb. The mouse genome is about 3Gb. In other words, this study, even read generously, has identified possible function for 0.7% of the mouse genome.
In summary, cool research. Important question, neat result. But let’s not start the usual extrapolationfest that normally accompanies such publications.