Archive for category Research
If you’ve read the Double Helix, you’ll be very familiar with it. If you haven’t, stop reading this post and go get a copy at Amazon. You won’t regret it.
Anyway, a great pic was posted in Twitter a few days ago, and I wanted to share it with you.
It was on February 28th, 1953, that Watson and Crick claimed they had cracked a big problem they have been working on; in fact Crick is said to have stormed into the Eagle proclaiming that they had ” found the secret of life”: they finally had a model for the structure of DNA.
Later that year, in April, the idea was formalized in the classic Nature paper.
2013 then, marks the 60th anniversary of this event, which opened the path for the explosion of molecular biology as a field.
Watson, now 84, appears in this great pic, taken to commemorate such an important event in the history of biological research. He is, of course, having a beer, like great scientists do.
Although this is not the original Eagle pub in Cambridge, but the one at CSHL, it is still a nice photograph.
Just for comparison, this is a picture taken of both of them in 1959.
Soon after the recent set of ENCODE papers came out, several scientists raised concerns regarding the estimates about the fraction of the genome that appears to be functional, that the authors put forward: according to them, ~80% of the human genome is functional.
This, of course, greatly differs to what most of us think, considering, among other things, that the fraction of the genome that is evolutionarily conserved through purifying selection appears to be under 10% (what about the rest? We think it divides between junk DNA and some “unknowns”).
The problem mainly arose from the definition of “functional” that ENCODE used, one that is so loose, that may not be useful at all.
In fact, “according to ENCODE, for a DNA segment to be ascribed functionality it needs to (1) be transcribed or (2) associated with a modified histone or (3) located in an open-chromatin area or (4) to bind a transcription factors or (5) to contain a methylated CpG dinucleotide” (Graur et al., 2013). You would agree that this criteria is very lenient, hence, the 80% estimate.
A recent paper, ruthlessly discusses the ENCODE paper and takes great issue with the “80%” estimate. The authors “detail the many logical and methodological transgressions involved in assigning functionality to almost every nucleotide in the human genome“. The manuscript reviewers could have suggested the authors to tone it down a little, but from what I found out in the web, evolutionary biologists tend to be very strong about their opinions on paper, when discussing the work of others they disagree with.
I encourage you to read the article, which is freely available. In the meantime, here are a few quotes:
The ENCODE results were predicted by one of its authors to necessitate the rewriting of textbooks. We agree, many textbooks dealing with marketing, mass-media hype, and public relations may well have to be
ENCODE adopted a strong version of the causal role definition of function, according to which a functional element is a discrete genome segment that produces a protein or an RNA or displays a reproducible biochemical signature (for example, protein binding). Oddly, ENCODE not only uses the wrong concept of functionality, it uses it wrongly and inconsistently
We identified three main statistical infractions. ENCODE used methodologies encouraging biased errors in favor of inflating estimates of functionality, it consistently and excessively favored sensitivity over specificity, and it paid unwarranted attention to statistical significance, rather than to the magnitude of the effect.
At this point, we must ask ourselves, what is the aim of ENCODE: Is it to identify every possible functional element at the expense of increasing the number of elements that are falsely identified as functional? Or is it to create a list of functional elements that is as free of false positives as possible
Comparative studies have repeatedly shown that pseudogenes, which have been so defined because they lack coding potential due to the presence of disruptive mutations, evolve very rapidly and are mostly subject to no functional constraint (Pei et al. 2012). Hence, regardless of their transcriptional or translational status, pseudogenes are nonfunctional!
For example, according to ENCODE, the putative function of the H4K20me1 modification is “preference for 5’ end of genes.” This is akin to asserting that the function of the White House is to occupy the lot of land at the 1600 block of Pennsylvania Avenue in Washington, D.C.
So, what have we learned from the efforts of 442 researchers consuming 288 million dollars? According to Eric Lander, a Human Genome Project luminary, ENCODE is the “Google Maps of the human genome” (Durbin et al. 2010). We beg to differ, ENCODE is considerably worse than even Apple Maps.
Evolutionary conservation may be frustratingly silent on the nature of the functions it highlights, but progress in understanding the functional significance of DNA sequences can only be achieved by not
ignoring evolutionary principles
High-throughput genomics and the centralization of science funding have enabled Big Science to generate “high-impact false positives” by the truckload (The PLoS Medicine Editors 2005; Platt et al. 2010; Anonymous 2012; MacArthur 2012; Moyer 2012). Those involved in Big Science will do well to remember the depressingly true popular maxim: “If it is too good to be true, it is too good to be true.”
We conclude that the ENCODE Consortium has, so far, failed to provide a compelling reason to abandon the prevailing understanding among evolutionary biologists according to which most of the human genome is devoid of function
(…) according to the ENCODE Consortium, a biological function can be maintained indefinitely without selection, which implies that at least 80 – 10 = 70% of the genome is perfectly invulnerable to
deleterious mutations, either because no mutation can ever occur in these “functional” regions, or because no mutation in these regions can ever be deleterious. This absurd conclusion was reached through various means (…)
Reading a post by Larry Moran on Sandwalk, in which he “ridicules the enthusiasm James Shapiro expresses in his book ‘Evolution: A View from the 21st Century’ for physicists coming into evolutionary studies and bringing new skills and new ideas” (this is according to Shapiro), I remembered about an article I read recently (see below).
This is part of what Moran wrote:
Why don’t I move to physics and solve their problems? I’ve got all the proper qualifications, “lacking a formal education,” “less prejudicial background,” and I haven’t been taught to exclude impossible things. I bet I could convince half a dozen of my biologist colleagues to abandon the difficult problems of biology in order to help the physicists. It shouldn’t take more than a few years.
We need a name for this discovery, let’s call it The Shapiro Conjecture.
Meanwhile, I welcome all those physicists who know nothing about evolution, protein structure, genetics, physiology, metabolism, and ecology. That’s just what we need in the biological sciences to go along with all the contributions made by equally ignorant creationists.
This is one of the comments on that post:
The best ones don’t do this, but it is fairly common for mathematicians and physicists to waltz into biology convinced that their powerful mathematical techniques will be unknown there, and that they can revolutionize computational biology, to universal applause.
Anyway, this is from the book “A Random Walk in Science“, which was brought back to my mind while reading Moran’s post:
Two great comments from T Ryan Gregory at E Birney’s Blog post on ENCODE.
He is replying to someone in the comments section. In the first quote, he again (and correctly) highlights that there was never a period when researchers supposedly dismissed all ncDNA as junk. I’ve quoted him before here stating that.
In the second one, he talks about intelligent design.
What I am talking about is what was actually being said in the primary literature. In that case, it is abundantly clear that there was no widespread dismissal of possible functions for non-coding DNA among the researchers working in the field. I have not been quoting selectively, either — I looked at the papers that first described each type of non-coding DNA elements, for example, or review papers from the period, or news stories in Science and Nature from that time. All of them show that function was a common expectation, or at least a serious question, throughout the supposed period of dismissal.
I personally don’t think non-functional junk DNA is all that relevant to the fight from either side. Why should “intelligent design” assume that nothing is non-functional? Our own intelligently designed artifacts often have crap in them. Who would argue that the computer code for Windows is flawless and without non-functional, redundant, sloppy, or otherwise unnecessary bits? I have never seen a clear articulation of the reason that ID predicts no non-functional DNA (other than the obvious “God don’t make no junk” — and even then, why not?).
In light of the recent ENCODE papers, several researchers have questioned some of their conclusions and the way the authors have described their findings (“These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions” -that’s from one of the articles….-), not only in the primary articles, but also in interviews and press releases.
Michael Eisen has discussed some of these issues on his blog, and this is a brief extract from one of his posts:
But if you think about this, you will realize that this simply can not be true. As we and many others have now shown, molecular interactions are not rare. Transcripts, transcription factor binding sites, DNA modifications, chromatin modifications, RNA binding sites, phosphorylation sites, protein-protein interactions, etc… are everywhere. This suggests that these kind of biochemical events are easy to create – change a nucleotide here – wham, a new transcription factor binds, an splicing site is lost, a new promoter is created, a glycosylation site is eliminated.
And, as I’ve mentioned before….
Rather than assuming – as so many of the ENCODE researchers apparently do – that the millions (or is it billions?) of molecular events they observe are a treasure trove of functional elements waiting to be understood, they should approach each and every one of them with Kimurian skepticism
Thank you, Michael.
Ryan discussing Michael Eisen’s take on ENCODE. This is exactly my take on the matter (my emphasis).
It’s important to distinguish between the different views that one can have on function of non-coding DNA. The notion that “all non-coding DNA is functionless junk” is a straw man position that no one has ever seriously held. So, yes, there was already an expectation that some non-coding DNA would turn out to have function — and plenty of examples were already well known. This extreme “it’s all junk” idea did not need to be overturned, because no one claimed it and, as noted, other examples that refute it were already known. However, there is another claim that is coming out in these media reports and in quotes from the ENCODE authors — that the evidence indicates that 80% or more of the genome is functional. This is a claim that is based on the flimsiest of definitions of “functional” and is not one that likely to be unconvincing to many experts. So, the claim that “there is little or no non-functional DNA at all” is somewhat unique to ENCODE (John Mattick thinks so too). But it is a problematic claim because the evidence for this assertion is very tenuous.
Somewhere in the middle is what I believe to be the most reasonable view: a significant percentage of the non-coding DNA in the human genome is functional in the sense of being biologically meaningful, but most of it probably is not. This is certainly the view that is most consistent with the evidence, and it is, in fact, the one that the early proponents of the “junk DNA” concept actually held. As I noted in a previous post, the ENCODE authors have to work pretty hard to even get the 80% figure, which would still leave an awful lot of non-functional nucleotides in the genome.
Let’s also not forget about this, written in 1972:
These considerations suggest that up to 20% of the genome is actively used and the remaining 80+% is junk. But being junk doesn’t mean it is entirely useless. Common sense suggests that anything that is completely useless would be discarded. There are several possible functions for junk DNA.
Comings, D.E. 1972. The structure and function of chromatin. Advances in Human Genetics 3: 237-431.