Saturday, March 16, 2013

Sandwalk: Anonymous Nature Editors Respond to ENCODE Criticism

There are now been four papers in the scientific literature criticizing the way ENCODE leaders hyped their data by claiming that most of our genome is functional [see Ford Doolittle's Critique of ENCODE ]. There have been dozens of blog postings on the same topic.

The worst of the papers were published by Nature?this includes the abominable summary that should never have made it past peer review (Encode Consortium, 2012).

The lead editor on the ENCODE story was Brendon Maher and he promoted the idea that the ENCODE results showed that most of our genome has a function [ENCODE: The human encyclopaedia]

The consortium has assigned some sort of function to roughly 80% of the genome, including more than 70,000 ?promoter? regions ? the sites, just upstream of genes, where proteins bind to control gene expression ? and nearly 400,000 ?enhancer? regions that regulate expression of distant genes.
But the very next day (Sept. 6, 2012) Brendon Maher got wind of the controversy and started to defend Nature's decisions. He quoted several bloggers, including me [Fighting about ENCODE and junk]. His main defense was ...
ENCODE was conceived of and practised as a resource-building exercise. In general, such projects have a huge potential impact on the scientific community, but they don?t get much attention in the media. The journal editors and authors at ENCODE collaborated over many months to make the biggest splash possible and capture the attention of not only the research community but also of the public at large.
In other words, the editors of Nature thought about this for several months and then decided that it was okay to attack junk DNA because that would make a big splash in the media.

A few days ago (March 12, 2013) the editors of Nature published another response to criticism [Form and Function]. These editors don't identify themselves.

Let's see how they do by analyzing each part of the editorial. Let's begin with the subtitle ...

Although debate over scientific definitions is important, it risks obscuring the real issues.
The real issues are whether most of our genome is functional or not and whether the ENCODE leaders understood the concept of noise and chance associations. I hope that Nature realizes that it really screwed up by allowing stupid definitions of function to obscure those issues, giving rise to the idea that junk DNA was debunked. Let's see if they understand where they went wrong.
Science is at the mercy of its language. It can be difficult for researchers to communicate what most excites them about the beauty, intricacy and complexity of the natural world. And when words fail, debates and arguments often arise.

One enduring debate has been resurrected by ENCODE, the Encyclopedia of DNA Elements ? an ongoing multimillion-dollar project to catalogue the functional elements of the human genome. A headline-grabbing claim, first made in this publication last September, was that roughly 80% of human DNA had been ascribed some ?biochemical function? thanks to the efforts of more than 440 scientists (The ENCODE Project Consortium Nature 489, 57?74; 2012).

That percentage is remarkably high, in part because of a broad definition of ?function?. The ENCODE team used the term to include binding by a regulatory protein, or transcription into RNA ? activities identified as widespread. But almost immediately, other scientists began to take this definition to task, calling it essentially meaningless.

They got that part right. The immediate reaction to the Nature papers is that the journal made a big mistake by using a silly definition of "function"?one that was bound to be misinterpreted by everyone. Many of us thought (and still think) that the authors actually believed that most of the genome is functional in the classic sense. In other words, it's not at all clear that there's a difference between the ENCODE definition of function and the definition used by everyone else, at least in the minds of the ENCODE leaders.
Some background is useful. Genomes vary dramatically in size ? sometimes irrespective of the complexity of the organism. Take, for example, the genome of the marbled lungfish (Protopterus aethiopicus), which clocks in at an excessive 133 billion base pairs. That of the puffer?fish (Takifugu rubripes), by contrast, sports only 365 million.

For the ENCODE paper to suggest that humans have little genomic redundancy implies that the 3.2-billion-base-pair human genome hits a sweet spot in efficiency. Critics suggested, sometimes sharply, that this was both anthropocentric and ignorant of how evolution shapes the genome. Much of human DNA is non-functional, they insisted. It is a relic of history, garbled by mutation and essentially junk.

The most recent formal critique, published this week, suggests that similar analyses on organisms with very large and very small genomes would probably find the same density of functional elements (W. F. Doolittle Proc. Natl Acad. Sci. USA http://doi.org/kr3; 2013). This investigation has yet to be done.

This is Ford Doolittle's critique but there are many others. Clearly, it's important to bring up this issue (variations in C-value) when discussing the possibility of junk DNA. I don't recall that this discussion took place in the original Nature papers or editorial comments. In fact. I don't recall any substantive discussion of junk or the possibility of "non-function" in any of the paper I read. Can anyone else find a reference?
The debate over ENCODE?s definition of function retreads some old battles, dating back perhaps to geneticist Susumu Ohno?s coinage of the term junk DNA in the 1970s. The phrase has had a polarizing effect on the life-sciences community ever since, despite several revisions of its meaning. Indeed, many news reports and press releases describing ENCODE?s work claimed that by showing that most of the genome was ?functional?, the project had killed the concept of junk DNA. This claim annoyed both those who thought it a premature obituary and those who considered it old news.

There is a valuable and genuine debate here. To define what, if anything, the billions of non-protein-coding base pairs in the human genome do, and how they affect cellular and system-level processes, remains an important, open and debatable question. Ironically, it is a question that the language of the current debate may detract from. As Ewan Birney, co-director of the ENCODE project, noted on his blog: ?Hindsight is a cruel and wonderful thing, and probably we could have achieved the same thing without generating this unneeded, confusing discussion on what we meant and how we said it? (see go.nature.com/8xorge).

Excellent! I'm glad to see that the editors are admitting some responsibility even though they are shifting most of the blame to "big talker" Ewan Birney and not to their reviewers (or themselves). On the other hand, to claim that junk DNA is still an "open and debatable question" seems like a bit of a cop-out. Yes, it's "debatable" but the proponents of junk DNA will probably win any debates. Rumors of the death of junk DNA are not "premature" and they are not "old news." Most of our genome is junk whether the ENCODE leaders believe it or not. It's a fact even if the editors of Nature are skeptical.

Any knowledgeable reviewer would have said the same thing. They would have pointed out that the discussions of function have to include all of the data suggesting that most of these sites are nothing nonfunctional noise. After all, we went through this same debate in 2007 when the preliminary ENCODE data was published. Ignoring this possibility is not good science. Good scientists think of ways their data could be falsified and they give appropriate credit to other interpretations that disagree with their own. It looks like the ENCODE scientists learned nothing in 2007 [see The ENCODE Data Dump and the Responsibility of Science Journalists for a discussion of what happened in 2007.]

We didn't see very much of that kind of good science in Nature last September and I'm still not seeing much of it here.

The ferocity of the criticism has no doubt been fuelled by dissatisfaction over ENCODE?s top-down, big-science approach and the large share of research funds that it has attracted. Many biologists have called the 80% figure more a publicity stunt than a statement of scientific fact. Nevertheless, ENCODE leaders say, the data resources that they have provided have been immensely popular. So far, papers that use the data have outnumbered those that take aim at the definition of function.
If you read Dan Graur's critique you'll see that the data resources are difficult to use and that they are contaminated by an emphasis on function. I think most biologists would be happy if the huge amount of money spent on the project really did yield useful databases. That's by no means certain.

And just because a lot of people might be using the data is no excuse for the publicity hype that misled thousands of scientists and all of the general public. It will take a long tome to undo the damage caused by Nature (and Science) and the ENCODE leaders. Most people now believe that our genome is packed with important regulatory features and that junk DNA no longer exists. I would be more impressed with this editorial if they made it clear that such conclusions were not supported by the ENCODE data.

The debate sounds like a matter of definitional differences. But to dismiss it as semantics minimizes the importance of words and definitions, and of how they are used to engage in research and to communicate findings. ENCODE continues to collect data and to characterize what the 3.2 billion base pairs might be doing in our genome and whether that activity is important. If a better word than ?function? is needed to describe those activities, so be it. Suggestions on a postcard please.
The editors are correct. This isn't just about semantics. Do they really need help in defining the abundant binding sites and transcripts that ENCODE found? If so, then clearly they haven't learned their lesson.

As Martin Hafner says in the second comment on the Nature website ...

'The English word to describe those activities you mention in your last paragraph is noise.'
Funny that the editors never discuss this possibility, isn't it?

The ENCODE Project Consortium 2012. An integrated encyclopedia of DNA elements
in the human genome. Nature 489: 57-74. (E. Birney, corresponding author)

Source: http://sandwalk.blogspot.com/2013/03/anonymous-nature-editors-respond-to.html

yield crossbow airhead atherosclerosis steven tyler tropic thunder carnie wilson

No comments:

Post a Comment