The Climate Modeling Leak: Code and Data Generating Published Results Must be Open and Facilitate Reproducibility

On November 20 documents including email and code spanning more than a decade were leaked from the Climatic Research Unit (CRU) at East Anglia University in the UK.

The Leak Reveals a Failure of Reproducibility of Computational Results

It appears as though the leak came about through a long battle to get the CRU scientists to reveal the code and data associated with published results, and highlights a crack in the scientific method as practiced in computational science. Publishing standards have not yet adapted to the relatively new computational methods used pervasively across scientific research today.

Other branches of science have long-established methods to bring reproducibility into their practice. Deductive or mathematical results are published only with proofs, and there are long established standards for an acceptable proof. Empirical science contains clear mechanisms for communication of methods with the goal of facilitation of replication. Computational methods are a relatively new addition to a scientist’s toolkit, and the scientific community is only just establishing similar standards for verification and reproducibility in this new context. Peer review and journal publishing have generally not yet adapted to the use of computational methods and still operate as suitable for the deductive or empirical branches, creating a growing credibility gap in computational science.

Verifying Computational Results without Clear Communication of the Steps Taken is Near-Impossible

The frequent near-impossibility of verification of computational results when reproducibility is not considered a research goal is shown by the miserable travails of “Harry,” a CRU employee with access to their system who was trying to reproduce the temperature results. The leaked documents contain logs of his unsuccessful attempts. It seems reasonable to conclude that CRU’s published results aren’t reproducible if Harry, an insider, was unable to do so after four years.

This example also illustrates why a decision to leave reproducibility to others, beyond a cursory description of methods in the published text, is wholly inadequate for computational science. Harry seems to have had access to the data and code used and he couldn’t replicate the results. The merging and preprocessing of data in preparation for modeling and estimation encompasses a potentially very large number of steps, and a change in any one could produce different results. Just as when fitting models or running simulations, parameter settings and function invocation sequences must be communicated, again because the final results are a culmination of many decisions and without this information each small step must match the original work – a Herculean task. Responding with raw data when questioned about computational results is merely a canard, not intended to seriously facilitate reproducibility.

The story of Penn State professor of meteorology Michael Mann‘s famous hockey stick temperature time series estimates is an example where lack of verifiability had important consequences. Release of the code and data used to generate the results in the hockey stick paper likely would have avoided the convening of panels to assess the papers. The hockey stick is a dramatic illustration of global warming and became something of a logo for the U.N.’s Intergovernmental Panel of Climate Change (IPCC). Mann was an author of the 2001 IPCC Assessment report, and was a lead author on the “Copenhagen Diagnosis,” a report released Nov 24 and intended to synthesize the hundreds of research papers about human-induced climate change that have been published since the last assessment by the IPCC two years ago. The report was prepared in advance of the Copenhagen climate summit scheduled for Dec 7-18. Emails between CRU researchers and Mann are included in the leak, which happened right before the release of the Copenhagen Diagnosis (a quick search of the leaked emails for “Mann” provided 489 matches).

These reports are important in part because of their impact on policy, as CBS news reports, “In global warming circles, the CRU wields outsize influence: it claims the world’s largest temperature data set, and its work and mathematical models were incorporated into the United Nations Intergovernmental Panel on Climate Change’s 2007 report. That report, in turn, is what the Environmental Protection Agency acknowledged it “relies on most heavily” when concluding that carbon dioxide emissions endanger public health and should be regulated.”

Discussions of Appropriate Level of Code and Data Disclosure on, Before and After the CRU Leak

For years researchers had requested the data and programs used to produce Mann’s Hockey Stick result, and were resisted. The repeated requests for code and data culminated in Freedom of Information (FOI) requests, in particular those made by Willis Eschenbach, who tells his story of requests he made for underlying code and data up until the time of the leak. It appears that a file,, was placed on CRU’s FTP server and then comments alerting people to its existence were posted on several key blogs.

The thinking regarding disclosure of code and data in one part of the climate change community is illustrated in this fascinating discussion on the blog in February. (Thank you to Michael Nielsen for the pointer.) has 5 primary authors, one of whom is Michael Mann, and its primary author is Gavin Schmidt. In this RealClimate blog post from November 27, Where’s the Data, the position seems to be now very much all in favor of data release, but the first comment asks for the steps taken in reconstructing the results as well. This is right – reproducibility of results should be the concern (as argued here for example).

Policy and Public Relations

The Hill‘s Blog Briefing Room reported that Senator Inhofe (R-Okla.) will investigate whether the IPCC “cooked the science to make this thing look as if the science was settled, when all the time of course we knew it was not.” With the current emphasis on evidence-based policy making, Inhofe’s review should recommend code and data release and require reliance on verified scientific results in policy making. The Federal Research Public Access Act should be modified to include reproducibility in publicly funded research.

A dangerous ramification from the leak could be an undermining of public confidence in science and the conduct of scientists. My sense is that making code and data readily available in a way that facilitates reproducibility of results, can help avoid distractions from the real science, such as potential evasions of FOIA requests, whether or not data were fudged, or scientists acted improperly in squelching dissent or manipulating journal editorial boards. Perhaps data release is becoming an accepted norm, but code release for reproducibility must follow. The issue here is verification and reproducibility, without which it is all but impossible to tell whether the core science done at CRU was correct or not, even for peer reviewing scientists.

This entry was posted in Law, Open Science, Reproducible Research, Scientific Method, Software, Statistics, Technology, Uncategorized. Bookmark the permalink.

7 Responses to The Climate Modeling Leak: Code and Data Generating Published Results Must be Open and Facilitate Reproducibility

  1. Pingback: Caesar’s Wife « Software Carpentry

  2. Dan says:

    Excellent work. Please let me add that Rajendra K Pachauri, chairman of the IPCC, was given an honorary Doctorate by the University of East Anglia. So don’t expect impartiality from him.

  3. vcs says:

    Thanks Dan. Also interesting is that Phil Jones, head of CRU, is stepping down pending an independent investigation:

    Michael Mann is also under investigation by Penn, his employer, for possible research ethics violations:

  4. Victoria,
    I think your analysis is too simplistic. Open data and open source on it’s own is a fine and laudable goal, but is not the problem in the CRU case, nor in the wider field of climate modeling.

    First, most climate code and data is freely available. The research results are reproduced widely throughout the field, and the comparison and validation of results is build into the IPCC process. Can you name any other field of science, in which 25 different research centres around the world are building large scale models of the same phenomena, and comparing the results in detail through controlled model inter-comparison projects (see CMIP5 for a taste:

    Second, climate phenomena are sufficiently complex that *exact* reproducibility is effectively impossible. This deserves a longer explanation (and I’m working on a paper on that), but for a taste, see:
    So what is needed is a different approach to reproducibility, which is to independently arrive at the comparable results via different methods. Which is exactly what the climate science community has been doing for decades, and it is very effective (unfortunately, little has been written on this, but I’m working on it – it’s a fascinating discipline)

    Finally, you’ve blown the Mann and Jones work out of all proportion. The field of dendro-chronology is a minor sideshow in climate science, and a very immature discipline. Mann’s early work on this had errors in it, but of course it did – that’s what science is! Subsequent work has improved the methodologies, without changing the results at all. But then the politicians and lawyers get involved, fail to understand the scientific process, and think that science should be a perfect process every step of the way. Can you honestly say that any of your publications could stand up to the kind of scrutiny at congressional levels that Mann’s has been subjected to? This is not how we do science.

    The net result of politically inspired attacks on climate scientists has resulted in an understandable siege mentality, which is what you see in the CRU emails. I have characterized it as a kind of denial of service attack, and it’s not healthy for science in any way. He’s my take:

  5. vcs says:

    Hi Steve,

    The simple availability of code and data isn’t enough – these need to be shared in such a way that the results can be verified. According to HARRY’s readme file he apparently had access to code and data from within the CRU and still wasn’t able to replicate results. The file also shows that layers and layers of complex data processing are involved in the work he was trying to replicate – all part of the scientific research, and important for understanding and evaluating the results. I haven’t questioned the scientific results, but secretiveness or obfuscation with regard to code and data makes it easier to believe computational results could possibly be in error, regardless of the field. Reproducibility of computational results is not a concern unique to climate science.

    Reproducibility is being taken seriously in a growing number of fields, such as among seismic researchers and in the machine learning community. I don’t agree that the potential for public harassment is an argument for an exception to the scientific method. I’d argue instead that openness is especially important in the case of climate change because of its salience in public policy. I have a previous blog post on the release of government data arguing that the answer to bad speech is more speech (doesn’t necessarily have to come from the original researchers).

    Sharing code and data from results that have not been prepared with the intention that they are to be reproducible is tough and I see your work as vital in this effort, for example your two recent papers in CiSE. I agree we self-correct in science, but this comes through openness and disclosure. Encouragingly, the view that science involves reproducibility of results appears to be resonating. Today The Times reported that The Met will conduct a review of it’s temperature data since it used work emanating from the CRU – and it will do this in a transparent fashion because it “wants to create a new and fully open method of analysing temperature data.” In yesterday’s Scientific American Michael Mann was quoted saying his most recent publication in Science includes the underlying code and data as a supplement – I’m hoping it was released in a reproducible fashion, and that we get to a place where our published results are routinely verifiable.

  6. Pingback: Software Quality in Climate Research | Serendipity

  7. Pingback: Victoria Stodden

Leave a Reply

Your email address will not be published. Required fields are marked *