<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: The Climate Modeling Leak: Code and Data Generating Published Results Must be Open and Facilitate Reproducibility</title>
	<atom:link href="http://blog.stodden.net/2009/11/30/the-climate-modeling-leak-code-and-data-generating-published-results-must-be-open-and-facilitate-reproducibility/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.stodden.net/2009/11/30/the-climate-modeling-leak-code-and-data-generating-published-results-must-be-open-and-facilitate-reproducibility/</link>
	<description>Just another WordPress weblog</description>
	<lastBuildDate>Mon, 17 May 2010 01:59:26 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<item>
		<title>By: Victoria Stodden</title>
		<link>http://blog.stodden.net/2009/11/30/the-climate-modeling-leak-code-and-data-generating-published-results-must-be-open-and-facilitate-reproducibility/comment-page-1/#comment-708</link>
		<dc:creator>Victoria Stodden</dc:creator>
		<pubDate>Mon, 21 Dec 2009 23:48:54 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stodden.net/?p=114#comment-708</guid>
		<description>[...] recent file leak from a major climate modeling center in England (I blogged my response to the leak here). The video is here, see especially 16:27, and the transcript is [...]</description>
		<content:encoded><![CDATA[<p>[...] recent file leak from a major climate modeling center in England (I blogged my response to the leak here). The video is here, see especially 16:27, and the transcript is [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Software Quality in Climate Research &#124; Serendipity</title>
		<link>http://blog.stodden.net/2009/11/30/the-climate-modeling-leak-code-and-data-generating-published-results-must-be-open-and-facilitate-reproducibility/comment-page-1/#comment-707</link>
		<dc:creator>Software Quality in Climate Research &#124; Serendipity</dc:creator>
		<pubDate>Tue, 08 Dec 2009 15:09:42 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stodden.net/?p=114#comment-707</guid>
		<description>[...] An argument that when a paper is published, all of the code and data on which it is based should be released so that other scientists (who have the appropriate background) can re-run it and validate the results. In fields with complex, messy datasets, this is exceedingly hard, but might be achievable with good tools. The complete toolset needed to do this does not exist today, so just calling for making the code open source is pointless. Much climate code is already open source, but that doesn&#8217;t mean anyone in another lab can repeat a run and check the results. The problems of reproducibility have very little to do with whether the code is open &#8211; the key problem is to capture the entire scientific workflow and all data provenance. This is very much an active line of research, and we have a long way to go. In the absence of this, climate scientists rely on other scientists testing the results with other methods, rather than repeating the same tests. Which is the way it&#8217;s done in most branches of science. [...]</description>
		<content:encoded><![CDATA[<p>[...] An argument that when a paper is published, all of the code and data on which it is based should be released so that other scientists (who have the appropriate background) can re-run it and validate the results. In fields with complex, messy datasets, this is exceedingly hard, but might be achievable with good tools. The complete toolset needed to do this does not exist today, so just calling for making the code open source is pointless. Much climate code is already open source, but that doesn&#8217;t mean anyone in another lab can repeat a run and check the results. The problems of reproducibility have very little to do with whether the code is open &#8211; the key problem is to capture the entire scientific workflow and all data provenance. This is very much an active line of research, and we have a long way to go. In the absence of this, climate scientists rely on other scientists testing the results with other methods, rather than repeating the same tests. Which is the way it&#8217;s done in most branches of science. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: vcs</title>
		<link>http://blog.stodden.net/2009/11/30/the-climate-modeling-leak-code-and-data-generating-published-results-must-be-open-and-facilitate-reproducibility/comment-page-1/#comment-705</link>
		<dc:creator>vcs</dc:creator>
		<pubDate>Sun, 06 Dec 2009 01:52:54 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stodden.net/?p=114#comment-705</guid>
		<description>Hi Steve,
&lt;P&gt;
The simple availability of code and data isn&#039;t enough - these need to be shared in such a way that the results can be verified. According to &lt;a href=&quot;http://www.anenglishmanscastle.com/HARRY_READ_ME.txt&quot; rel=&quot;nofollow&quot;&gt;HARRY&#039;s readme file&lt;/a&gt; he apparently had access to code and data from within the CRU and still wasn&#039;t able to replicate results. The file also shows that layers and layers of complex data processing are involved in the work he was trying to replicate - all part of the scientific research, and important for understanding and evaluating the results. I haven&#039;t questioned the scientific results, but secretiveness or obfuscation with regard to code and data makes it easier to believe computational results could possibly be in error, regardless of the field. Reproducibility of computational results is not a concern unique to climate science.
&lt;P&gt;
Reproducibility is being taken seriously in a growing number of fields, such as among &lt;a href=&quot;http://sepwww.stanford.edu/data/media/public/sep//jon/reproducible.html&quot; rel=&quot;nofollow&quot;&gt;seismic researchers&lt;/a&gt; and in the &lt;a href=&quot;http://videolectures.net/mloss08_braun_rr/&quot; rel=&quot;nofollow&quot;&gt;machine learning community&lt;/a&gt;. I don&#039;t agree that the potential for public harassment is an argument for an exception to the scientific method. I&#039;d argue instead that openness is especially important in the case of climate change because of its salience in public policy. I have a &lt;a href=&quot;http://blog.stodden.net/2009/09/27/optimal-information-disclosure-levels-datagov-and-talebs-criticism/&quot; rel=&quot;nofollow&quot;&gt;previous blog post&lt;/a&gt; on the release of government data arguing that the answer to bad speech is more speech (doesn&#039;t necessarily have to come from the original researchers).
&lt;P&gt;
Sharing code and data from results that have not been prepared with the intention that they are to be reproducible is tough and I see your work as vital in this effort, for example your two recent papers in CiSE. I agree we self-correct in science, but this comes through openness and disclosure. Encouragingly, the view that science involves reproducibility of results appears to be resonating. Today &lt;a href=&quot;http://www.timesonline.co.uk/tol/news/environment/article6945445.ece&quot; rel=&quot;nofollow&quot;&gt;The Times reported&lt;/a&gt; that The Met will conduct a review of it&#039;s temperature data since it used work emanating from the CRU - and it will do this in a transparent fashion because it &quot;wants to create a new and fully open method of analysing temperature data.&quot; In &lt;a href=&quot;http://www.scientificamerican.com/article.cfm?id=scientists-respond-to-climategate-controversy&quot; rel=&quot;nofollow&quot;&gt;yesterday&#039;s Scientific American&lt;/a&gt; Michael Mann was quoted saying his most recent publication in Science includes the underlying code and data as a supplement - I&#039;m hoping it was released in a reproducible fashion, and that we get to a place where our published results are routinely verifiable.</description>
		<content:encoded><![CDATA[<p>Hi Steve,</p>
<p>
The simple availability of code and data isn&#8217;t enough &#8211; these need to be shared in such a way that the results can be verified. According to <a href="http://www.anenglishmanscastle.com/HARRY_READ_ME.txt" rel="nofollow">HARRY&#8217;s readme file</a> he apparently had access to code and data from within the CRU and still wasn&#8217;t able to replicate results. The file also shows that layers and layers of complex data processing are involved in the work he was trying to replicate &#8211; all part of the scientific research, and important for understanding and evaluating the results. I haven&#8217;t questioned the scientific results, but secretiveness or obfuscation with regard to code and data makes it easier to believe computational results could possibly be in error, regardless of the field. Reproducibility of computational results is not a concern unique to climate science.
</p>
<p>
Reproducibility is being taken seriously in a growing number of fields, such as among <a href="http://sepwww.stanford.edu/data/media/public/sep//jon/reproducible.html" rel="nofollow">seismic researchers</a> and in the <a href="http://videolectures.net/mloss08_braun_rr/" rel="nofollow">machine learning community</a>. I don&#8217;t agree that the potential for public harassment is an argument for an exception to the scientific method. I&#8217;d argue instead that openness is especially important in the case of climate change because of its salience in public policy. I have a <a href="http://blog.stodden.net/2009/09/27/optimal-information-disclosure-levels-datagov-and-talebs-criticism/" rel="nofollow">previous blog post</a> on the release of government data arguing that the answer to bad speech is more speech (doesn&#8217;t necessarily have to come from the original researchers).
</p>
<p>
Sharing code and data from results that have not been prepared with the intention that they are to be reproducible is tough and I see your work as vital in this effort, for example your two recent papers in CiSE. I agree we self-correct in science, but this comes through openness and disclosure. Encouragingly, the view that science involves reproducibility of results appears to be resonating. Today <a href="http://www.timesonline.co.uk/tol/news/environment/article6945445.ece" rel="nofollow">The Times reported</a> that The Met will conduct a review of it&#8217;s temperature data since it used work emanating from the CRU &#8211; and it will do this in a transparent fashion because it &#8220;wants to create a new and fully open method of analysing temperature data.&#8221; In <a href="http://www.scientificamerican.com/article.cfm?id=scientists-respond-to-climategate-controversy" rel="nofollow">yesterday&#8217;s Scientific American</a> Michael Mann was quoted saying his most recent publication in Science includes the underlying code and data as a supplement &#8211; I&#8217;m hoping it was released in a reproducible fashion, and that we get to a place where our published results are routinely verifiable.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve Easterbrook</title>
		<link>http://blog.stodden.net/2009/11/30/the-climate-modeling-leak-code-and-data-generating-published-results-must-be-open-and-facilitate-reproducibility/comment-page-1/#comment-704</link>
		<dc:creator>Steve Easterbrook</dc:creator>
		<pubDate>Thu, 03 Dec 2009 19:23:07 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stodden.net/?p=114#comment-704</guid>
		<description>Victoria,
I think your analysis is too simplistic. Open data and open source on it&#039;s own is a fine and laudable goal, but is not the problem in the CRU case, nor in the wider field of climate modeling.

First, most climate code and data is freely available. The research results are reproduced widely throughout the field, and the comparison and validation of results is build into the IPCC process. Can you name any other field of science, in which 25 different research centres around the world are building large scale models of the same phenomena, and comparing the results in detail through controlled model inter-comparison projects (see CMIP5 for a taste: http://cmip-pcmdi.llnl.gov/cmip5/)

Second, climate phenomena are sufficiently complex that *exact* reproducibility is effectively impossible. This deserves a longer explanation (and I&#039;m working on a paper on that), but for a taste, see: http://moregrumbinescience.blogspot.com/2009/11/data-set-reproducibility.html
So what is needed is a different approach to reproducibility, which is to independently arrive at the comparable results via different methods. Which is exactly what the climate science community has been doing for decades, and it is very effective (unfortunately, little has been written on this, but I&#039;m working on it - it&#039;s a fascinating discipline)

Finally, you&#039;ve blown the Mann and Jones work out of all proportion. The field of dendro-chronology is a minor sideshow in climate science, and a very immature discipline. Mann&#039;s early work on this had errors in it, but of course it did - that&#039;s what science is! Subsequent work has improved the methodologies, without changing the results at all. But then the politicians and lawyers get involved, fail to understand the scientific process, and think that science should be a perfect process every step of the way. Can you honestly say that any of your publications could stand up to the kind of scrutiny at congressional levels that Mann&#039;s has been subjected to? This is not how we do science.

The net result of politically inspired attacks on climate scientists has resulted in an understandable siege mentality, which is what you see in the CRU emails. I have characterized it as a kind of denial of service attack, and it&#039;s not healthy for science in any way. He&#039;s my take: http://www.easterbrook.ca/steve/?p=1001</description>
		<content:encoded><![CDATA[<p>Victoria,<br />
I think your analysis is too simplistic. Open data and open source on it&#8217;s own is a fine and laudable goal, but is not the problem in the CRU case, nor in the wider field of climate modeling.</p>
<p>First, most climate code and data is freely available. The research results are reproduced widely throughout the field, and the comparison and validation of results is build into the IPCC process. Can you name any other field of science, in which 25 different research centres around the world are building large scale models of the same phenomena, and comparing the results in detail through controlled model inter-comparison projects (see CMIP5 for a taste: <a href="http://cmip-pcmdi.llnl.gov/cmip5/)" rel="nofollow">http://cmip-pcmdi.llnl.gov/cmip5/)</a></p>
<p>Second, climate phenomena are sufficiently complex that *exact* reproducibility is effectively impossible. This deserves a longer explanation (and I&#8217;m working on a paper on that), but for a taste, see: <a href="http://moregrumbinescience.blogspot.com/2009/11/data-set-reproducibility.html" rel="nofollow">http://moregrumbinescience.blogspot.com/2009/11/data-set-reproducibility.html</a><br />
So what is needed is a different approach to reproducibility, which is to independently arrive at the comparable results via different methods. Which is exactly what the climate science community has been doing for decades, and it is very effective (unfortunately, little has been written on this, but I&#8217;m working on it &#8211; it&#8217;s a fascinating discipline)</p>
<p>Finally, you&#8217;ve blown the Mann and Jones work out of all proportion. The field of dendro-chronology is a minor sideshow in climate science, and a very immature discipline. Mann&#8217;s early work on this had errors in it, but of course it did &#8211; that&#8217;s what science is! Subsequent work has improved the methodologies, without changing the results at all. But then the politicians and lawyers get involved, fail to understand the scientific process, and think that science should be a perfect process every step of the way. Can you honestly say that any of your publications could stand up to the kind of scrutiny at congressional levels that Mann&#8217;s has been subjected to? This is not how we do science.</p>
<p>The net result of politically inspired attacks on climate scientists has resulted in an understandable siege mentality, which is what you see in the CRU emails. I have characterized it as a kind of denial of service attack, and it&#8217;s not healthy for science in any way. He&#8217;s my take: <a href="http://www.easterbrook.ca/steve/?p=1001" rel="nofollow">http://www.easterbrook.ca/steve/?p=1001</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: vcs</title>
		<link>http://blog.stodden.net/2009/11/30/the-climate-modeling-leak-code-and-data-generating-published-results-must-be-open-and-facilitate-reproducibility/comment-page-1/#comment-703</link>
		<dc:creator>vcs</dc:creator>
		<pubDate>Tue, 01 Dec 2009 22:06:36 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stodden.net/?p=114#comment-703</guid>
		<description>Thanks Dan. Also interesting is that Phil Jones, head of CRU, is stepping down pending an independent investigation:
http://dotearth.blogs.nytimes.com/2009/12/01/head-of-climate-unit-steps-down-pending-inquiry/

Michael Mann is also under investigation by Penn, his employer, for possible research ethics violations:
http://blogs.sciencemag.org/scienceinsider/2009/11/climate-hack-sc.html</description>
		<content:encoded><![CDATA[<p>Thanks Dan. Also interesting is that Phil Jones, head of CRU, is stepping down pending an independent investigation:<br />
<a href="http://dotearth.blogs.nytimes.com/2009/12/01/head-of-climate-unit-steps-down-pending-inquiry/" rel="nofollow">http://dotearth.blogs.nytimes.com/2009/12/01/head-of-climate-unit-steps-down-pending-inquiry/</a></p>
<p>Michael Mann is also under investigation by Penn, his employer, for possible research ethics violations:<br />
<a href="http://blogs.sciencemag.org/scienceinsider/2009/11/climate-hack-sc.html" rel="nofollow">http://blogs.sciencemag.org/scienceinsider/2009/11/climate-hack-sc.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan</title>
		<link>http://blog.stodden.net/2009/11/30/the-climate-modeling-leak-code-and-data-generating-published-results-must-be-open-and-facilitate-reproducibility/comment-page-1/#comment-702</link>
		<dc:creator>Dan</dc:creator>
		<pubDate>Tue, 01 Dec 2009 19:00:58 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stodden.net/?p=114#comment-702</guid>
		<description>Excellent work. Please let me add that Rajendra K Pachauri, chairman of the IPCC, was given an honorary Doctorate by the University of East Anglia. So don&#039;t expect impartiality from him.
Dan</description>
		<content:encoded><![CDATA[<p>Excellent work. Please let me add that Rajendra K Pachauri, chairman of the IPCC, was given an honorary Doctorate by the University of East Anglia. So don&#8217;t expect impartiality from him.<br />
Dan</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Caesar&#8217;s Wife &#171; Software Carpentry</title>
		<link>http://blog.stodden.net/2009/11/30/the-climate-modeling-leak-code-and-data-generating-published-results-must-be-open-and-facilitate-reproducibility/comment-page-1/#comment-701</link>
		<dc:creator>Caesar&#8217;s Wife &#171; Software Carpentry</dc:creator>
		<pubDate>Tue, 01 Dec 2009 10:46:03 +0000</pubDate>
		<guid isPermaLink="false">http://blog.stodden.net/?p=114#comment-701</guid>
		<description>[...] see also Victoria Stodden&#8217;s post. Possibly related posts: (automatically generated)American Scientist Article on How Scientists Use [...]</description>
		<content:encoded><![CDATA[<p>[...] see also Victoria Stodden&#8217;s post. Possibly related posts: (automatically generated)American Scientist Article on How Scientists Use [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>
