Letter Re Software and Scientific Publications – Nature

Mark Gerstein and I penned a reaction to two pieces published in Nature News last October, “Publish your computer code: it is good enough,” by Nick Barnes and “Computational Science…. Error” by Zeeya Merali. Nature declined to publish our note and so here it is.

Dear Editor,

We have read with great interest the recent pieces in Nature about the importance of computer codes associated with scientific manuscripts. As participants in the Yale roundtable mentioned in one of the pieces, we agree that these codes must be constructed robustly and distributed widely. However, we disagree with an implicit assertion, that the computer codes are a component separate from the actual publication of scientific findings, often neglected in preference to the manuscript text in the race to publish. More and more, the key research results in papers are not fully contained within the small amount of manuscript text allotted to them. That is, the crucial aspects of many Nature papers are often sophisticated computer codes, and these cannot be separated from the prose narrative communicating the results of computational science. If the computer code associated with a manuscript were laid out according to accepted software standards, made openly available, and looked over as thoroughly by the journal as the text in the figure legends, many of the issues alluded to in the two pieces would simply disappear overnight.

The approach taken by the journal Biostatistics serves as an exemplar: code and data are submitted to a designated “reproducibility editor” who tries to replicate the results. If he or she succeeds, the first page of the article is kitemarked “R” (for reproducible) and the code and data made available as part of the publication. We propose that high-quality journals such as Nature not only have editors and reviewers that focus on the prose of a manuscript but also “computational editors” that look over computer codes and verify results. Moreover, many of the points made here in relation to computer codes apply equally well to large datasets that underlie experimental manuscripts. These are often organized, formatted, and deposited into databases as an afterthought. Thus, one could also imagine a “data editor” who would look after these aspects of a manuscript. All in all, we have to come to the realization that current scientific papers are more complicated than just a few thousand words of narrative text and a couple of figures, and we need to update journals to handle this reality.

Yours sincerely,

Mark Gerstein (1,2,3)
Victoria Stodden (4)

(1) Program in Computational Biology and Bioinformatics,
(2) Department of Molecular Biophysics and Biochemistry, and
(3) Department of Computer Science,
Yale University, Bass 432, 266 Whitney Avenue, New Haven, CT 06520 Mark.Gerstein@Yale.edu

(4) Department of Statistics, Columbia University, 1255 Amsterdam Ave, New York, NY 10027
vcs@stodden.net

4 Responses to “Letter Re Software and Scientific Publications – Nature”


  • I also wholeheartedly agree with the points raised, and it’s a shame they were not published. As well as the editorial policies outlined in Biostatistics, the new journal Open Research Computation is requiring independent code audits, and we at the BGI are keen to follow the “data editor” approach with our upcoming “big-data” journal GigaScience. Doing such assessments properly may be a significant amount of work, so it’s always good to hear confirmation that it’s worth the effort. Any ideas or feedback regarding the practicalities and willingness of potential reviewers/editors to devote some of their time to this would also be very interesting to hear.

  • Dear Dr Stodden
    Discussion of your suggestion that journals run/verify code before publication is going on at Steve McIntyre’s Climate Audit blog.

    http://climateaudit.org/2011/02/27/shub-niggurath-on-archiving-code/

    Any feedback/comments would be welcome.

    Thanks.

  • I would love to see ‘reproducibility editors’ everywhere, and it’s completely obvious to me that full code publication is a future norm of science. But it’s a long, long way from here to there, and many scientists don’t yet understand the destination or realise the necessity of the journey. The Climate Code Foundation is working on baby steps.

Leave a Reply