<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Victoria Stodden &#187; Intellectual Property</title>
	<atom:link href="http://blog.stodden.net/category/intellectual-property/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.stodden.net</link>
	<description>Just another WordPress weblog</description>
	<lastBuildDate>Sun, 16 May 2010 02:13:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Open Data Dead on Arrival</title>
		<link>http://blog.stodden.net/2010/02/03/open-data-dead-on-arrival/</link>
		<comments>http://blog.stodden.net/2010/02/03/open-data-dead-on-arrival/#comments</comments>
		<pubDate>Wed, 03 Feb 2010 04:17:27 +0000</pubDate>
		<dc:creator>vcs</dc:creator>
				<category><![CDATA[Developing world]]></category>
		<category><![CDATA[Intellectual Property]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Reproducible Research]]></category>
		<category><![CDATA[Scientific Method]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://blog.stodden.net/?p=174</guid>
		<description><![CDATA[In 1984 Karl Popper wrote a private letter to an inquirer he didn&#8217;t know, responding to enclosed interview questions. The response was subsequently published and in it he wrote, among other things, that: &#8220;Every intellectual has a very special responsibility. He has the privilege and opportunity of studying. In return, he owes it to his [...]]]></description>
			<content:encoded><![CDATA[<p>In 1984 Karl Popper wrote a private letter to an inquirer he didn&#8217;t know, responding to enclosed interview questions. The response was subsequently published and in it he wrote, among other things, that:</p>
<blockquote><p>
&#8220;Every intellectual has a very special responsibility. He has the privilege and opportunity of studying. In return, he owes it to his fellow men (or &#8216;to society&#8217;) to represent the results of his study as simply, clearly and modestly as he can. The worst thing that intellectuals can do &#8212; the cardinal sin &#8212; is to try to set themselves up as great prophets vis-a-vis their fellow men and to impress them with puzzling philosophies. Anyone who cannot speak simply and clearly should say nothing and continue to work until he can do so.&#8221;
</p></blockquote>
<p>Aside from the offensive sexism in referring to intellectuals as males, there is another way this imperative should be updated for intellectualism today. The movement to make data available online is picking up momentum &#8212; as it should &#8212; and open code is following suit (see <a href=http://mloss.org/>http://mloss.org</a> for example). But data should not be confused with facts, and applying the simple communication that Popper refers to beyond the written or spoken word is the only way open data will produce dividends. It isn&#8217;t enough to post raw data, or undocumented code. Data and code should be considered part of intellectual communication, and made as simple as possible for &#8220;fellow men&#8221; to understand. Just as knowledge of adequate English vocabulary is assumed in the nonquantitative communication Popper refers to, certain basic coding and data knowledge can be assumed as well. This means the same thing as it does in the literary case; the elimination of extraneous information and obfuscating terminology. No need to bury interested parties in an Enron-like shower of bits. It also means using a format for digital communication that is conducive to reuse, such as a flat text file or another non-proprietary format, for example pdf files cannot be considered acceptable to either data or code. Facilitating reproducibility must be the gold standard for data and code release.</p>
<h4>And who are these &#8220;fellow men&#8221;?</h4>
<p>Well, fellow men and women that is, but back to the issue. Much of the history of scientific communication has dealt with the question of demarcation of the appropriate group to whom the reasoning behind the findings would be communicated, the definition of the scientific community. Clearly, communication of very technical and specialized results to a layman would take intellectuals&#8217; time away from doing what they do best, being intellectual. On the other hand some investment in explanation is essential for establishing a finding as an accepted fact &#8212; assuring others that sufficient error has been controlled for and eliminated in the process of scientific discovery. These others ought to be able to verify results, find mistakes, and hopefully build on the results (or the gaps in the theory) and thereby further our understanding. So there is a tradeoff. Hence the establishment of the Royal Society for example as a body with the primary purpose of discussing scientific experiments and results. Couple this with Newton&#8217;s surprise, or even irritation, at having to explain results he put forth to the Society in his one and only journal publication in their journal Philosophical Transactions (he called the various clarifications tedious, and sought to withdraw from the Royal Society and subsequently never published another journal paper. See the last chapter of <a href=http://mitpress.mit.edu/catalog/item/default.asp?tid=10611&#038;ttype=2>The Access Principle</a>). <b>There is a mini-revolution underfoot that has escaped the spotlight of attention on open data, open code, and open scientific literature. That is, the fact that the intent is to open to the public.</b> Not open to peers, or appropriately vetted scientists, or selected ivory tower mates, but to anyone. Never before has the standard for communication been &#8220;everyone,&#8221; in fact quite the opposite. Efforts had traditionally been expended narrowing and selecting the community privileged enough to participate in scientific discourse.</p>
<h4>So what does public openness mean for science?</h4>
<p>Recall the leaked files from the University of East Anglia&#8217;s Climatic Research Unit last November. Much of the information revealed concerned scientifically suspect (and ethically dubious) attempts not to reveal data and methods underlying published results. Although that tack seems to have <a href=http://www.realclimate.org/index.php/archives/2009/12/please-show-us-your-code/>softened now</a> some initial responses defended the climate scientists&#8217; right to be closed with regard to their methods due to the possibility of &#8220;<a href=http://sgillies.net/blog/970/ddos-on-climate-science/>denial of service attacks</a>&#8221; &#8211; the ripping apart of methodology (recall all science is wrong, an asymptotic progression toward to truth at best) not with the intent of finding meaningful errors that halt the acceptance of findings as facts, but merely to tie up the climate scientists so they cannot attend to real research. This is the same tradeoff as described above. An interpretation of this situation cannot be made without the complicating realization that peer review &#8212; the review process that vets articles for publication &#8212; doesn&#8217;t check computational results but largely operates as if the papers are expounding results from the pre-computational scientific age. The outcome, if computational methodologies are able to remain closed from view, is that they are directly vetted nowhere. Hardly an acceptable basis for establishing facts. My own view is that data and code must be communicated publicly with attention paid to Popper&#8217;s admonition: as simply and clearly as possible, such that the results can be replicated. Not participating in dialog with those insufficiently knowledgable to engage will become part of our scientific norms, in fact this is enshrined in the structure of our scientific societies of old. Others can take up those ends of the discussion, on blogs, in digital forums. But public openness is important not just because taxpayers have a right to what they paid for (perhaps they do, but this quickly falls apart since not all the public are technically taxpayers and that seems a wholly unjust way of deciding who shall have access to scientific knowledge and who not, clearly we mean society), but because of the increasing inclusiveness of the scientific endeavor. How do we determine who is qualified to find errors in our scientific work? We don&#8217;t. Real problems will get noticed regardless of with whom they originate, many eyes making all bugs shallow. And I expect peer review for journal publishing to incorporate computational evaluation as well.</p>
<h4>Where does this leave all the open data?</h4>
<p>Unused, unless efforts are expended to communicate the meaning of the data, and to maximize the usability of the code. Data is not synonymous with facts &#8211; methods for understanding data, and turning its contents into facts, are embedded within the documentation and code. Take for granted that users understand the coding language or basic scientific computing functions, but clearly and modestly explain the novel contributions. Facilitate reproducibility. Without this data may be open, but will remain de facto in the ivory tower.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.stodden.net/2010/02/03/open-data-dead-on-arrival/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Post 3: The OSTP’s call for comments regarding Public Access Policies for Science and Technology Funding Agencies Across the Federal Government</title>
		<link>http://blog.stodden.net/2010/01/07/post-3-the-ostp%e2%80%99s-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government/</link>
		<comments>http://blog.stodden.net/2010/01/07/post-3-the-ostp%e2%80%99s-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government/#comments</comments>
		<pubDate>Thu, 07 Jan 2010 20:13:53 +0000</pubDate>
		<dc:creator>vcs</dc:creator>
				<category><![CDATA[Intellectual Property]]></category>
		<category><![CDATA[Law]]></category>
		<category><![CDATA[OSTP]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Reproducible Research]]></category>
		<category><![CDATA[Scientific Method]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.stodden.net/?p=152</guid>
		<description><![CDATA[The following comments were posted in response to the OSTP’s call as posted here: http://www.ostp.gov/galleries/default-file/RFI%20Final%20for%20FR.pdf. The first wave, comments posted here, asked for feedback on implementation issues. The second wave requested input on Features and Technology (our post is here). For the third and final wave on Management, Chris Wiggins, Matt Knepley, and I posted [...]]]></description>
			<content:encoded><![CDATA[<p>The following comments were posted in response to the OSTP’s call as posted here: <a href=http://www.ostp.gov/galleries/default-file/RFI%20Final%20for%20FR.pdf>http://www.ostp.gov/galleries/default-file/RFI%20Final%20for%20FR.pdf</a>. The <a href=http://blog.ostp.gov/2009/12/10/policy-forum-on-public-access-to-federally-funded-research-implementation>first wave</a>, comments posted <a http://blog.stodden.net/2009/12/21/the-ostps-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government>here</a>, asked for feedback on implementation issues. The <a href=http://blog.ostp.gov/2009/12/21/policy-forum-on-public-access-to-federally-funded-research-features-and-technology>second wave</a> requested input on Features and Technology (our post is <a href=http://blog.stodden.net/2009/12/28/post-2-the-ostp%E2%80%99s-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government>here</a>). For the <a href=http://blog.ostp.gov/2010/01/01/policy-forum-on-public-access-to-federally-funded-research-management>third and final wave</a> on Management, Chris Wiggins, Matt Knepley, and I posted the following comments:</p>
<p> Q1: <i>Compliance. What features does a public access policy need to ensure compliance? Should this vary across agencies?</i></p>
<p></p>
<p>One size does not fit all research problems across all research communities, and a heavy-handed general release requirement across agencies could result in de jure compliance – release of data and code as per the letter of the law – without the extra effort necessary to create usable data and code facilitating reproducibility (and extension) of the results. One solution to this barrier would be to require grant applicants to formulate plans for release of the code and data generated through their research proposal, if funded. This creates a natural mechanism by which grantees (and peer reviewers), who best know their own research environments and community norms, contribute complete strategies for release. This would allow federal funding agencies to gather data on needs for release (repositories, further support, etc.); understand which research problem characteristics engender which particular solutions, which solutions are most appropriate in which settings, and uncover as-yet unrecognized problems particular researchers may encounter. These data would permit federal funding agencies to craft release requirements that are more sensitive to barriers researchers face and the demands of their particular research problems, and implement strategies for enforcement of these requirements. This approach also permits researchers to address confidentiality and privacy issues associated with their research.</p>
<p>
Examples:</p>
<p></p>
<p>    One exemplary precedent by a UK funding agency is the January 2007 &#8220;Policy on data management and sharing&#8221;<br />
(<a href="http://bit.ly/74pXhT">http://www.wellcome.ac.uk/About-us/Policy/Policy-and-position-statements/WTX035043.htm</a>)<br />
adopted by The Wellcome Trust (<a href="http://bit.ly/wRu9b">http://www.wellcome.ac.uk/About-us/index.htm</a>) according to which &#8220;the Trust will require that the applicants provide a data management and sharing plan as part of their application; and review these data management and sharing plans, including any costs involved in delivering them, as an integral part of the funding decision.&#8221; A comparable policy statement by US agencies would be quite useful in clarifying OSTP&#8217;s intent regarding the relationship between publicly-supported research and public access to the research products generated by this support.</p>
<p>
<span id="more-152"></span><br />
</p>
<p>    An exemplary precedent by a US funding agency is that of NSF&#8217;s &#8220;broader impact criterion&#8221; (cf. <a href="http://bit.ly/wRu9b">http://www.ndsciencehumanitiespolicy.org/workshop/</a> for an links to extensive discussions on history and examples of what qualifies as evidence of broad impact). Such an existing requirement could allow, encourage, or require data and code sharing plans as possible examples of broader impact.</p>
<p></p>
<p>    A second exemplary precedent by a US funding agency is that of NIH&#8217;s development of <a href="http://bit.ly/51tyAI">PubMed Central</a>. Submission of manuscripts resulting from NIH support is now mandatory (cf. <a href="http://bit.ly/5dy1ic">http://grants.nih.gov/grants/guide/notice-files/NOT-OD-08-033.html</a>). NIH or other agencies might consider developing a similar repository for code, data, or (even better) full compendia (manuscript, data, and code together) of computational research, and possibly requiring use of this reliable, searchable, open repository for future federal funding. By creating and requiring an open access repository for manuscripts, NIH has avoided the possibility that research results can only be accessed by libraries able to pay the increasing costs of subscriptions to closed-access journals.</p>
<p></p>
<p>
<p>    Q2: <i>Evaluation. How should an agency determine whether a public access policy is successful? What measures could agencies use to gauge whether there is increased return on federal investment gained by expanded access?</i></p>
<p></p>
<p>One simple gauge is the proportion of funded projects, by field and by agency, which are in compliance. Compliance could be easily measured: whether the research compendia have been made available according to agency policy and the details of the particular grant funding the researcher. When the work is computational, funding agencies could consider implementation of the Reproducible Research Standard (cf. V. Stodden &#8220;Enabling Reproducible Research: Licensing For Scientific Innovation&#8221; at <a href="http://bit.ly/RC7lx">http://www.ijclp.net/issue_13.html</a>) to untangle intellectual property rights associated with research release and clarify requirements.</p>
<p></p>
<p>    The Reproducible Research Standard (RRS) realigns the Intellectual Property framework faced by computational researchers with longstanding scientific norms. The RRS suggests a licensing structure for research compendia, including code and data, that permits others to use and re-use code and data without having to obtain prior permission or assume a Fair Use exception to copyright, so long as attribution is given. The RRS utilizes existing open licenses that permit the free use of licensed work, so long as attribution is given, and is satisfied if the following four conditions hold:</p>
<p></p>
<p>1. The full research compendium, including code and data, is available on the Internet,<br />
<br />
2. The media components such as text or figures, (including original selection and arrangement of the data), are licensed under the Creative Commons Attribution License 3.0 or released to the public domain under CC0,<br />
<br />
3. The code components are licensed under one of Apache 2.0, the MIT License, or the Modified BSD license, or released to the public domain under CC0,<br />
<br />
4. The data have been released into the public domain under CC0 or according to the Science Commons Open Data Protocol.<br />
</p>
<p>Using the RRS on all components of computational scholarship will encourage reproducible scientific investigation, facilitate greater collaboration, and promote engagement of the larger community in scientific learning and discovery.</p>
<p></p>
<p>Moreover, in evaluating compliance, we would also want to encompass the ability to build, run, and verify any source code. This might be accomplished using<br />
    <br />* spot checks of the repository<br />
    <br />* automated checks akin to unit tests<br />
    <br />* tests run by a separate reviewer at the time of inclusion</p>
<p></p>
<p>
<p>    Q3: <i>Roles. How might a public private partnership promote robust management of a public access policy? Are there examples already in use that may serve as models? What is the best role for the Federal government?</i></p>
<p></p>
<p>Two notable examples of public-private partnership which have benefited science are <a href="http://bit.ly/12MYix">http://arxiv.org</a>, which is partially NSF-supported, and <a href="http://bit.ly/16Imki">http://PDB.org</a>, funded by a number of (public and private) sources. PDB in particular has for more than a decade been an integral part of the funding and publication policies in the structural biology community (cf. <a href="http://bit.ly/5wzxeH">http://www.nature.com/nsmb/wilma/v5n3.892130820.html</a>).</p>
<p></p>
<p>That said, previous experimentation with private management of scientific works has been problematic in at least one case. In December 2008 Google shut down http://researchdatasets.google.com &#8211; a repository for research data (cf. <a href="http://bit.ly/7U10NJ">http://www.wired.com/wiredscience/2008/12/googlescienceda/</a>). Private interests are not aligned with those of the scientific community, and there must be a public role in the preservation of this aspect of our culture. Moreover, reliance on private resources comes with venerability to changing missions of or solvency of these private and/or corporate partners. The principle of Open Access recognizes that such collections should be considered valuable stewards of our culture just as the Library of Congress and the National Archives. Rewards for the availability of scientific compendia &#8212; papers, data, and code &#8212; come not only through views and downloads, but through the acceleration of scientific research, technological development, and an increase in scientific integrity.</p>
<p></p>
<p>Possible roles for the federal government include:<br />
<br />* facilitating and supporting an open an sustainable database comparable to the PDB for research compendia (manuscripts, data, and code)<br />
<br />* encouraging funding agencies to draft clear statements encouraging reproducibility (e.g., distribution of compendia) and public access to research results (e.g., submission to open access journals or arxiv.org)<br />
<br />* clarification of the relationship between copyright and open access (a topic currently under debate in the form of competing proposed congressional bills, cf.<br />
<a href="http://bit.ly/5Ehyj6">http://www.publishers.org/main/PressCenter/Archicves/2009_Feb/02_FairCopyright.htm</a> and<br />
<a href="http://bit.ly/8QvNU0">http://www.taxpayeraccess.org/issues/access/access_supporters/</a> for background)<br />
<br />* clarification of the relationship between broad impact of publicly-funded research (and public access to the output of this federal support) versus university-specific IP policies (e.g., governing code and data even where generated by publicly-funded research), which often act as a disincentive to sharing the results of federally-funded research.</p>
<p>
<p>
Victoria Stodden<br />
Yale Law School, New Haven, CT<br />
Science Commons, Cambridge, MA<br />
<a href="http://bit.ly/4Uq6DT">http://www.stanford.edu/~vcs</a></p>
<p>
<p>
Chris Wiggins<br />
Columbia University, New York, NY<br />
<a href="http://bit.ly/4Uq6DT">http://www.columbia.edu/~chw2</a></p>
<p>
<p>
Matthew G. Knepley<br />
University of Chicago, Chicago, IL<br />
<a href="http://bit.ly/4oEU1V">http://www.cs.uchicago.edu/~knepley</a></p>
<p>
<p><b>References</b> These issues were discussed at a roundtable convened by one of the authors on research sharing issues held at Yale Law School on November 21, 2009.  The webpage, along with thought pieces and research materials, is located at <a href="http://bit.ly/5L6mTh">http://www.stanford.edu/~vcs/Conferences/RoundtableNov212009/</a>.</p>
<p>
<p>Crossposted to the <a href=http://blog.ostp.gov/2010/01/01/policy-forum-on-public-access-to-federally-funded-research-management/comment-page-2/#comment-10974>OSTP blog</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.stodden.net/2010/01/07/post-3-the-ostp%e2%80%99s-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Post 2: The OSTP’s call for comments regarding Public Access Policies for Science and Technology Funding Agencies Across the Federal Government</title>
		<link>http://blog.stodden.net/2009/12/28/post-2-the-ostp%e2%80%99s-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government/</link>
		<comments>http://blog.stodden.net/2009/12/28/post-2-the-ostp%e2%80%99s-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government/#comments</comments>
		<pubDate>Tue, 29 Dec 2009 02:54:38 +0000</pubDate>
		<dc:creator>vcs</dc:creator>
				<category><![CDATA[Intellectual Property]]></category>
		<category><![CDATA[Law]]></category>
		<category><![CDATA[OSTP]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Reproducible Research]]></category>
		<category><![CDATA[Scientific Method]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://blog.stodden.net/?p=144</guid>
		<description><![CDATA[The following comments were posted in response to the second wave of the OSTP&#8217;s call as posted here: http://www.ostp.gov/galleries/default-file/RFI%20Final%20for%20FR.pdf. The first wave, comments posted here and on the OSTP site here (scroll to the second last comment), asked for feedback on implementation issues. The second wave requests input on Features and Technology and Chris Wiggins [...]]]></description>
			<content:encoded><![CDATA[<p>The following comments were posted in response to the second wave of the OSTP&#8217;s call as posted here: <a href=http://www.ostp.gov/galleries/default-file/RFI%20Final%20for%20FR.pdf>http://www.ostp.gov/galleries/default-file/RFI%20Final%20for%20FR.pdf</a>. The first wave, comments posted <a href=http://blog.stodden.net/2009/12/21/the-ostps-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government>here</a> and on the OSTP site <a href=http://blog.ostp.gov/2009/12/10/policy-forum-on-public-access-to-federally-funded-research-implementation>here</a> (scroll to the second last comment), asked for feedback on implementation issues. The <a href=http://blog.ostp.gov/2009/12/21/policy-forum-on-public-access-to-federally-funded-research-features-and-technology>second wave requests input on Features and Technology</a> and Chris Wiggins and I posted the following comments:</p>
<p>We address each of the questions for phase two of OSTP&#8217;s forum on public access in turn. The answers generally depend on the community involved and (particularly question 7, asking for a cost estimate) on the scale of implementation. Inter-agency coordination is crucial however in (i) providing a centralized repository to access agency-funded research output and (ii) encouraging and/or providing a standardized tagging vocabulary and structure (as discussed further below).</p>
<p><span id="more-144"></span><br />
Agency-funded research output will contain at least a peer-reviewed final paper, and if computational, should also contain data and code ensuring that the work is reproducible (the paper, code, and data together are described as the research &#8220;compendium&#8221;). It is imperative to provide public access to taxpayer-funded scientific output &#8212; not only to the final published paper but also the supporting data and code &#8212; for the reproducibility and skepticism fundamental to scientific communication and progress.</p>
<p>
<p>
We address these eight questions in turn:</p>
<p>
<p>
<i>1. In what format should published papers be submitted in order to make them easy to find, retrieve, and search and to make it easy for others to link to them?</i></p>
<p>
<p>
As a general rule publication formats and standards evolve over time as technologies develop and should not be mandated. Any development of research sharing platforms should take into account the evolving nature of standards and formats, and permit this innovation in an open community-driven way. Likely the easiest format for searching, at present, is that of XML; however, as this is not a publishing standard, a more reasonable intermediate goal is that of annotated PDFs and LaTeX comments which can easily be converted into XML given their rich use of structured environments (e.g., tables, figures, and citations). PDF is largely standard for scientific publications today, but is a proprietary format and should not be regulated as a standard. Proprietary formats, particularly those requiring purchase of specific commercial software, should strongly and unambiguously be discouraged by OSTP.</p>
<p>
<p>
<i>2. Are there existing digital standards for archiving and interoperability to maximize public benefit?</i></p>
<p>
<p>
For manuscripts, there are at least two examples of widely-used standards for archiving. The first is the NIH&#8217;s use of PubMed and PubMedCentral. PubMed is a list of pointers with unique stable IDs (a.k.a. PMIDs) pointing to the peer-reviewed manuscript&#8217;s citation or, if available, online presence. The second serves as an archive of published, peer-reviewed manuscripts. PubMed couples both to the dynamics of publishing as well as funding, in that the final requirement the NIH makes of grant recipients is to use the PubMed Central identifier at the end of citations. The use of unique identifiers of papers, as well as of data and code, can encourage the release and hence citation of all forms of research. PubMed also assists in citation by exporting citations in several formats (though, unfortunately, not in BibTeX, the most widely-used format among quantitative and computational scientists). Such a unique identifier would also indicate compliance with agency open access policies.</p>
<p>
<p>
The second example is <a href=http://arXiv.org>http://arXiv.org</a>, which originates from a different set of communities and is used purely for archiving; uploaded manuscripts need not ever be submitted for peer-review. ArXiv entries are given a unique &#8220;tag&#8221; pointing to the uploaded manuscript. After April 2007, the format was changed to a simple YYMM.NNNN, serving as a date-specific quantitative ID.</p>
<p>
<p>
Not yet developed is a similar set of IDs for research compendia (defined above as the manuscript, code, and data required for reproducing the work). Tagging of research compendia is an important issue for communicating work, facilitating topical web searches, and aggregating a researcher&#8217;s contributions, including their data and code. Development of a standard RDFa vocabulary for HTML tags for agency funded research would enable search for data, code, and research as well as facilitating the transmission of licensing information, authorship, and sources. Enabling search by author would allow a more granular understanding of a researcher&#8217;s contributions, beyond citations. This would provide an incentive to release data and code, and give others &#8212; such as funders, award committees, and university hiring and promotion committees &#8212; access to a more representative assessment of the researcher&#8217;s contributions to the community than mere publication-counting.  Such a tagging vocabulary could include unique identifiers for data and code, ideally the same as those required for repository deposit as discussed in the previous section, and thus facilitate and encourage their citation.</p>
<p>
<p>
The leading efforts on these topics include <a href=http://www.datacite.org/<http://www.datacite.org/</a> and <a href=http://www.openarchives.org/ore>http://www.openarchives.org/ore/</a>. The issue is not restricted to data however; for computational work the entire research compendium must be incorporated into the semantic structure. A recent talk by one of the authors on this issue, proposing HTML+RDFa tagging for research compendia, is available via <a href=http://www.stanford.edu/~vcs/talks/CCTechSummitVCS06262009.pdf>http://www.stanford.edu/~vcs/talks/CCTechSummitVCS06262009.pdf</a>.</p>
<p>
<p>
<i>3. How are these anticipated to change?</i></p>
<p>
<p>
Technical challenges ahead will be set, as they have for the past decades, by growing sizes of the data files and code bases to be shared. The flexibility of XML (allowing future defined environment tags, for example) has so far kept up with the unpredictable changing demands of users. We anticipate such a mark-up language standard, which includes the possibility of defining new environments, the likely best option for moving forward.</p>
<p>
<p>
The recent increase in research collaboration and virtual organizations suggests another possible pressure on standards. As scientific research becomes more highly tied to massive computation, for example the NSF&#8217;s <a href=http://www.teragrid.org>TeraGrid</a> computing infrastructure, research will tend to proceed through virtual environments allowing intensive collaboration by researchers separated geographically. The sharing of code and data in concurrent use is already happening, in addition to the downstream reuse of code and data by subsequent researchers. These virtual environments are developing standards for sharing that could exert pressure on the evolution of formats and protocols for code, data, and manuscript communication.</p>
<p>
<p>
<i>4. Are there formats that would be especially useful to researchers wishing to combine datasets or other published results published from various papers in order to conduct comparative studies or meta-analyses?</i></p>
<p>
<p>
Formats should emerge from the researching communities (as was the case with the Protein Data Bank (PDB), at <a href=http://pdb.org>http://pdb.org</a>), with encouragement toward HTML+RDFa standards for inclusion of meta-data. Careful consideration should be given to the locus of the digital archiving however. The creation of multiple, community-specific or agency-specific repositories does not facilitate interdisciplinary communication and thwarts scripted search and API usage; a national research repository should be established to house released agency funded manuscripts including supporting digital materials such and data and code, and provide links to research housed elsewhere. Many institutions do not have repositories, nor do they have the resources to maintain them. For computational work, supporting data and code must accompany article release creating additional demands on a repository. For papers whose results can be replicated from short scripts and small datasets, many computational scientists who do engage in reproducible research are able to host their research compendia (paper, data, and code) on their institutional web-pages or using hosting resources their institution is willing to provide. These individual contributions, however, may not conform to standardized formats that facilitate scripted search, and nor display transparent versioning and crucial time-stamping of edits and revisions, and may not be labeled with unique object identifiers as required by the NIH Open Access policy. These desiderata could be implemented in a straightforward manner by a neutral third-party site such as one coordinated among multiple funding agencies (as is the case with PDB). Not all computational research involves small amounts of supplemental data and code and an inter-agency repository could host very large datasets or complex bodies of code in cases where institutional support is not available to the researcher. Such a repository could extend the capabilities of http://arXiv.org or PubMed Central for all federally funded research (data, code, and peer-reviewed final manuscripts; perhaps renaming PubMed Central the more representative &#8220;PubSci&#8221; or &#8220;PubCentral&#8221;). A centralized repository is especially useful in encouraging researchers to combine datasets and/or code, as opposed to siloing the research by topic area.</p>
<p>
<p>
<i>5. What are the best examples of usability in the private sector (both domestic and international) and what makes them exceptional?</i></p>
<p>
<p>
There are few in the private sector, in which there are often disincentives to transparency and interoperability. Successes at standardizing the maintaining and submission of code, for example, can be found in the private sector efforts at <a href=http://code.google.com>http://code.google.com</a>, <a href=http://sourceforge.net>http://sourceforge.net</a>, and <a href=http://github.com>http://github.com</a> which are actively used by some academic researchers.</p>
<p>
<p>
In the academic sector, notable examples to be emulated include <a href=http://arXiv.org>http://arXiv.org</a> (for manuscripts) and the Protein Data Base (<a href=http://pdb.org>http://pdb.org</a> ; for protein structure data, one specific data type), which has worked since 1971 to solve the complexities of data sharing as well as the loosely-aligned interests of publishers, scientists, and funding agencies. There are many successful examples of data sharing in academic communities, such as Gary King&#8217;s Social Science research repository at Harvard, <a href=http://TheData.org>http://TheData.org</a>, or Pat Brown&#8217;s Stanford MicroArray Database at <a href=http://smd.stanford.edu>http://smd.stanford.edu</a>. Note that the MicroArray community publishes their data with every publication as a routinely accepted requirement; similar standards have been enforced in protein structure since the 1990s (cf. <a href=http://www.nature.com/nsmb/wilma/v5n3.892130820.html>http://www.nature.com/nsmb/wilma/v5n3.892130820.html</a>).</p>
<p>
<p>
Since the data and code are being shared and reused, licensing agreements in these repositories come to the fore. This is an open and active problem across academia largely with the goal of securing attribution rights for owners while permitting use and reuse by others, while minimizing or eliminating licensing incompatibilities between different datasets. Licenses must be compatible for different datasets or different programs to be combined.</p>
<p>
<p>
<i>6. Should those who access papers be given the opportunity to comment or provide feedback?</i></p>
<p>
<p>
Online submission is clearly advantageous for the open and democratic sharing of opinion. However, given the very real consequences (including to future funding, careers, and, in the case of such fields as climate and medicine, policy and political decisions), feedback should be moderated, restricted to verified email addresses, and provided via unique IPs.</p>
<p>
<p>
<i>7. What are the anticipated costs of maintaining publicly accessible libraries of available papers, and how might various public access business models affect these maintenance costs?</i></p>
<p>
<p>
Memory and disk space get cheaper with each year, but such a site requires staffing. The answer to this question, however, depends entirely on the scale of the implementation. What is important to note is the principle of Open Access, and such libraries should be considered valuable stewards of our culture just as the Library of Congress and the National Archives.</p>
<p>
<p>
<i>8. By what metrics (e.g. number of articles or visitors) should the Federal government measure success of its public access collections?</i></p>
<p>
<p>
As mentioned above, the principle of Open Access recognizes that such collections should be considered valuable stewards of our culture just as the Library of Congress and the National Archives. Rewards to the availability of scientific compendia &#8212; papers, data, and code &#8212; come not only through views and downloads, but through the acceleration of scientific research, technological development, and an increase in scientific integrity.</p>
<p>
<p>
Victoria Stodden<br />
Yale Law School, New Haven, CT<br />
Science Commons, Cambridge, MA</p>
<p>http://www.stanford.edu/~vcs</p>
<p>
<p>
Chris Wiggins<br />
Department of Applied Physics and Applied Mathematics, Columbia University, New York, NY</p>
<p>http://www.columbia.edu/~chw2</p>
<p>
<p>
<b>References</b> These issues were discussed at a roundtable convened by one of the authors on research sharing issues held at Yale Law School on November 21, 2009.  The webpage, along with thought pieces and research materials, is located at <a href=http://www.stanford.edu/~vcs/Conferences/RoundtableNov212209>http://www.stanford.edu/~vcs/Conferences/RoundtableNov212209/</a>.</p>
<p>
Crossposted at <a href=http://blog.ostp.gov/2009/12/21/policy-forum-on-public-access-to-federally-funded-research-features-and-technology>http://blog.ostp.gov/2009/12/21/policy-forum-on-public-access-to-federally-funded-research-features-and-technology/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.stodden.net/2009/12/28/post-2-the-ostp%e2%80%99s-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Nathan Myhrvold advocates for Reproducible Research on CNN</title>
		<link>http://blog.stodden.net/2009/12/21/nathan-myhrvold-advocates-for-reproducible-research/</link>
		<comments>http://blog.stodden.net/2009/12/21/nathan-myhrvold-advocates-for-reproducible-research/#comments</comments>
		<pubDate>Mon, 21 Dec 2009 23:45:56 +0000</pubDate>
		<dc:creator>vcs</dc:creator>
				<category><![CDATA[Intellectual Property]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Reproducible Research]]></category>
		<category><![CDATA[Scientific Method]]></category>

		<guid isPermaLink="false">http://blog.stodden.net/?p=135</guid>
		<description><![CDATA[On yesterday&#8217;s edition of Fareed Zakaria&#8217;s GPS on CNN former Microsoft CTO and current CEO of Intellectual Ventures Nathan Myhrvold said reproducible research is an important response for climate science in the wake of Climategate, the recent file leak from a major climate modeling center in England (I blogged my response to the leak here). [...]]]></description>
			<content:encoded><![CDATA[<p>On yesterday&#8217;s edition of Fareed Zakaria&#8217;s GPS on CNN former Microsoft CTO and current CEO of <a href="http://www.intellectualventures.com/">Intellectual Ventures</a> Nathan Myhrvold said reproducible research is an important response for climate science in the wake of Climategate, the recent file leak from a major climate modeling center in England (I blogged my response to the leak <a href="http://blog.stodden.net/2009/11/30/the-climate-modeling-leak-code-and-data-generating-published-results-must-be-open-and-facilitate-reproducibility/">here</a>). The video is <a href="http://edition.cnn.com/video/data/2.0/video/podcasts/fareedzakaria/site/2009/12/20/gps.podcast.12.20.cnn.html">here</a>, see especially 16:27, and the transcript is <a href="http://transcripts.cnn.com/TRANSCRIPTS/0912/20/fzgps.01.html">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.stodden.net/2009/12/21/nathan-myhrvold-advocates-for-reproducible-research/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The OSTP&#039;s call for comments regarding Public Access Policies for Science and Technology Funding Agencies Across the Federal Government</title>
		<link>http://blog.stodden.net/2009/12/21/the-ostps-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government/</link>
		<comments>http://blog.stodden.net/2009/12/21/the-ostps-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government/#comments</comments>
		<pubDate>Mon, 21 Dec 2009 21:54:26 +0000</pubDate>
		<dc:creator>vcs</dc:creator>
				<category><![CDATA[Intellectual Property]]></category>
		<category><![CDATA[Law]]></category>
		<category><![CDATA[OSTP]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Reproducible Research]]></category>
		<category><![CDATA[Scientific Method]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.stodden.net/?p=132</guid>
		<description><![CDATA[The following comments were posted in response to the OSTP&#8217;s call as posted here: http://www.ostp.gov/galleries/default-file/RFI%20Final%20for%20FR.pdf: Open access to our body of federally funded research, including not only published papers but also any supporting data and code, is imperative, not just for scientific progress but for the integrity of the research itself. We list below nine [...]]]></description>
			<content:encoded><![CDATA[<p>The following comments were posted in response to the OSTP&#8217;s call as posted here: <a href=http://www.ostp.gov/galleries/default-file/RFI%20Final%20for%20FR.pdf>http://www.ostp.gov/galleries/default-file/RFI%20Final%20for%20FR.pdf</a>:</p>
<p>Open access to our body of federally funded research, including not only published papers but also any supporting data and code, is imperative, not just for scientific progress but for the integrity of the research itself. We list below nine focus areas and recommendations for action.</p>
<p><span id="more-132"></span></p>
<p>[1] Each Funding Agency Must Address Open Access: The disparate nature of research discourages the use of blanket mandates in favor of an approach, at least initially, tailored to the research environment at the level of the funding agency. For example, the initiative shown by the National Institutes for Health regarding Open Access derives from the established norms of openness emerging from the Human Genome Project which may not be directly applicable to each agency. Awards from agencies may be currently subject to data sharing agreements that must be reconciled with Open Access. We recommend advising the funding agencies to develop plans to implement Open Access within a six month time frame before turning to the powers vested in the Executive Branch. We discuss issues for consideration in the enactment of an Open Access policy for federally funded research.</p>
<p>[2] Public Access to Federally Funded Research: It is imperative to provide public access to tax-payer funded scientific output, not only the final published paper but also the supporting data and code necessary for the reproducibility and skepticism fundamental to scientific communication and progress.</p>
<p>[3] Exceptions to Open Access: These must be minimized. The goal of transparency in research must accommodate exceptions, such as research used for national security purposes or those with privacy or confidentiality concerns. Research relevant to national security interests falls outside the mandate of these recommendations. Confidentiality must be circumscribed to apply to data with individual subjects for which anonymization techniques are ineffective.</p>
<p>[4] Timeliness and Embargo Periods: Funding agencies should require the deposit of agency-funded final peer-reviewed manuscripts. The NIH requires that papers that arise from NIH funds comply with their public access policy: final peer-reviewed journal manuscripts are submitted to PubMed Central upon acceptance for publication, and become accessible to the public no longer than 12 months after publication. Ideally the closer the research is made public to the date of publication the better, but 12 months should be the maximum embargo period for federally funded research.</p>
<p>[5] Digital Archiving: Careful consideration must be given to the locus of the digital archiving. The creation of agency-specific repositories does not facilitate interdisciplinary communication and thwarts scripted search and API usage; a national research repository should be established to house released agency funded manuscripts including supporting digital materials such and data and code, and provide links to research housed elsewhere. Many institutions do not have repositories, nor do they have the resources to maintain them. For computational work, supporting data and code must accompany article release creating additional demands on a repository. For papers whose results can be replicated from short scripts and small datasets, many computational scientists who do engage in reproducible research are able to host their research compendia (paper, data, and code) on their institutional webpages or using hosting resources their institution is willing to provide. These individual contributions, however, may not conform to standardized formats that facilitate scripted search, and nor display transparent versioning and crucial time-stamping of edits and revisions, and may not be labeled with unique object identifiers as required by the NIH Open Access policy. These desiderata could be implemented in a straightforward manner by a neutral third-party site such as one coordinated among multiple funding agencies. Not all computational research involves small amounts of supplemental data and code and an inter-agency repository could host very large datasets or complex bodies of code in cases where institutional support is not available to the researcher. Such a repository could extend the capabilities of arxiv.org or PubMed Central for all federally funded research (data, code, and manuscripts; perhaps renaming PubMed Central the more representative &#8220;PubSci&#8221; or &#8220;PubCentral&#8221;).</p>
<p>[6] Copyright and Ownership Issues: The NIH further requires that copyright be lawfully addressed. Many journals require authors to assign copyright to the journal as a condition of publication, but will allow an earlier version to be posted publicly. The NIH has made publication in journals that permit the article &#8212; or a version thereof &#8212; to be posted in PubMed Central a requirement for funding; this strategy is an option for all funding agencies to consider, as well as a generalization to include data and code deposit (for computational research).</p>
<p>Current complex ownership issues must be clarified between the public, the researcher, the institution at which the researcher works, and publishing entities. OSTP&#8217;s current RFI could be viewed as a step in untangling ownership in favor of the taxpayer. Since the passage of the Bayh-Dole Act in 1980, universities have taken a strong interest in maintaining a proprietary interest in research produced at their institutions. Patenting and other forms of intellectual property limit the ability of other researchers to reuse and build upon the research, and thus work against scientific norms and hinder scientific progress.</p>
<p>[7] Incentives to Open Science &#8212; Citation and Future Grants: The final requirement the NIH makes of grant recipients is use of the PubMed Central identifier at the end of citations. Encouraging the use of unique identifiers of papers, as well as of data and code, can encourage the release and hence citation of all forms of computational research. Such a unique identifier would indicate compliance with agency open access policies.</p>
<p>Tagging of research compendia is an important issue for communicating work, facilitating topical web searches, and aggregating a researcher&#8217;s contributions, including their data and code. Development of a standard RDFa vocabulary for HTML tags for agency funded research would enable search for data, code, and research as well as facilitating the transmission of licensing information, authorship, and sources. Enabling search by author would allow a more granular understanding of a researcher&#8217;s contributions, beyond citations. This would provide an incentive to release data and code, and give others &#8212; such as funders, award committees, and university hiring and promotion committees &#8212; access to a more representative assessment of the researcher&#8217;s contributions to the community than mere publication-counting. Such a tagging vocabulary could include unique identifiers for data and code, ideally the same as those required for repository deposit as discussed in the previous section, and thus facilitate and encourage their citation.</p>
<p>It is important that these requirements be tied to grant funding and a mechanism established that allows compliance to be reflected in future grant determinations. Strategies for release of data and code arising from a particular grant should be subject to peer review in the grant evaluation process.</p>
<p>[8] Posted Guidelines and Recommended Best Practices: A &#8220;best practices&#8221; document should be publicly available at a stable URL, be updated with versions, and provide clarity regarding the above issues, either at the agency level or at the OSTP. It should be framed to suggest ideal recommendations, rather than list a series of requirements. Some points such a document may wish to address follow.</p>
<p>Reproducibility is a goal of computational science, and practicing reproducible research means:<br />
* Uploading the final peer-reviewed journal manuscripts that arise from federally funded research to a digital archive upon acceptance of publication,<br />
* Making the data and code required to reproduce results from federally funded works publicly available online upon acceptance of publication,<br />
* Utilizing appropriate licensing structures for federally funded research, such as the Reproducible Research Standard (see IJCLP Webdoc 1-13-2009 at <a href=http://www.ijclp.net/issue_13.html>http://www.ijclp.net/issue_13.html</a> ),<br />
* Utilizing tagging structures for agency funded compendia release, as part of inclusion in repositories or posting in institutional repositories, in order to facilitate search of research results.</p>
<p>[9] References: These issues were discussed at a roundtable on research sharing issues held at Yale Law School on November 21. The webpage, along with thought pieces and research materials on the subject, is located at <a href=http://www.stanford.edu/~vcs/Conferences/RoundtableNov212209/>http://www.stanford.edu/~vcs/Conferences/RoundtableNov212209/</a> . A possibly useful reference discussing the communication of research and scientific progress, &#8220;Reproducible Research in Computational Harmonic Analysis&#8221; is available at <a href=http://www.computer.org/portal/web/csdl/doi/10.1109/MCSE.2009.15>http://www.computer.org/portal/web/csdl/doi/10.1109/MCSE.2009.15</a> .</p>
<p>Victoria Stodden<br />
Yale Law School, New Haven, CT, 06511<br />
Science Commons, Cambridge, MA 02138<br />
<a href=http://www.stanford.edu/~vcs>http://www.stanford.edu/~vcs</a></p>
<p>Chris Wiggins<br />
Department of Applied Physics and Applied Mathematics, Columbia University, New York, NY<br />
<a href=http://www.columbia.edu/~chw2/>http://www.columbia.edu/~chw2/</a></p>
<p>Cross-posted on <a href=http://blog.ostp.gov/2009/12/10/policy-forum-on-public-access-to-federally-funded-research-implementation/>http://blog.ostp.gov/2009/12/10/policy-forum-on-public-access-to-federally-funded-research-implementation/</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.stodden.net/2009/12/21/the-ostps-call-for-comments-regarding-public-access-policies-for-science-and-technology-funding-agencies-across-the-federal-government/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My Interview with ITConversations on Reproducible Research</title>
		<link>http://blog.stodden.net/2009/10/04/my-interview-with-itconversations-on-reproducible-research/</link>
		<comments>http://blog.stodden.net/2009/10/04/my-interview-with-itconversations-on-reproducible-research/#comments</comments>
		<pubDate>Sun, 04 Oct 2009 15:46:29 +0000</pubDate>
		<dc:creator>vcs</dc:creator>
				<category><![CDATA[Intellectual Property]]></category>
		<category><![CDATA[Law]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Reproducible Research]]></category>
		<category><![CDATA[Scientific Method]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[shameless self-promotion]]></category>

		<guid isPermaLink="false">http://blog.stodden.net/?p=95</guid>
		<description><![CDATA[On September 30, I was interviewed by Jon Udell from ITConversations.org in his Interviews with Innovators series, on Reproducibility of Computational Science. Here&#8217;s the blurb: &#8220;If you&#8217;re a writer, a musician, or an artist, you can use Creative Commons licenses to share your digital works. But how can scientists license their work for sharing? In [...]]]></description>
			<content:encoded><![CDATA[<p>On September 30, I was interviewed by Jon Udell from <a href="http://itc.conversationsnetwork.org/series/innovators.html">ITConversations.org</a> in his <strong>Interviews with Innovators</strong> series, on <a href="http://itc.conversationsnetwork.org/shows/detail4255.html">Reproducibility of Computational Science</a>.</p>
<p>Here&#8217;s the blurb: &#8220;If you&#8217;re a writer, a musician, or an artist, you can use Creative Commons licenses to share your digital works. But how can scientists license their work for sharing? In this conversation, Victoria Stodden &#8212; a fellow with Science Commons &#8212; explains to host Jon Udell why scientific output is different and how Science Commons aims to help scientists share it freely.&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.stodden.net/2009/10/04/my-interview-with-itconversations-on-reproducible-research/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
<enclosure url="http://itc.conversationsnetwork.org/audio/download/ITC.INNO-VictoriaStodden-2009.09.30.mp3" length="18973734" type="audio/mpeg" />
		</item>
		<item>
		<title>Stuart Shieber and the Future of Open Access Publishing</title>
		<link>http://blog.stodden.net/2008/11/23/stuart-shieber-and-the-future-of-open-access-publishing/</link>
		<comments>http://blog.stodden.net/2008/11/23/stuart-shieber-and-the-future-of-open-access-publishing/#comments</comments>
		<pubDate>Sun, 23 Nov 2008 19:02:54 +0000</pubDate>
		<dc:creator>vcs</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Developing world]]></category>
		<category><![CDATA[Economics]]></category>
		<category><![CDATA[Intellectual Property]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Talks]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://blog.stodden.net/2008/11/23/stuart-shieber-and-the-future-of-open-access-publishing/</guid>
		<description><![CDATA[Back in February Harvard adopted a mandate requiring its faculty member to make their research papers available within a year of publication. Stuart Shieber is a computer science professor at Harvard and responsible for proposing the policy. He has since been named director of Harvard&#8217;s new Office for Scholarly Comminication. On November 12 Shieber gave [...]]]></description>
			<content:encoded><![CDATA[<p>Back in February <a xhref="http://www.libraryjournal.com/info/CA6532658.html#news1">Harvard adopted a mandate</a> requiring its faculty member to make their research papers available within a year of publication. <a xhref="http://www.eecs.harvard.edu/~shieber/">Stuart Shieber</a> is a computer science professor at Harvard and responsible for proposing the policy. He has since <a xhref="http://www.earlham.edu/~peters/fos/2008/05/stuart-shieber-reflects-on-his-new.html">been named</a> director of Harvard&#8217;s new <a xhref="http://osc.hul.harvard.edu/osc.php">Office for Scholarly Comminication</a>.</p>
<p>On November 12 Shieber gave a talk entitled &#8220;The Future of Open Access &#8212; and How to Stop It&#8221; to give an update on where things stand after the adoption of the open access mandate. Open access isn&#8217;t just something that makes sense from an ethical standpoint, as Shieber points out that (for-profit) journal subscription costs have risen out of proportion with inflation costs and out of proportion with the costs of nonprofit journals. He notes that the <a xhref="http://www.econ.ucsb.edu/~tedb/Journals/jpricing.html">cost per published page in a commercial journal is six times that of the nonprofits</a>. With the current library budget cuts, open access &#8212; meaning both access to articles directly on the web and shifting subscriptions away from for-profit journals &#8212; is something that appears financially unavoidable.</p>
<p>Here&#8217;s the business model for an Open Access (OA) journal: authors pay a fee upfront in order for their paper to be published. Then the issue of the journal appears on the web (possibly also in print) without an access fee. Conversely, traditional for-profit publishing doesn&#8217;t charge the author to publish, but keeps the journal closed and charges subscription fees for access.</p>
<p>Shieber recaps Harvard&#8217;s policy:</p>
<p>1. The faculty member grants permission to the University to make the article available through an OA repository.</p>
<p>2. There is a waiver for articles: a faculty member can opt out of the OA mandate at his or her sole discretion. For example, if you have a prior agreement with a publisher you can abide by it.</p>
<p>3. The author themselves deposits the article in the repository.</p>
<p>Shieber notes that the policy is also because it allows Harvard to make a collective statement of principle, systematically provide metadata about articles, it clarifies the rights accruing to the article, it allows the university to facilitate the article deposit process, it allows the university to negotiate collectively, and having the mandate be opt out rather than opt in might increase rights retention at the author level.</p>
<p>So the concern Shieber set up in his talk is whether standards for research quality and peer review will be weakened. Here&#8217;s how the dystopian argument runs:</p>
<p>1. all universities enact OA policies<br />
2. all articles become OA<br />
3. libraries cancel subscriptions<br />
4. prices go up on remaining journals<br />
5. these remaining journals can&#8217;t recoup their costs<br />
6. publishers can&#8217;t adapt their business model<br />
7. so the journals and the logistics of peer review they provide, disappear</p>
<p>Shieber counters this argument: 1 through 5 are good because journals will start to feel some competitive pressure. What would be bad is if publishers cannot change their way of doing business. Shieber thinks that even if this is so it will have the effect of pushing us towards OA journals, which provide the same services, including peer review, as the traditional commercial journals.</p>
<p>But does the process of getting there cause a race to the bottom? The argument goes like this: since OA journals are paid by the number of articles published they will just publish everything, thereby destroying standards. Shieber argues this won&#8217;t happen because there is price discrimination among journals &#8211; authors will pay more to publish in the more prestigious journals. For example, PLOS costs about $3k, Biomed Central about $1000, and Scientific Publishers International is $96 for an article. Shieber also makes an argument that Harvard should have a fund to support faculty who wish to publish in an OA journal and have no other way to pay the fee.</p>
<p>This seems to imply that researchers with sufficient grant funding or falling under his proposed Harvard publication fee subsidy, would then be immune to the fee pressure and simply submit to the most prestigious journal and work their way down the chain until their paper is accepted. This also means that editors/reviewers decide what constitutes the best scientific articles by determining acceptance.</p>
<p>But is democratic representation in science a goal of OA? Missing from Shieber&#8217;s described market for scientific publications is any kind of feedback from the readers. The content of these journals, and the determination of prestige, is defined solely by the editors and reviewers. Maybe this is a good thing. But maybe there&#8217;s an opportunity to open this by allowing readers a voice in the market. This could done through ads or a very tiny fee on articles &#8211; both would give OA publishers an incentive to respond to the preferences of the readers. Perhaps OA journals should be commercial in the sense of profit-maximizing: they might have a reason to listen to readers and might be more effective at maximizing their prestige level.</p>
<p>This vision of OA publishing still effectively excludes researchers who are unable to secure grants or are not affiliated with a university that offers a publication subsidy. The dream behind OA publishing is that everyone can read the articles, but to fully engage in the intellectual debate quality research must still find its way into print, and at the appropriate level of prestige, regardless of the affiliation of the researcher. This is the other side of OA that is very important for researchers from the developing world or thinkers whose research is not mainstream (see, for example, <a xhref=http://en.wikipedia.org/wiki/Antony_Garrett_Lisi>Garrett Lisi</a> a high impact researcher who is unaffiliated with an institution).</p>
<p>The OA publishing model Shieber describes is a clear step forward from the current model where journals are only accessible by affiliates of universities who have paid the subscription fees. It might be worth continuing to move toward an OA system where, not only can anyone access publications, but any quality research is capable of being published, regardless of the author&#8217;s affiliation and wealth. To get around the financial constraints one approach might be to allow journals to fund themselves through ads, or provide subsidies to certain researchers. This also opens up the idea of who decides what is quality research.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.stodden.net/2008/11/23/stuart-shieber-and-the-future-of-open-access-publishing/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A2K3: Opening Scientific Research Requires Societal Change</title>
		<link>http://blog.stodden.net/2008/09/10/a2k3-opening-scientific-research-requires-societal-change/</link>
		<comments>http://blog.stodden.net/2008/09/10/a2k3-opening-scientific-research-requires-societal-change/#comments</comments>
		<pubDate>Wed, 10 Sep 2008 14:05:02 +0000</pubDate>
		<dc:creator>vcs</dc:creator>
				<category><![CDATA[A2K3]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Developing world]]></category>
		<category><![CDATA[Intellectual Property]]></category>
		<category><![CDATA[Open Science]]></category>

		<guid isPermaLink="false">http://blog.stodden.net/2008/09/10/a2k3-opening-scientific-research-requires-societal-change/</guid>
		<description><![CDATA[In the A2K3 panel on Open Access to Science and Research, Eve Gray, from the Centre for Educational Technology, University of Cape Town, sees the Open Access movement as a real societal change. Accordingly she shows us a picture of Nelson Mandela and asks us to think about his release from prison and the amount [...]]]></description>
			<content:encoded><![CDATA[<p>In the A2K3 panel on <a href=http://www.law.yale.edu/intellectuallife/7122.htm>Open Access to Science and Research</a>, Eve Gray, from the Centre for Educational Technology, University of Cape Town, sees the Open Access movement as a real societal change. Accordingly she shows us a picture of Nelson Mandela and asks us to think about his release from prison and the amount of change that ushered in. She also asks us to consider whether or not Mandela is an international person or a local person. She sees a parallel with how South African society changed with Mandela and the change people are advocation toward open access to research knowledge. She shows a <a href=http://www.worldmapper.org>worldmapper.org</a> map of countries distorted by the amount of (copyrighted) scientific research publications. South Africa looks small. She blames this on South Africa&#8217;s willingness to uphold colonial traditions in copyright law and norms in knowledge dissemination. She says this happens almost unquestioningly, and in South Africa to rise in the research world you are expected to publish in &#8216;international&#8217; journals &#8211; the prestigious journals are not South African, she says (I am familiar with this attitude from my own experience in Canada. The top American journals and schools were considered the holy grail. When I asked about attending a top American graduate school I was laughed at by a professor and told that maybe it could happen, if perhaps I had an Olympic gold medal.) She states that for real change in this area to come about people have to recognize that they must mediate a &#8220;complex meshing&#8221; of policies: at the university level, and the various government levels, norms and the individual scientist level&#8230; just as Mandela had to mediate a large number of complex policies at a variety of different levels in order to bring about the change he did.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.stodden.net/2008/09/10/a2k3-opening-scientific-research-requires-societal-change/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Legal Barriers to Open Science: my SciFoo talk</title>
		<link>http://blog.stodden.net/2008/09/10/legal-barriers-to-open-science-my-scifoo-talk/</link>
		<comments>http://blog.stodden.net/2008/09/10/legal-barriers-to-open-science-my-scifoo-talk/#comments</comments>
		<pubDate>Wed, 10 Sep 2008 12:12:23 +0000</pubDate>
		<dc:creator>vcs</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Intellectual Property]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Talks]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[shameless self-promotion]]></category>

		<guid isPermaLink="false">http://blog.stodden.net/2008/09/10/legal-barriers-to-open-science-my-scifoo-talk/</guid>
		<description><![CDATA[I had an amazing time participating at Science Foo Camp this year. This is a unique conference: there are 200 invitees comprising some of the most innovative thinkers about science today. Most are scientists but not all &#8211; there are publishers, science reporters, scientific entrepreneurs, writers on science, and so on. I met old friends [...]]]></description>
			<content:encoded><![CDATA[<p>I had an amazing time participating at <a href=http://www.nature.com/nature/meetings/scifoo/index.html>Science Foo Camp</a> this year. This is a unique conference: there are 200 invitees comprising some of the most innovative thinkers about science today. Most are scientists but not all &#8211; there are publishers, science reporters, scientific entrepreneurs, writers on science, and so on. I met old friends there and found many amazing new ones.</p>
<p>One thing that I was glad to see was the level of interest in Open Science. Some of the top thinkers in this area were there and I&#8217;d guess at least half the participants are highly motivated by this problem. There were sessions on reporting negative results, the future of the scientific method, reproducibility in science. I organized a session with <a href=http://michaelnielsen.org/blog/>Michael Nielsen</a> on overcoming barriers in open science. I spoke about the legal barriers and O&#8217;Reilly Media has made the talk available <a href=http://www.youtube.com/watch?v=4J4IwzUfvoo>here</a>.</p>
<p>I have papers forthcoming on this topic you can find on <a href=http://www.stodden.net>my website</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.stodden.net/2008/09/10/legal-barriers-to-open-science-my-scifoo-talk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A2K3 Kaltura Award</title>
		<link>http://blog.stodden.net/2008/09/10/a2k3-kaltura-award/</link>
		<comments>http://blog.stodden.net/2008/09/10/a2k3-kaltura-award/#comments</comments>
		<pubDate>Wed, 10 Sep 2008 11:13:18 +0000</pubDate>
		<dc:creator>vcs</dc:creator>
				<category><![CDATA[A2K3]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Intellectual Property]]></category>
		<category><![CDATA[Open Science]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[shameless self-promotion]]></category>

		<guid isPermaLink="false">http://blog.stodden.net/2008/09/10/a2k3-kaltura-award/</guid>
		<description><![CDATA[I am honored and humbled to win the A2K3 Kaltura prize for best paper. Peter Suber posts about it here and gives the abstract. His post also includes a link to a draft of the paper, which can also be found here: Enabling Reproducible Research: Open Licensing For Scientific Innovation. I&#8217;d love comments and feedback [...]]]></description>
			<content:encoded><![CDATA[<p>I am honored and humbled to win the <a href=http://www.law.yale.edu/news/6991.htm>A2K3 Kaltura prize</a> for best paper. Peter Suber posts about it <a href=http://www.earlham.edu/~peters/fos/2008/09/open-licensing-to-enable-reproducible.html> here</a> and gives the abstract. His post also includes a link to a draft of the paper, which can also be found here: <it><a href=http://www.stanford.edu/~vcs/papers/Licensing08292008.pdf>Enabling Reproducible Research: Open Licensing For Scientific Innovation</a></it>. I&#8217;d love comments and feedback although please be aware that since the paper is forthcoming in the <a href=http://www.ijclp.net/>International Journal of Communications Law and Policy</a> it will very likely undergo changes. Thank you to <a href=http://corp.kaltura.com/>Kaltura.com</a> and the entire A2K3 committee. I&#8217;m very happy to be here in Geneva and enjoying every minute. <img src='http://blog.stodden.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.stodden.net/2008/09/10/a2k3-kaltura-award/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
