Advancing Software and Data Citation Best Practices

The research community urgently needs new practices and incentives to ensure data producers, software and tool developers and data curators are credited for their contributions.

Software and data are essential parts of the modern practice of scientific research. When, in addition to research results, scientists share data and software with their colleagues, it vastly amplifies the reach, relevance and transparency of science. Yet there are still substantial social, systemic and technological barriers that prevent scientists from sharing data and software.

No standard practices exist for citing data and software, meaning it may be difficult for a researcher to give appropriate credit to contributors, or to measure the impact and value of data and software contributions. Although numerous data and software sharing repositories exist, each repository uses a slightly different approach. Many scientists still distrust the public access model, preferring to share data and software only by personal request, which assures attribution through personal contact and implicit social contract but substantially limits the reach and benefit of shared data and software.

Scientific researchers – particularly academics – are also embedded in a reputation economy in which tenure, promotion and acclaim are achieved by producing influential research results. Tenure and promotion decisions are typically blind to a researcher’s contributions to shared data or software, despite the crucial role of these activities in the scientific endeavor.

To facilitate conversation around such issues and to develop actionable plans to move citation best practices forward across disciplines, The Foundation for Earth Science (FES), with support from the National Science Foundation and the Alfred P. Sloan Foundation, held a Data and Software Citation Workshop in Arlington, VA Jan 27-28, 2015.

The result was a wide-ranging interdisciplinary discussion and exploration of new norms and practices for software and data citation and attribution.

The workshop generated substantial interest and excitement among participants. A majority of those in attendance agreed that now is the time to move beyond workshops and discussions on data and software citation. Their next step is to begin implementing actions articulated at the workshop; critical actions that will guide further progress in data and software citation and attribution. Participants expressed great interest in advancing pilot programs to help research communities implement practices and procedures that facilitate improved credit, measurement, and attribution of research.

“We feel the January workshop was a seed that built significant cross-discipline community support around software sustainability across science,” said Erin Robinson, a member of the workshop organizing committee and Executive Director of FES.

The interest in pursuing a coordinated effort among research communities and related projects is now stronger than ever as a result of the workshop. The consensus built, insights gained and recommendations provided will be of great value to the National Science Foundation and other government agencies, research communities and industry partners.