Now in its third year, the 2018 State of Open Data report shows some encouraging progress in respondents reporting making data, the foundation of academic research, openly available. Yet a closer look makes clear the work still to be done and begs the question, what steps can publishers, funders and institutions take to make data sharing worth a researcher’s time and energy, and accelerate progress?
Simply encouraging data sharing is not enough. Calls for data sharing mandates from funders and institutions are increasing. This year’s State of Open Data survey shows support from respondents for national mandates increasing to 63%, up from 55% in 2017. This is also reflected in new policies around the world from China’s Ministry of Science and Technology which effectively mandates data sharing at a national level to the European Commission whose Horizon Europe proposal will mandate open access to research data as well as publications. It is great to see that globally more than 50 funders now require data sharing. The majority of funders with data mandates are in the US and UK, yet when we surveyed more than 7000 researchers, we found self-reported levels of sharing below the global average of 63% by respondents in the UK (58%) and US (55%).
In this year’s State of Open Data survey we saw a marked increase in lack of certainty about where funds will come from to support making data open. While funders are increasingly committed to coupling policy with practical support for researchers, dedicated funding and clear guidance about using grants is also needed.
Researchers are still experiencing barriers and challenges to sharing data. The top six responses to the question in this survey “What problems/concerns do you have with sharing datasets?” were “Concerns about misuse of my data”, “Unsure about copyright and licensing”, “Not receiving appropriate credit or acknowledgement”, “Unsure I have the rights to share”, “Organising data in a presentable and useful way” and “Contains sensitive information”. All were selected by more than 400 respondents. To my knowledge, this is the first time concerns about misuse of data have been expressed so strongly in a global survey.
Publishers need to better understand researchers concerns about “misuse of data”, and are well-placed to help make sure that researchers are clear about their rights to share, and the copyright and licensing options available to them. A good place to start is easy to understand and consistent journal data policies.
Publishers need to better understand researchers concerns about “misuse of data”, and are well-placed to help make sure that researchers are clear about their rights to share, and the copyright and licensing options available to them.
When we asked about practical challenges in our report Practical challenges for researchers in data sharing, “Organising data in a presentable and useful way” was the most stated reason for not sharing data. That is why helping researchers to deposit, describe and share their data, using good metadata, remains a priority for Springer Nature.
Researchers also want to receive credit when their data is shared or cited. This year’s survey found that 58% of respondents felt they did not get sufficient credit for sharing data, as opposed to 9% who felt they do. In my view, researchers would share data more routinely, and more openly, if they received proper credit for their work that counted in advancing their academic standing. To provide true credit for good data practice, datasets which are published and citable need to be viewed as research outputs on a par with a research article in terms of career advancement and assessment. Realistically, routine inclusion of datasets, their citations and impact in grant assessments and CV evaluation is probably still years away, though 2018 has been an encouraging year of progress from the Belmont Forum, the Open Research Funders Group Incentivisation Blueprint and others.
In my view, researchers would share data more routinely, and more openly, if they received proper credit for their work that counted in advancing their academic standing. To provide true credit for good data practice, datasets which are published and citable need to be viewed as research outputs on a par with a research article in terms of career advancement and assessment.
In the meantime, we can encourage and measure the usage and citations of datasets. Initiatives such as the GO FAIR metrics group, the FAIRdata project from DANS and MakeDataCount are making strides in this area. Figshare and other repositories include download and citation statistics, and alternative metrics for datasets. They also provide DOIs or other unique identifiers for datasets, ensuring they are citable in their own right.
Data articles provide an established credit mechanism - a citable publication - while making datasets easier to find, access and reuse. Yet uptake of publishing data descriptors in data journals continues to be low. Perhaps there is more we can do here to make it easier for researchers to write and publish data articles, and see the benefits to their research in doing so.
Ultimately we need to tell more stories about the benefits of data sharing. There is compelling evidence as to the benefits of managing and sharing data, including productivity and citation advantages. I referenced these in my contribution to last year’s State of Open Data report. We need to do a much better job of finding and telling stories about researchers who are sharing data, the impact on their work and on the fields they work in. Coupling these real world examples and evidence with clear policy, better credit, explicit funding, practical help and answers to common questions are all essential factors in accelerating data sharing to an established norm. There are no easy answers, and no “silver bullet”, but there is much we can act on now.
There is compelling evidence as to the benefits of managing and sharing data, including productivity and citation advantages.
Finding answers to these challenges, and helping researchers and funders make the most of their data, is at the heart of our approach at Springer Nature. Find out more about the actions we're taking here.
A longer version of this post originally appeared as part of Digital Science’s “The State of Open Data Report 2018”, and is published under a CC BY 4.0 license. The full report can be found on Figshare: https://doi.org/10.6084/m9.figshare.7195058.v1