“Funder requirements definitely change behaviour and they have taken away some of the fear of sharing data”Name: Rosie Higman
Position: Research data advisor, University of Cambridge, now Research Data Librarian at the University of Manchester Library
Institution: University of Cambridge
More info: LinkedIn Twitter
ORCID ID: http://orcid.org/0000-0001-5329-7168
An interview with Rosie Higman on 10 April 2017
“Some of us have the privilege of having the funding to generate data, and this should be shared with others.”
I think it is therefore very important to have the data in the open and being better at communicating the scientific process behind it. There’s also a slight social justice / political element to it in that we have a lot of money in the UK in higher education to produce a lot of data. If we sit on our data then that’s not fair to researchers in other countries who do not have the funding available to create data.
“We have a lot of money in the UK in higher education to produce a lot of data. If we sit on our data then that’s not fair to researchers in other countries who do not have the funding available to create data.”
So particularly, when you look beyond the hard sciences, i.e. at the social sciences, which is my background, I think we should be sharing our social science data so that others can do secondary analyses and do more work with it, as otherwise the field won’t progress. Some of us have the privilege of having the funding to generate data, and this should be shared with others.
Furthermore, the more likely that people who are passionate about this are in senior positions and in a position to hire, the better. You are already seeing that Principal Investigators (PIs) in charge of labs are saying, “My policy is being an open researcher, and if you want to come work at my lab, this is how we do it.” We need more PIs to do the same. At the moment, that’s incredibly unusual.
Currently, we are seeing more engagement with Open Data from early career researchers than we do from PIs. I understand that as if you’re a PI, you have to run the lab; you don’t want to think about yet another thing you’re doing if you can’t see a clear benefit. PIs have a large influence on junior researchers’ behaviour so until there are clearer benefits from Open Data for PIs it’s going to be tricky to get wider engagement.
I guess I came to data sharing via Open Access and, having spent some time outside of academia, I saw how frustrating it was to not be able to get access to things behind a paywall. Also, when you get to the end of an article and realise that it was really badly written but that some really interesting research had been carried out, access to that data should be there. Having also been in that situation makes you think that sharing would be a good idea.
Furthermore, opening up data also benefits start-ups, government and society. For example, look at the work that the Open Data Institute does with public datasets and governmental datasets e.g. Open Data Institute Leeds created an app for their local council that told people when their bins were next due to be collected, and sent them a reminder. Opening up data can help see more useful commercial applications, too. I find it quite depressing that we have to put a monetary value on everything, but seeing as that is the policy climate that we’re in at the moment, then perhaps that will help academia to continue to justify funding research data management.
What sort of further obstacles do you see that may be preventing researchers from making their data openly available?
“We need a slightly more collegiate spirit of not shouting, “Look, there’s a mistake in your data!” but instead saying, “Let’s improve it and then I can reuse it.”
I think this is also really hard as researchers are having this conversation in public, in a climate which isn’t the most friendly, but we need to say, “Actually, everything that’s being done isn’t very good,” or “There are flaws …. Had we been working more openly, someone else could have helped catch it.”
What frustrates you the most about current systems? If you could change something about the current systems, what would it be?
Is there more you see that isn't being addressed in terms of data sharing, especially in terms of practicality and implementation?
It’s not something that seems to be part of the day-to-day practice. In some areas I think it is, so if we look at our most regular depositors in Chemistry, I think it must be part of lab processes because they deposit so regularly. However, I believe that this is not the case in many areas. We can tell from the enquiries we get that in a lot of areas it’s not something people think about until they come to publish and exclaim, “Oh no, the publisher says I need to share my data!” And we respond by saying, “Yes, you do, and I wish you’d thought about it two years ago. But you haven’t, so let’s work out what we can do.” So I think it’s becoming part of business as usual that is still some way off.
We have some challenges here as well in the area of sensitive data, which is still very much working this out, and that will hold things back, and data within the Clinical and Social Sciences is held back for good reason sometimes. I entirely understand that social scientists are sometimes extremely nervous. They work with sensitive populations that are quite vulnerable and they don’t want to share such data. I think we’ve probably got some work to do with people in those disciplines to talk them through their options: “These are the bits you could share. These are the bits you couldn’t. These are the ways you could do it and these are what your options are.”
Another challenge is in the area of longitudinal data where sharing data could potentially help out people’s careers, and I think we’re still working out how best to do that, if you’re collecting data over, say, 20 years. One example saw someone collecting data on tree rings throughout his career, producing a really valuable dataset about what was happening to the climate. He spent 40 years collecting it, and did not want to share it. I understand this in as much that as a researcher you want to get everything from your dataset that you can because you spent a lot of your life collecting it. And particularly if you have a research project that’s going on for 20 years or more, then you don’t want to share very much. However, I think we just need to work out mechanisms so that you are always ahead of others by sitting on your data, for example, for just two years in such a situation and then you are obliged to share. I think when you have a dataset that will evolve over time, you don’t just want to impose a single embargo based on a dataset that then changes. You want to be able to update this dataset and have a kind of rolling embargo almost. That can be difficult, and I think that’s something we’re still working out how best to support that. We also still need to work out the workflows for researchers. If you’re only sharing something every two years, you don’t want it to be a long, convoluted process. It has to be something easy for the researcher to do where data, people and processes related to that data are recorded.
Is there a person / project / service that inspires you and makes you optimistic about the future of Open Science and Open Data?
Copyright: Dr Joyce Heckman, University of Cambridge. Creative Commons CC-BY Licence.
To more champions >