Preserve and Share
You have invested a lot of time and effort in creating your data, so keep it safe in the long run.
Learn how to select what to keep and how to preserve it properly. When you share your data you have to think about an appropriate license. Don’t forget how Pure, the University’s Research Information System, can help you make your data visible.
What is digital preservation?
"The set of processes, activities and management of digital information over time to ensure its long-term accessibility. Because of the relatively short lifecycle of digital information, preservation is an ongoing process." — University of Bristol.
Preservation involves actions and procedures to hold data for some period of time and/or to set data aside for future use. This may include data archiving and submission to a data repository.
Lancaster University recognises that it is good practice for researchers to manage and retain their research data. Sometimes they are legally required to do so for many years after project funding has ceased.
Why preserve and share?
- The idea behind increasing access to research data at the end of a project, when legally, ethically and commercially appropriate, is that publicly funded research should be a public good and be able to be accessed and reused by the widest possible audience.
- Sharing data also shows your scientific integrity. It allows others to replicate, validate, or correct your results, thereby improving the scientific record.
- Lancaster University supports access to data by other researchers who could re-use the data, thus maximising the effectiveness of our research funding.
Many funders specify time-frames of how long research data has to be preserved. Look at funders requirements to find out more.
There might be reasons not to share your data
- If your data has financial value or is the basis for potentially valuable patents that could be exploited by the University, it may be unwise to share it.
- If the data contains sensitive, personal information about human subjects, it may violate the Data Protection Act, ethics codes, or your own written consent forms to share it, even with other researchers. However, there might be ways of anonymising the data to make it sharable.
Please note that if you think you cannot share your data you may need to provide a statement to your funder justifying why data should be restricted as part of the application process.
A lot of data with personal or confidential information can be anonymised without compromising the value data and be shared. For example you should consider to:
- Remove direct identifiers (e.g. personal information such as addresses);
- Aggregate or reduce the precision of variables that might be identifiable (such as postcodes); and
- Generalise text variables to reduce identifiability.
Note that re-users of data have the same legal and ethical obligation to NOT disclose confidential information as primary users.
A guide to data anonymisation can be found on the website of the UK Data Archive.
Data Access Restrictions
If necessary, you need to restrict access according to the data's level of detail, sensitivity and confidentiality. Some data centres like the UK Data Archive allow you to do that. You will also be able to do this when you use Pure to publish your data with Lancaster University.
The University recommends restricting the dataset's visibility at the electronic document file level (not the dataset record level) to control access to it. If you deposit your data into Pure, you have the following options:
- Unrestricted public access to data files if you choose Public - No restriction in Visibility of your data file in Pure. Data files will be freely available on the Research Directory.
- No public access if you choose Backend - Restricted to Pure users. In this case data files will not be visible on the Research Registry (but the dataset record describing the data will be).
- Access on request. Same as in step 2 but in your description you add that data is available on request and you summarise conditions under which data can be accessed. For example: "Data can be made available on request to bona fide researchers who provide information regarding proposed use".
You can also note any legal or ethical constraints when creating a dataset entry on Pure. You will need to provide details of why your data is subject to these restraints.
Please note that funders expect that most datasets will be publically available (or Open) and you will have to justify restrictions (best to do that in your Data Management Plan).
All research data you want to publish need to conform to Lancaster University's Code of Practice (pdf).
Find out more about Ethics from the Research Support Office.
What if my consent forms do not allow my data to be shared?
If the consent forms explicitly prohibits data publication and sharing, your funding body will not expect you to share your data. You are required to have a data access statement in your research publication that links to a webpage where you explain why data cannot be shared. This can be done in Pure.
However, for future research projects funders like EPSRC and ESRC expect: "when gaining informed consent, include consent for data sharing" (ESRC Data Policy). Get in touch with RDM Support if you have any questions: firstname.lastname@example.org.
Consent form templates
The UK Data Archive has a useful help site including example information sheets and consent forms.
When and for how long do I need to preserve research data?
Data must be kept securely even once the research has ended. Your funder might specify when you need to deposit the data with a data archive, for example "as soon after the end of data collection as is possible" (NERC, DCC summary), "within three months of the end of the award" (ESRC, DCC summary), or "normally within 12 months of the data being generated" (EPSRC, DCC summary).
Many research funders specify which data need preserving and for how long. This can range from "a minimum of three years" (AHRC) to ten years (BBSRC (pdf)). EPSRC (Data Principle VII) expects data to be preserved for 10 years after the last "privileged access" to the data, so this data might have to be kept in perpetuity.
Please consult your funders' data policy expectations for specific details.
Lancaster University expectations
If your research is not funded by an external funder, you will need to comply with Lancaster University's Research Data Management Policy (Word doc) which states "all research data will be stored in either electronic or paper form for a minimum of 10 years after the end of a project, unless ethical considerations, participant confidentiality, FOI requirements or external agencies e.g. NHS, specifically require otherwise."
Please contact email@example.com if you have questions regarding the preservation of your research data.
Data access statements
Data access statements are used in publications to describe where supporting data can be found and under what conditions they can be accessed. They are a requirement of many funders' data policies and are a requirement of the UKRI (formerly RCUK) Policy on Open Access (pdf, Section 3.3 ii).
Learn more about data access statements.