Data access statements
Data access statements are used in publications to describe where supporting data can be found and under what conditions they can be accessed. Examples are given at the bottom of the page.
Why do you need a data access statement?
Data access statements are required for all publications arising from publicly-funded research. They are a requirement of many funders' data policies and are a requirement of the UKRI (Formerly RCUK) Policy on Open Access (pdf, Section 3.3 ii). Sometimes they are called data availability statements.
Some funders have indicated that they now check for the inclusion of data access statements in publications that acknowledge their support. In particular the requirement applies to all papers that acknowledge EPSRC funding with a publication date after 1 May 2015 (see our EPSRC guide).
Does this mean all data needs to be published?
The aim of the data access statement is discoverability — the data referenced by the statement do not have to be openly available. There are many reasons why access to data should be restricted and if you are unsure about whether you should publish your data openly please contact firstname.lastname@example.org for advice.
Where to provide the data access statement?
We recommend one of the following options:
- Some journals (for example PLOS) now provide a separate section in articles for the data access statement.
- You can include the data access statement with the acknowledgement of funder support.
If these options are not available you can include a data statement in your main reference section. Please email email@example.com if you have any questions.
What to include in the data access statement?
Depending on if your data is openly available or not one of the following options will apply:
- If data are openly available the name(s) of the data repositories should be provided, as well as any persistent identifiers or accession numbers for the dataset. The data repository could be Pure or an external data preservation archive.
- If there are justifiable legal or ethical reasons why your data cannot be made openly available, these should be included in the data access statement. In this case, the data access statement must direct users to a permanent record that describes any access constraints or conditions that must be satisfied for access to be granted. You can do this in Pure.
- If you did not collect the research data yourself but instead used existing data obtained from another source, this source should be credited.
Please note that a simple direction to interested parties to contact the author would not normally be considered sufficient.
When should I deposit data in order to get a DOI for my Data Access Statement?
The problem is that the paper has to reference the DOI of the dataset, but once a dataset has been given a DOI you cannot change the dataset. Therefore we recommend the following steps:
- In your draft paper's Acknowledgements, include a sentence like "The underlying data in this paper is available from http://dx.doi.org/10.17635/lancaster/researchdata/xxx." as it goes through the journal reviewing process.
- When the paper is accepted, generate and deposit the dataset. You will need to edit the paper's text to give the dataset DOI that you will receive from the Library when the dataset is validated. This version of your paper is the AAM (authors accepted manuscript), which you will also need to deposit in Pure to satisfy HEFCE's open access requirements.
- When the paper finally appears in print or online, update the dataset metadata to give the full details of the paper, including its DOI.
Example data access statements
Please note that the URLs/DOIs in these examples are not genuine.
Openly available data
"All data created during this research are openly available from Lancaster University data archive at http://dx.doi.org/10.17635/lancaster/researchdata/15."
"All data are provided in full in the results section / the supplementary section of this paper."
"Crystal structures are available from the Cambridge Crystallographic Data Centre (Identifier BATHRS) at http://dx.doi.org/10.15125/010203. Microscopy images are openly available from Dryad at http://dx.doi.org/10.17635/lancaster/researchdata/1."
"The 1962 birth cohort data can be accessed via the UK Data Service (http://ukdataservice.ac.uk/)."
Secondary analysis of existing data
"This study was a re-analysis of existing data that are publicly available from EMBL at http://dx.doi.org/10.15125/12345. Further documentation about data processing are available from the Lancaster University data archive at http://dx.doi.org/10.17635/lancaster/researchdata/3."
"The study brought together existing data obtained upon request and subject to licence restrictions from a number of different sources. Full details how these data were obtained are available in the documentation available at http://dx.doi.org/10.17635/lancaster/researchdata/28."
"Due to the (commercially, politically, ethically) sensitive nature of the research, no participants consented to their data being retained or shared. Additional details relating to the data are available from the Lancaster University data archive at http://dx.doi.org/10.17635/lancaster/researchdata/22."
"Anonymised interview transcripts from participants who consented to data sharing, plus other supporting information, are available from the UK Data Service, subject to registration, at http://dx.doi.org/10.17635/lancaster/researchdata/24."
Data available on request only: "Due to ethical concerns, supporting data cannot be made openly available. Further information about the data and conditions for access are available at the Lancaster University data archive: http://dx.doi.org/10.17635/lancaster/researchdata/123."
"Due to the (commercially, politically, ethically) sensitive nature of the research, no interviewees consented to their data being retained or shared. Additional details relating to other aspects of the data are available from the Lancaster University data archive at http://dx.doi.org/10.17635/lancaster/researchdata/222."
"Supporting data are available to bona fide researchers, subject to registration, from the UK Data Service at http://dx.doi.org/10.15125/12345."
"Supporting data will be available from Lancaster University research portal at http://dx.doi.org/10.17635/lancaster/researchdata/17 after a 6 month embargo from the data of publication to allow for commercialisation of research findings."
"Due to confidentiality agreements with research collaborators, supporting data can only be made available to bona fide researchers subject to a non-disclosure agreement. Details of the data and how to request access are available at Lancaster University research portal: http://dx.doi.org/10.15125/12345."
Citation of multiple datasets
"This publication is supported by multiple datasets, which are openly available at locations cited in the reference section."
Physical data (samples, specimens, paper collections etc.)
"Non-digital data supporting this study are stored by the corresponding author at Lancaster University. Details of how to request access to these data are provided in the documentation available from the Lancaster University data archive at http://dx.doi.org/10.17635/lancaster/researchdata/42."
No new data created
"No new data were created during this study."
"This research did not produce new data, other data sources are referenced throughout the paper".
All data are included in paper
"All relevant data are within the paper and its Supporting Information files."
Examples from Journals
Data Availability statement example 2 from a PLOS article (below Figures just under Copyright notice)
Examples from Springer Nature.
We acknowledge the work of the University of Bath in the development of this guidance.