Plan your research
Why is planning important?
Data often have a longer lifespan than the research project that creates them. Researchers may continue to work on data after funding has ceased, follow-up projects may analyse or add to the data, and data may be re-used by other researchers. This means that the idea of managing data through a whole lifecycle becomes more relevant.
You should make plans for your data before you start to create and collect it. Many funders are now asking you to do this as part of their application process. Make sure you know about your funders' expectations!
Planning at an early stage can help you make the right decisions about creating, storing and sharing your data.
What is data planning?
What counts as research data?
Research data are information that is involved directly in funded or unfunded research activities. They are often arranged or formatted in a such a way as to make it suitable for communication, interpretation and processing. Put more simply, research data are all of the information that you use as an integral part of your research.
Research data are "a reinterpretable representation of information in a formalized manner suitable for communication, interpretation, or processing." Digital Curation Centre (DCC)
Research data do not include incidental or administrative data generated in the course of personal activities, desktop or mailbox backups, or data produced by non-research activities such as University administration or teaching.
Information becomes data through the context of research activity. See the following 2 examples (adapted from Bristol University):
- A photographic image of an old municipal building in an historical archive is an archived image in an image bank. When used by a researcher to study the history of a city, the photographic image becomes research data, for that researcher.
- CCTV footage may be archived by a security firm. However, when used by a researcher to study human behaviour or 21st-century surveillance methods, the video footage becomes research data, for that researcher.
Research data classification
Data may have been created ‘from scratch’ by research efforts or it may be existing data which has been transformed, adjusted or reinterpreted. It can be be generated for different purposes and through different processes:
- Observational: data captured in real time that is usually unique and irreplaceable. For example, remote sensing data, survey data, field recordings, sample data
- Experimental: data captured from lab equipment that is often reproducible. For example, gene sequences, chromatograms, magnetic field data
- Models or simulation: data generated from test models where model and metadata may be more important than output data from the model. For example, climate models, economic models
- Derived or compiled: resulting from processing or combining ‘raw’ data. For example, text and data mining, compiled databases, 3D models
- Reference or canonical: a static or organic conglomeration or collection of datasets, probably published and curated. For example, gene sequence databanks, collection of letters or archive of historical images
Examples of Research Data
Research data can be electronic or in hardcopy (eg paper) and it may include the following:
- Documents (text, Word), spreadsheets
- Laboratory notebooks, field notebooks, diaries
- Questionnaires, transcripts, codebooks
- Audiotapes, videotapes
- Photographs, films
- Test responses
- Slides, artefacts, specimens, samples
- Collection of digital objects acquired and generated during the process of research
- Database contents (video, audio, text, images)
- Models, algorithms, scripts
- Contents of an application (input, output, log files for analysis software, simulation software, schemas)
- Methodologies and workflows
- Standard operating procedures and protocols
List adapted from Leeds University.
The what, why and how of data management planning
This short video by Research Data Netherlands provides an introduction to data management planning.