Organise

With a little planning, you can make your data files easy to store, find and reuse. Failing to organise, however, can render data unusable.

University data storage

The best place to store your research data while you gather and work with it is one of the University’s filestores. See our data storage guide below for more information.

Choosing formats

In planning a research project, it’s important that you consider which file formats you will use to store your data. Please read our help on choosing file formats below and how your choice effects the usability and long term readability of files.

Naming and folders

Research data can easily become disorganised if a proper set of file naming protocols are not developed. Creating appropriate file and folder structures will save time, avoid loss of data, allow re-use of the data, and assist in accurate location of data in the future. Please consult our file naming guide below.

Documentation and metadata

Good documentation and metadata makes material understandable, verifiable, and reusable. See our help on documentation and metadata below.

Backup and security

Safe storage of your working data with regular backup is essential during your research project. Please read our backup and security guide below to see your options.

  • Data storage

    Safe storage of your working data is essential during your research project.

    Your working data

    During your research project you will need to store your research data so it is secure and backed up regularly, but is easily accessible to those who are authorised. The University supports you to do so. We define working data as the data you collect or create as the basis for your research publications (journal articles, conference papers, book chapters, etc.) 

    Where should I store (working) research data?

    Lancaster University provides storage space for diverse data needs of staff and postgraduate researchers. The choice of location for your data depends on a number of factors, such as space required and data churn. By making a request through the online tool, ISS will be able to advise you of the best location for your research data.

    What are my data storage options?

    Research filestore

    Projects with larger storage requirements can use the research filestore on site at Lancaster University. Files are accessible via PC's on campus and from off campus (using the VPN). Data may optionally be shared with colleagues and is backed up nightly.

    Access to storage on the research silo is by request via downloading and completing this request form (Word doc) and emailing it to iss-service-desk@lancaster.ac.uk.

    There are no individual file size limits. Your overall quota will be agreed when you request storage space, and may be expanded depending on your requirements. Contact the ISS Service Desk to request any changes to your current research filestore storage space.

    Departmental filestore

    Departmental filestores can be used to share data between colleagues in the same department or faculty, or for cross-faculty collaborations.

    Projects with smaller storage requirements (under 50GB) can use departmental filestores on site at Lancaster University. Files are accessible via PC's on campus (normally via R: — Research Drive) and from off campus (using the VPN). Data may optionally be shared with departmental colleagues.

    For further information, see departmental filestore help. If you have further questions about this, please contact the ISS Service Desk.

    Box (cloud storage)

    • Lancaster University provides access to the Lancaster University version of Box (http://lancaster.box.com). See LU Box help to find out about how to use Box.
    • LU Box provides secure cloud storage. Research data is accessible from anywhere via the web, mobile apps and on your PC or Mac using Box Sync. Data may optionally be shared with specified colleagues, students and external parties.
    • Box is based in America. It abides by the US 'Safe Harbor agreement', thus following the stated levels of protection within the UK Data Protection Act. Find out more about data security.
    • It is acceptable to store ordinary, confidential, restricted or personal classifications of information on Box. However, if sensitive personal must be stored in Box, additional security is required – native file encryption must be applied on these files before uploading to Box.

    Personal Filestore (H: drive)

    All staff including researchers are allocated a personal filestore (also known as your H: drive) of 1GB. Your personal filestore might be suitable to store your working data if your data files:

    • Are relatively small; and
    • Do not need to be shared with your Lancaster colleagues.

    It is possible to increase the storage of your H: drive if needed. Please contact ISS Help and Support if you want to request additional space.

    What about using my laptop or external hard drive to store data?

    All data which is stored on devices like laptops or external hard drives, USB sticks, etc. is not part of the University backup routine and we would not encourage researchers to use ad hoc solutions or to rely upon consumer devices for backup purposes. 

    If you do use your laptop or other memory holding device, please ensure it is encrypted.  All University provided laptops should be encrypted – contact ISS Service Desk to arrange encryption if needed.

    Other cloud storage solutions

    • Cloud storage such as Dropbox or Google Drive is a convenient option to access files across devices. The University discourages the use of Cloud based storage solutions (other than Box) for anything but the most public of data.
    • Cloud storage may not be stored within the EU and may not be subject to our levels of data protection (remember that all personal information must, by law, be stored within the EU). There is a risk of who could access the files, as they are no longer within your control.  Data on commercial cloud services is also not necessarily backed up.
    • Cloud storage in services other than Box might be an option for the above mentioned public data classified as Ordinary by the University's Information Classifications. Learn more on storing information in the Cloud.

    How long do I have access to my research data?

    In general, your access to Lancaster University data will end when you leave the organisation.  The last day for staff is the final day of their contract, and for PGR students it is the day they graduate. Find more detailed information to different systems below.

    Research and Departmental filestores

    Data in the Research filestore and the Departmental filestore are not deleted after you leave, as these are University–shared storage spaces. Data in these areas will be inaccessible to you whilst your account is suspended and once it is deleted. Access can be granted to others.

    Box

    Data in LU Box is made inaccessible to you when your account is suspended when you leave the University. It is inaccessible, but can be restored by request during the three-month period following suspension. It is permanently deleted after this time.

    Personal filestore

    Data in your personal filestore is made inaccessible to you when your account is suspended. It is inaccessible but can be restored by request during the three-month period following suspension.

  • Choose file formats

    The format in which research data are created usually depends on how researchers choose to collect and analyse data. This is often determined by discipline-specific standards and customs. Ensuring long-term usability of data requires consideration of the most appropriate file formats.

    Available file formats

    The safest option to guarantee long-term data access and usable data is to convert data to standard formats that most software are capable of interpreting, and that are suitable for data interchange and transformation.

    It is important to choose platform and vendor-independent file formats where possible to ensure the best chance for future compatibility.

    Danger of obsolescence

    In principle, all software is bound to become obsolete — however there are factors that should be considered in assessing a file format's long-term stability:

    • Is it widely adopted?
    • Does it have a history of backward compatibility?
    • Does it have good metadata support (in an open format such as XML)?
    • Does it have a good range of functionality, but not overly complex
    • Does it have an available interchange format with a usable target?
    • Does it use built-in error checking?
    • Does it have a reasonable upgrade cycle?

    Choose non-proprietary formats over proprietary ones

    Popular formats such as those produced by Microsoft Office products (e.g. Word documents or Excel spreadsheets) are very likely to have reasonable longevity, but be aware that they are proprietary (owned by someone) and so will not necessarily exist forever or remain easily readable. We encourage researchers storing important information in open, non-proprietary formats — for example:

    • PDF/A rather than Microsoft Word (.docx);
    • CSV rather than Excel (.xlsx);
    • TIFF rather than Photoshop files (.psd); and
    • XML rather than a database.

    File format table

    Here is a simple overview of some popular data formats and which to choose for long-term preservation. If you need more detailed advice, please look at the UK Data Archive file format table.

    Format GenreGreatAcceptableAvoid
    TEXT .txt; .odt; .xml; .html .pdf; .rtf; .docx .doc
    AUDIO .flac; .wav .ogg; .mp3, aif .wma; .ra; .ram; compression
    VIDEO .mp2/.mp4, MKV   .wmv; .mov; .avi; compression
    IMAGE .tif; .png; .svg; .jpg2000 .gif; .jpg .psd; compression
    DATA .sql; .csv; .xml .xlsx .xls; proprietary DB formats
    QUANTITATIVE TABULAR DATA .por .sav; .dta; mdb; accb  
    GEOSPATIAL DATA ESRI Shapefile (essential - .shp, .shx, .dbf, optional - .prj, .sbx, .sbn); geo-referenced TIFF (.tif, .tfw); CAD data (.dwg) .mdb; .mif; .kml; .ai; .dxf; .svg
  • Organise files and folders

    Choosing a logical and consistent way to name and organise your files allows you and others to locate and use them. Think about this at the start of your project!

    Why think about naming and versioning?

    • Agreeing on a naming convention will help to provide consistency. This will make it easier to find and correctly identify your files.  You might run into version control problems when working on files collaboratively, so think about versioning early.
    • Organising your files carefully will save you time and frustration and prevent duplication or errors by helping you and your colleagues find what you need when you need it.

    Tips on creating file names

    Names

    • Name folders meaningfully — name folders after the areas of work and content to which they relate and not after individual researchers or students.
    • Be as concise with your names as you can — if possible, no more than 25 characters.
    • You might also think about using a standard vocabulary for file names, so that everyone uses a common language. 

    Consistency

    • When developing a naming scheme for your folders you need to be consistent and stick to it.

    Versioning

    • Use a revision numbering system. Any major changes to a file can be indicated by numbers — for example v01 would be the first version, v02 the second version, and so on.
    • Be consistent throughout your project and specify the amount of digits (e.g. 01).
    • For the final version, substitute the word FINAL for the version number.

    Version control

    • Include a version control table for each important document, noting changes and when alongside the appropriate version number of the document. 

    Dates

    • When using dates, put them in the order YYYY_MM_DD at the beginning of the file or folder name.

    Renaming files

    • There are occasions when you may want to rename large number of files at once. For example, digital images such as photographs' default file names are a string of numbers.
    • Special software to carry out batch renaming is available — for example, the free Bulk Rename Utility (for Windows) or Renamer (Mac). You can also do this in Windows Explorer without additional software. 

    Naming Examples

    • 2013_11_05_Interview_SM
    • 2014_06_02_Survey
    • Methodology_v02 

    Useful videos

    Version Control

    by University of Wisconsin Data Services

    File Naming Conventions

    by University of Wisconsin Data Services

  • Documentation and metadata

    What are documentation and metadata, and why should I consider creating them?

    Why is it important?

    Good documentation makes material understandable, verifiable, and reusable. Just making data available to others does not make it usable or useful. If you or someone else comes back to your data at a later time, they will need this documentation to understand when, why, and by whom the data was created, what methods were used, and explanation of acronyms, or jargon.

    Creating good metadata is part of good practice in Research Data Management.

    What do Funding Bodies expect?

    Research funder requirements now demand researchers to create and make documentation and metadata openly available, thereby facilitating access and re-use to often complex datasets.

    "Research data should be accompanied by high-quality documentation and metadata to provide secondary users with essential information to independently understand the data, enable discovery, and allow for scientific re-use. Documentation should describe at least the origin of data, fieldwork and data collection methods, processing and/or the researcher’s management of the data. Individual data items such as variables or transcripts should be clearly labelled and described." (ESRC Research Data Policy (pdf), Principle 3 Implementation Note, March 2015)

    Please note that Lancaster University also expects researchers to create metadata that is "sufficient to enable other researchers to understand how it was created or acquired, and, if it is to be made openly available, to discover it and assess its reuse potential." (Lancaster University Research Data Management Policy, Word).

    What does documentation include?

    Supporting documentation can include information on:

    • What hardware and software were used to create the data?
    • What methodologies were used to create the data?
    • What assumptions were made in your experiments?
    • Why are there anomalies in your data?

    When should I write research project documentation?

    Much of what you need in project level documentation will have already been included in the project application. Documentation content, such as the aims and objectives of the project, any hypotheses and the methodologies used in the project, can be created even before the project has begun and so should not be very time consuming.

    What is metadata?

    Metadata is often defined as data about data.

    It is related to the broader contextual information that describes your data, but is usually more structured in that it conforms to set standards and is machine readable.  One typical use of metadata is to create a catalogue record for a dataset held in an archive. Bibliographical metadata about a journal article is another example of metadata.

    Metadata commonly includes information such as:

    • Who created the data;
    • Who published the data;
    • An abstract of the data; and
    • A description of the data.

    What is "minimal" or "mandatory" metadata?

    These are basic elements that most researchers consider "must-haves" for data documentation. Most metadata standards have formalised a set of mandatory metadata fields without which the metadata record is not valid. These often include TitleCreatorDescriptionRights or Date.

    Lancaster University recommends using a set of recommended minimal metadata fields to describe data sets:

    Metadata fieldDescription
    Title A name or title by which a resource is known
    Unique resource identifier For your working data, this could be a project ID or a departmental identifier. Once you publish your data the unique resource identifier will be a DOI (Digital Object Identifier)
    Description Description of the data set, like an abstract for a paper
    Subject Subject or classification code describing the resource chosen from one or more authoritative sources
    Creator(s) The main researchers involved in producing the data in priority order
    Funder Sources of financial support for the development of the resource (e.g. ESRC or Wellcome Trust)
    Resource Language Default will be set to 'eng' (English)
    Publication date The date when the data was or will be made publicly available
    Publisher The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the resource. For your working data this will be Lancaster University.
    Contact email address Person or service with knowledge of how to access, troubleshoot, or otherwise field issues related to the data set

    You will need these metadata when you submit your data information into Pure or when you deposit your data set into a data centre as required by your funding body, so it will save you time if you have them ready early in your project.

    Subject–specific metadata

    Many research domains, research data repositories, and funding agencies have specific requirements for metadata and data documentation. Please contact Lancaster RDM support if you have questions regarding your data's metadata.

    Other metadata

    Good metadata (often called "rich" metadata) will provide a relevant context for research data, help track its provenance, and in the longer term, make it easier to find and use research data, and for others to discover it. The above "minimal" metadata fields can be enhanced by including more relevant information such as:

    • Keywords;
    • Version number;
    • Collection dates;
    • Geographic coverage;
    • Provenance;
    • Item embargo;
    • Item MIME type;
    • Related resources; and
    • Access restrictions.
  • Securing your data

    You can mitigate the effects of data loss by backing up and securing your work effectively. Security and backup is part of good practice in Research Data Management.

    Data security

    Storing your research data securely while maintaining ongoing access can be a challenge, especially for large or sensitive datasets. Data security is needed to prevent unauthorised access or disclosure and changes to or destruction of data. 

    Level of security needed

    • The level of security required depends upon the nature of the data. The more personal or restricted the information is the higher the level of security needs to be. With help of the University's information classification document (Word), you can ascertain what level of protection is required when storing information for the purpose of Information Security.

    Security of University filestore

    • Data held on ISS systems are stored in a resilient storage infrastructure.  
    • Access to the University's IT infrastructure and data centres is tightly controlled.  

    Encryption

    • You can use encryption to secure restricted or personal information that is stored on hardware or media and/or while it is in transit (e.g. when sending emails). Encryption can also be applied to various hardware and media – such as Computers, Laptops, Mobile Phones, CDs, DVDs, External Hard Drives, USB memory sticks and other data storing devices. All University provided laptops should be encrypted – you can make a request to the ISS Service Desk to arrange encryption if needed.
    • Recommended encryption standards:
      • For Windows machines or for all drives that are being accessed by Windows operating systems (e.g. external hard drives and USB sticks), BitLocker is the ISS–recommended encryption method.
      • For Macs, the recommended encryption method is FileVault 2, which is included in all versions of macOS since 10.7.

    If you have any questions regarding encryption of your data, please get in touch with the ISS Service Desk who are happy to help.

    Bespoke security settings

    If you are working on a project that deals with particularly sensitive information (e.g. health data), you might like to think about special data security arrangements. Contact ISS to discuss your security needs.

    Data backup

    Digital files may be accidentally lost or corrupted so that errors are introduced or the file becomes unreadable. To avoid corruption of data, the researcher is responsible for ensuring that data are backed up regularly.

    • We recommend to store the master copy of all important research data on the University's filestore systems. All data storage options, including personal filestores (your H: Drive), departmental filestores and research filestores, are backed up each night.
    • We advise you NOT to rely on local solutions such as USB sticks or external hard drives for backup purposes. If you use laptop or desktop computers, use one of the University's filestores as the primary storage option for your data.
    • If you work on your laptop and modify data while travelling, you should update your master copy of your data on its University filestore as soon as you can.

    What about using my laptop or external hard drive to back up data?

    • All data which is stored on devices like laptops or external hard drives or USB sticks is not part of the University backup routine and we do not encourage researchers to use ad hoc solutions or to rely upon consumer devices for backup purposes. If you must, keep your device secure by encrypting the device.
    • If you work on or create new data while you are not connected to the University filestore, please ensure the device it is stored on is encrypted and update (or create) the master copy on the filestore as soon as you can (on campus or via VPN).