License your data
Plan licensing early
As a data creator, you should clarify ownership of, and rights relating to, research data before a project starts. If you collaborate on a project there might be multiple rights holders and all have to give permission for data sharing.
Types of licenses
There is a spectrum of permissions that can be assigned to licensing data for use, re-use, or distribution. The least restrictive license states that anyone can use, reuse, share and re-distribute the data for any purpose and without attribution. In essence, you waive your claim to the copyright. Restrictions that can be added to a license include:
- Attribution — If you allow use or re-use of the data, you will need to decide if the license will require the inclusion of attribution or acknowledgement in resulting output.
- Derivative Works — Will you allow re-use of your data in the creation of derivative works, and if so with or without attribution? You can stipulate that any derivative work(s) must be licensed under the same parameters as your data; this is called ShareAlike (or SA).
- Commercial/Non-Commercial Use — You might like to allow re-use of your data for commercial use or for non-commercial (NC) use exclusively?
Various forms of license exist, ranging from standard licenses like Creative Commons to bespoke restrictive data licenses. Standard licenses are available which will be suitable for most research projects but you need to make yourself familiar with the options available. Here are some examples:
- Creative Commons provides a range of licenses that are used widely but were not designed for data specifically. The CC BY-NC-SA licence (Creative Commons Attribution-NonCommercial-ShareAlike) lets others share, remix, tweak, and build upon your work non-commercially, as long as they credit you and license their new creations under the identical terms.
- Open Data Commons (ODC) offer two licenses which are better suited to research data than Creative Commons.
- Open Government Licence (pdf): Suitable for UK public sector databases and datasets.
The majority of these licences have been designed to enable open sharing of data. If you are working with a dataset that cannot be made available under an open licence (eg it contains 3rd party rights that cannot be made freely available), you will need to produce a Data Transfer Agreement tailored to the specific needs of your work.
Open Source Software Licenses
All open source licences allow other people to take the source code for your software and modify it, as long as they give credit to you as an author. Popular and widely used open source software licenses are:
- Apache License 2.0
- BSD 3-Clause "New" or "Revised" license
- BSD 2-Clause "Simplified" or "FreeBSD" license
- GNU General Public License (GPL)
- GNU Library or "Lesser" General Public License (LGPL)
- MIT license
- Mozilla Public License 2.0
- Common Development and Distribution License
- Eclipse Public License
Please note that all open source licenses will allow the commercial use of your code. If you want to maintain control about commercial use of the software you can consider multiple licensing (DCC guide).
The Digital Curation Centre (DCC) has published an extensive guide on how to license data:
Ball, A. (2012). ‘How to License Research Data’. DCC How-to Guides. Edinburgh: Digital Curation Centre.
Another good overview including a table of licence types:
Korn, N., Oppenheim, C. (2011). ‘Licensing Open Data: A Practical Guide’ (pdf).