What licence should I use?
Brief introduction to licences in the research process
Creating a research compendium for a project and making it publicly available for others to see is necessary but not quite sufficient for making the research project reproducible. The issue is that, by default, the author retains all rights to the research compendium. This limits what others can do, such as making a copy to reproduce the results of the project. The solution to this problem is to attach a license to the compendium to allow others to use this research artifact in a way that enables reproducible research. There are a great many open licenses that are used for this purpose, and the goal of this section is to outline what types of licenses can apply to which types of projects. In all cases, using a license consists of placing a file called LICENSE in the top level of your research compendium with the standard content of the license.
Licences are a complex topic. The quite comprehensive Turing Way chapter on Licences gives a broad overview of licenses and their role for reproducible research. Rather than try to summarize this complex topic, the goal here is to give some practical advice on how licensing interacts with a research project. The Choose a License tool can help if you’re unsure about picking a license.
Note that licensing is a legal concept, so the details can vary by location and, to the extent that your employer retains the copyright to your work, may be decided by institutional policy.
Picking a license for a research project
In most cases a research compendium is a collection of scripts along with some documentation to take data and produce research results. A common license for this type of project is one of the Creative Commons licenses. These are general-purpose licenses for any copyrightable work (except software) that enables authors to grant public permissions to use their works. These licenses are also useful for research papers (e.g., articles are licensed as CC BY-NC 4.0 for the Journal of Official Statistics and CC BY 4.0 for the Journal of Open Source Software) and data sets. There are other licenses in this space, and some very colourful ones, but the Creative Commons licenses appear to be the most common.
Picking a license for research software
Software licenses are different than the general-purpose Creative Commons licenses. Without getting into the details, the practical difference is that making a piece of research software generally available for others to use requires releasing that software with an open-source software license. For example, releasing an R package on CRAN requires that the package has one of the standard open-source licenses. Some research projects will have software as part of the research project, and consequently there may be different licenses for different parts of the project.
In contrast to the general-purpose licenses, there are two flavors to software licenses: permissive and copyleft. The MIT license is a popular permissive license that puts minimal restrictions on what someone can do with a piece of software. By contrast, the GPL, probably the most well-known copyleft license, prevents open-source software from getting repackaged as non-open software. For most research projects there is little practical difference between these licenses, as they all fulfill the purpose of making research software accessible to others. The R Packages book has an excellent chapter on software licenses for research software.