Skip to Main Content
University of York Library
Library Subject Guides

Research data management: a practical guide

Software/computer code

A practical guide to help you manage your research data well, covering best practice for the successful organisation, storage, documentation, archiving and sharing of research data.

software and computer code

You may develop your own software to generate, process and/or to analyse research data. This software may be necessary to enable others to verify your findings or reproduce your methodology, and may need to be made available to others alongside your research data.​​​​

The University encourages researchers to follow best practice in the management, archiving and sharing of software written in the research process.


Click on each tab below to see guidance on the things you need to consider when you manage software.

DMPonline logo

Develop a software management plan

Developing a software management plan can help you to manage and publish your software easily at the end of your project.

DMPonline (see planning your data management) includes a template from the Software Sustainability Institute.

Use a code repository

To manage your code it is recommended that you use an online code repository when developing your software. A code repository will provide version control, code review, bug tracking, documentation, user support and other features. Code repositories can be either private or public, which means that your code can be maintained during a closed development phase and then released for open use at an appropriate point (e.g. if you want your code to be used and developed by other researchers / developers).

The University has a licence for and supports GitHub. More information can be found on IT Services' access to revision control services page.

If the software is needed to validate research results, you should use a data repository to archive and share it. Click on the Data repository tab for more guidance.

Document and describe your software

Software documentation is the contextual information that describes the software and provides others with information on how the software was created, how it operates and how it can be used. You can record documentation by embedding it in the source code or by including text(s) to accompany the software.

As a minimum it is recommended that you include a README file with your software. If you use GitHub, a file with the name README will be displayed on your repository's top-level page, surfacing your README to repository visitors. GitHub provides guidance on adding a README file to your repository.

A clear link should also be established between the shared research data and the shared software/. This can be achieved by including a link or a DOI to the code in a dataset's metadata record.

You can find further guidance on documentation and metadata on the describing your data page.

note with text University guidance on intellectual property

Clarify copyright and ownership of software

Before you can share (and licence) software you need to ensure that you have the right to do so. It is therefore vital to clarify copyright and intellectual property ownership of the software you will use before your research begins.

You should be mindful that although you are the developer of a given piece of software, you may not have the right to do whatever you want with it:

  • copyright is likely to reside with your employer (the University) or a hiring party
  • co-developers and owners of software also have rights that need to be taken into account.
    • Using elements of third-party code needs to be carefully documented, as the licence terms for use of this part of the code will also require consideration before publishing.

University guidance: The University's policy on intellectual property and guidelines are intended to address both the rights and property aspects of IP generated within the institution.

The University owns the software created by staff in the course of their employment; unless otherwise stated within a research contract. However, researchers have discretion to choose the most suitable publication (and licensing) option.

Contact: Commercialisation team

Identify if there are valid reasons not to share software openly

Valid reasons may exist for not making the software openly available. For example:

note with text University support for commercialising research

The software you develop might have commercial potential. It is worth considering if significant impact could be potentially achieved through companies having access to the software. This route does not prevent the publishing of the software for research and non-commercial purposes. With the support of the commercialisation team it is possible to dual-licence the software for both non-commercial and commercial use.

If you think your research might have commercial potential, you should consult with the Commercialisation team before you share (and licence) software.

The software you develop might have ongoing research value. For example, where the development of the software has required significant intellectual investment, and where making it openly available would seriously compromise your future research. In such cases, it may be possible to make the software available on an ad hoc basis, subject to a non-disclosure agreement.

You may not be the owner of the software you develop or improve. If you develop or improve third-party software, making the code you have developed openly available is likely to be constrained by the commercial interests of the software owner. The software owner may permit you to make the software available to specific individuals subject to a non-disclosure agreement, so that other researchers can verify your published findings. At the outset of your research, you should clarify what you will be permitted to do by the software owner and/or the original licence.

You can find further guidance and other reasons not to openly share on the restricting access page.

Archive and share software in a data repository

Versions of software that are needed to verify published results should be archived and shared in a data repository. A DOI will be issued for each new version, which will enable users to easily find and cite a specific version of your software.

If you use GitHub, you can push a release to the data repository Zenodo in just a few clicks. It's also possible to connect figshare (a generalist repository) with your GitHub account to import and published your software and code there. If your code is on a GitHub public repository, the Software Heritage Archive may have saved it automatically. This archive aims to collect, preserve, and share all software that is publicly available in source code form.

The Software Sustainability Institute provides comprehensive software deposit guidance for researchers.

Ensure that your published software is covered by an appropriate licence agreement

A licence is how you explicitly give someone else permission to use that work. In addition to communicating a work's terms of use, a licence also provides protection to the creators and owners of intellectual property.

Please note that Creative Commons licences are not suitable for software.

Open Source licences place no restrictions on use and minimal restrictions on the distribution of derivative works, as such they are an important tool for making software accessible and reusable.

If you believe that companies may be interested in using the software you have developed please consult with the Commercialisation team so they can support you with choosing a suitable licence.
 

Open Source Initiative logoOpen Source Initiative approved licences

The vast number of licences available can cause confusion. Choosing an Open Source Initiative (OSI) approved licence is recommended.

Popular OSI approved licences: Apache-2.0  I  BSD License(s)  I  GNU General Public License(s)  I  The MIT License 
 

Choose an open source license is a useful site which explores licensing options for software. tl;drLegal provides the resource software licences in plain English.

Guidance on licensing software is available from the Software Sustainability Institute.

Enable others to cite your software

You should ensure that it is possible for other people to cite your software. You may wish to write a plain text file with citation metadata, e.g. a Citation File Format (.cff), which is then distributed with your software. GitHub and other version control platforms are increasingly offering tools for making your software easier to cite, including assigning DOIs to your code. For further guidance see DataCite's software citation workflows.

Cite others' software that has played an integral part in your research

It is important to give credit to other researchers for software they have developed which you have used in your research. Moreover, some software publishers may require the use of software to be acknowledged or cited in published research outputs.

In general, software should be cited in a similar manner to research data and research papers. Many software packages (and authors) give guidance on how they want to be cited, this is typically found in documentation for the software. If you are writing for publication and guidelines from your publisher or citation styles exist follow them, or check with your editor. If no guidance exists, various organisations have been working to develop guidelines for software citation; examples include:
   - Force 11 software citation principles
   - DataCite metadata schema 4.1 with new additions to describe software and example software citation
   - Software Sustainability Institute how to cite and describe software.

Clarify funder terms

The terms “open source” and “open access” are sometimes used interchangeably (including by funders) but it’s important to note that they are different concepts.

Open source refers exclusively to software.

“Open source doesn’t just mean access to the source code. The distribution terms of open-source software must comply with” The Open Source Definition.

Put simply, open source software must be fully open to reuse.

“By “open access” to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself.” - Budapest Open Access Initiative

The term open access is typically applied to research publications, for which the University has a policy. When applied to software there are options to share source code, to make it available, but to license it in a way which does not permit full “open” reuse.

Funders may have their own definitions of these concepts. Therefore you should clarify with your funder when their terms require the software to be "open source" or “open access”.

Software publishing checklist

less-than sign, forward slash, more-than sign

Deciding if software should be published

To help you make your decision, you should consider the following questions:


What do the software terms in your research contract require you to do?
Many terms require the software to "be made openly available with as few restrictions as possible". However, it's widely accepted that a requirement to share software does not equate to "open" access to software in every case. There may be valid reasons for placing restrictions on who or how the source code can be used. Click on the Restricting access tab and read the guidance provided on this page.

Consider the likely users of the software you have developed. Are any companies likely to be interested in using it? If so we recommended that you discuss this with the Commercialisation team before choosing a licence and publishing the software.


Do you need to publish your software source code in order to enable others to verify your published findings or reproduce your methodology?
If yes, you should publish the software source code no later than the publication of the research findings. Delaying publication (e.g. applying an embargo) or restricting access to the software source code can often be justified if there is a need to protect the intellectual property pending commercial exploitation.


Are you required to publish your software source code?
For instance, to meet a policy requirement. If your research is funded, you should check the policy requirements of your funder. For example, the Wellcome Trust includes software alongside data in its Data, software and materials management and sharing policy.


Does the software you develop have commercial potential?
You should consult with the Commercialisation team before you publish (and licence) software.


Do you own all aspects of the software source code and do you have the right to publish it?
If the answer is no, you must obtain permission from the owner(s) of the software before you can publish (and licence) it. If you are unsure click on the Ownership and rights tab and read the guidance provided on this page, which points to the University's guidance and policy on intellectual property.


Have you identified an appropriate licence for your published software?
If you haven't, click on the Licensing tab and read the guidance provided on this page; and if necessary contact either the Open Research team (non-commercial enquiries) or the Commercialisation team (commercial enquiries) if you have any further questions about protecting and licensing your software.


Is your software ready to publish?
You should ensure that your software is clearly and consistently formatted, organised and documented, and is packaged with everything necessary to enable others to read and execute it.


The Software Sustainability Institute lays out the importance of archiving software in its briefing paper Digital preservation and curation - the danger of overlooking software.

Software Sustainability Institute logo

The Software Sustainability Institute was founded in 2010 as the first organisation in the world dedicated to improving software in research.

The Institute provides guidance and resources to assist with the development, management and publication of research software, including guides for researchers and guides for developers.

University support

Research Coding Club

The Research Coding Club is an informal group for people who work with research software. The Club meets every two weeks offering drop-in code clinics and seminars. You can subscribe to the Club's mailing list and a Slack channel to get help and advice.


Research Software Engineers

IT Services' Research Software Engineers offer specialised software support and expertise to academics. Additional guidance on facilities, support and training can be found on the Research Computing Support wiki pages.

Society of Research Software Engineering

If you regularly develop research software you may wish to consider joining the Society of Research Software Engineering.

The term 'software' is used throughout this page and represents both software and computer code.