Skip to Main Content
University of York Library
Library Subject Guides

Research data management: a practical guide

File formats for your data

A practical guide to help you manage your research data well, covering best practice for the successful organisation, storage, documentation, archiving and sharing of research data.

file formats

Future proof your file formats

It is important to give some thought to the life of your research data after your project is completed and what file formats will enable long-term accessibility and reusability.

File format considerations

When you create digital research data its file format is often dictated by how you choose to collect, analyse and store your data, by the hardware being used or the availability of software. It can also be determined by discipline-specific standards and customs.

The file formats used when collecting and working with research data are not always ideal for long-term accessibility and reusability. You should therefore consider:

  • What formats are easiest to share
  • What formats are easiest for others to reuse, both now and in the future
  • What formats are at least risk of obsolescence as new versions are produced
  • What formats include appropriate levels of metadata - many formats have metadata embedded that is essential for the future reuse of the data
  • If you are planning on depositing your files in a data repository, what file formats will they accept.

You may find that you need to use one format for your own data recording and analysis and another for data archiving and sharing. Where possible, converting your files to open, non-proprietary formats that can be used by any operating system to maximise accessibility and interoperability.

Data repositories

Many data repositories specify the file formats they will accept. These are formats chosen by the repository to help it keep data usable over the long term.

The file formats recommended by the UK Data Service and the data requirements table of the Archaeology Data Service, for example, offer useful lists of preferred formats that can help you to choose the file formats to use.

Native Google files

Native Google files (e.g. Google Docs and Google Sheets) will need to be downloaded into different file formats before deposit with a data repository. You should ensure that the file format chosen adequately and accurately captures the content of the item, e.g. that calculated values in spreadsheets are retained or comments within documents captured.

It's important to carefully consider the file formats you are going to use as conversion to different file formats will add to your workload. 

Software and applications

You should also consider the software and/or application you are using to present your research data.

If you are using specialist new and/or expensive software then you should consider reformatting it to a more user-friendly application, if feasible.