Skip to Main Content
University of York Library
Library Subject Guides

Research data management: a practical guide

Describing your data

A practical guide to help you manage your research data well, covering best practice for the successful organisation, storage, documentation, archiving and sharing of research data.

describing

Writing documentation to accompany your research data will help you understand your data and how you derived it, it will also help other researchers to interpret and reuse your data.

Data documentation is the contextual and descriptive information required to make data discoverable, understandable, accessible and reusable. Without it future users, including yourself, may not be able to interpret your data.

Examples: a description of methodology, dataset structure, database schema, data dictionaries, user guides, laboratory notebooks, protocols, a codebook (with information about the study, files, and variables), questionnaires, interview schedules, equipment settings, instrument calibration, references to related data/sources or to software/computer code.

Why you should document your data

Good documentation:

  • helps you to understand your own data when you need to come back to it.
  • enables you to find and use your own data quickly and easily.
  • helps in the sharing of data with others - data repositories ask for documentation and/or metadata about the data for deposit.
  • provides context to minimise the risk of misunderstanding or misuse.
  • is essential to the longer term preservation of data as a record of provenance, licensing and access arrangements.
  • is required to make data FAIR:
    • Findable: keywords, DOIs, controlled vocabularies, repository metadata standards, metadata exchange.
    • Accessible: metadata is openly available (wherever possible) even if the data isn't.
    • Interoperable: standards and controlled vocabularies.
    • Reuseable: describing data - What? Why? When? Where? - so that others can understand it.

When

When should you create documentation?

You need to think about the documentation you might need for your data at an early stage of your project and while data collection and analysis is being carried out. It is much easier to record relevant information about your dataset when it is fresh in your memory to ensure that key details are not forgotten.

If you plan to deposit/share your data with a data repository, you should contact the repository in the early stages of your project to discuss their metadata and/or documentation requirements.

What

What should you document?

There are no hard and fast rules. You should aim to provide enough information so that someone working in your research field can understand and reuse the data without having to contact you. At a minimum, your documentation should include information on who created the data, how the data was gathered and used, and for what purpose.

Try to look at your data with a fresh pair of eyes and imagine trying to reuse it in five or 10 years time. Think about the level of information you might need in order to fully understand it.

The level and type of documentation you will need will depend on a number of factors:

  • Standards within your discipline or subject area. Identify metadata standards for your discipline using:
  • The nature of your research and the types of data that you are creating or collecting
  • The reuse potential of your data. If you envisage your data being widely reused beyond the period of your research you may need a higher level of documentation to enable this.
  • The retention period for your research data. If your data is to be kept (and remain usable) for a long period it is more likely that as time progresses, more documentation will be required to make sense of it. Knowledge that we may assume today, may not be so obvious in twenty years time.

You understand your research data better than anyone else, so you are best placed to make the final decision about the level of documentation needed.

The UK Data Service Document your data web pages give more detail on documentation and metadata.

Data can be described at different levels

Utrecht University: The ins and outs of metadata and data documentation

A video about metadata and documentation, with advice on integrating metadata into spreadsheets and an overview of the types of metadata.

Metadata (or data about data) is a structured and machine-readable description of data. Metadata often follows international standards that serve specific research disciplines, data types or purposes.

A good example of metadata are the bibliographic records on YorSearch, the University Library catalogue.

Data repositories expose descriptive metadata (title, creator, organisation, abstract, keywords etc.) e.g. the University exposes descriptive metadata about York datasets through the York Research Database.

Where

Where should you record your documentation?

Documentation can be in any form that is appropriate to your research and the dataset it describes.

Documentation can be recorded:

Whatever form your documentation takes, you need to ensure that it is accessible alongside your data if anyone needs to use it to interpret your results. For example, a readme file - is typically - located at the root of your dataset.