How to manage and publish research data

Benefits of managing and sharing research data

  • Validation of results.
  • Data localization and comprehension. 
  • Reduces the duplication of data collection and the costs involved.
  • Complies with the requisites of calls for research.
  • Promotes scientific debate.
  • Promotes innovation and new potential uses of the data.
  • Encourages collaboration between data users and creators.
  • Increases research impact and visibility.
  • Increases your reputation when other people cite your work.

Data Management Plan

What is a DMP?

A data management plan (DMP) is a formal document that must describe the data life cycle both during a research project and when it has been completed. The DMP's objective is to consider aspects such as the methodology and standards to be used for data management and how they will be shared, curated and preserved in the future.

The data management plan is not a fixed document, but evolves during the lifespan of the research project. The specifications for developing a DMP for a H2020 project can be consulted at: Guidelines on Data Management in Horizon 2020 (Annex 1 and 2). Horizon 2020 currently requests depositing and preservation of those data (including the associated metadata) that are necessary to validate the research results presented in scientific publications.

 

How is a DMP produced?

Tools and resources that are available and will help you draft your DMP:

 

Data management plan template

This document was prepared by the CSUC's Working Group to Support Research

 

Where can I find DMP examples?

Successful cases of data management plans developed with DMP Online or other tools.

Working with the research data

How are the data described? Datasets, standards and metadata

The data's description must include the information required to understand and analyse our data and/or reproduce the results in 20 years' time.

  • Datasets: each dataset must be referenced and named. The description of each dataset should include the following information:  
  • Origin of the data: whether the data are generated within the project or are collected. If the data are collected, indicate the source they have been taken from.
  • Typology and format of the research data (observational, experimental, computational, etc.).
  • Standards: The metadata standard that will be used must be identified.
  • Description metadata: the metadata should answer questions such as:
  • What are the data?
  • Who can use them?
  • When can they be used?
  • How can they be used?
  • For what purpose can they be used?
  • Where can they be found?
  • For how long will they be available

 

What are the legal aspects concerning data protection?

The protection of personal data includes the protection of people's basic rights and freedoms applied to a RDI project, and their protection against possible use by unauthorized third parties

 

What are the ethical aspects concerning data protection?

Ethical aspects concern the data that can be shown, the time spent and the anonymity of the people involved, respecting dignity and integrity in order to guarantee privacy and confidentiality.

 

Resources and related documentation:

 

Under which licence can you publish your data?

The document  Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 states:

 

"As far as possible, projects must then take measures to enable third parties to access, mine, exploit, reproduce and disseminate (free of charge for any user) this research data. One straightforward and effective way of doing this is to attach Creative Commons Licence (CC-BY or CC0 tool) to the data deposited.

 

Further information can be found at:

 

​How are data cited?

DataCite establishes that data must be cited in the same way that we cite other bibliographic information sources, such as articles or books.

Citing research data will enable you to:

  • Easily reuse and verify the data.
  • Monitor the potential impact of the data.
  • Create an academic structure that acknowledges and rewards the data producers.

 

Structure templates:

  1. Creator (Year of publication): Title. Publisher. Identifier
  2. Creator (Year of publication): Title. Version. Publisher [Type of resource]. Identifier

​Note: The identifier refers to the permanent DOI, handle or URL (preferably linkable).

 

Examples of data citation (source DataCite):

Irino, T; Tada, R (2009): Chemical and mineral compositions of sediments from ODP Site 127‐797. Geological Institute, University of Tokyo.

Geofon operator (2009): GEFON event gfz2009kciu (NW Balkan Region). GeoForschungsZentrum Potsdam (GFZ).

Denhard, Michael (2009): dphase_mpeps: MicroPEPS LAF‐Ensemble run by DWD for the MAP D‐PHASE project. World Data Center for Climate.

Publishing in open data journals

The last few years have seen a growing interest in publishing research data in open access, the idea being to increase research's transparency, visibility and impact, and to guarantee that the data can be freely accessed, preserved, exploited and reproduced.

This is the context that has spawned data journals. There are two types of data journal, allowing authors to publish their articles in two different ways:

  1. Publishing data as a data paper: these journals only publish data in data paper format. This is a new publishing format based on datasets.
     
  2. Publishing data together with an article (enriched or enhanced publication): these journals present data and articles side-by-side. Usually this type of journal doesn't collect complete data, instead providing links within the articles to specific data repositories where you can find the data.

Examples of this type of publishers include:

Have a look at the following compilation of data journals according to subject area, access type and scientific impact, based on the most relevant international indexes.

 

Where can research data be deposited?

To select a repository where the research data can be deposited, we recommend you take into account the following considerations:

  • Thematic field (there are multidisciplinary and thematic repositories) and geographical scope
  • Identify which type of data you have (software, images, raw data, etc.).
  • Identify whether the data are open, embargoed, restricted or closed.
  • Take into account the approximate size of the data files.
  • Take into account the licence under which you want to disseminate the data.
  • Identify whether it is necessary to use permanent identifiers (DOI, Handle).

It is important to use european resources to be sure that the current General Data Protection Regulation (GDPR).

 

​What are the most notable multidisciplinary data repositories and their characteristics?

Multidisciplinary Research Data Repositories:

Data repositories and data portals focused in social sciences:

You can also check the comparative chart for open access data carried aout by Library of the Autonomous University of Barcelona or the list of repositories elaborated by the Library of Erasmus University.