Research data on the cloud

Subject:  Multidisciplinary
Several servers and, above, a diagram of a network connected to the Cloud

Rosa Padrós is attending the EOSC Stakeholder Forum to find out the latest about a European tool for retrieving open-access research data.

The file capacity on your mobile phone or computer is measured in megabytes and gigabytes. In big data, a very different unit of measurement is used, the petabyte (1024 terabytes) equivalent to 13.3 years of HD video. How does one manage information in this labyrinth of data? This is one of the challenges of open science: how can one manage the data produced during a research project?

This is the context in which the EOSC has been created, a European infrastructure conceived as the "system-of-systems", because it agglutinates different functions: "research infrastructures, e-infrastructures & service providers". The European cloud "will leverage on existing capacities and support reuse of scientific data across disciplinary, social and geographical barriers".

«Not just publications, not just data, and not just in Europe»

These are some of the ideas highlighted by Rosa Padrós, a research support librarian and co-manager of the O2 repository, who attended the EOSC Stakeholder Forum from 28 to 29 November. This was a meeting held between scientific stakeholders (members of assessment agencies, research managers, researchers, technicians and librarians) to learn more about the EOSC.

The 3 Os: Open Innovation, Open Science and Open to the World

Within the framework of the European H2020 programme, the projects funded with European pubic funds must publish in open access both the publications and the data generated during the research process, such as the answers to a survey or the tests carried out as part of an experiment. "A paradigm shift is taking place, with a transition from open access to open science. The phrase 'open innovation, open science and open to the world' defines it very well. Not just publications, not just data, and not just in Europe".

What is a data management plan (DMP)?

The researcher must draw up a DMP, a document which describes the life cycle: "How the data will be identified, how they will be described, who they will be shared with, and which will be deposited in open access and which won't. We must not forget the principle as open as possible, as closed as necessary: we must publish with the least number of restrictions while protecting sensitive data".

The libraries, driving forces for change

The latest edition of the Horizon Report, Library Edition, highlights research data management as one of the most immediate trends for academic and research libraries. As the report says, "data generation methods, and the capacity to store them in large quantities, are growing constantly".

«There is work to be done, it's a challenge and the Library has a key role to play»

But what challenges are involved for the Library in publishing open access data?

"Until now, our job was to ensure compliance with the open access mandates for scientific publications, articles, chapters... but now we also have to provide support to researchers in publishing their research data, with all the complexity that this entails (diversity of formats, file volume, etc.). The challenge is to include these new mandates within the libraries' work dynamic, and training is vital for this: we must learn new metadata vocabularies, new standards, implement data management policies, know and use tools other than institutional repositories, etc. There is work to be done, it's a challenge and the Library has a key role to play".

What are the benefits of sharing data in open access?

"There are many but I could highlight two: it helps increase research impact and visibility and it fosters collaboration between the data's users and their creators. For example, if you are doing a scientific experiment and it doesn't work, sharing the data used is fantastic because it will allow other researchers to take these results as a starting point for performing new tests, reducing duplication in data collection and the associated costs. This is the philosophy of open access".

She says not to forget that the UOC Library has a support service to help draw up the data management plan and information for choosing the most suitable repository for depositing the data.