EOSC Future COVID-19 Metadata Findability and Interoperability

Partners: Instruct-ERIC, EuroBioimaging

Project description

Although useful resources for data handling/integration are being developed, their visibility to the scientific community is limited due to lack of proper platforms that support their visibility. Additionally, long-term sustainability of resources is required (even if the projects are completed!). This is why this science project brings together the social science and humanities and life sciences clusters to assess the metadata schemas and domain catalogues to develop a strategy to map and relate the different schemas.

This project focused specifically on data objects related to COVID-19:

reviewing models for automatic raw and intermediate data preservation and sharing, including software and analysis workflows;
identifying and share best practice;
linking the catalogues with the emerging EOSC resources and the EOSC interoperability framework.

This work aligned with, and enriched, the data currently being exposed within the European COVID-19 data portal.

Technical challenge

Data silos emerged as a result of the huge amount of biological and chemical data generated by different scientific community (EU-OpenScreen, ChEMBL, Work Package partners, etc.). Therefore, there is a need to FAIRify the data to understand disease/medical conditions as a whole. The task in this project was to create a workflow for FAIR data and thus enable data integration/harmonisation. Additionally, such data are subjected to visualisation with semantic networks known as Knowledge Graphs. The KGs can further be used for making scientific queries or perform downstream analyses. In this project, reproducible workflows were created which enable time and cost effective KGs. In this process, APIs from public databases (ChEMBL, Uniprot and so on) were used to enrich the KG.

The EOSC Future added value

EOSC provided an environment to collaborate with experts from different domain
The resources were available/hosted in EOSC marketplace and can be found easily
Service providers could see the latest information about total views/downloads, which help to monitor the impact of specific resources
The resources were created using EGI-Notebooks which is a horizontal service in EOSC (This is very useful because the developers do not have to install any software/tools in their local pc. Everything can be done on a personalised and secure remote server)

Main results

Resources for in-silico methods towards drug-repurposing (use case: Mpox KG)
Characterisation of novel fragments using KGs and fingerprint analysis
Alignment with similar resources developed by others (eg: Disease Maps in WP5 from BY-COVID)
Enriching KGs with resources developed by other partners

Other resources

The codebases for the KG resources are maintained at:

Publications:

Poster: https://doi.org/10.5281/zenodo.7990992

KG hosted at NDex: https://doi.org/10.18119/N9SG7D