Improving ecological (meta)data FAIRness through semantic services: integration of EcoPortal in LifeWatch Italy new platforms

Interoperability
Metadata & Ontologies
Metrics, Certification and Guidelines
Earth and environmental sciences
LifeWatch ERIC


zenodo

Overview

LifeWatch Italy serves as the Italian Distributed Center for the LifeWatch ERIC Infrastructure, contributing significantly to the ERIC's functionality. Focused on biodiversity and ecosystem research data management, LifeWatch Italy enhances data sharing, integration, and analysis through its Data Portal and Metadata Catalogue. Recent efforts by LifeWatch Italy aimed at improving FAIRness involve integrating EcoPortal, a semantic artefacts catalogue, with the Data Portal and Metadata Catalogue.

EcoPortal supports the scientific community in managing semantic artefacts in the ecological domain and employs the Ontology FAIRness Evaluator (O’FAIRe) tool for FAIRness assessment. Challenges we are trying to address include enhancing metadata annotation with FAIR semantic artefacts. The expected impacts of the integration between EcoPortal and the new Data Portal and Metadata Catalogue of LifeWatch Italy encompass easier ecological data discovery, annotation, machine-actionable meta(data), and a push towards Linked Open Data.

 

Context and objectives

LifeWatch Italy (LW ITA) is the Italian Distributed Center of LifeWatch ERIC Infrastructure, and contributes in-kind to the functioning of the ERIC (European Research Infrastructure Consortium). LifeWatch Italy activity focuses on the  management of biodiversity and ecosystem research data, and the development of tools and services for their sharing, integration and  analysis. Among the different offered services, LW ITA provides the Data Portal and the Metadata Catalogue, which are actually at the final testing stage before deploying in production.

The Data Portal is a data repository, based on DSpace, that provides FAIR data and metadata. It helps scientists to share their (meta)data and also to re-use data created by others. The data schema is based on the Darwin Core standard and controlled vocabularies. The metadata schema associated with each dataset is the Ecological Metadata Language profile LifeWatch (EML 2.2.0; Vaira et al., 2022).

The LifeWatch Italy Metadata Catalogue is an information management system based on GeoNetwork 4.2.2, designed and implemented to enable access to several resources from a variety of providers through descriptive metadata, enhancing and promoting the information exchange and sharing among organizations and researchers. The LifeWatch Italy Metadata Catalogue gives access to different resources and their metadata that are based on two main standards:

ISO 19139 (VREs, services, workflows and research sites);

EML 2.2.0 (datasets).

Each metadata profile is organized in sections, which reflect the main information related to each specific resource. Each section contains optional and mandatory metadata elements (Vaira et al., 2022).

One of the objectives of the last implementation was to provide a new version of the Data Portal and the Metadata Catalogue and also integrate both with EcoPortal. The integration aimed to use well defined and agreed concepts/classes for the metadata attributes and its values.

EcoPortal uses an advanced version –collaboratively developed and experimented in FAIR-IMPACT– of the OntoPortal technology. It supports the scientific community in the creation and management of semantic artefacts and their use to harmonize (meta)data. The recent updates of EcoPortal have enabled to improve the metadata schema (MOD 2.0) associated with semantic artefacts and to align it with other catalogues and repositories (Tarallo et al., 2024). In addition, the integration of the Ontology FAIRness Evaluator (O’FAIRe) tool, allows to assess the level of FAIRness of semantic artefacts through a metadata-based automatic FAIRness assessment methodology (Amdouni et al., 2022).

 

Challenges and implemented solutions

The challenges we aim to address involve harmonising metadata and data in our platforms using semantic artefacts, while ensuring that semantic artefacts in use are FAIR so that others, humans or machines, can find, access, interoperate and reuse them.

The integration encompasses the implementation of a REST API connector, through HTTP GET calls, that links the editing tools of Data Portal and Metadata Catalogue with EcoPortal, thus enhancing the management of data and metadata to ensure FAIR compliance.

As for DataPortal, two REST API access points (classes and properties) are used to extract the hierarchy of selected semantic artefacts. Those artefacts are subsequently converted and saved within the platform in DSpace proprietary format, ensuring a continuously performing system. The update of the semantic artefacts is scheduled to be performed every night.

This integration allows users to select the value of the attributes for the metadata schema directly using semantic artefacts published within EcoPortal, thereby facilitating the autocompletion.

The interaction with EcoPortal consists in searching for concepts/classes within the editing interface and subsequently retrieving information such as label, definition, URI, etc. This process, implemented for both the Data Portal and the Metadata Catalogue, facilitates the unambiguous annotation of metadata with concepts and classes from FAIR semantic artefacts.

Figure 1 shows the workflow of semantic annotation within the Data Portal, in particular for the “Keywords” attribute, which is followed for other attributes like “Software” and “Protocol”. Users are able to access (Fig. 1a, b) and search (Fig. 1c) concepts and classes from semantic artefacts published within EcoPortal through a wizard. Users can select concepts and classes that are then reported, with their URI, inside the “Keywords” element (Fig. 1d).

A similar approach is followed for the attributes of the data table but it differs from the previous one because the definition of concepts or classes is also retrieved and is reported inside the element set as shown in Figure 2.

 

 

Data Portal wizard. Keywords to compile
Figure 1a. Data Portal wizard. Keywords to compile

 

 

 

Data Portal wizard. Point of access for semantic artefacts published within EcoPortal
Figure 1b. Data portal wizard. Point of access for semantic artefacts published within EcoPortal. 

 

 

 

 

Example search for keywords
Figure 1c. Data Portal wizard. Example search for keywords. 

 

 

 

keywords filled in.
Figure 1d. Data Portal wizard. Keywords filled in. 

 

 

 

example of data table's
Figure 2. Example of data table's attribute compilation within the Data Portal. The "attribute definition" is retrieved directly from EcoPortal. 

 

On the other hand, the process to obtain the semantic annotation within the Metadata Catalogue is shown in Figure 3. Users can search for concepts and classes directly in the search box (Fig. 3a). These fields are configurable by administrators to retrieve data from external resources (i.e. EcoPortal), enabling the autocompletion function and the connection to EcoPortal, which allows the search of controlled terms.

Start enhancing the field and as you type the text, the list of occurrences is filtered and the possible value is suggested into the field (Fig. 3b).

 

 

 

metadata
Figure 3a. Metadata Catalogue wizard. Keywords to compile.

 

metadata
Figure 3b. Metadata Catalogue wizard. Drop-down list of concepts and classes. 

 

 

Expected/measured impacts

 

The main impacts we expected to generate thanks to the integration of semantic artefacts with LifeWatch Italy Data Portal and Metadata Catalogue are:

 

  • Make ecological (meta)data discovery more easy;
  • Make annotation of ecological (meta)data easy and quick  to perform for users;
  • Make meta(data) machine-actionable and machine-interpretable;
  • Push towards Linked Open Data

 

The use of semantic artefacts for the annotation of meta(data) promotes their harmonisation and integration.

 

Reference materials

 

Amdouni E., Bouazzouni S., Jonquet C. (2022). O'FAIRe makes you an offer: Metadata-based Automatic FAIRness Assessment for Ontologies and Semantic Resources. International Journal of Metadata, Semantics and Ontologies, 16 (1), 16-46. https://hal-lirmm.ccsd.cnrs.fr/lirmm-03630233 

Tarallo, A., Pulieri, M., Ramezani, P., & Rosati, I. (2024). Advancements in EcoPortal: Enhancing functionalities for the ecological domain semantic artefacts repository. FAIR Connect, 2(1), 1-7. DOI: 10.3233/FC-240002

Vaira, L., Fiore, N., & Rosati, I. (2022). LifeWatch ERIC Application Profiles (Version 1). LifeWatch ERIC. https://doi.org/10.48372/8528-9Z45

 

 

 

 

 


Contributors

Ilaria Rosati
Ilaria Rosati
Enrica Nestola
Enrica Nestola
Martina Pulieri
Martina Pulieri