PIDs as a cornerstone in actualising the FAIR principles within the LifeWatch infrastructure

PIDs
LifeWatch ERIC


zenodo

LifeWatch ERIC is a European Research Infrastructure Consortium dedicated to advancing e-Science research in biodiversity and ecosystem studies, supporting global sustainability challenges. Our mission involves uniting diverse scientific communities and creating a cutting-edge e-Science Research Infrastructure by connecting distributed observatories and research centers into unified accessible online platforms. A main component of our infrastructure, the LifeWatch ERIC Metadata Catalogue, underpinned by GeoNetwork, streamlines the management of metadata for various resource types, including Datasets, Research Sites, Services, Virtual Research Environments, and Workflows. It offers advanced search and user management functionalities, facilitating resource discovery and access control. Furthermore, we leverage modern semantic technologies, offering a transformative approach to comprehensively describe and interconnect diverse data sources, reducing barriers to data exchange among researchers. To achieve this, LifeWatch ERIC's EcoPortal plays a crucial role in ecological research, providing a Semantic Artefact Repository that consolidates core ontologies, domain-specific vocabularies, and reference lists. It offers essential services to facilitate seamless discovery and integration, exemplifying our commitment to advancing scientific collaboration and knowledge management.


Persistent identifiers (PIDs) are fundamental in actualising the FAIR principles within our infrastructure. EcoPortal facilitates the acquisition of Digital Object Identifiers (DOIs) for hosted resources via Datacite services, concurrently offering the capability to specify authors and contributors through ORCID iD integration. Additionally, PIDs are instrumental in affiliating institutions through the Research Organization Registry (ROR). Within the LW ERIC Metadata Catalogue, registered users seeking to create new resources (Datasets, Virtual Research Environments, etc.) must select the resource type and template, and provide mandatory metadata. This Catalogue facilitates the generation of DOIs for resources lacking them, leveraging the DataCite connection, following validation and verification. When it comes to workflow submissions, the challenge arises not only in assigning PIDs to the workflows themselves but also in allocating unique identifiers to each constituent service. This granular aspect of PIDs presents a pivotal consideration.

Challenges that need to be addressed

Managing dataset granularity in the Virtual Research Environment involves a critical question: how to handle the provenance of newly composed datasets? This challenge centers on deciding whether to assign DOI/PIDs to the entire dataset or its constituent provenance subsets. Another intricate issue is version granularity, which encompasses versioning of the entire artefact as well as individual entities, necessitating a clear linkage between deprecated and valid entities. Furthermore, we must establish a PID policy that aligns seamlessly with the EOSC Policy, ensuring compliance with essential standards.In response to these challenges, we're actively developing LifeBlock, a blockchain-based prototype. This innovative solution tracks all activities related to specific research objects, aiming to automate PID management through blockchain technologies. With LifeBlock, we're committed to enhancing our infrastructure's robustness, enabling more efficient management of complex datasets and their associated PIDs

Expected impact of the Use Case

PIDs are essential for tackling challenges associated with name changes, ensuring unambiguous references to individuals, and accurately attributing credit throughout a researcher's career. They also eliminate location dependency for digital objects. PIDs not only provide unique identification but also establish vital links between research entities, creators, and institutions, enriching the interconnection through machine-readable metadata in our systems. They enhance the discoverability, accessibility, and usability of research entities, enabling precise referencing of specific resource versions. PIDs also improve resource intelligibility by revealing their origins, enhancing accuracy and information flow. Moreover, they promote interoperability, bolstering trustworthiness through transparency and provenance. This interconnected network of specifically identified entities forms a robust foundation for assessment and evaluation within our infrastructure.

Expected outputs

While the original aim of PID services was to offer persistence, LifeWatch goes beyond the basic idea of just having persistent identifiers for content. Instead, with the ongoing mission to address our challenges and gain a better understanding of how PIDs relate to each other and the broader context, we aspire to establish a holistic research ecosystem


Contributors

Parham Ramezani - LifeWatch ERIC
Nicola Fiore - LifeWatch ERIC