Released first DOI minted by GSSC

609 views

Why are DOIs so important

In a very complex world where data assets are growing at an enormous pace, no matter the field of knowledge, if the access criteria to data are not guided by clear principles, access to scientific data will become extremely difficult and opaque for all. Any sensible policies to establish the rules for accessing scientific data must take into consideration data access is now more than ever, a machine based activity that should be guided by criteria of Findability, Accessibility, Interoperability and Reuse of digital assets: the so called FAIR principles. One important tool in the application of the FAIR principles is the Digital Object Identifier (DOI). This is perhaps the most notorious reason the GSSC has now taken a big step forward and is becoming DOI aware.

The first DOI

Because some of our data assets are extremely complex and span very large time intervals, we are proceeding at a slow but steady pace! Decisions that are taken now will influence the useability of our data infrastructure in the future and we decided to start with a well known DOI assignment use case. Today we are launching the first DOI in the GSSC: the DOI for our public 7 day GREAT dataset. You can go and check for yourself here: https://doi.org/10.57780/esa-nlhirh0. And yes, that is what a DOI looks like (although some purists would say the doi is doi:10.57780/esa-nlhirh0.

What will a DOI do for us?

What is, then, a DOI and what does it do for us? The same way state authorities have an unequivocal means of identifying us, be it either an ID number or a National Insurance number, a DOI allows us to refer to and find anything in the digital world – texts, images, videos, data, software, … – in a simple way.

Two of the most important features of a DOI are its uniqueness and its persistence. These are indeed very desirable characteristics to possess. Uniqueness means there is a one-to-one relation between the DOI and the object it stands for – the DOI can stand for the object itself once we need to refer to it – and persistence means it should last in time as long as the object persists. Notice that even here the analogy with a National ID can be easily extended as each of us has a unique ID and IDs will last while we live. The principles and the technical background of the DOIs – there is of course more to a DOI than just uniqueness and persistence – can be found with all manner of details at the DOI foundation website.

Why, then, is this something the GSSC should be concerned with? Perhaps not so surprisingly, there are many reasons. They are mostly encapsulated in the FAIR principles but are easier to understand with a few simple examples. Assigning a DOI to a dataset (or a collection, or a datalab) creates an immutable reference to the object of interest – remember persistence? – and by invoking the same DOI we will always be referring to the same dataset, greatly simplifying the process of referencing data. As it turns out, in the digital world a DOI is represented as a web page – we call it the landing page. As the DOI is persistent, it follows the landing page will also be persistent. Even if the object of interest (a dataset for instance) is moved to a new location, the landing page is always kept at the same URL (this is part of the commitment between the organization that manages the DOIs and the organization publishing the data).

The landing page

The most important information contained in the landing page is a link to the dataset of interest. But a landing page may contains a lot more: a description of the dataset, information on the instrument(s) used to acquire the data, any information required to correctly use the data and, if there are more versions of the dataset of interest, their DOIs (a version of an object is a different object and therefore has a different DOI) and the dates associated with the versions. This means anyone with the appropriate knowledge of the field will always be able to recover any results obtained with the dataset, as all versions of the dataset will be available and documented.

The landing page is meant to facilitate the access and make use of datasets easier for humans, all the metadata available in the landing page will also be available in machine-readable format making it easily known to search engines and ultimately to users themselves.

The future

This is just the first step in the adoption of DOI’s by GSSC, embracing FAIR principles as a mechanism to improve the specification of our datasets.

Sincerely, The GSSC Team, Navigation Science Office