What is a PID?

In the current discussion about openness in science and all the research fields, Persistent Identifiers (PID) play a significant role. They actively support the FAIR Data Principles for good scientific practice and, consequently, Open Science. To conform to these specifications, data needs to be Findable, Accessible, Interoperable and Reusable. But how can a Persistent Identifier help with that?

A PID is a unique, permanent and persistent identifier that is linked to a document, file, or digital object, but it can also be used to point to an abstract entity (e.g. a time span, location, event, person). The identifier is generated by a specific provider that allows permanent linking via a resolving service. This means that the object can be found even if its location in the World Wide Web changes. This way, the instability of URLs and broken website links can be avoided. The PID usually resolves to a landing page containing the searched entity, enabling information to be shared across systems. Many different PIDs are used across the scientific sector and its various domains. Below are some of the more commonly known PIDs that can be used for digital objects in cultural heritage and natural history collections:

  • DOI (Digital Object Identifier): is especially used for written text and research outputs, but can also be used for any other purpose. DOIs are provided by DataCite.
  • Handle: runs along a central registry that resolves URLs to the current location, and underpins the technical infrastructure of DOIs, for that is a special type of Handle.
  • ARK (Archival Resource Key): is a PID commonly used by archives and collections. ARKs are a non-proprietary, decentralised service which must be managed individually by each institution.
  • PURL: resolves to the current location of an object by using standard HTTP status codes.
  • GND (Gemeinsame Normdatei): are used in library science and other humanities research environments to identify persons, entities, geographical features, works, concepts and events.

Opening Up Research Infrastructure

As previously mentioned, PIDs can facilitate the opening up of research infrastructures. As an individual researcher, you can register for an ORCID ID (Open Researcher and Contribution ID) or an ISNI (International Standard Name Identifier), for example. Large research databases such as Scopus and Web of Science also assign a system-bound PID to researchers. The ROR (Research Organization Registry) standard is commonly used for organisations. Assigning a Persistent Identifier to every entity in a research process, including data, instruments, projects, and heritage objects, makes these processes easier to access and understand for the public. They also support interoperability and the unambiguity of information in a rather diverse field.

What is the best PID for your collection items?

The choice of a PID standard for digital objects is generally a matter of taste and institutional resources. The advantages are manifold, but it is important to bear in mind that, despite the persistence of these identifiers, there can be no guarantee for the consistency of the PID service maintaining the system. If the service provider ceases to exist, the quality management of the identifiers will do, too. The most suitable type depends on your needs, the size of your collection and institution, and your human, technical, and financial resources.

The DOI is widely used for a variety of objects, particularly scholarly publications. The diverse application scenarios speak for the future consistency of the service. While the identifiers can be assigned by each institution individually, they must be acquired from a DOI registration agency such as CrossRef or DataCite, which makes them a proprietary format. On the other hand, DOIs free you from the need to manage the resolving service. Many national agencies already offer DOI assignment and resolution services.

Similar to DOIs, URNs (Uniform Resource Name) are well established for data and digital objects. There are also national agencies that provide assignment and resolution services.

ARKs are provided by the California Digital Library and have not yet reached the level of coverage of a DOI. The ARK alliance was launched in 2018 and is supported by about 50 institutions by now. ARKs are non-proprietary, decentralised and cost-effective, as the use of the identifiers themselves is free of charge. They are not bound to a metadata scheme and can be created without any descriptive information. However, the effort involved in assigning, hosting, monitoring and forwarding the content is higher.

PIDs for Natural History Data

In theory, there is not “the one” Persistent Identifier dedicated to a special field, there are only recommendations and best practices. You can assign all of the above-mentioned identifiers to your digital specimen records or other digital data. Nevertheless, there are certain PIDs that were specially designed for natural history data, like the IGSN (International Generic Sample Number; technically a form of DOI) for physical samples or the CETAF Stable Identifier for specimens. In addition, there are numerous different standards and taxonomies that can be assigned to people, organizations, taxonomic concepts and names as well as geographic places. Visit SPNHC to learn more about enriching you datasets.

PIDs and Europeana

In order to adhere to the FAIR Data Principles, Europeana promotes the use of Persistent Identifiers in cultural heritage institutions. The conception of the Common European Data Space for Cultural Heritage stressed the need for a PID usage policy. Europeana will further explore the best PID practices, identifying challenges and future actions towards sustainable practice.

Including a PID in the Europeana metadata is not mandatory but highly recommended, since it contributes to Open Science and interoperability of cultural heritage data. Europeana will soon update their metadata scheme, incorporating new elements to support the inclusion of PIDs. View the documentation of the various metadata schemes on our website to find information on metadata fields used for PIDs.

Further Information