Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Making heterogeneous specimen data ‘FAIR’: Implementing a digital specimen repository

Nieva De La Hidalga, Abraham ORCID: https://orcid.org/0000-0001-7348-7612 and Hardisty, Alex ORCID: https://orcid.org/0000-0002-0767-4310 2019. Making heterogeneous specimen data ‘FAIR’: Implementing a digital specimen repository. Presented at: Biodiversity_Next 2019, Leiden, The Netherlands, 21-25 October 2019. Biodiversity Information Science and Standards. , vol.3 https://biss.pensoft.net/: Pensoft, e37163. 10.3897/biss.3.37163

[thumbnail of BISS_article_37163.pdf]
Preview
PDF - Published Version
Available under License Creative Commons Attribution.

Download (73kB) | Preview

Abstract

The definition of a digital specimen is proposed to encompass the digital representation(s) of physical specimens from natural science collections. The digital specimen concept is intended to define a representation (digital object) that brings together an array of heterogeneous data types, which are themselves alternative physical specimen representations. In this case, the digital specimen (DS) holds references to specimen data from a collection management system, images, 3D models, research articles, DNA sequences, collector information, among many other data types. The proposal is to create persistent relationships between the DS and other categories of digital objects (e.g. resource types mentioned above, collections, storage platforms, organisations, databases, and provenance data). Complying with FAIR data principles (findability, accessibility, interoperability, and reuse), i.e., achieving data ‘FAIRness’, eases data integration, which is needed for cross-disciplinary linking and combination of data from different domains, making the DS as a comprehensive package of information about a specimen. Implementation and access to a digital specimen repository (DSR) as a Digital Object Architecture (Sharp 2016) component demonstrates the alignment of the DS concept and FAIR data principles (Wilkinson et al. 2016, Kahn and Wilensky 2006). The DSR fulfills four roles: data producer, resource manager, data publisher, and collaboration space. As data producer, the DSR allows acquisition and curation (indexing, storage) of DSs linking primary data, models, analyses, and other digital object types. As resource manager, the DSR manages access to distributed platforms, ranging from acquisition networks (digitisation stations, museums, herbariums) to processing services, advanced computational resources, data asset storage systems, and specialised servers. As data publisher, the DSR provides access to data assets from national and transnational data archives. As collaboration space, the DSR supports users’ accessing, sharing and (re)using data assets, and derived data products and services. Adopting the collaboration space and data publisher roles, the DSR implements interfaces that expose the DSs to the research community, fulfilling the FAIR findability, accessibility, and reuse principles. Adopting the data producer and resource manager roles, the DSR creates meaningful and persistent relationships required to link DSs and other types of digital objects, fulfilling the FAIR interoperability principle. A prototype DSR based on the Cordra digital object repository has been deployed (Corporation for National Research Initiatives (CNRI) 2018, Reilly and Tupelo-Schneck 2010). The advantages of Cordra are: rapid deployment, customisable object model, creation of relations between digital objects, and application program interfaces for programmatic access. Rapid deployment of the DSR provides a tangible target for discussing the implementation of the DS concept. The customisable object model enables the refinement and enhancing of the definition of DS in response to feedback from colleagues who have accessed the DSR and used its contents. Creating relations between digital objects enables flexible linking to digital objects stored in different repositories. Accessing the DSR programmatically through APIs enables extending the use of the repository in different platforms (e.g. mobile devices) as well as integration with other repositories and services. As well as supporting a HTTP-oriented API, Cordra implements Digital Object Interface Protocol (DONA Foundation 2018), allowing the definition of operations to act directly on selected DSs in the repository. The DSR prototype has been demonstrated by providing access to the repository administrative interface and with a custom interface designed to facilitate access by different user groups, such as collection curators, researchers, teachers, and students. The client interface has been designed to demonstrate a subset of the functionalities derived from user stories, which describe software features from the end-user perspective. Demonstrating the DSR capabilities as proposed, will inform the refinement of the design of the DS model and provide early feedback about the needed software features.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Chemistry
Computer Science & Informatics
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QH Natural history
Uncontrolled Keywords: digital specimen repository, digital specimen, natural history collection, digitisation, FAIR
Publisher: Pensoft
Funders: Horizon 2020 Framework Programme of the European Union, H2020-INFRADEV-2016-2017 Grant Agreement No. 777483
Date of First Compliant Deposit: 29 July 2019
Date of Acceptance: 12 June 2019
Last Modified: 10 Dec 2022 02:25
URI: https://orca.cardiff.ac.uk/id/eprint/124491

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics