Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Predicting distance and direction from text locality descriptions for biological specimen collections

Liao, Ruoxuan, Das, Pragyan P, Jones, Christopher B ORCID: https://orcid.org/0000-0001-6847-7575, Aflaki, Niloofar and Stock, Kristin 2022. Predicting distance and direction from text locality descriptions for biological specimen collections. Presented at: 15th International Conference on Spatial Information Theory (COSIT 2022), Kobe, Japan, 5-9 September, 2022. Leibniz International Proceedings in Informatics. International Conference on Spatial Information Theory , vol.240 (4) Liebniz, Germany: Dagstuhl Publishing, pp. 1-15. 10.4230/LIPIcs.COSIT.2022.4

[thumbnail of Liao-2022-PredictingDistanceAndDirectionFromTextLocalityDescriptionsForBiologicalSpecimenCollections.pdf]
Preview
PDF - Published Version
Available under License Creative Commons Attribution.

Download (882kB) | Preview

Abstract

A considerable proportion of records that describe biological specimens (flora, soil, invertebrates), and especially those that were collected decades ago, are not attached to corresponding geographical coordinates, but rather have their location described only through textual descriptions (e.g. North Canterbury, Selwyn River near bridge on Springston-Leeston Rd). Without geographical coordinates, millions of records stored in museum collections around the world cannot be mapped. We present a method for predicting the distance and direction associated with human language location descriptions which focuses on the interpretation of geospatial prepositions and the way in which they modify the location represented by an associated reference place name (e.g. near the Manawatu River). We study eight distance-oriented prepositions and eight direction-oriented prepositions and use machine learning regression to predict distance or direction, relative to the reference place name, from a collection of training data. The results show that, compared with a simple baseline, our model improved distance predictions by up to 60% and direction predictions by up to 31%.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Schools > Computer Science & Informatics
Publisher: Dagstuhl Publishing
ISBN: 978-3-95977-257-0
Funders: New Zealand Ministry of Business, Innovation and Enterprise Endeavour Fund
Related URLs:
Date of First Compliant Deposit: 23 October 2025
Date of Acceptance: 21 March 2022
Last Modified: 24 Oct 2025 15:15
URI: https://orca.cardiff.ac.uk/id/eprint/181855

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics