Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Detecting geospatial location descriptions in natural language text

Stock, Kristin, Jones, Christopher B. ORCID: https://orcid.org/0000-0001-6847-7575, Russell, Shaun, Radke, Mansi, Das, Prarthana and Aflaki, Niloofar 2022. Detecting geospatial location descriptions in natural language text. International Journal of Geographical Information Science 36 (3) , pp. 547-584. 10.1080/13658816.2021.1987441

[thumbnail of Detecting geospatial location descriptions in natural language text.pdf] PDF - Published Version
Available under License Creative Commons Attribution.

Download (4MB)

Abstract

References to geographic locations are common in text data sources including social media and web pages. They take different forms from simple place names to relative expressions that describe location through a spatial relationship to a reference object (e.g. the house beside the Waikato River). Often complex, multi-word phrases are employed (e.g. the road and railway cross at right angles; the road in line with the canal) where spatial relationships are communicated with various parts of speech including prepositions, verbs, adverbs and adjectives. We address the problem of automatically detecting relative geospatial location descriptions, which we define as those that include spatial relation terms referencing geographic objects, and distinguishing them from non-geographical descriptions of location (e.g. the book on the table). We experiment with several methods for automated classification of text expressions, using features for machine learning that include bag of words that detect distinctive words, word embeddings that encode meanings of words and manually identified language patterns that characterise geospatial expressions. Using three data sets created for this study, we find that ensemble and meta-classifier approaches, that variously combine predictions from several other classifiers with data features, provide the best F-measure of 0.90 for detecting geospatial expressions.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Additional Information: This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Publisher: Taylor and Francis
ISSN: 1365-8816
Date of First Compliant Deposit: 4 March 2022
Date of Acceptance: 26 September 2021
Last Modified: 18 May 2023 11:13
URI: https://orca.cardiff.ac.uk/id/eprint/147975

Citation Data

Cited 2 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics