Logo image
Extracting geographic relations from large social media text data
Journal article   Open access   Peer reviewed

Extracting geographic relations from large social media text data

Yi-Yun Cheng and Ly Dinh
International Journal of Geographical Information Science, pp.1-29
05/28/2025

Abstract

Geographic information extraction social media semantic relations spatial relations large language models
Extracting geographical information from a large corpus of social media text is useful for monitoring live events such as natural disasters and public health crises. However , the noisy nature of texts creates challenges for reliable geoparsing, geocoding, and geotagging. These challenges can be remedied with relation extraction (RE) to efficiently identify geographic relations, but existing RE solutions are not domain-specific for geographic contexts. In this study, we domain-adapt and validate existing RE models to identify variations of geographic relations used in unstructured texts. We analyze 163,037 tweets containing counties and state names of the United States with 1,672 annotated tweets as training data. We apply five RE models with domain-adaptation: (1) our own heuristics-based model; (2) OpenIE; (3) BERT; (4) GPT-3.5-turbo; and (5) Mistral. We identify a list of lexically similar but semantically different relations and categorized them into meronymic, prepositional, " include " , " locate " , geographic noun, and other spatial relations. The contributions of this study are twofold: (1) we provide a list of geographic relations that can be used for GIS tasks that require more accurate detection of location from text; (2) we show that the performance of large language models (LLMs) improved with domain-specific training for geographic relation extraction.
pdf
IJGIS_Cheng_Dinh356.52 kBDownloadView
Accepted Manuscript (AM) International Journal of Geographical Information Science Open Access
url
https://doi.org/10.1080/13658816.2025.2510419View
Version of Record (VoR) International Journal of Geographical Information Science Restricted
url
Report an accessibility issueView
Please complete a content remediation request to report an accessibility issue with a library electronic resource, website, or service.

Metrics

51 File downloads
23 Record Views

Details

Logo image