Skip to main navigation Skip to search Skip to main content

Extracting and Geocoding Locations in Social Media Posts: a Comparative Analysis

  • Helen Ngonidzashe Serere
  • , Bernd Resch
  • , Clemens Havas
  • , Andreas Petutschnig

Research output: Contribution to journalArticlepeer-review

Abstract

Geo-social media have become an established data source for spatial analysis of geographic and social processes in various fields. However, only a small share of geo-social media data are explicitly georeferenced, which often compromises the reliability of the analysis results by excluding large volumes of data from the analysis. To increase the number of georeferenced tweets, inferred locations can be extracted from the texts of social media posts. We propose a customized workflow for location extraction from tweets and subsequent geocoding. We compare the results of two methods: DBpedia Spotlight (using linked Wikipedia entities), and spaCy combined with the geocoding methods of OpenStreetMap Nominatim. The results suggest that the workflow using spaCy and Nominatim identifies more locations than DBpedia Spotlight. For 50,616 tweets posted within California, USA, the granularity of the extracted locations is reasonable. However, several directions for future research were identified, including improved semantic analysis, the creation of a cascading workflow, and the need to integrate different data sources in order to increase reliability and spatial accuracy.
Original languageEnglish
JournalGI_Forum
DOIs
Publication statusPublished - 2021

Classification according to Österreichische Systematik der Wissenschaftszweige (ÖFOS 2012)

  • Not applicable

Applied Research Level (ARL)

  • Not applicable

Research focus/foci

  • Not applicable

Fingerprint

Dive into the research topics of 'Extracting and Geocoding Locations in Social Media Posts: a Comparative Analysis'. Together they form a unique fingerprint.

Cite this