Loading...
Loading...

Go to the content (press return)

TweetNorm: a benchmark for lexical normalization of spanish tweets

Author
Alegria, I.; Aranberri, N.; Comas, P.R.; Fresno, V.; Gamallo, P.; Padro, L.; San Vicente, I.; Turmo, J.; Zubiaga, A.
Type of activity
Journal article
Journal
Language resources and evaluation
Date of publication
2015-12-01
Volume
49
Number
4
First page
883
Last page
905
DOI
https://doi.org/10.1007/s10579-015-9315-6 Open in new window
Project funding
Adquisición de escenarios de conocimiento a través de la lectura de textos: inferencia de relaciones entre eventos (SKATeR)
Repository
http://hdl.handle.net/2117/80964 Open in new window
Abstract
The language used in social media is often characterized by the abundance of informal and non-standard writing. The normalization of this non-standard language can be crucial to facilitate the subsequent textual processing and to consequently help boost the performance of natural language processing tools applied to social media text. In this paper we present a benchmark for lexical normalization of social media posts, specifically for tweets in Spanish language. We describe the tweet normalizat...
Citation
Alegria, I., Aranberri, N., Comas, P.R., Fresno, V., Gamallo, P., Padro, L., San Vicente, I., Turmo, J., Zubiaga, A. TweetNorm: a benchmark for lexical normalization of spanish tweets. "Language resources and evaluation", 01 Desembre 2015, vol. 49, núm. 4, p. 883-905.
Keywords
Corpus, Evaluation, Lexical normalization, Social media, Twitter
Group of research
GPLN - Natural Language Processing Group
IDEAI-UPC - Intelligent Data Science and Artificial Intelligence Research Center
TALP - Centre for Language and Speech Technologies and Applications

Participants

  • Alegria, Iñaki  (author)
  • Aranberri, Nora  (author)
  • Comas Umbert, Pere Ramon  (author)
  • Fresno, Víctor  (author)
  • Gamallo Otero, Pablo  (author)
  • Padró Cirera, Lluís  (author)
  • San Vicente Roncal, Iñaki  (author)
  • Turmo Borras, Jorge  (author)
  • Zubiaga, Arkaitz  (author)

Attachments