Loading...
Loading...

Go to the content (press return)

Developing competitive HMM PoS taggers using small training corpora

Author
Padro, M.; Padro, L.
Type of activity
Report
Date
2004-06
Code
LSI-04-36-R
Repository
http://hdl.handle.net/2117/97920 Open in new window
Abstract
This paper presents a study aiming to find out the best strategy to develop a fast and accurate HMM tagger when only a limited amount of training material is available. This is a crucial factor when dealing with languages for which small annotated material is not easily available. First, we develop some experiments in English, using WSJ corpus as a test-bench to establish the differences caused by the use of large or a small train set. Then, we port the results to develop an accurate Spanish Po...
Citation
Padro, M., Padro, L. "Developing competitive HMM PoS taggers using small training corpora". 2004.
Keywords
HMM PoS taggers, NLP, Natural language, Training corpora
Group of research
GPLN - Natural Language Processing Group
IDEAI-UPC - Intelligent Data Science and Artificial Intelligence Research Center
TALP - Centre for Language and Speech Technologies and Applications

Participants

Attachments