Loading...
Loading...

Go to the content (press return)

The entropy of words-learnability and expressivity across more than 1000 languages

Author
Bentz, C.; Alikaniotis, D.; Cysouw, M.; Ferrer-i-Cancho, R.
Type of activity
Journal article
Journal
Entropy: international and interdisciplinary journal of entropy and information studies
Date of publication
2017-06-01
Volume
19
Number
6
First page
1
Last page
32
DOI
https://doi.org/10.3390/e19060275 Open in new window
Repository
http://hdl.handle.net/2117/106703 Open in new window
URL
http://www.mdpi.com/1099-4300/19/6/275 Open in new window
Abstract
The choice associated with words is a fundamental property of natural languages. It lies at the heart of quantitative linguistics, computational linguistics and language sciences more generally. Information theory gives us tools at hand to measure precisely the average amount of choice associated with words: the word entropy. Here, we use three parallel corpora, encompassing ca. 450 million words in 1916 texts and 1259 languages, to tackle some of the major conceptual and practical problems of w...
Citation
Bentz, C., Alikaniotis, D., Cysouw, M., Ferrer-i-Cancho, R. The entropy of words-learnability and expressivity across more than 1000 languages. "Entropy: international and interdisciplinary journal of entropy and information studies", 1 Juny 2017, vol. 19, núm. 6, p. 1-32.
Keywords
Entropy rate, Quantitative language typology, Unigram entropy
Group of research
LARCA - Laboratory of Relational Algorithmics, Complexity and Learnability

Participants

  • Bentz, Chris  (author)
  • Alikaniotis, Dimitrios  (author)
  • Cysouw, Michael  (author)
  • Ferrer Cancho, Ramon  (author)

Attachments