In this paper, an extension of n-grams, called x-grams, is proposed. In this extension, the memory of the model (n) is not fixed a priori. Instead, large memories are accepted first, and merging criteria are then applied to reduce the complexity and to ensure reliable estimations. The results show how the perplexity obtained with x-grams is smaller than that of n-grams. Furthermore, the complexity is smaller than trigrams and can become close to bigrams.
Bonafonte, A., Mariño, J. Language modeling using X-grams. A: International Conference on Spoken Language. "Fourth international conference on spoken language, 1996, ICSLP 96: proceedings". Philadelphia, PA: H. TIMOTHY BRUMMELL, WILLIAM IDSARDI CITATION DELAWARE, NEW CASTLE, DELAWARE, 1996, p. 394-397.