Abstract. In this article, we propose the use of suffix arrays to efficiently implement n-gram language models with practically unlimited size n. This approach, which is used with ...
: The fact that the World Wide Web is being used for various purposes also implies that users may have various information quality factors to consider according to their current co...
In this paper, we address the question of what kind of knowledge is generally transferable from unlabeled text. We suggest and analyze the semantic correlation of words as a gener...
For the patent classification task of the 2010 CLEF-IP evaluation we have used three different approaches combining semantics and statistics-driven techniques: first approach is b...
Franck Derieux, Mihaela Bobeica, Delphine Pois, Je...
Recently web-based educational systems collect vast amounts of data on user patterns, and data mining methods can be applied to these databases to discover interesting associations...
Behrouz Minaei-Bidgoli, Gerd Kortemeyer, William F...