We present an approximation to the Bayesian hierarchical PitmanYor process language model which maintains the power law distribution over word tokens, while not requiring a comput...
This paper describes a query algebra for queries over XML p2p databases that provides explicit mechanisms for modeling data dissemination, replication constraints, and for capturi...
The ultimate goal of data visualization is to clearly portray features relevant to the problem being studied. This goal can be realized only if users can effectively communicate t...
C. Ryan Johnson, Markus Glatter, Wesley Kendall, J...
In this paper, we describe CALM, a method for building statistical language models for the Web. CALM addresses several unique challenges dealing with the Web contents. First, CALM...
We present a critique of language-based modelling for text input research, and propose an alternative inputbased approach. Current language-based statistical models are derived fr...