We describe a novel class of distributions, called Mondrian processes, which can be interpreted as probability distributions over kd-tree data structures. Mondrian processes are m...
Accurately estimating probabilities from observations is important for probabilistic-based approaches to problems in computational biology. In this paper we present a biologically...
We describe and evaluate experimentally a method for clustering words according to their distribution in particular syntactic contexts. Words are represented by the relative frequ...
Fernando C. N. Pereira, Naftali Tishby, Lillian Le...
Most probabilistic classi ers used for word-sense disambiguationhave either been based on onlyone contextual feature or have used a model that is simply assumed to characterize th...
We propose a new hierarchical Bayesian n-gram model of natural languages. Our model makes use of a generalization of the commonly used Dirichlet distributions called Pitman-Yor pr...