We present a joint model for Chinese word segmentation and new word detection. We present high dimensional new features, including word-based features and enriched edge (label-tra...
Learning probabilistic graphical models from high-dimensional datasets is a computationally challenging task. In many interesting applications, the domain dimensionality is such a...
Is the real problem in resolving correspondence using current stereo algorithms the lack of the "right" matching criterion? In studying the related task of reconstructin...
The practice of software development can likely be improved if an externalized model of each programmer's knowledge of a particular code base is available. Some tools already...
We continue the study of approximating the number of distinct elements in a data stream of length n to within a (1? ) factor. It is known that if the stream may consist of arbitra...