We describe here a method for automatically identifying word sense variation in a dated collection of historical books in a large digital library. By leveraging a small set of kno...
Many semi-supervised learning algorithms only
deal with binary classification. Their extension to the
multi-class problem is usually obtained by repeatedly
solving a set of bina...
We present a very efficient, in terms of space and access speed, data structure for storing huge natural language data sets. The structure is described as LZ (Ziv Lempel) compresse...
The general goal of data mining is to extract interesting correlated information from large collection of data. A key computationally-intensive subproblem of data mining involves ...
Multi-task learning aims at combining information across tasks to boost prediction performance, especially when the number of training samples is small and the number of predictor...