We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is a ...
This paper stresses the contribution of the process of knowledge discovery in databases for the effective creation and sharing of organizational knowledge. The focus on the proces...
A good distance metric is crucial for many data mining tasks. To learn a metric in the unsupervised setting, most metric learning algorithms project observed data to a lowdimensio...
—Subspaces offer convenient means of representing information in many pattern recognition, machine vision, and statistical learning applications. Contrary to the growing populari...
Given a database with missing or uncertain content, our goal is to correct and fill the database by extracting specific information from a large corpus such as the Web, and to d...