Probabilistic latent semantic indexing (PLSI) represents documents of a collection as mixture proportions of latent topics, which are learned from the collection by an expectation...
Alexander Hinneburg, Hans-Henning Gabriel, Andr&eg...
An active learner has a collection of data points, each with a label that is initially hidden but can be obtained at some cost. Without spending too much, it wishes to find a cla...
As a new paradigm for data warehousing demanded by today's decision support community, DW 2.0 recognized the life cycle of data with it, that make metadata evolution mechanis...
The problem of automatically extracting the most interesting and relevant keyword phrases in a document has been studied extensively as it is crucial for a number of applications. ...
—A Bloom filter is an effective, space-efficient data structure for concisely representing a set, and supporting approximate membership queries. Traditionally, the Bloom filter a...
Deke Guo, Jie Wu, Honghui Chen, Ye Yuan, Xueshan L...