Sciweavers

KDD
2008
ACM
161views Data Mining» more  KDD 2008»
14 years 5 months ago
An inductive database prototype based on virtual mining views
We present a prototype of an inductive database. Our system enables the user to query not only the data stored in the database but also generalizations (e.g. rules or trees) over ...
Élisa Fromont, Adriana Prado, Bart Goethals...
KDD
2008
ACM
161views Data Mining» more  KDD 2008»
14 years 5 months ago
Locality sensitive hash functions based on concomitant rank order statistics
: Locality Sensitive Hash functions are invaluable tools for approximate near neighbor problems in high dimensional spaces. In this work, we are focused on LSH schemes where the si...
Kave Eshghi, Shyamsundar Rajaram
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
14 years 5 months ago
Structured entity identification and document categorization: two tasks with one joint model
Traditionally, research in identifying structured entities in documents has proceeded independently of document categorization research. In this paper, we observe that these two t...
Indrajit Bhattacharya, Shantanu Godbole, Sachindra...
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
14 years 5 months ago
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Peter Christen
KDD
2008
ACM
147views Data Mining» more  KDD 2008»
14 years 5 months ago
Extracting shared subspace for multi-label classification
Multi-label problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construc...
Shuiwang Ji, Lei Tang, Shipeng Yu, Jieping Ye
KDD
2008
ACM
97views Data Mining» more  KDD 2008»
14 years 5 months ago
The future of image search
Jitendra Malik
KDD
2008
ACM
111views Data Mining» more  KDD 2008»
14 years 5 months ago
Fast logistic regression for text categorization with variable-length n-grams
Gökhan H. Bakir, Georgiana Ifrim, Gerhard Wei...
KDD
2008
ACM
174views Data Mining» more  KDD 2008»
14 years 5 months ago
Using predictive analysis to improve invoice-to-cash collection
Sai Zeng, Prem Melville, Christian A. Lang, Ioana ...
KDD
2008
ACM
153views Data Mining» more  KDD 2008»
14 years 5 months ago
Text classification, business intelligence, and interactivity: automating C-Sat analysis for services industry
Text classification has matured as a research discipline over the last decade. Independently, business intelligence over structured databases has long been a source of insights fo...
Shantanu Godbole, Shourya Roy
KDD
2008
ACM
115views Data Mining» more  KDD 2008»
14 years 5 months ago
Topical query decomposition
We introduce the problem of query decomposition, where we are given a query and a document retrieval system, and we want to produce a small set of queries whose union of resulting...
Francesco Bonchi, Carlos Castillo, Debora Donato, ...