Sciweavers

KDD
2008
ACM
232views Data Mining» more  KDD 2008»
14 years 5 months ago
Anticipating annotations and emerging trends in biomedical literature
The BioJournalMonitor is a decision support system for the analysis of trends and topics in the biomedical literature. Its main goal is to identify potential diagnostic and therap...
Bernd Wachmann, Dmitriy Fradkin, Fabian Mörch...
KDD
2008
ACM
148views Data Mining» more  KDD 2008»
14 years 5 months ago
Get another label? improving data quality and data mining using multiple, noisy labelers
This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated la...
Victor S. Sheng, Foster J. Provost, Panagiotis G. ...
KDD
2008
ACM
211views Data Mining» more  KDD 2008»
14 years 5 months ago
ArnetMiner: extraction and mining of academic social networks
This paper addresses several key issues in the ArnetMiner system, which aims at extracting and mining academic social networks. Specifically, the system focuses on: 1) Extracting ...
Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zha...
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
14 years 5 months ago
Context-aware query suggestion by mining click-through and session data
Query suggestion plays an important role in improving the usability of search engines. Although some recently proposed methods can make meaningful query suggestions by mining quer...
Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen L...
KDD
2008
ACM
161views Data Mining» more  KDD 2008»
14 years 5 months ago
An inductive database prototype based on virtual mining views
We present a prototype of an inductive database. Our system enables the user to query not only the data stored in the database but also generalizations (e.g. rules or trees) over ...
Élisa Fromont, Adriana Prado, Bart Goethals...
KDD
2008
ACM
161views Data Mining» more  KDD 2008»
14 years 5 months ago
Locality sensitive hash functions based on concomitant rank order statistics
: Locality Sensitive Hash functions are invaluable tools for approximate near neighbor problems in high dimensional spaces. In this work, we are focused on LSH schemes where the si...
Kave Eshghi, Shyamsundar Rajaram
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
14 years 5 months ago
Structured entity identification and document categorization: two tasks with one joint model
Traditionally, research in identifying structured entities in documents has proceeded independently of document categorization research. In this paper, we observe that these two t...
Indrajit Bhattacharya, Shantanu Godbole, Sachindra...
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
14 years 5 months ago
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Peter Christen
KDD
2008
ACM
147views Data Mining» more  KDD 2008»
14 years 5 months ago
Extracting shared subspace for multi-label classification
Multi-label problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construc...
Shuiwang Ji, Lei Tang, Shipeng Yu, Jieping Ye
KDD
2008
ACM
97views Data Mining» more  KDD 2008»
14 years 5 months ago
The future of image search
Jitendra Malik