Sciweavers

KDD
2009
ACM
209views Data Mining» more  KDD 2009»
14 years 5 months ago
Collective annotation of Wikipedia entities in web text
To take the first step beyond keyword-based search toward entity-based search, suitable token spans ("spots") on documents must be identified as references to real-world...
Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, ...
KDD
2009
ACM
152views Data Mining» more  KDD 2009»
14 years 5 months ago
TANGENT: a novel, 'Surprise me', recommendation algorithm
Most of recommender systems try to find items that are most relevant to the older choices of a given user. Here we focus on the "surprise me" query: A user may be bored ...
Kensuke Onuma, Hanghang Tong, Christos Faloutsos
KDD
2009
ACM
141views Data Mining» more  KDD 2009»
14 years 5 months ago
Meme-tracking and the dynamics of the news cycle
Tracking new topics, ideas, and "memes" across the Web has been an issue of considerable interest. Recent work has developed methods for tracking topic shifts over long ...
Jure Leskovec, Lars Backstrom, Jon M. Kleinberg
KDD
2009
ACM
156views Data Mining» more  KDD 2009»
14 years 5 months ago
Effective multi-label active learning for text classification
Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical ...
Bishan Yang, Jian-Tao Sun, Tengjiao Wang, Zheng Ch...
KDD
2009
ACM
180views Data Mining» more  KDD 2009»
14 years 5 months ago
Mining social networks for personalized email prioritization
Email is one of the most prevalent communication tools today, and solving the email overload problem is pressingly urgent. A good way to alleviate email overload is to automatical...
Shinjae Yoo, Yiming Yang, Frank Lin, Il-Chul Moon
KDD
2009
ACM
219views Data Mining» more  KDD 2009»
14 years 5 months ago
Structured correspondence topic models for mining captioned figures in biological literature
A major source of information (often the most crucial and informative part) in scholarly articles from scientific journals, proceedings and books are the figures that directly pro...
Amr Ahmed, Eric P. Xing, William W. Cohen, Robert ...
KDD
2009
ACM
168views Data Mining» more  KDD 2009»
14 years 5 months ago
Name-ethnicity classification from open sources
The problem of ethnicity identification from names has a variety of important applications, including biomedical research, demographic studies, and marketing. Here we report on th...
Anurag Ambekar, Charles B. Ward, Jahangir Mohammed...
KDD
2009
ACM
188views Data Mining» more  KDD 2009»
14 years 5 months ago
Mining broad latent query aspects from search sessions
Search queries are typically very short, which means they are often underspecified or have senses that the user did not think of. A broad latent query aspect is a set of keywords ...
Xuanhui Wang, Deepayan Chakrabarti, Kunal Punera
KDD
2009
ACM
191views Data Mining» more  KDD 2009»
14 years 5 months ago
Scalable pseudo-likelihood estimation in hybrid random fields
Learning probabilistic graphical models from high-dimensional datasets is a computationally challenging task. In many interesting applications, the domain dimensionality is such a...
Antonino Freno, Edmondo Trentin, Marco Gori
KDD
2009
ACM
194views Data Mining» more  KDD 2009»
14 years 5 months ago
Combining link and content for community detection: a discriminative approach
In this paper, we consider the problem of combining link and content analysis for community detection from networked data, such as paper citation networks and Word Wide Web. Most ...
Tianbao Yang, Rong Jin, Yun Chi, Shenghuo Zhu