Sciweavers

SIGIR
2008
ACM
13 years 4 months ago
Named entity normalization in user generated content
Named entity recognition is important for semantically oriented retrieval tasks, such as question answering, entity retrieval, biomedical retrieval, trend detection, and event and...
Valentin Jijkoun, Mahboob Alam Khalid, Maarten Mar...
SIGIR
2008
ACM
13 years 4 months ago
Blogger, stick to your story: modeling topical noise in blogs with coherence measures
Topical noise in blogs arises when bloggers digress from the central topical thrust of their blogs. We introduce a method to explicitly incorporate a model of topical noise into a...
Jiyin He, Wouter Weerkamp, Martha Larson, Maarten ...
SIGIR
2008
ACM
13 years 4 months ago
To tag or not to tag -: harvesting adjacent metadata in large-scale tagging systems
We present HAMLET, a suite of principles, scoring models and algorithms to automatically propagate metadata along edges in a document neighborhood. As a showcase scenario we consi...
Adriana Budura, Sebastian Michel, Philippe Cudr&ea...
SIGIR
2008
ACM
13 years 4 months ago
TopicRank: bringing insight to users
Ivan Berlocher, Kyung-Il Lee, Kono Kim
SIGIR
2008
ACM
13 years 4 months ago
Asymmetric distance estimation with sketches for similarity search in high-dimensional spaces
Efficient similarity search in high-dimensional spaces is important to content-based retrieval systems. Recent studies have shown that sketches can effectively approximate L1 dist...
Wei Dong, Moses Charikar, Kai Li
SIGIR
2008
ACM
13 years 4 months ago
Social tag prediction
In this paper, we look at the "social tag prediction" problem. Given a set of objects, and a set of tags applied to those objects by users, can we predict whether a give...
Paul Heymann, Daniel Ramage, Hector Garcia-Molina
SIGIR
2008
ACM
13 years 4 months ago
On document splitting in passage detection
Passages can be hidden within a text to circumvent their disallowed transfer. Such release of compartmentalized information is of concern to all corporate and governmental organiz...
Nazli Goharian, Saket S. R. Mengle
SIGIR
2008
ACM
13 years 4 months ago
Combining document- and paragraph-based entity ranking
We study entity ranking on the INEX entity track and propose a simple graph-based ranking approach that enables to combine scores on document and paragraph level. The combined app...
Henning Rode, Pavel Serdyukov, Djoerd Hiemstra
SIGIR
2008
ACM
13 years 4 months ago
Generating diverse katakana variants based on phonemic mapping
In Japanese, it is quite common for the same word to be written in several different ways. This is especially true for katakana words which are typically used for transliterating ...
Kazuhiro Seki, Hiroyuki Hattori, Kuniaki Uehara
SIGIR
2008
ACM
13 years 4 months ago
A general optimization framework for smoothing language models on graph structures
Recent work on language models for information retrieval has shown that smoothing language models is crucial for achieving good retrieval performance. Many different effective smo...
Qiaozhu Mei, Duo Zhang, ChengXiang Zhai