Sciweavers

CIKM
2011
Springer
12 years 5 months ago
Classifying trending topics: a typology of conversation triggers on Twitter
Twitter summarizes the great deal of messages posted by users in the form of trending topics that reflect the top conversations being discussed at a given moment. These trending ...
Arkaitz Zubiaga, Damiano Spina, Víctor Fres...
CIKM
2011
Springer
12 years 5 months ago
The impact of author ranking in a library catalogue
The field of information retrieval has witnessed over 50 years of research on retrieval methods for metadata descriptions and controlled indexing languages, the prototypical exam...
Jaap Kamps
CIKM
2011
Springer
12 years 5 months ago
Joint inference for cross-document information extraction
Previous information extraction (IE) systems are typically organized as a pipeline architecture of separated stages which make independent local decisions. When the data grows bey...
Qi Li, Sam Anzaroot, Wen-Pin Lin, Xiang Li, Heng J...
CIKM
2011
Springer
12 years 5 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
CIKM
2011
Springer
12 years 5 months ago
Integrating and querying web databases and documents
There exist many interrelated information sources on the Internet that can be categorized into structured (database) and semistructured (documents). A key challenge is to integrat...
Carlos Garcia-Alvarado, Carlos Ordonez
CIKM
2011
Springer
12 years 5 months ago
Towards a framework for attribute retrieval
In this paper, we propose an attribute retrieval approach which extracts and ranks attributes from HTML tables. We distinguish between class attribute retrieval and instance attri...
Arlind Kopliku, Mohand Boughanem, Karen Pinel-Sauv...
CIKM
2011
Springer
12 years 5 months ago
Semi-supervised multi-task learning of structured prediction models for web information extraction
Extracting information from web pages is an important problem; it has several applications such as providing improved search results and construction of databases to serve user qu...
Paramveer S. Dhillon, Sundararajan Sellamanickam, ...
CIKM
2011
Springer
12 years 5 months ago
Toward interactive training and evaluation
Machine learning often relies on costly labeled data, and this impedes its application to new classification and information extraction problems. This has motivated the developme...
Gregory Druck, Andrew McCallum
CIKM
2011
Springer
12 years 5 months ago
PDFMeat: managing publications on the semantic desktop
Researchers maintain bibliographies and extensive sets of PDF files of scholarly publications on their desktop. The lack of proper metadata of downloaded PDFs makes this task a t...
David Aumüller, Erhard Rahm
CIKM
2011
Springer
12 years 5 months ago
Supervised language modeling for temporal resolution of texts
We investigate temporal resolution of documents, such as determining the date of publication of a story based on its text. We describe and evaluate a model that build histograms e...
Abhimanu Kumar, Matthew Lease, Jason Baldridge