Sciweavers

IPPS
2010
IEEE
13 years 1 months ago
Large-scale multi-dimensional document clustering on GPU clusters
Document clustering plays an important role in data mining systems. Recently, a flocking-based document clustering algorithm has been proposed to solve the problem through simulat...
Yongpeng Zhang, Frank Mueller, Xiaohui Cui, Thomas...
EMNLP
2010
13 years 1 months ago
Translingual Document Representations from Discriminative Projections
Representing documents by vectors that are independent of language enhances machine translation and multilingual text categorization. We use discriminative training to create a pr...
John Platt, Kristina Toutanova, Wen-tau Yih
BMVC
2010
13 years 1 months ago
Probabilistic Latent Sequential Motifs: Discovering Temporal Activity Patterns in Video Scenes
This paper introduces a novel probabilistic activity modeling approach that mines recurrent sequential patterns from documents given as word-time occurrences. In this model, docum...
Jagannadan Varadarajan, Rémi Emonet, Jean-M...
SPIRE
2010
Springer
13 years 2 months ago
Dual-Sorted Inverted Lists
Several IR tasks rely, to achieve high efficiency, on a single pervasive data structure called the inverted index. This is a mapping from the terms in a text collection to the docu...
Gonzalo Navarro, Simon J. Puglisi
SPIRE
2010
Springer
13 years 2 months ago
Colored Range Queries and Document Retrieval
Abstract. Colored range queries are a well-studied topic in computational geometry and database research that, in the past decade, have found exciting applications in information r...
Travis Gagie, Gonzalo Navarro, Simon J. Puglisi
PVLDB
2010
184views more  PVLDB 2010»
13 years 2 months ago
TimeTrails: A System for Exploring Spatio-Temporal Information in Documents
Spatial and temporal data have become ubiquitous in many application domains such as the Geosciences or life sciences. Sophisticated database management systems are employed to ma...
Jannik Strötgen, Michael Gertz
PVLDB
2010
115views more  PVLDB 2010»
13 years 2 months ago
ROXXI: Reviving witness dOcuments to eXplore eXtracted Information
In recent years, there has been considerable research on information extraction and constructing RDF knowledge bases. In general, the goal is to extract all relevant information f...
Shady Elbassuoni, Katja Hose, Steffen Metzger, Ral...
PKDD
2010
Springer
154views Data Mining» more  PKDD 2010»
13 years 2 months ago
Topic Models Conditioned on Relations
Latent Dirichlet allocation is a fully generative statistical language model that has been proven to be successful in capturing both the content and the topics of a corpus of docum...
Mirwaes Wahabzada, Zhao Xu, Kristian Kersting
PKDD
2010
Springer
188views Data Mining» more  PKDD 2010»
13 years 2 months ago
AnswerArt - Contextualized Question Answering
The focus of this paper is a question answering system, where the answers are retrieved from a collection of textual documents. The system also includes automatic document summariz...
Lorand Dali, Delia Rusu, Blaz Fortuna, Dunja Mlade...
OTM
2010
Springer
13 years 2 months ago
Integrating Keywords and Semantics on Document Annotation and Search
This paper describes GoNTogle, a framework for document annotation and retrieval, built on top of Semantic Web and IR technologies. GoNTogle supports ontology-based annotation for ...
Nikos Bikakis, Giorgos Giannopoulos, Theodore Dala...