Sciweavers

CIKM
2010
Springer
13 years 3 months ago
Decomposing background topics from keywords by principal component pursuit
Low-dimensional topic models have been proven very useful for modeling a large corpus of documents that share a relatively small number of topics. Dimensionality reduction tools s...
Kerui Min, Zhengdong Zhang, John Wright, Yi Ma
CIKM
2010
Springer
13 years 3 months ago
Automatically suggesting topics for augmenting text documents
We present a method for automated topic suggestion. Given a plain-text input document, our algorithm produces a ranking of novel topics that could enrich the input document in a m...
Robert West, Doina Precup, Joelle Pineau
SAC
2008
ACM
13 years 4 months ago
An objective way to evaluate and compare binarization algorithms
The choice of the best binarization algorithm is very critical for any document image processing system, since it is one of the first tasks and any mistake it performs will be car...
Ergina Kavallieratou
SAC
2008
ACM
13 years 4 months ago
Exploring social annotations for web document classification
Social annotation via so-called collaborative tagging describes the process by which many users add metadata in the form of unstructured keywords to shared content. In this paper,...
Michael G. Noll, Christoph Meinel
SAC
2008
ACM
13 years 4 months ago
A generic XML language for characterising objects to support digital preservation
The dominance of digital objects in today's information landscape has changed the way humankind creates and exchanges information. However, it has also brought an entirely ne...
Christoph Becker, Andreas Rauber, Volker Heydegger...
PVLDB
2008
90views more  PVLDB 2008»
13 years 4 months ago
Generating XML structure using examples and constraints
This paper presents a framework for automatically generating structural XML documents. The user provides a target DTD and an example of an XML document, called a Generate-XML-ByEx...
Sara Cohen
PVLDB
2008
89views more  PVLDB 2008»
13 years 4 months ago
ManyAspects: a system for highlighting diverse concepts in documents
We demonstrate MANYASPECTS
Kun Liu, Evimaria Terzi, Tyrone Grandison
DSS
2008
141views more  DSS 2008»
13 years 4 months ago
A Latent Semantic Indexing-based approach to multilingual document clustering
The creation and deployment of knowledge repositories for managing, sharing, and reusing tacit knowledge within an organization has emerged as a prevalent approach in current know...
Chih-Ping Wei, Christopher C. Yang, Chia-Min Lin
TON
2002
86views more  TON 2002»
13 years 4 months ago
Efficient randomized web-cache replacement schemes using samples from past eviction times
The problem of document replacement in web caches has received much attention in recent research, and it has been shown that the eviction rule "replace the least recently used...
Konstantinos Psounis, Balaji Prabhakar
TKDE
2002
111views more  TKDE 2002»
13 years 4 months ago
Query Relaxation by Structure and Semantics for Retrieval of Logical Web Documents
Since WWW encourages hypertext and hypermedia document authoring (e.g. HTML or XML), Web authors tend to create documents that are composed of multiple pages connected with hyperl...
Wen-Syan Li, K. Selçuk Candan, Quoc Vu, Div...