Sciweavers

367 search results - page 30 / 74
» Indexing Text Documents Based on Topic Identification
Sort
View
CIKM
2010
Springer
14 years 10 months ago
Hypergraph-based multilevel matrix approximation for text information retrieval
In Latent Semantic Indexing (LSI), a collection of documents is often pre-processed to form a sparse term-document matrix, followed by a computation of a low-rank approximation to...
Haw-ren Fang, Yousef Saad
ICDAR
2007
IEEE
15 years 6 months ago
An Efficient Word Segmentation Technique for Historical and Degraded Machine-Printed Documents
Word segmentation is a crucial step for segmentation-free document analysis systems and is used for creating an index based on word matching. In this paper, we propose a novel met...
Michael Makridis, N. Nikolaou, Basilios Gatos
PVLDB
2008
174views more  PVLDB 2008»
14 years 11 months ago
Relaxation in text search using taxonomies
In this paper we propose a novel document retrieval model in which text queries are augmented with multi-dimensional taxonomy restrictions. These restrictions may be relaxed at a ...
Marcus Fontoura, Vanja Josifovski, Ravi Kumar, Chr...
ICASSP
2011
IEEE
14 years 3 months ago
Concept-based classification for multi-document summarization
Documents often contain inherently many concepts reflecting specific and generic aspects. To automatically generate a short summary text of documents on similar topics, it is im...
Asli Çelikyilmaz, Dilek Hakkani-Tür
DL
1994
Springer
191views Digital Library» more  DL 1994»
15 years 3 months ago
Corpus Linguistics for Establishing The Natural Language Content of Digital Library Documents
Digital Libraries will hold huge amounts of text and other forms of information. For the collections to be maximally useful, they must be highly organized with useful indexes and ...
Robert P. Futrelle, Xiaolan Zhang 0002, Yumiko Sek...