Sciweavers

TKDE
1998
142views more  TKDE 1998»
13 years 3 months ago
Performance Analysis of Three Text-Join Algorithms
—When a multidatabase system contains textual database systems (i.e., information retrieval systems), queries against the global schema of the multidatabase system may contain a ...
Weiyi Meng, Clement T. Yu, Wei Wang 0010, Naphtali...
SIGIR
2002
ACM
13 years 3 months ago
Document clustering with committees
Document clustering is useful in many information retrieval tasks: document browsing, organization and viewing of retrieval results, generation of Yahoo-like hierarchies of docume...
Patrick Pantel, Dekang Lin
SIGIR
2002
ACM
13 years 3 months ago
Document clustering with cluster refinement and model selection capabilities
In this paper, we propose a document clustering method that strives to achieve: (1) a high accuracy of document clustering, and (2) the capability of estimating the number of clus...
Xin Liu, Yihong Gong, Wei Xu, Shenghuo Zhu
SIGIR
2002
ACM
13 years 3 months ago
Cross-document summarization by concept classification
In this paper we describe a Cross Document Summarizer XDoX designed specifically to summarize large document sets (50-500 documents and more). Such sets of documents are typically...
Hilda Hardy, Nobuyuki Shimizu, Tomek Strzalkowski,...
SIGIR
1998
ACM
13 years 3 months ago
Exploring the Similarity Space
Ranked queries are used to locate relevant documents in text databases. In a ranked query a list of terms is specified, then the documents that most closely match the query are re...
Justin Zobel, Alistair Moffat
PR
2002
129views more  PR 2002»
13 years 3 months ago
Text extraction in complex color documents
Text extraction in mixed-type documents is a pre-processing and necessary stage for many document applications. In mixed-type color documents, text, drawings and graphics appear w...
Charalambos Strouthopoulos, Nikos Papamarkos, Anto...
IJON
1998
91views more  IJON 1998»
13 years 3 months ago
WEBSOM - Self-organizing maps of document collections
With the WEBSOM method a textual document collection may be organized onto a graphical map display that provides an overview of the collection and facilitates interactive browsing...
Samuel Kaski, Timo Honkela, Krista Lagus, Teuvo Ko...
IPM
2002
106views more  IPM 2002»
13 years 3 months ago
A feature mining based approach for the classification of text documents into disjoint classes
This paper proposes a new approach for classifying text documents into two disjoint classes. The new approach is based on extracting patterns, in the form of two logical expressio...
Salvador Nieto Sánchez, Evangelos Triantaph...
CN
1998
81views more  CN 1998»
13 years 3 months ago
Improving the WWW: Caching or Multicast?
We consider two schemes for the distributionof Web documents. In the first scheme the sender repeatedly transmits the Web document into a multicast address, and receivers asynchr...
Pablo Rodriguez, Keith W. Ross, Ernst Biersack
PAMI
2000
66views more  PAMI 2000»
13 years 4 months ago
A Statistical, Nonparametric Methodology for Document Degradation Model Validation
Tapas Kanungo, Robert M. Haralick, Henry S. Baird,...