Sciweavers

285 search results - page 22 / 57
» Ontology-based Text Document Clustering
Sort
View
77
Voted
COLING
2008
15 years 1 months ago
A Framework for Identifying Textual Redundancy
The task of identifying redundant information in documents that are generated from multiple sources provides a significant challenge for summarization and QA systems. Traditional ...
Kapil Thadani, Kathleen McKeown
ICML
2006
IEEE
16 years 1 months ago
Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
Charles Elkan
110
Voted
JCST
2010
147views more  JCST 2010»
14 years 11 months ago
A New Approach for Multi-Document Update Summarization
Fast changing knowledge on the Internet can be acquired more efficiently with the help of automatic document summarization and updating techniques. This paper describes a novel app...
Chong Long, Minlie Huang, Xiaoyan Zhu, Ming Li
WWW
2002
ACM
16 years 1 months ago
Using web structure for classifying and describing web pages
The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents ...
Eric J. Glover, Kostas Tsioutsiouliklis, Steve Law...
88
Voted
DOCENG
2005
ACM
15 years 2 months ago
Structuring documents according to their table of contents
In this paper, we present a method for structuring a document according to the information present in its Table of Contents. The detection of the ToC as well as the determination ...
Hervé Déjean, Jean-Luc Meunier