Document clustering has long been an important problem in information retrieval. In this paper, we present a new clustering algorithm ASI1, which uses explicitly modeling of the s...
Software system documentation is almost always expressed informally in natural language and free text. Examples include requirement specifications, design documents, manual pages, ...
A segmentation algorithm, which can detect different regions of a handwritten document such as text lines, tables and sketches will be extremely useful in a variety of application...
Similarity measures are mechanisms that assign a numeric score indicating how closely two documents, or a document and a query match. The Cosine measure is one of the similarity m...
Summarizing web pages have recently gained much attention from researchers. Until now two main types of approaches have been proposed for this task: content- and context-based met...