Where Information Retrieval (IR) and Text Categorization delivers a set of (ranked) documents according to a query, users of large document collections would rather like to receiv...
Algorithms that enable the process of automatically mining distinct topics in document collections have become increasingly important due to their applications in many fields and ...
This paper presents a system that combines two text mining techniques; information extraction and clustering. A rulebased approach is used to perform the information extraction tas...
The classical (ad hoc) document retrieval problem has been traditionally approached through ranking according to heuristically developed functions (such as tf.idf or bm25) or gene...