The performance of document clustering systems depends on employing optimal text representations, which are not only difficult to determine beforehand, but also may vary from one ...
We propose a multi-document generic summarization model based on the budgeted median problem. Our model selects sentences to generate a summary so that every sentence in the docum...
This paper introduces a new technique of document clustering based on frequent senses. The proposed system, GDClust (Graph-Based Document Clustering) works with frequent senses ra...
Short texts clustering is one of the most difficult tasks in natural language processing due to the low frequencies of the document terms. We are interested in analysing these kind...
Diego Ingaramo, David Pinto, Paolo Rosso, Marcelo ...
Text representation plays a crucial role in classical text mining, where the primary focus was on static text. Nevertheless, well-studied static text representations including TFI...