Manual classification of free-text documents within a predefined hierarchy is highly time consuming. This is especially true for clinical guidelines, which are often indexed by mu...
Robert Moskovitch, Shiva Cohen-Kashi, Uzi Dror, If...
Many applications which use web data extract information from a limited number of regions on a web page. As such, web page division into blocks and the subsequent block classifica...
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Prospective readers can quickly determine whether a document is relevant to their information need if the significant phrases (or keyphrases) in this document are provided. Althou...
Authorship attribution is the task of identifying the author of a given text. The main concern of this task is to define an appropriate characterization of documents that captures ...