Previous work on Natural Language Processing for Information Retrieval has shown the inadequateness of semantic and syntactic structures for both document retrieval and categoriza...
As the web expands exponentially, the need to put some order to its content becomes apparent. Hypertext categorization, that is the automatic classification of web documents into ...
In this paper we propose PARTfs which adopts a supervised machine learning algorithm, namely partial decision trees, as a method for feature subset selection. In particular, it is...
Most Twitter search systems generally treat a tweet as a plain text when modeling relevance. However, a series of conventions allows users to tweet in structural ways using combin...
Zhunchen Luo, Miles Osborne, Sasa Petrovic, Ting W...
This paper presents an automatic orientation detection and categorization technique that is capable of detecting the orientation of multilingual documents with arbitrary skew and ...