In this paper, we present Concept Chain Queries (CCQ), a special case of text mining in document collections focusing on detecting links between two topics across text documents. ...
Many text documents naturally have two kinds of labels. For example, we may label web pages from universities according to their categories, such as "student" or "fa...
Abstract. We investigate a generative latent variable model for modelbased word saliency estimation for text modelling and classification. The estimation algorithm derived is able ...
Abstract. In this paper we describe an e cient and scalable implementation for grammar induction based on the EMILE approach ( 2], 3], 4], 5], 6]). The current EMILE 4.1 implementa...
Pieter W. Adriaans, Marten Trautwein, Marco Vervoo...
We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences...
Huma Lodhi, John Shawe-Taylor, Nello Cristianini, ...