Although documents have hundreds of thousands of unique words, only a small number of words are significantly useful for intelligent services. For this reason, feature extraction ...
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Part of the process of data integration is determining which sets of identifiers refer to the same real-world entities. In integrating databases found on the Web or obtained by us...
Little work to date in sentiment analysis (classifying texts by ‘positive’ or ‘negative’ orientation) has attempted to use fine-grained semantic distinctions in features ...
Detailed knowledge about implemented concerns in the source code is crucial for the cost-effective maintenance and successful evolution of large systems. Concern mining techniques...