We study a novel shallow information extraction problem that involves extracting sentences of a given set of topic categories from medical forum data. Given a corpus of medical fo...
The multiple-instance learning (MIL) model has been very successful in application areas such as drug discovery and content-based imageretrieval. Recently, a generalization of thi...
Qingping Tao, Stephen D. Scott, N. V. Vinodchandra...
We describe an efficient technique to weigh word-based features in binary classification tasks and show that it significantly improves classification accuracy on a range of proble...
Justin Martineau, Tim Finin, Anupam Joshi, Shamit ...
Background: Understanding how amino acid substitutions affect protein functions is critical for the study of proteins and their implications in diseases. Although methods have bee...
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...