This paper explores the use of set expansion (SE) to improve question answering (QA) when the expected answer is a list of entities belonging to a certain class. Given a small set...
Richard C. Wang, Nico Schlaefer, William W. Cohen,...
This article presents an online cluster using genetic algorithms to increase information retrieval efficiency. The Information Retrieval (IR) is based on the grouping of documents...
Spam sender detection based on email subject data is a complex large-scale text mining task. The dataset consists of email subject lines and the corresponding IP address of the em...
Chemistry research papers are a primary source of information about chemistry, as in any scientific field. The presentation of the data is, predominantly, unstructured information...
C. J. Rupp, Ann A. Copestake, Peter Corbett, Peter...
Web 2.0 applications like Flickr, YouTube, or Del.icio.us are increasingly popular online communities for creating, editing and sharing content. However, the rapid increase in siz...