Spam and phishing emails are not only annoying to users, but are a real threat to internet communication and web economy. The fight against unwanted emails has become a cat-and-mo...
Abstract. This paper investigates a new extension of the Probabilistic Latent Semantic Analysis (PLSA) model [6] for text classification where the training set is partially labeled...
Open Information Extraction extracts relations from text without requiring a pre-specified domain or vocabulary. While existing techniques have used only shallow syntactic featur...
Janara Christensen, Mausam, Stephen Soderland, Ore...
We present a hybrid method to turn off-the-shelf information retrieval (IR) systems into future event predictors. Given a query, a time series model is trained on the publication...
We consider the task of suggesting related queries to users after they issue their initial query to a web search engine. We propose a machine learning approach to learn the probab...