Inverse document frequency (IDF) is one of the most useful and widely used concepts in information retrieval. There have been various attempts to provide theoretical justification...
We introduce a multi-stage ensemble framework, ErrorDriven Generalist+Expert or Edge, for improved classification on large-scale text categorization problems. Edge first trains a ...
Many of the recently proposed algorithms for learning feature-based ranking functions are based on the pairwise preference framework, in which instead of taking documents in isola...
Vitor R. Carvalho, Jonathan L. Elsas, William W. C...
Document understanding techniques such as document clustering and multi-document summarization have been receiving much attention in recent years. Current document clustering meth...
Dingding Wang, Shenghuo Zhu, Tao Li, Yun Chi, Yiho...
This paper describes the starting points of how to design and build tools to help individual users track and monitor their presence on the web from the standpoints of individual p...
Markus Bylund, Jussi Karlgren, Fredrik Olsson, Ped...