Many valuable text databases on the web have non-crawlable contents that are "hidden" behind search interfaces. Metasearchers are helpful tools for searching over multip...
Nowadays contents in Internet like weblogs, wikipedia and news sites become "live". How to notify and provide users with the relevant contents becomes a challenge. Unlike...
Weixiong Rao, Ada Wai-Chee Fu, Lei Chen 0002, Hanh...
This paper uses the URL word breaking task as an example to elaborate what we identify as crucialin designingstatistical natural language processing (NLP) algorithmsfor Web scale ...
Kuansan Wang, Christopher Thrasher, Bo-June Paul H...
—String matching is a ubiquitous problem that arises in a wide range of applications in computing, e.g., packet routing, intrusion detection, web querying, and genome analysis. D...
s In TREC-10, we participated in the web track (only ad-hoc task) and the QA track (only main task). In the QA track, our QA system (SiteQ) has general architecture with three proc...
Gary Geunbae Lee, Jungyun Seo, Seungwoo Lee, Hanmi...