We developed and tested a heuristic technique for extracting the main article from news site Web pages. We construct the DOM tree of the page and score every node based on the amo...
Webbases are database systems that enable creation of Web applications that allow end users to shop around for products and services at various Web sites without having to manually...
Hasan Davulcu, Guizhen Yang, Michael Kifer, I. V. ...
In this paper we propose a completely unsupervised method for open-domain entity extraction and clustering over query logs. The underlying hypothesis is that classes defined by mi...
s In TREC-10, we participated in the web track (only ad-hoc task) and the QA track (only main task). In the QA track, our QA system (SiteQ) has general architecture with three proc...
Gary Geunbae Lee, Jungyun Seo, Seungwoo Lee, Hanmi...
Users prefer to navigate subjects from organized topics in an abundance resources than to list pages retrieved from search engines. We propose a framework to cluster frequent items...