Readability is a crucial presentation attribute that web summarization algorithms consider while generating a querybaised web summary. Readability quality also forms an important ...
A large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and this is a substantial source of rev...
Abstract. We present the data modeling concepts of Tricia, an opensource Java platform used to implement enterprise web information systems as well as social software solutions inc...
Many web documents are dynamic, with content changing in varying amounts at varying frequencies. However, current document search algorithms have a static view of the document con...
Abstract. This paper describes our approach to the Person Name Disambiguation clustering task in the Third Web People Search Evaluation Campaign(WePS3). The method focuses on two a...