Information retrieval algorithms leverage various collection statistics to improve performance. Because these statistics are often computed on a relatively small evaluation corpus...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
Abstract The paper introduces mixed networks, a new graphical model framework for expressing and reasoning with probabilistic and deterministic information. The motivation to devel...
In the ever increasing world of distributed systems, different middleware implementations can be compared qualitatively or quantitatively. Existing evaluation techniques are often...
Position information has been proved to be very effective in document summarization, especially in generic summarization. Existing approaches mostly consider the information of se...