As the number of available Web pages grows, users experience increasing difficulty finding documents relevant to their interests. One of the underlying reasons for this is that mo...
In this paper, we propose a machine learning approach to title extraction from general documents. By general documents, we mean documents that can belong to any one of a number of...
Yunhua Hu, Hang Li, Yunbo Cao, Dmitriy Meyerzon, Q...
It has been widely observed that search queries are composed in a very different style from that of the body or the title of a document. Many techniques explicitly accounting for...
Complex network analysis is a growing research area in a wide variety of domains and has recently become closely associated with data, text and web mining. One of the most active ...
Cristian Klen dos Santos, Alexandre Evsukoff, Beat...
An information retrieval performance measure that is interpreted as the percent of perfect performance (PPP) can be used to study the effects of the inclusion of specific documen...