Sciweavers

1319 search results - page 16 / 264
» Using the Structure of HTML Documents to Improve Retrieval
Sort
View
WWW
2008
ACM
16 years 13 days ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
FTDCS
2003
IEEE
15 years 5 months ago
pFilter: Global Information Filtering and Dissemination Using Structured Overlay Networks
The exponential data growth rate of the Internet makes it increasingly difficult for people to find desired information in a timely fashion. Information filtering and dissemina...
Chunqiang Tang, Zhichen Xu
IPM
2008
141views more  IPM 2008»
14 years 11 months ago
Towards a unified approach to document similarity search using manifold-ranking of blocks
Document similarity search (i.e. query by example) aims to retrieve a ranked list of documents similar to a query document in a text corpus or on the Web. Most existing approaches...
Xiaojun Wan, Jianwu Yang, Jianguo Xiao
SIGIR
2008
ACM
14 years 11 months ago
Improving biomedical document retrieval using domain knowledge
Research articles typically introduce new results or findings and relate them to knowledge entities of immediate relevance. However, a large body of context knowledge related to t...
Shuguang Wang, Milos Hauskrecht
DIAL
2004
IEEE
156views Image Analysis» more  DIAL 2004»
15 years 3 months ago
Xed: A New Tool for eXtracting Hidden Structures from Electronic Documents
PDF became a very common format for exchanging printable documents. Further, it can be easily generated from the major documents formats, which make a huge number of PDF documents...
Karim Hadjar, Maurizio Rigamonti, Denis Lalanne, R...