Sciweavers

116 search results - page 1 / 24
» Significance of HTML Tags for Document Indexing and Retrieva...
Sort
View
IADIS
2003
13 years 6 months ago
Significance of HTML Tags for Document Indexing and Retrieval
Indexing quality has an overwhelming effect on retrieval effectiveness of search engines. In the past few years it has become one of the major challenges in the search engines are...
Byurhan Hyusein, Ahmed Patel
DOCENG
2007
ACM
13 years 8 months ago
Structure and content analysis for html medical articles: a hidden markov model approach
We describe ongoing research on segmenting and labeling HTML medical journal articles. In contrast to existing approaches in which HTML tags usually serve as strong indicators, we...
Jie Zou, Daniel X. Le, George R. Thoma
ICTAI
1999
IEEE
13 years 9 months ago
A New Study on Using HTML Structures to Improve Retrieval
Locating useful information effectively from the World Wide Web (WWW) is of wide interest. This paper presents new results on a methodology of using the structures and hyperlinks ...
Michal Cutler, H. Deng, S. Maniccam, Weiyi Meng
SIGIR
2005
ACM
13 years 10 months ago
Title extraction from bodies of HTML documents and its application to web page retrieval
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...
WWW
2010
ACM
13 years 11 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han