Sciweavers

116 search results - page 2 / 24
» Significance of HTML Tags for Document Indexing and Retrieva...
Sort
View
JCDL
2006
ACM
167views Education» more  JCDL 2006»
13 years 11 months ago
Combining DOM tree and geometric layout analysis for online medical journal article segmentation
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
Jie Zou, Daniel X. Le, George R. Thoma
SIGIR
2009
ACM
13 years 12 months ago
Personalized tag recommendation using graph-based ranking on multi-type interrelated objects
Social tagging is becoming increasingly popular in many Web 2.0 applications where users can annotate resources (e.g. Web pages) with arbitrary keywords (i.e. tags). A tag recomme...
Ziyu Guan, Jiajun Bu, Qiaozhu Mei, Chun Chen, Can ...
ICDAR
1997
IEEE
13 years 9 months ago
Representing OCRed documents in HTML
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...
Tao Hong, Sargur N. Srihari
KDD
2002
ACM
148views Data Mining» more  KDD 2002»
14 years 5 months ago
Discovering informative content blocks from Web documents
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...
Shian-Hua Lin, Jan-Ming Ho
WWW
2008
ACM
14 years 6 months ago
Mining, indexing, and searching for textual chemical molecule information on the web
Current search engines do not support user searches for chemical entities (chemical names and formulae) beyond simple keyword searches. Usually a chemical molecule can be represen...
Bingjun Sun, Prasenjit Mitra, C. Lee Giles