Sciweavers

80 search results - page 1 / 16
» Extracting context to improve accuracy for HTML content extr...
Sort
View
WWW
2005
ACM
14 years 5 months ago
Extracting context to improve accuracy for HTML content extraction
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo
WWW
2005
ACM
14 years 5 months ago
Hybrid semantic tagging for information extraction
The semantic web is expected to have an impact at least as big as that of the existing HTML based web, if not greater. However, the challenge lays in creating this semantic web an...
Ronen Feldman, Binyamin Rosenfeld, Moshe Fresko, B...
INFORMATICALT
2007
164views more  INFORMATICALT 2007»
13 years 4 months ago
Extracting Personalised Ontology from Data-Intensive Web Application: an HTML Forms-Based Reverse Engineering Approach
The advance of the Web has significantly and rapidly changed the way of information organization, sharing and distribution. The next generation of the web, the semantic web, seeks...
Sidi Mohamed Benslimane, Mimoun Malki, Mustapha Ka...
SIGIR
2005
ACM
13 years 10 months ago
Title extraction from bodies of HTML documents and its application to web page retrieval
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...
IPM
2007
149views more  IPM 2007»
13 years 4 months ago
Web page title extraction and its application
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
Yewei Xue, Yunhua Hu, Guomao Xin, Ruihua Song, Shu...