Sciweavers

433 search results - page 11 / 87
» Web page title extraction and its application
Sort
View
LREC
2010
216views Education» more  LREC 2010»
14 years 10 months ago
BlogBuster: A Tool for Extracting Corpora from the Blogosphere
This paper presents BlogBuster, a tool for extracting a corpus from the blogosphere. The topic of cleaning arbitrary web pages with the goal of extracting a corpus from web data, ...
Georgios Petasis, Dimitrios Petasis
ICDAR
2003
IEEE
15 years 2 months ago
Identifying Story and Preview Images in News Web Pages
The World Wide Web provides an increasingly powerful and popular publication mechanism. Web documents often contain a large number of images serving various different purposes. Th...
Jianying Hu, Amit Bagga
WECWIS
2003
IEEE
132views ECommerce» more  WECWIS 2003»
15 years 2 months ago
Page Digest for Large-Scale Web Services
The rapid growth of the World Wide Web and the Internet has fueled interest in Web services and the Semantic Web, which are quickly becoming important parts of modern electronic c...
Daniel Rocco, David Buttler, Ling Liu
INFORMATICALT
2007
92views more  INFORMATICALT 2007»
14 years 9 months ago
A Matrix-Based Model for Web Page Community Construction and More
The rapid development of network technologies has made the web a huge information source with its own characteristics. In most cases, traditional database-based technologies are no...
Jingyu Hou
IICS
2010
Springer
15 years 1 months ago
Local Aspects of the Global Ranking of Web Pages
Started in 1998, the search engine Google estimates page importance using several parameters. PageRank is one of those. Precisely, PageRank is a distribution of probability on the ...
Fabien Mathieu, Laurent Viennot