Sciweavers

9730 search results - page 1825 / 1946
» Relating models of backtracking
Sort
View
WWW
2008
ACM
16 years 4 months ago
IRLbot: scaling to 6 billion pages and beyond
This paper shares our experience in designing a web crawler that can download billions of pages using a single-server implementation and models its performance. We show that with ...
Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, Dmit...
WWW
2008
ACM
16 years 4 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
150
Voted
WWW
2008
ACM
16 years 4 months ago
Can chinese web pages be classified with english data source?
As the World Wide Web in China grows rapidly, mining knowledge in Chinese Web pages becomes more and more important. Mining Web information usually relies on the machine learning ...
Xiao Ling, Gui-Rong Xue, Wenyuan Dai, Yun Jiang, Q...
157
Voted
WWW
2008
ACM
16 years 4 months ago
Substructure similarity measurement in chinese recipes
Improving the precision of information retrieval has been a challenging issue on Chinese Web. As exemplified by Chinese recipes on the Web, it is not easy/natural for people to us...
Liping Wang, Qing Li, Na Li, Guozhu Dong, Yu Yang
142
Voted
WWW
2007
ACM
16 years 4 months ago
Dynamic personalized pagerank in entity-relation graphs
Extractors and taggers turn unstructured text into entityrelation (ER) graphs where nodes are entities (email, paper, person, conference, company) and edges are relations (wrote, ...
Soumen Chakrabarti
« Prev « First page 1825 / 1946 Last » Next »