Search Sciweavers | Sciweavers

9730 search results - page 1825 / 1946

» Relating models of backtracking

140

click to vote

WWW
2008
ACM

91views Internet Technology» more WWW 2008»

IRLbot: scaling to 6 billion pages and beyond

16 years 4 months ago

Download irl.cs.tamu.edu

This paper shares our experience in designing a web crawler that can download billions of pages using a single-server implementation and models its performance. We show that with ...

Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, Dmit...

claim paper

Read More »

159

click to vote

WWW
2008
ACM

163views Internet Technology» more WWW 2008»

As we may perceive: finding the boundaries of compound documents on the web

16 years 4 months ago

Download www2008.org

This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...

Pavel Dmitriev

claim paper

Read More »

150

Voted

WWW
2008
ACM

179views Internet Technology» more WWW 2008»

Can chinese web pages be classified with english data source?

16 years 4 months ago

Download www2008.org

As the World Wide Web in China grows rapidly, mining knowledge in Chinese Web pages becomes more and more important. Mining Web information usually relies on the machine learning ...

Xiao Ling, Gui-Rong Xue, Wenyuan Dai, Yun Jiang, Q...

claim paper

Read More »

157

Voted

WWW
2008
ACM

125views Internet Technology» more WWW 2008»

16 years 4 months ago

Substructure similarity measurement in chinese recipes

Download www2008.org

Improving the precision of information retrieval has been a challenging issue on Chinese Web. As exemplified by Chinese recipes on the Web, it is not easy/natural for people to us...

Liping Wang, Qing Li, Na Li, Guozhu Dong, Yu Yang

claim paper

Read More »

142

Voted

WWW
2007
ACM

178views Internet Technology» more WWW 2007»

Dynamic personalized pagerank in entity-relation graphs

16 years 4 months ago

Download www2007.org

Extractors and taggers turn unstructured text into entityrelation (ER) graphs where nodes are entities (email, paper, person, conference, company) and edges are relations (wrote, ...

Soumen Chakrabarti

claim paper

Read More »

« Prev « First page 1825 / 1946 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers