Sciweavers

WWW
2008
ACM
14 years 5 months ago
IRLbot: scaling to 6 billion pages and beyond
This paper shares our experience in designing a web crawler that can download billions of pages using a single-server implementation and models its performance. We show that with ...
Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, Dmit...
WWW
2008
ACM
14 years 5 months ago
Ranking refinement and its application to information retrieval
We consider the problem of ranking refinement, i.e., to improve the accuracy of an existing ranking function with a small set of labeled instances. We are, particularly, intereste...
Rong Jin, Hamed Valizadegan, Hang Li
WWW
2008
ACM
14 years 5 months ago
Learning to classify short and sparse text & web with hidden topics from large-scale data collections
This paper presents a general framework for building classifiers that deal with short and sparse text & Web segments by making the most of hidden topics discovered from larges...
Xuan Hieu Phan, Minh Le Nguyen, Susumu Horiguchi
WWW
2008
ACM
14 years 5 months ago
A combinatorial allocation mechanism with penalties for banner advertising
Most current banner advertising is sold through negotiation thereby incurring large transaction costs and possibly suboptimal allocations. We propose a new automated system for se...
Uriel Feige, Nicole Immorlica, Vahab S. Mirrokni, ...
WWW
2008
ACM
14 years 5 months ago
Generating hypotheses from the web
Hypothesis generation is a crucial initial step for making scientific discoveries. This paper addresses the problem of automatically discovering interesting hypotheses from the we...
Wei Jin, Rohini K. Srihari, Abhishek Singh
WWW
2008
ACM
14 years 5 months ago
Sessionlock: securing web sessions against eavesdropping
Typical web sessions can be hijacked easily by a network eavesdropper in attacks that have come to be designated "sidejacking." The rise of ubiquitous wireless networks,...
Ben Adida
WWW
2008
ACM
14 years 5 months ago
How people use the web on mobile devices
This paper describes a series of user studies on how people use the Web via mobile devices. The data primarily comes from contextual inquiries with 47 participants between 2004 an...
Yanqing Cui, Virpi Roto
WWW
2008
ACM
14 years 5 months ago
Automatically refining the wikipedia infobox ontology
The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machine-harvestable object-attribute-value triples in Wikipedia inf...
Fei Wu, Daniel S. Weld
WWW
2008
ACM
14 years 5 months ago
Dtwiki: a disconnection and intermittency tolerant wiki
Wikis have proven to be a valuable tool for collaboration and content generation on the web. Simple semantics and ease-of-use make wiki systems well suited for meeting many emergi...
Bowei Du, Eric A. Brewer
WWW
2008
ACM
14 years 5 months ago
Computing minimum cost diagnoses to repair populated DL-based ontologies
Ontology population is prone to cause inconsistency because the populating process is imprecise or the populated data may conflict with the original data. By assuming that the int...
Jianfeng Du, Yi-Dong Shen