Sciweavers

179 search results - page 18 / 36
» Improvement of HITS-based algorithms on web documents
Sort
View
SIGIR
2008
ACM
14 years 9 months ago
Classifiers without borders: incorporating fielded text from neighboring web pages
Accurate web page classification often depends crucially on information gained from neighboring pages in the local web graph. Prior work has exploited the class labels of nearby p...
Xiaoguang Qi, Brian D. Davison
SIGIR
2010
ACM
15 years 1 months ago
Adaptive near-duplicate detection via similarity learning
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
79
Voted
GRID
2006
Springer
14 years 9 months ago
A Parallel Approach to XML Parsing
A language for semi-structured documents, XML has emerged as the core of the web services architecture, and is playing crucial roles in messaging systems, databases, and document p...
Wei Lu, Kenneth Chiu, Yinfei Pan
WSDM
2012
ACM
285views Data Mining» more  WSDM 2012»
13 years 5 months ago
Probabilistic models for personalizing web search
We present a new approach for personalizing Web search results to a specific user. Ranking functions for Web search engines are typically trained by machine learning algorithms u...
David Sontag, Kevyn Collins-Thompson, Paul N. Benn...
WWW
2009
ACM
15 years 4 months ago
Bootstrapped extraction of class attributes
As an alternative to previous studies on extracting class attributes from unstructured text, which consider either Web documents or query logs as the source of textual data, A boo...
Joseph Reisinger, Marius Pasca