Sciweavers

54 search results - page 11 / 11
» Improving Web Spam Classification using Rank-time Features
Sort
View
WWW
2007
ACM
14 years 6 months ago
U-REST: an unsupervised record extraction system
In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
Yuan Kui Shen, David R. Karger
WWW
2006
ACM
14 years 6 months ago
Relaxed: on the way towards true validation of compound documents
To maintain interoperability in the Web environment it is necessary to comply with Web standards. Current specifications of HTML and XHTML languages define conformance conditions ...
Jirka Kosek, Petr Nálevka
AAAI
2008
13 years 7 months ago
Text Categorization with Knowledge Transfer from Heterogeneous Data Sources
Multi-category classification of short dialogues is a common task performed by humans. When assigning a question to an expert, a customer service operator tries to classify the cu...
Rakesh Gupta, Lev-Arie Ratinov
WWW
2009
ACM
14 years 6 months ago
XQuery in the browser
Since the invention of the Web, the browser has become more and more powerful. By now, it is a programming and execution environment in itself. The predominant language to program...
Ghislain Fourny, Markus Pilman, Daniela Florescu, ...