Sciweavers

146 search results - page 25 / 30
» RoadRunner: Towards Automatic Data Extraction from Large Web...
Sort
View
79
Voted
ECIR
2008
Springer
14 years 11 months ago
Clustering Template Based Web Documents
More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
Thomas Gottron
CICLING
2009
Springer
15 years 4 months ago
Semi-supervised Word Sense Disambiguation Using the Web as Corpus
Abstract. As any other classification task, Word Sense Disambiguation requires a large number of training examples. These examples, which are easily obtained for most of the tasks,...
Rafael Guzmán-Cabrera, Paolo Rosso, Manuel ...
KDD
2005
ACM
182views Data Mining» more  KDD 2005»
15 years 10 months ago
Making holistic schema matching robust: an ensemble approach
The Web has been rapidly "deepened" by myriad searchable databases online, where data are hidden behind query interfaces. As an essential task toward integrating these m...
Bin He, Kevin Chen-Chuan Chang
NAR
2000
132views more  NAR 2000»
14 years 9 months ago
The Eukaryotic Promoter Database (EPD)
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of experimentally characterised eukaryotic POL II promoters. The underlying definition of a promote...
Rouaïda Cavin Périer, Viviane Praz, Th...
CIKM
2006
Springer
15 years 1 months ago
Summarizing local context to personalize global web search
The PC Desktop is a very rich repository of personal information, efficiently capturing user's interests. In this paper we propose a new approach towards an automatic persona...
Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang...