Sciweavers

73 search results - page 10 / 15
» Classifying Web Data in Directory Structures
Sort
View
78
Voted
LREC
2008
108views Education» more  LREC 2008»
14 years 11 months ago
A Lightweight and Efficient Tool for Cleaning Web Pages
Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...
Stefan Evert
WWW
2008
ACM
15 years 10 months ago
Floatcascade learning for fast imbalanced web mining
This paper is concerned with the problem of Imbalanced Classification (IC) in web mining, which often arises on the web due to the "Matthew Effect". As web IC applicatio...
Xiaoxun Zhang, Xueying Wang, Honglei Guo, Zhili Gu...
WWW
2004
ACM
15 years 10 months ago
Link fusion: a unified link analysis framework for multi-type interrelated data objects
Web link analysis has proven to be a significant enhancement for quality based web search. Most existing links can be classified into two categories: intra-type links (e.g., web h...
Wensi Xi, Benyu Zhang, Zheng Chen, Yizhou Lu, Shui...
EUPROJECTS
2006
Springer
15 years 1 months ago
Web Mediators for Accessible Browsing
We present a highly accurate method for classifying web pages based on link percentage, which is the percentage of text characters that are parts of links normalized by the number...
Benjamin N. Waber, John J. Magee, Margrit Betke
ICDM
2002
IEEE
143views Data Mining» more  ICDM 2002»
15 years 2 months ago
Automatic Web Page Classification in a Dynamic and Hierarchical Way
Automatic classification of web pages is an effective way to deal with the difficulty of retrieving information from the Internet. Although there are many automatic classification...
Xiaogang Peng, Ben Choi