Sciweavers

207 search results - page 1 / 42
» Eliminating noisy information in Web pages for data mining
Sort
View
KDD
2003
ACM
161views Data Mining» more  KDD 2003»
14 years 4 months ago
Eliminating noisy information in Web pages for data mining
A commercial Web page typically contains many information blocks. Apart from the main content blocks, it usually has such blocks as navigation panels, copyright and privacy notice...
Lan Yi, Bing Liu, Xiaoli Li
ITCC
2005
IEEE
13 years 9 months ago
Elimination of Redundant Information for Web Data Mining
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditi...
Shakirah Mohd Taib, Soon-ja Yeom, Byeong Ho Kang
SEBD
2008
177views Database» more  SEBD 2008»
13 years 5 months ago
Using PageRank in Feature Selection
Abstract. Feature selection is an important task in data mining because it allows to reduce the data dimensionality and eliminates the noisy variables. Traditionally, feature selec...
Dino Ienco, Rosa Meo, Marco Botta
EMNLP
2008
13 years 5 months ago
Improved Sentence Alignment on Parallel Web Pages Using a Stochastic Tree Alignment Model
Parallel web pages are important source of training data for statistical machine translation. In this paper, we present a new approach to sentence alignment on parallel web pages....
Lei Shi, Ming Zhou
CIKM
2004
Springer
13 years 9 months ago
Optimizing web search using web click-through data
The performance of web search engines may often deteriorate due to the diversity and noisy information contained within web pages. User click-through data can be used to introduce...
Gui-Rong Xue, Hua-Jun Zeng, Zheng Chen, Yong Yu, W...