Sciweavers

311 search results - page 32 / 63
» Cleaning Web Pages for Effective Web Content Mining
Sort
View
WWW
2007
ACM
15 years 10 months ago
Classifying web sites
In this paper, we present a novel method for the classification of Web sites. This method exploits both structure and content of Web sites in order to discern their functionality....
Christoph Lindemann, Lars Littig
90
Voted
HT
2003
ACM
15 years 2 months ago
Enhanced web document summarization using hyperlinks
This paper addresses the issue of Web document summarization. As textual content of Web documents is often scarce or irrelevant and existing summarization techniques are based on ...
Jean-Yves Delort, Bernadette Bouchon-Meunier, Mari...
81
Voted
COLING
2010
14 years 4 months ago
Large Scale Parallel Document Mining for Machine Translation
A distributed system is described that reliably mines parallel text from large corpora. The approach can be regarded as cross-language near-duplicate detection, enabled by an init...
Jakob Uszkoreit, Jay Ponte, Ashok C. Popat, Moshe ...
KDD
2010
ACM
259views Data Mining» more  KDD 2010»
15 years 1 months ago
A probabilistic model for personalized tag prediction
Social tagging systems have become increasingly popular for sharing and organizing web resources. Tag recommendation is a common feature of social tagging systems. Social tagging ...
Dawei Yin, Zhenzhen Xue, Liangjie Hong, Brian D. D...
CORR
2011
Springer
255views Education» more  CORR 2011»
14 years 1 months ago
Link Spam Detection based on DBSpamClust with Fuzzy C-means Clustering
This Search engine became omnipresent means for ingoing to the web. Spamming Search engine is the technique to deceiving the ranking in search engine and it inflates the ranking. ...
S. K. Jayanthi, S. Sasikala