Sciweavers

2151 search results - page 316 / 431
» Using Document Dimensions for Enhanced Information Retrieval
Sort
View
ECIR
2003
Springer
15 years 5 months ago
Discretizing Continuous Attributes in AdaBoost for Text Categorization
Abstract. We focus on two recently proposed algorithms in the family of “boosting”-based learners for automated text classification, AdaBoost.MH and AdaBoost.MHKR . While the ...
Pio Nardiello, Fabrizio Sebastiani, Alessandro Spe...
WWW
2009
ACM
16 years 5 months ago
User-centric content freshness metrics for search engines
In order to return relevant search results, a search engine must keep its local repository synchronized to the Web, but it is usually impossible to attain perfect freshness. Hence...
Ali Dasdan, Xinh Huynh
WWW
2007
ACM
16 years 5 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
HT
2005
ACM
15 years 9 months ago
Activity links: supporting communication and reflection about action
Tasks that take place over a long period of time or collaborative tasks where participants are required to develop an understanding of each other’s effort benefit from better co...
Hao-wei Hsieh, Frank M. Shipman III
SIGIR
2010
ACM
15 years 4 months ago
Analysis of structural relationships for hierarchical cluster labeling
Cluster label quality is crucial for browsing topic hierarchies obtained via document clustering. Intuitively, the hierarchical structure should influence the labeling accuracy. H...
Markus Muhr, Roman Kern, Michael Granitzer