Sciweavers

938 search results - page 143 / 188
» Space-Efficient Algorithms for Document Retrieval
Sort
View
SAC
2006
ACM
15 years 5 months ago
Exploiting partial decision trees for feature subset selection in e-mail categorization
In this paper we propose PARTfs which adopts a supervised machine learning algorithm, namely partial decision trees, as a method for feature subset selection. In particular, it is...
Helmut Berger, Dieter Merkl, Michael Dittenbach
IJCNLP
2005
Springer
15 years 5 months ago
Using Multiple Discriminant Analysis Approach for Linear Text Segmentation
Research on linear text segmentation has been an on-going focus in NLP for the last decade, and it has great potential for a wide range of applications such as document summarizati...
Jingbo Zhu, Na Ye, Xinzhi Chang, Wenliang Chen, Be...
CIKM
2011
Springer
13 years 11 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
SIGIR
2008
ACM
14 years 11 months ago
Query dependent ranking using K-nearest neighbor
Many ranking models have been proposed in information retrieval, and recently machine learning techniques have also been applied to ranking model construction. Most of the existin...
Xiubo Geng, Tie-Yan Liu, Tao Qin, Andrew Arnold, H...
CIKM
2005
Springer
15 years 5 months ago
Biasing web search results for topic familiarity
Depending on a web searcher’s familiarity with a query’s target topic, it may be more appropriate to show her introductory or advanced documents. The TREC HARD [1] track defi...
Giridhar Kumaran, Rosie Jones, Omid Madani