Sciweavers

6103 search results - page 1007 / 1221
» Multimedia Retrieval Algorithmics
Sort
View
CIKM
2011
Springer
14 years 2 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
CLEF
2011
Springer
14 years 2 months ago
A Language-Independent Approach to Identify the Named Entities in Under-Resourced Languages and Clustering Multilingual Document
Abstract. This paper presents a language-independent Multilingual Document Clustering (MDC) approach on comparable corpora. Named entites (NEs) such as persons, locations, organiza...
N. Kiran Kumar, G. S. K. Santosh, Vasudeva Varma
257
Voted
SIGIR
2012
ACM
13 years 4 months ago
Parallelizing ListNet training using spark
As ever-larger training sets for learning to rank are created, scalability of learning has become increasingly important to achieving continuing improvements in ranking accuracy [...
Shilpa Shukla, Matthew Lease, Ambuj Tewari
240
Voted
ICDE
2008
IEEE
336views Database» more  ICDE 2008»
16 years 3 months ago
Dominant Graph: An Efficient Indexing Structure to Answer Top-K Queries
Given a record set D and a query score function F, a top-k query returns k records from D, whose values of function F on their attributes are the highest. In this paper, we investi...
Lei Zou, Lei Chen 0002
102
Voted
WWW
2009
ACM
16 years 3 months ago
Estimating the impressionrank of web pages
The ImpressionRank of a web page (or, more generally, of a web site) is the number of times users viewed the page while browsing search results. ImpressionRank captures the visibi...
Ziv Bar-Yossef, Maxim Gurevich
« Prev « First page 1007 / 1221 Last » Next »