Sciweavers

71 search results - page 13 / 15
» The Case of the Duplicate Documents Measurement, Search, and...
Sort
View
95
Voted
WISE
2009
Springer
15 years 10 months ago
Entry Pairing in Inverted File
Abstract. This paper proposes to exploit content and usage information to rearrange an inverted index for a full-text IR system. The idea is to merge the entries of two frequently ...
Hoang Thanh Lam, Raffaele Perego, Nguyen Thoi Minh...
92
Voted
ICTAI
2008
IEEE
15 years 7 months ago
Fuzzy Information Retrieval Model Based on Multiple Related Ontologies
– With the World Wide Web popularity the information retrieval area has a new challenge intending to retrieve information resources by their meaning by using a knowledge base. No...
Maria Angelica A. Leite, Ivan L. M. Ricarte
93
Voted
ISAAC
2005
Springer
138views Algorithms» more  ISAAC 2005»
15 years 6 months ago
On the Complexity of Rocchio's Similarity-Based Relevance Feedback Algorithm
In this paper, we prove for the first time that the learning complexity of Rocchio’s algorithm is O(d+d2 (log d+log n)) over the discretized vector space {0, . . . , n − 1}d ,...
Zhixiang Chen, Bin Fu
CIKM
2005
Springer
15 years 6 months ago
Maximal termsets as a query structuring mechanism
Search engines process queries conjunctively to restrict the size of the answer set. Further, it is not rare to observe a mismatch between the vocabulary used in the text of Web p...
Bruno Pôssas, Nivio Ziviani, Berthier A. Rib...
144
Voted
GECCO
2007
Springer
206views Optimization» more  GECCO 2007»
15 years 5 months ago
Using code metric histograms and genetic algorithms to perform author identification for software forensics
We have developed a technique to characterize software developers' styles using a set of source code metrics. This style fingerprint can be used to identify the likely author...
Robert Charles Lange, Spiros Mancoridis