Sciweavers

1523 search results - page 141 / 305
» Generalized contextualization method for XML information ret...
Sort
View
CIKM
2010
Springer
14 years 8 months ago
Improved index compression techniques for versioned document collections
Current Information Retrieval systems use inverted index structures for efficient query processing. Due to the extremely large size of many data sets, these index structures are u...
Jinru He, Junyuan Zeng, Torsten Suel
SIGIR
2010
ACM
14 years 4 months ago
Efficient partial-duplicate detection based on sequence matching
With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...
Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang
ICUIMC
2009
ACM
15 years 4 months ago
PicAChoo: a tool for customizable feature extraction utilizing characteristics of textual data
Although documents have hundreds of thousands of unique words, only a small number of words are significantly useful for intelligent services. For this reason, feature extraction ...
Jaeseok Myung, Jung-Yeon Yang, Sang-goo Lee
SAC
2006
ACM
15 years 4 months ago
Light stemming approaches for the French, Portuguese, German and Hungarian languages
This paper describes and evaluates various general stemming approaches for the French, Portuguese (Brazilian), German and Hungarian languages. Based on the CLEF test-collections, ...
Jacques Savoy
WWW
2007
ACM
15 years 10 months ago
Answering bounded continuous search queries in the world wide web
Search queries applied to extract relevant information from the World Wide Web over a period of time may be denoted as continuous search queries. The improvement of continuous sea...
Dirk Kukulenz, Alexandros Ntoulas