We present a novel language modeling approach to capturing the query reformulation behavior of Web search users. Based on a framework that categorizes eight different types of “...
This paper describes a method of detecting Japanese Katakana variants from a large corpus. Katakana words, which are mainly used as loanwords, cause problems with information retr...
Mobile terminals (cellular phones, PDAs, palmtops etc.) emerge as a new class of small-scale, ad-hoc service providers that share data and functionality via mobile web services’...
Context-awareness has become a desired key feature of today’s mobile systems, yet, its realization still remains a challenge. On the one hand, mobile computing provides great po...
Lukasz Juszczyk, Harald Psaier, Atif Manzoor, Scha...
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz