It is commonly believed that word segmentation accuracy is monotonically related to retrieval performance in Chinese information retrieval. In this paper we show that, for Chinese...
Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick ...
An empirical study has been conducted investigating the relationship between the performance of a generative language model in terms of perplexity and the corresponding informatio...
Leif Azzopardi, Mark Girolami, Keith van Rijsberge...
This report describes the English-Chinese crosslanguage retrieval experiments at Berkeley for TREC-9 Cross-Language Information Retrieval track. We present a simple and effective ...
: In the processing of Chinese documents and queries in information retrieval (IR), one has to identify the units that are used as indexes. Words and n-grams have been used as inde...
We propose a self-supervised word-segmentation technique for Chinese information retrieval. This method combines the advantages of traditional dictionary based approaches with cha...
Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick ...