Technology in the field of digital media generates huge amounts of nontextual information, audio, video, and images, along with more familiar textual information. The potential for...
This paper describes the participation of Columbus Project of Microsoft Research Asia (MSRA) in the GeoCLEF 2006 (a cross-language geographical retrieval track which is part of Cr...
Zhisheng Li, Chong Wang 0002, Xing Xie, Xufa Wang,...
This paper presents a general framework for building classifiers that deal with short and sparse text & Web segments by making the most of hidden topics discovered from larges...
Approximate queries on a collection of strings are important in many applications such as record linkage, spell checking, and Web search, where inconsistencies and errors exist in...
We are concerned with dependencyoriented morphosyntactic parsing of running text. While a parsing grammar should avoid introducing structurallyunresolvable distinctionsin order to...