Minimum-error-rate training (MERT) is a bottleneck for current development in statistical machine translation because it is limited in the number of weights it can reliably optimi...
Web search is challenging partly due to the fact that search queries and Web documents use different language styles and vocabularies. This paper provides a quantitative analysis ...
Recently, there has been an emphasis on creating shared resources for natural language processing applications. This has resulted in the development of high-quality tools and data...
Recent text and speech processing applications such as speech mining raise new and more general problems related to the construction of language models. We present and describe in...
With a few exceptions, discriminative training in statistical machine translation (SMT) has been content with tuning weights for large feature sets on small development data. Evid...