Abstract. Extensive work has been done in recent years on automatically grouping words into categories. For example, {Wednesday, Monday, Tuesday} could be grouped into a `days of w...
Neil Rubens, Vera Sheinman, Takenobu Tokunaga, Mas...
We introduce factored language models (FLMs) and generalized parallel backoff (GPB). An FLM represents words as bundles of features (e.g., morphological classes, stems, data-drive...
A well-known challenge of information retrieval is how to infer a user's underlying information need when the input query consists of only a few keywords. Question Answering (...
Abstract—The idea of an online visual vocabulary is proposed. In contrast to the accepted strategy of generating vocabularies offline, using the k-means clustering over all the ...
Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the databas...