We investigate using topic prediction data, as a summary of document content, to compute measures of search result quality. Unlike existing quality measures such as query clarity t...
Users attempt to express their search goals through web search queries. When a search goal has multiple components or aspects, documents that represent all the aspects are likely ...
The integration of facts derived from information extraction systems into existing knowledge bases requires a system to disambiguate entity mentions in the text. This is challengi...
Mark Dredze, Paul McNamee, Delip Rao, Adam Gerber,...
In lots of natural language processing tasks, the classes to be dealt with often occur heavily imbalanced in the underlying data set and classifiers trained on such skewed data t...
A semi-structured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because eac...