Recent advances in information retrieval over hyperlinked corpora have convincinglydemonstratedthat links carry less noisy information than text. We investigate the feasibility of...
Georeferenced data sets are often large and complex. Natural Language Generation (NLG) systems are beginning to emerge that generate texts from such data. One of the challenges th...
Caches are very inefficiently utilized because not all the excess data fetched into the cache, to exploit spatial locality, is utilized. We define cache utilization as the percent...
We consider the problem of content-based spam filtering for short text messages that arise in three contexts: mobile (SMS) communication, blog comments, and email summary informa...
: TextWise LLC. participated in the TREC-7 Cross-Language Retrieval track using the CINDOR system, which utilizes a "conceptual interlingua" representation of documents a...
Anne Diekema, Farhad Oroumchian, Paraic Sheridan, ...