Latent semantic indexing (LSI) is a well-known unsupervised approach for dimensionality reduction in information retrieval. However if the output information (i.e. category labels...
Text similarity spans a spectrum, with broad topical similarity near one extreme and document identity at the other. Intermediate levels of similarity – resulting from summariza...
Donald Metzler, Yaniv Bernstein, W. Bruce Croft, A...
Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of <table> tags. A mul...
This CHI Note proposes a new research initiative for the HCI community: multi-lifespan information system design. The central idea begins with the identification of categories of ...
Query translation in Cross Language Information Retrieval (CLIR) can be performed using multiple resources. Previous attempts to combine different translation resources use simple...