The use of NLP techniques for document classification has not produced significant improvements in performance within the standard term weighting statistical assignment paradigm (...
We have investigated two major issues in Distributed Information Retrieval (DIR), namely: collection selection and search results merging. While most published works on these two ...
Probabilistic retrieval models usually rank documents based on a scalar quantity. However, such models lack any estimate for the uncertainty associated with a document’s rank. Fu...
Jianhan Zhu, Jun Wang, Michael J. Taylor, Ingemar ...
The probability that a term appears in relevant documents ( ) is a fundamental quantity in several probabilistic retrieval models, however it is difficult to estimate without rele...
Text documents often embed data that is structured in nature. This structured data is increasingly exposed using information extraction systems, which generate structured relation...