A distributed system is described that reliably mines parallel text from large corpora. The approach can be regarded as cross-language near-duplicate detection, enabled by an init...
Jakob Uszkoreit, Jay Ponte, Ashok C. Popat, Moshe ...
We profile a system for search and analysis of largescale email archives. The system builds around four facets: Content-based search engine, statistical topic model, automaticall...
Textual entailment recognition plays a fundamental role in tasks that require indepth natural language understanding. In order to use entailment recognition technologies for real-...
The problem of finding trust paths and estimating the trust one can place in a partner arises in various application areas, including virtual organisations, authentication systems ...
Polysemy is one of the most difficult problems when dealing with natural language resources. Consequently, automated ontology learning from textual sources (such as web resources) ...