An ad hoc data format is any non-standard, semi-structured data format for which robust data processing tools are not available. In this paper, we present ANNE, a new kind of mark...
This paper introduces a method to train an error-corrective model for Automatic Speech Recognition (ASR) without using audio data. In existing techniques, it is assumed that suf...
We demonstrate a fully working system for multifaceted browsing over large collections of text-annotated data, such as annotated images, that are stored in relational databases. T...
Wisam Dakka, Panagiotis G. Ipeirotis, Kenneth R. W...
Email spam is a much studied topic, but even though current email spam detecting software has been gaining a competitive edge against text based email spam, new advances in spam g...
This paper describes an efficient method to extract large n-best lists from a word graph produced by a statistical machine translation system. The extraction is based on the k sh...