Information retrieval methods are frequently used for indexing and retrieving spoken documents, and more recently have been proposed for voice-search amongst a pre-defined set of ...
Geoffrey Zweig, Patrick Nguyen, Jasha Droppo, Alex...
Various approaches for plagiarism detection exist. All are based on more or less sophisticated text analysis methods such as string matching, fingerprinting or style comparison. I...
WINACS (Web-based Information Network Analysis for Computer Science) is a project that incorporates many recent, exciting developments in data sciences to construct a Web-based co...
This work explores the problem of cross-lingual pairwise similarity, where the task is to extract similar pairs of documents across two different languages. Solutions to this pro...
Generic database replication algorithms do not scale linearly in throughput as all update, deletion and insertion (UDI) queries must be applied to every database replica. The thro...