Current speech recognition systems are often based on HMMs with state-clustered Gaussian Mixture Models (GMMs) to represent the context dependent output distributions. Though high...
The Tarragon Consulting team participated in the primary task of the TREC 2003 Genomics Track. We used a combination of knowledge-engineering and corpus analysis to construct sema...
—Although popular text search engines allow users to retrieve similar web pages, source code search engines do not have this feature. Detecting similar applications is a notoriou...
The analysis of the leading social video sharing platform YouTube reveals a high amount of redundancy, in the form of videos with overlapping or duplicated content. In this paper,...
Querying semantically related data sources depends on the ability to map between their schemas. Unfortunately, in most cases matching between schema is still largely performed man...
Fabien Duchateau, Zohra Bellahsene, Mark Roantree,...