The increasing throughput of sequencing raises growing needs for methods of sequence analysis and comparison on a genomic scale, notably, in connection with phylogenetic tree reco...
Abstract. Distributed crawling has shown that it can overcome important limitations of the centralized crawling paradigm. However, the distributed nature of current distributed cra...
In practice, lexical chains are typically built using term reiteration or resource-based measures of semantic distance. The former approach misses out on a significant portion of ...
Abstract. We propose inductive distance-based methods for instance classification and retrieval in ontologies. Casting retrieval as a classification problem with the goal of assess...
Efficient similarity search in high-dimensional spaces is important to content-based retrieval systems. Recent studies have shown that sketches can effectively approximate L1 dist...