Spectral clustering is a widely used method for organizing data that only relies on pairwise similarity measurements. This makes its application to non-vectorial data straightforw...
Fabian L. Wauthier, Nebojsa Jojic, Michael I. Jord...
Fast retrieval methods are critical for large-scale and
data-driven vision applications. Recent work has explored
ways to embed high-dimensional features or complex distance
fun...
Comprehensive coverage of the public web is crucial to web search engines. Search engines use crawlers to retrieve pages and then discover new ones by extracting the pages' o...
More and more structured information in the form of semantic data is nowadays available. It offers a wide range of new possibilities especially for semantic search and Web data in...
It is crucial for a web crawler to distinguish between ephemeral and persistent content. Ephemeral content (e.g., quote of the day) is usually not worth crawling, because by the t...