Abstract. Latent Semantic Indexing(LSI) has been proved to be effective to capture the semantic structure of document collections. It is widely used in content-based text retrieval...
Recent progress in information extraction has shown how to automatically build large ontologies from high-quality sources like Wikipedia. But knowledge evolves over time; facts ha...
Yafang Wang, Mingjie Zhu, Lizhen Qu, Marc Spaniol,...
In contextual advertising, estimating the number of impressions of an ad is critical in planning and budgeting advertising campaigns. However, producing this forecast, even within...
Xuerui Wang, Andrei Z. Broder, Marcus Fontoura, Va...
Searching approximate nearest neighbors in large scale high dimensional data set has been a challenging problem. This paper presents a novel and fast algorithm for learning binary...
: A fully operational large scale digital library is likely to be based on a distributed architecture and because of this it is likely that a number of independent search engines m...