A continuous top-k query retrieves the k most preferred objects in a data stream according to a given preference function. These queries are important for a broad spectrum of appl...
Avani Shastri, Di Yang, Elke A. Rundensteiner, Mat...
Entity matching (EM) is the task of identifying records that refer to the same real-world entity from different data sources. While EM is widely used in data integration and data...
A key component of BM25 contributing to its success is its sub-linear term frequency (TF) normalization formula. The scale and shape of this TF normalization component is controll...
The effectiveness and scalability of MapReduce-based implementations of complex data-intensive tasks depend on an even redistribution of data between map and reduce tasks. In the...
The study of information flow analyzes the principles and mechanisms of social information distribution. It is becoming an extremely important research topic in social network re...
Hongliang Fei, Ruoyi Jiang, Yuhao Yang, Bo Luo, Ju...
NoSQL databases focus on analytical processing of large scale datasets, offering increased scalability over commodity hardware. One of their strongest features is elasticity, whi...
Ioannis Konstantinou, Evangelos Angelou, Christina...
Abstract. In [She82], it is shown that four of its basic functional properties are enough to characterize plain Kolmogorov complexity, hence obtaining an axiomatic characterization...
Automatic classes are classes of languages for which a finite automaton can decide whether a given element is in a set given by its index. The present work studies the learnabilit...
John Case, Sanjay Jain, Yuh Shin Ong, Pavel Semukh...