Abstract A rich family of generic Information Extraction (IE) techniques have been developed by researchers nowadays. This paper proposes WebKER, a system for automatically extract...
Data mining allows the exploration of sequences of phenomena, whereas one usually tends to focus on isolated phenomena or on the relation between two phenomena. It offers invaluab...
The increasing complexity of enterprise databases and the prevalent lack of documentation incur significant cost in both understanding and integrating the databases. Existing solu...
In many application domains there is a large amount of unlabeled data but only a very limited amount of labeled training data. One general approach that has been explored for util...
Avrim Blum, John D. Lafferty, Mugizi Robert Rweban...
Minkowski-sum cost model indicates that balanced data partitioning is not beneficial for high dimensional data. Thus we study several unbalanced partitioning methods and propose ...