This paper studies the ensemble selection problem for unsupervised learning. Given a large library of different clustering solutions, our goal is to select a subset of solutions t...
When non-unique values are used as the identifier of entities, due to their homonym, confusion can occur. In particular, when (part of) “names” of entities are used as their ...
Decision lists (or ordered rule sets) have two attractive properties compared to unordered rule sets: they require a simpler classification procedure and they allow for a more co...
On large datasets, the popular training approach has been stochastic gradient descent (SGD). This paper proposes a modification of SGD, called averaged SGD with feedback (ASF), tha...
We have developed Environmental Scenario Search Engine (ESSE) for parallel data mining of a set of conditions inside distributed, very large databases from multiple environmental ...
Mikhail N. Zhizhin, Eric A. Kihn, Vassily Lyutsare...