We propose a new machine learning paradigm called Graph Transformer Networks that extends the applicability of gradient-based learning algorithms to systems composed of modules th...
Abstract. Erasure of information incurs an increase in entropy and dissipates heat. Therefore, information-preserving computation is essential for constructing computers that use e...
Abstract. Greedy machine learning algorithms suffer from shortsightedness, potentially returning suboptimal models due to limited exploration of the search space. Greedy search mis...
In this chapter, we describe a view of statistical learning in the inductive logic programming setting based on kernel methods. The relational representation of data and background...
We propose a novel variant of the conjugate gradient algorithm, Kernel Conjugate Gradient (KCG), designed to speed up learning for kernel machines with differentiable loss functio...