We present a technique for augmenting annotated training data with hierarchical word clusters that are automatically derived from a large unannotated corpus. Cluster membership is...
Form documents or screen forms bring essential information on the data manipulated by an organization. They can be considered as different but often overlapping views of its whole...
Jan Hidders, Jan Paredaens, Philippe Thiran, Geert...
Human visual capability has remained largely beyond the reach of engineered systems despite intensive study and considerable progress in problem understanding, algorithms and comp...
Abiteboul et al. initiated the systematic study of distributed XML documents consisting of several logical parts, possibly located on different machines. The physical distributio...
The bag-of-words (BoW) model treats images as an unordered set of local regions and represents them by visual word histograms. Implicitly, regions are assumed to be identically an...
Ramazan Gokberk Cinbis, Jakob J. Verbeek, Cordelia...