We study the problem of active learning with convex loss functions. We prove that even under bounded noise constraints, the minimax rates for proper active learning are often no b...
ABSTRACT. In the framework of the LegDoc project at Xerox Research Centre Europe, we are developing components for the semantic annotation of semi-structured documents. While certa...
We present a technique for augmenting annotated training data with hierarchical word clusters that are automatically derived from a large unannotated corpus. Cluster membership is...
We report about a project which brings together Natural Language Processing and eLearning. One of the functionalities developed within this project is the possibility to annotate ...
Abstract. Machine learning approaches in natural language processing often require a large annotated corpus. We present a complementary approach that utilizes expert knowledge to o...