The paper presents Bulgarian National Corpus project (BulNC) - a large-scale, representative, online available corpus of Bulgarian. The BulNC is also a monolingual general corpus,...
ImageNet is a large-scale database of object classes with millions of images. Unfortunately only a small fraction of them is manually annotated with bounding-boxes. This prevents ...
Abstract. The purpose of information extraction (IE) is to find desired pieces of information in natural language texts and store them in a form that is suitable for automatic pro...
Corpus-based grammar induction generally relies on hand-parsed training data to learn the structure of the language. Unfortunately, the cost of building large annotated corpora is...
We derive two variants of a semi-supervised model for fine-grained sentiment analysis. Both models leverage abundant natural supervision in the form of review ratings, as well as...