Abstract. Even a relatively unstructured captioned image set depicting a variety of objects in cluttered scenes contains strong correlations between caption words and repeated visu...
Automatic summarization systems usually are trained and evaluated in a particular domain with fixed data sets. When such a system is to be applied to slightly different input, la...
We describe two corpora of question and answer pairs collected for complex, open-domain Question Answering (QA) to enable answer classification and re-ranking experiments. We deli...
We describe the process of converting plain text cultural heritage data to elements of a domain-specific knowledge base, using general machine learning techniques. First, digitise...
This paper presents the evaluation of the dictionary look-up component of Mayo Clinic's Information Extraction system. The component was tested on a corpus of 160 free-text c...
Karin Schuler, Vinod Kaggal, James J. Masanz, Phil...