Distinguishing speculative statements from factual ones is important for most biomedical text mining applications. We introduce an approach which is based on solving two sub-probl...
We investigate the problem of automatically labelling
faces of characters in TV or movie material with their
names, using only weak supervision from automaticallyaligned
subtitl...
Low overhead analysis of large distributed data sets is necessary for current data centers and for future sensor networks. In such systems, each node holds some data value, e.g., ...
In the traditional setting, text categorization is formulated as a concept learning problem where each instance is a single isolated document. However, this perspective is not appr...
Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, ...