As part of a large effort to acquire large repositories of facts from unstructured text on the Web, a seed-based framework for textual information extraction allows for weakly sup...
Vast amounts of text on the Web are unstructured and ungrammatical, such as classified ads, auction listings, forum postings, etc. We call such text “posts.” Despite their in...
Nowadays, images have become widely available on the World Wide Web (WWW). It’s essential to develop effective ways for managing and retrieving such abundant images. Advantageou...
Challenging the implicit reliance on document collections, this paper discusses the pros and cons of using query logs rather than document collections, as self-contained sources o...
Background: Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio...
Gianluca Pollastri, Alberto J. M. Martin, Catherin...