The representation of information collections needs to be optimized for human cognition. While documents often include rich visual components, collections, including personal coll...
One may need to build a statistical parser for a new language, using only a very small labeled treebank together with raw text. We argue that bootstrapping a parser is most promis...
Clustering of gene expression data is a standard exploratory technique used to identify closely related genes. Many other sources of data are also likely to be of great assistance...
Erliang Zeng, Chengyong Yang, Tao Li, Giri Narasim...
TalkMiner is a search engine for lecture webcasts. Lecture videos are processed to recover a set of distinct slide images and OCR is used to generate a list of indexable terms fro...
John Adcock, Matthew Cooper, Laurent Denoue, Hamed...
Background: The task of recognizing and identifying species names in biomedical literature has recently been regarded as critical for a number of applications in text and data min...