In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
We present an algorithm, Nomen, for learning generalized names in text. Examples of these are names of diseases and infectious agents, such as bacteria and viruses. These names ex...
The purpose of extractive document summarization is to automatically select a number of indicative sentences, passages, or paragraphs from the original document according to a tar...
Shih-Hsiang Lin, Yi-Ting Chen, Hsin-Min Wang, Bin ...
An unsupervised method for word sense disambiguation using a bilingual comparable corpus was developed. First, it extracts statistically significant pairs of related words from th...
Abstract. Structural imaging investigations commonly apply a segmentation step followed by the extraction of feature data that can be used to compare or discriminate groups. We pre...