In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
Creating labeled training data for relation extraction is expensive. In this paper, we study relation extraction in a special weakly-supervised setting when we have only a few see...
“The curse of dimensionality” is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity and classification error in high dimension...
Mykola Pechenizkiy, Seppo Puuronen, Alexey Tsymbal
Information extraction (IE) holds the promise of generating a large-scale knowledge base from the Web’s natural language text. Knowledge-based weak supervision, using structured...
Raphael Hoffmann, Congle Zhang, Xiao Ling, Luke S....
In this paper, we investigate how modeling content structure can benefit text analysis applications such as extractive summarization and sentiment analysis. This follows the lingu...