In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...
We investigate the automatic detection of sentences containing linguistic hedges using corpus statistics and syntactic patterns. We take Wikipedia as an already annotated corpus u...
There is a ever-growing need to add structure in the form of semantic markup to the huge amounts of unstructured text data now available. We present the technique of shallow seman...
Sameer Pradhan, Kadri Hacioglu, Wayne Ward, James ...
Open Information Extraction extracts relations from text without requiring a pre-specified domain or vocabulary. While existing techniques have used only shallow syntactic featur...
Janara Christensen, Mausam, Stephen Soderland, Ore...
This paper compares a deep and a shallow processing approach to the problem of classifying a sentence as grammatically wellformed or ill-formed. The deep processing approach uses ...
Joachim Wagner, Jennifer Foster, Josef van Genabit...