Search results generated by searchable databases are served dynamically and far larger than the static documents on the Web. These results pages have been referred to as the Deep ...
Yasuhiro Yamada, Nick Craswell, Tetsuya Nakatoh, S...
Existing HTML mark-up is used only to indicate the structure and lay-out of documents, but not the document semantics. As a result web documents are difficult to be semantically p...
Open Information Extraction extracts relations from text without requiring a pre-specified domain or vocabulary. While existing techniques have used only shallow syntactic featur...
Janara Christensen, Mausam, Stephen Soderland, Ore...
Text documents often embed data that is structured in nature. This structured data is increasingly exposed using information extraction systems, which generate structured relation...
This paper presents a system that uses the domain name of a German business website to locate its information pages (e.g. company profile, contact page, imprint) and then identifi...