Sciweavers

54 search results - page 1 / 11
» Long-tail Vocabulary Dictionary Extraction from the Web
Sort
View
KDD
2008
ACM
153views Data Mining» more  KDD 2008»
14 years 5 months ago
Information extraction from Wikipedia: moving down the long tail
Not only is Wikipedia a comprehensive source of quality information, it has several kinds of internal structure (e.g., relational summaries known as infoboxes), which enable self-...
Fei Wu, Raphael Hoffmann, Daniel S. Weld
ICDAR
2003
IEEE
13 years 10 months ago
Lexical Postcorrection of OCR-Results: The Web as a Dynamic Secondary Dictionary?
Postcorrection of OCR-results for text documents is usually based on electronic dictionaries. When scanning texts from a specific thematic area, conventional dictionaries often m...
Christian M. Strohmaier, Christoph Ringlstetter, K...
TVCG
2008
112views more  TVCG 2008»
13 years 4 months ago
Vispedia: Interactive Visual Exploration of Wikipedia Data via Search-Based Integration
Abstract-Wikipedia is an example of the collaborative, semi-structured data sets emerging on the Web. These data sets have large, nonuniform schema that require costly data integra...
Bryan Chan, Leslie Wu, Justin Talbot, Mike Cammara...
WWW
2008
ACM
14 years 5 months ago
WWW 2008 workshop: NLPIX2008 summary
The amount of information available on the Web has increased rapidly, reaching levels that few would ever have imagined possible. We live in what could be called the "informa...
Hiroshi Nakagawa, Kentaro Torisawa, Marasu Kitsure...
INTERSPEECH
2010
12 years 11 months ago
Wiktionary as a source for automatic pronunciation extraction
In this paper, we analyze whether dictionaries from the World Wide Web which contain phonetic notations, may support the rapid creation of pronunciation dictionaries within the sp...
Tim Schlippe, Sebastian Ochs, Tanja Schultz