Modern statistical parsers are trained on large annotated corpora (treebanks). These treebanks usually consist of sentences addressing different subdomains (e.g. sports, politics,...
One may need to build a statistical parser for a new language, using only a very small labeled treebank together with raw text. We argue that bootstrapping a parser is most promis...
One challenge in text processing is the treatment of case insensitive documents such as speech recognition results. The traditional approach is to re-train a language model exclud...
Cheng Niu, Wei Li 0003, Jihong Ding, Rohini K. Sri...