The process of labeling each word in a sentence with one of its lexical categories (noun, verb, etc) is called tagging and is a key step in parsing and many other language processi...
Chinese part-of-speech (POS) tagging assigns one POS tag to each word in a Chinese sentence. However, since words are not demarcated in a Chinese sentence, Chinese POS tagging req...
Overlap in markup occurs where some markup structures do not nest, such as where the sentence and phrase boundaries of a poem and the metrical line structure describe different hi...
We show how web mark-up can be used to improve unsupervised dependency parsing. Starting from raw bracketings of four common HTML tags (anchors, bold, italics and underlines), we ...
Valentin I. Spitkovsky, Daniel Jurafsky, Hiyan Als...
Our system for the Novelty Track at TREC 2004 looks beyond sentence boundaries as well as within sentences to identify novel, nonduplicative passages. It tries to identify text sp...