This paper presents a simple and efficient algorithm for approximate dictionary matching designed for similarity measures such as cosine, Dice, Jaccard, and overlap coefficients. ...
There often exist multiple corpora for the same natural language processing (NLP) tasks. However, such corpora are generally used independently due to distinctions in annotation s...
Because English is a low morphology language, current statistical parsers tend to ignore morphology and accept some level of redundancy. This paper investigates how costly such re...
Matthew Honnibal, Jonathan K. Kummerfeld, James R....
We present a theoretical and empirical comparative analysis of the two dominant categories of approaches in Chinese word segmentation: word-based models and character-based models...
Lexicalized Well-Founded Grammar (LWFG) is a recently developed syntacticsemantic grammar formalism for deep language understanding, which balances expressiveness with provable le...