In the literature, Tree Adjoining Grammars (TAGs) are propagated to be adequate for natural language description -- analysis as well as generation. In this paper we concentrate on...
The longest-common-prefix (LCP) array is an adjunct to the suffix array that allows many string processing problems to be solved in optimal time and space. Its construction is a bo...
We investigate the problem of evaluating the performance of text processing algorithms on inputs that contain errors as a result of optical character recognition. A new hierarchic...
Abstract. We propose a new unsupervised training method for acquiring probability models that accurately segment Chinese character sequences into words. By constructing a core lexi...
There exists a positive constant < 1 such that for any function T(n) n and for any problem L BPTIME(T(n)), there exists a deterministic algorithm running in poly(T(n)) time w...