One commonly used approach for language recognition is to convert the input speech into a sequence of tokens such as words or phones and then to use these token sequences to deter...
It is a truism that for a machine to have a useful access to memory or workspace, it must "know" where its input ends and its working memory begins. Most machine models ...
Parallel bit stream algorithms exploit the SWAR (SIMD within a register) capabilities of commodity processors in high-performance text processing applications such as UTF8 to UTF-...
This paper presents an empirical study for improving the performance of text chunking. We focus on two issues: the problem of selecting feature spaces, and the problem of alleviat...
RDF uses the RFC3066 standard for language tags for literals in natural languages. The revision RFC3066bis includes productive use of language, country and script codes. These form...