Given two strings, a pattern P of length m and a text T of length n over some alphabet , we consider the string matching problem under k mismatches. The well
Abstract. Existing methods to text plagiarism analysis mainly base on “chunking”, a process of grouping a text into meaningful units each of which gets encoded by an integer nu...
Abstract. Stochastic finite automata are useful for identifying substrings (chunks) within larger units of text. Relevant applications include tokenization, base-NP chunking, name...