We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
String-to-string transduction is a central problem in computational linguistics and natural language processing. It occurs in tasks as diverse as name transliteration, spelling co...
We give a universal kernel that renders all the regular languages linearly separable. We are not able to compute this kernel efficiently and conjecture that it is intractable, but...
Abstract. Effective fractal dimension was defined by Lutz (2003) in order to quantitatively analyze the structure of complexity classes. Interesting connections of effective dim...
In this paper we present a noun phrase coreference resolution system which aims to enhance the identification of the coreference realized by string matching. For this purpose, we ...
Xiaofeng Yang, Guodong Zhou, Jian Su, Chew Lim Tan