A compressed full-text self-index for a text T , of size u, is a data structure used to search for patterns P, of size m, in T , that requires reduced space, i.e. space that depend...
Building a model using machine learning that can classify the sentiment of natural language text often requires an extensive set of labeled training data from the same domain as t...
Address standardization is a very challenging task in data cleansing. To provide better customer relationship management and business intelligence for customer-oriented cooperates...
We present a new bit-parallel technique for approximate string matching. We build on two previous techniques. The first one, BPM [Myers, J. of the ACM, 1999], searches for a patte...
This paper presents a method for improving phrase-based Statistical Machine Translation systems by enriching the original translation model with information derived from a multilin...