Given a sequence S of n symbols over some alphabet Σ, we develop a new compression method that is (i) very simple to implement; (ii) provides O(1) time random access to any symbol...
There are many emerging database applications that require accurate selectivity estimation of approximate string matching queries. Edit distance is one of the most commonly used s...
Today, in many practical E-Commerce systems, the real stored data usually are short strings, such as names, addresses, or other information. Searching data within these short stri...
Abstract-- Large graph datasets are common in many emerging database applications, and most notably in large-scale scientific applications. To fully exploit the wealth of informati...
Symbolic Indirect Correlation (SIC) is a new classification method for unsegmented patterns. SIC requires two levels of comparisons. First, the feature sequences from an unknown q...
George Nagy, Ashutosh Joshi, Mukkai S. Krishnamoor...