— This paper studies the analysis on the Cyber Clean Center (CCC) Data Set 2009, consisting of raw packets captured more than 90 independent honeypots, in order for detecting beh...
There are many advantages to be gained by storing the lexicon of a full text database in main memory. In this paper we describe how to use a compressed inverted file index to sear...
This paper considers enumeration of substring equivalence classes introduced by Blumer et al. [1]. They used the equivalence classes to define an index structure called compact dir...
We describe a framework for representing email as well as meeting information as a joint graph. In the graph, documents and meeting descriptions are connected via other nontextual...
Many algorithms for grammatical inference can be viewed as instances of a more general algorithm which maintains a set of primitive elements, which distributionally define sets of ...