Similarity measures for text have historically been an important tool for solving information retrieval problems. In many interesting settings, however, documents are often closel...
Clustering the results of a search helps the user to overview the information returned. In this paper, we regard the clustering task as indexing the search results. Here, an index...
UML sequence diagrams are commonly used to represent object interactions in software systems. This work considers the problem of extracting UML sequence diagrams from existing cod...
In lots of natural language processing tasks, the classes to be dealt with often occur heavily imbalanced in the underlying data set and classifiers trained on such skewed data t...
We describe active measurements of topology and end-to-end latency characteristics between several of the DNS root servers and a subset of their clients using the skitter tool dev...
Marina Fomenkov, Kimberly C. Claffy, Bradley Huffa...