PageRank is known to be an efficient metric for computing general document importance in the Web. While commonly used as a one-size-fits-all measure, the ability to produce topica...
The already huge number of blogs in existence is increasing rapidly, and many users are struggling to find a way to keep up with the expansion. A number of existing tools aim to c...
This paper presents a novel domain-independent text segmentation method, which identifies the boundaries of topic changes in long text documents and/or text streams. The method c...
The increasing complexity of enterprise databases and the prevalent lack of documentation incur significant cost in both understanding and integrating the databases. Existing solu...
Searching an organization’s document repositories for experts provides a cost effective solution for the task of expert finding. We present two general strategies to expert se...