Abstract. Organizations are increasingly using the the eXtensible Markup Language (XML) for document representation and exchange on the Web. To protect an XML document from unautho...
Knowledge Sifter is a scaleable agent-based system that supports access to heterogeneous information sources such as the Web, open-source repositories, XML-databases and the emergi...
Larry Kerschberg, Mizan Chowdhury, Alberto Damiano...
Preparation of knowledge bank is a very difficult task. In this paper, we discuss the knowledge extraction from the manually examined Sinica Treebank. Categorical information, wor...
We present a study of new word identification (NWI) to improve the performance of a Chinese word segmenter. In this paper the distribution and types of new words are discussed emp...
In this paper, a series of window-based methods is proposed for information retrieval. Compared with traditional tf-idf model, our approaches are based on two new key notions. The ...