This paper proposes a chunking strategy to detect unknown words in Chinese word segmentation. First, a raw sentence is pre-segmented into a sequence of word atoms 1 using a maximum...
Packet traces of operational Internet traffic are invaluable to network research, but public sharing of such traces is severely limited by the need to first remove all sensitive...
A collection of distributed databases forms an important architectural component of the ATON project for networked incidence management of highway traffic. The database sub-archit...
Mohan M. Trivedi, Shailendra K. Bhonsle, Amarnath ...
Query translation for Cross-Lingual Information Retrieval (CLIR) has gained increasing attention in the research area. Previous work mainly used machine translation systems, bilin...
Rong Hu, Weizhu Chen, Jian Hu, Yansheng Lu, Zheng ...
Substantial medical data, such as discharge summaries and operative reports are stored in electronic textual form. Databases containing free-text clinical narratives reports often...