Named Entity Recognition (NER) is an important subtask of document processing such as Information Extraction. This paper describes a NER algorithm which uses a Multi-Layer Percept...
The purpose of this paper is to build a digital archive database for Taiwanese people to learn Taiwanese culture through the internet and elearning environment. This study will be ...
Cross Language Information Retrieval community has brought up search engines over multilingual corpora, and multilingual text categorization systems. In this paper, we focus on th...
A crucial step in processing speech audio data for information extraction, topic detection, or browsing/playback is to segment the input into sentence and topic units. Speech segm...
Elizabeth Shriberg, Andreas Stolcke, Dilek Z. Hakk...
Abstract. Most common feature selection techniques for document categorization are supervised and require lots of training data in order to accurately capture the descriptive and d...