Given the lack of word delimiters in written Japanese, word segmentation is generally considered a crucial first step in processing Japanese texts. Typical Japanese segmentation a...
Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...
—Content-based video indexing and retrieval has become an interesting research area with the tremendous growth in the amount of digital media. In addition to the audio-visual con...
This paper presents a general framework for building classifiers that deal with short and sparse text & Web segments by making the most of hidden topics discovered from larges...
This paper describes the framework of the StatCan Daily Translation Extraction System (SDTES), a computer system that maps and compares webbased translation texts of Statistics Can...