Almost all Chinese language processing tasks involve word segmentation of the language input as their first steps, thus robust and reliable segmentation techniques are always requ...
In this paper we introduce a machine learning approach for automatic text segmentation. Our text segmenter clusters text-segments containing similar concepts. It first discovers th...
—Handwritten text line segmentation on real-world data presents significant challenges that cannot be overcome by any single technique. Given the diversity of approaches and the...
With the widespread use of full-text information retrieval, passage-retrieval techniques are becoming increasingly popular. Larger texts can then be replaced by important text exc...
Gerard Salton, Amit Singhal, Chris Buckley, Mandar...
Word segmentation is a crucial step for segmentation-free document analysis systems and is used for creating an index based on word matching. In this paper, we propose a novel met...