We present a joint model for Chinese word segmentation and new word detection. We present high dimensional new features, including word-based features and enriched edge (label-tra...
We consider the task of summarizing a cluster of related sentences with a short sentence which we call multi-sentence compression and present a simple approach based on shortest p...
The Block Sorting process of Burrows and Wheeler can be applied to any sequence in which symbols are (or might be) conditioned upon each other. In particular, it is possible to pa...
R. Yugo Kartono Isal, Alistair Moffat, A. C. H. Ng...
The paper presents a novel method for compressing large database workloads for purpose of autonomic, continuous index selection. The compressed workload contains a small subset of ...
Some text compression methods take advantage from using more complex compression units than characters. The synchronization between coder and decoder then can be done by transferri...