Editing speech data is currently time-consuming and errorprone. Speech editors rely on acoustic waveform representations, which force users to repeatedly sample the underlying spe...
Abstract. Active probing has increasingly been used to collect information about the topological and functional characteristics of the Internet. Given the need for active probing a...
Abstract. This paper presents the final version of the Czech Broadcast Conversation Corpus released at the Linguistic Data Consortium (LDC). The corpus contains 72 recordings of a...
This paper explores the issue of term-weighting in the genre of spontaneous, multi-party spoken dialogues, with the intent of using such term-weights in the creation of extractive ...
Reordering model is important for the statistical machine translation (SMT). Current phrase-based SMT technologies are good at capturing local reordering but not global reordering...