Automatic speech recognition (ASR) results contain not only ASR errors, but also disfluencies and colloquial expressions that must be corrected to create readable transcripts. We...
Graham Neubig, Yuya Akita, Shinsuke Mori, Tatsuya ...
Browsing through collections of audio recordings of conversation nominally relies on the processing of participants’ lexical productions. The evolving verbal and non-verbal cont...
In this paper we propose the use of intra-frame prediction with lapped transforms for image coding. Both lapped transforms and intra prediction exploit the redundancies of neighbo...
In a paper published by Greenberg in 1998, it was said that in conversational speech, phone deletion rate may go as high as 12% whereas syllable deletion rate is about 1%. The fi...
Selective attention in the human visual system is performed as the way that humans focus on the most important parts when observing a visual scene. Many bottom-up computational mo...