Although the Web lets users freely browse and publish information, most Web information is unauthorized in contrast to conventional mass media. Therefore, it is not always credibl...
The paper describes the IBM systems submitted to the NIST Rich Transcription 2007 (RT07) evaluation campaign for the speechto-text (STT) and speaker-attributed speech-to-text (SAST...
We view scientific workflows as the domain scientist's way to harness cyberinfrastructure for e-Science. Domain scientists are often interested in "end-to-end" fram...
Classification of texts potentially containing a complex and specific terminology requires the use of learning methods that do not rely on extensive feature engineering. In this w...
Background: Hidden Markov Models (HMMs) provide an excellent means for structure identification and feature extraction on stochastic sequential data. An HMM-with-Duration (HMMwD) ...