Sciweavers

TASLP
2011
14 years 5 months ago
A Probabilistic Interaction Model for Multipitch Tracking With Factorial Hidden Markov Models
—We present a simple and efficient feature modeling approach for tracking the pitch of two simultaneously active speakers. We model the spectrogram features of single speakers u...
Michael Wohlmayr, Michael Stark, Franz Pernkopf
99
Voted
TASLP
2011
14 years 5 months ago
Robust Voice Activity Detection Using Long-Term Signal Variability
Prasanta Kumar Ghosh, Andreas Tsiartas, Shrikanth ...
TASLP
2011
14 years 5 months ago
Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment
—This paper presents a blind source separation method for convolutive mixtures of speech/audio sources. The method can even be applied to an underdetermined case where there are ...
Hiroshi Sawada, Shoko Araki, Shoji Makino
86
Voted
TASLP
2011
14 years 5 months ago
Predicting Preference Judgments of Individual Normal and Hearing-Impaired Listeners With Gaussian Processes
Abstract—A probabilistic kernel approach to pairwise preference learning based on Gaussian processes is applied to predict preference judgments for sound quality degradation mech...
Perry Groot, Tom Heskes, Tjeerd Dijkstra, James M....
TASLP
2011
14 years 5 months ago
Estimating Dominance in Multi-Party Meetings Using Speaker Diarization
—With the increase in cheap commercially available sensors, recording meetings is becoming an increasingly practical option. With this trend comes the need to summarize the recor...
Hayley Hung, Yan Huang, Gerald Friedland, Daniel G...
TASLP
2011
14 years 5 months ago
On the Information Geometry of Audio Streams With Applications to Similarity Computing
Abstract—This paper proposes methods for information processing of audio streams using methods of information geometry. We lay the theoretical groundwork for a framework allowing...
Arshia Cont, Shlomo Dubnov, Gérard Assayag
TASLP
2011
14 years 5 months ago
Speaker Diarization Based on Intensity Channel Contribution
The time delay of arrival (TDOA) between multiple microphones has been used since 2006 as a source of information (localization) to complement the spectral features for speaker di...
Roberto Barra-Chicote, José Manuel Pardo, J...
TASLP
2011
14 years 5 months ago
Advances in Missing Feature Techniques for Robust Large-Vocabulary Continuous Speech Recognition
— Missing feature theory (MFT) has demonstrated great potential for improving the noise robustness in speech recognition. MFT was mostly applied in the log-spectral domain since ...
Maarten Van Segbroeck, Hugo Van Hamme
TASLP
2011
14 years 5 months ago
Time-Domain Blind Separation of Audio Sources on the Basis of a Complete ICA Decomposition of an Observation Space
—Time-domain algorithms for blind separation of audio sources can be classified as being based either on a partial or complete decomposition of an observation space. The decompo...
Zbynek Koldovský, Petr Tichavský
TASLP
2011
14 years 5 months ago
Reasons why Current Speech-Enhancement Algorithms do not Improve Speech Intelligibility and Suggested Solutions
—Existing speech enhancement algorithms can improve speech quality but not speech intelligibility, and the reasons for that are unclear. In the present paper, we present a theore...
Philipos C. Loizou, Gibak Kim