Discriminant binary data representation for speaker recognition

9 years 2 months ago
Discriminant binary data representation for speaker recognition
In supervector UBM/GMM paradigm, each acoustic file is represented by the mean parameters of a GMM model. This supervector space is used as a data representation space, which has a high dimensionality. Moreover, this space is not intrinsically discriminant and a complete speech segment is represented by only one vector, withdrawing mainly the possibility to take into account temporal or sequential information. This work proposes a new approach where each acoustic frame is represented in a discriminant binary space. The proposed approach relies on a UBM to structure the acoustic space in regions. Each region is then populated with a set of Gaussian models, denoted as ”specificities”, able to emphasize speaker specific information. Each acoustic frame is mapped in the discriminant binary space, turning ”on” or ”off” all the specificities to create a large binary vector. All the following steps, speaker reference extraction, likelihood estimation or decision take place in...
Jean-François Bonastre, Pierre-Michel Bousq
Added 21 Aug 2011
Updated 21 Aug 2011
Type Journal
Year 2011
Authors Jean-François Bonastre, Pierre-Michel Bousquet, Driss Matrouf, Xavier Anguera Miró
Comments (0)