Machine Learning for Speech Processing

Speaker Verification


A syllable Lattice Approach to Speaker Verification


Minho Jin, Frank K. Soong and Chang D. Yoo


This paper proposes a syllable-lattice-based speaker verification algorithm for Mandarin Chinese input. For each speech utterance, a syllable lattice is generated with a speaker-independent large-vocabulary continuous speech recognition system in free syllable decoding. The verification decision is made based upon the likelihood ratio between a target-speaker model and a speaker-independent background model, computed on the decoded syllable lattice. The likelihood function is calculated efficiently in a forward algorithm by considering all paths in the lattice. The proposed algorithm was evaluated using a Mandarin Chinese database, where 1832 true and 26 250 impostor trials were recorded by 19 target speakers and 180 impostors. The average duration of each trial is 2 s long without silence. The target-speaker model was adapted from the speaker-independent background model using enrollment data of two minutes with silence. The proposed algorithm achieved an equal-error rate of 0.857% which is beter than 1.21% of the hidden Markov model-based speaker verification algorithm without using syllable lattices. The equal-error rate was further reduced to 0.617% by incorporating the Goussian mixture model–universal background model algorithm with 2048 Gaussian kernels whose equal error rate is 0.990%.


Related Papers

  1. "A syllable Lattice Approach to Speaker Verification," IEEE transactions on Audio, Speech and Language processing. vol. 15, no. 8, pp 2476-2484, Nov. 2007 (with M. Jin and F. K. Soong).
  2. "Syllable Lattice Based Re-Scoring For Speaker Verification," ICASSP2006, Toulouse, France, May 14-19, 2006 (with Minho Jin and Frank K. Soong).