department of informatics


FASLAV: Towards Fully Automated Spoken Language Acquisition, Understanding and Speaker Verification by Machines

Summary :

The current state-of-the-art speaker verification systems are limited to the use of frame-based spectral features that are basically modeled globally via Gaussian Mixture Models (GMM). In such systems the linguistic structure of the speech signal is not taken into account and all sounds are represented using a unique model. Beside this, in recent years, research on text-independent speaker verification has expanded from using only the acoustic content of speech to trying to utilise high-level information, such as linguistic content, pronunciation and idiolectal word usage. Works examining the exploitation of high-level information sources have provided strong evidence that gains in speaker recognition accuracy are possible. These promising techniques are however built using manually transcribed databases that are error-prone and expensive to create. These databases need also to be updated with new data sets in order to match with potentially new specifications (channel, microphones, context of use, ...) of the verification data. Data-driven segmentation techniques provide a potential solution to these problems because they do not use transcribed data and can easily be applied on development data minimizing the mismatches. Through this work, our objective is to show that automatic segmentation based on Automatic Language Independent Speech Processing (ALISP) tools, can be used instead of the phonetic one for speaker verification task in order to extract complementary information.


Period : The project has started on January 2003 and is supposed to last 4 years.


Fundings : The project is funded by the Suisse National Science Foundation (SNF) and by BioSecure - Biometrics for Secure Authentication, Network of Excellence.


Participants :

  • asmaa.elhannani AT
  • dijana.petrovska AT (thesis supervisor)
  • rolf.ingold AT

Partners : Members of the EU BioSecure project

Publications related to this project
  • A. El-Hannani and D. T. Toledano and D. Petrovska-Delacrétaz and A. Montero-Asenjo and J. Hennebert, “Using Data-driven and Phonetic Units for Speaker Verification,” In proc. of ODYSSEY06, The Speaker and Language Recognition Workshop. 2006.

  • A. El Hannani and D. Petrovska-Delacrétaz, “Improving Speaker Verification System using ALISP-based Specific GMMs,” To be published in proc. of Conference on Audio- and Video-Based Biometric Person Authentication (AVBPA), July 20-22 2005.

  • A. El Hannani and D. Petrovska-Delacrétaz, “Exploiting High-Level Information Provided by ALISP in Speaker Recognition,” In proc. of NOLISP, Non Linear Speech Processing Workshop, April 19-22 2005.

  • A. El Hannani and D. Petrovska-Delacrétaz, “Segmental Score Fusion for ALISP-based GMM Text-Independent Speaker Verification,” In the book, Advances in Nonlinear Speech Processing and Applications, Edited by G. Chollet, A. Esposito, M. Faundez- Zanuy, M. Marinaro, pp. 385–394, 2004.

  • A. El Hannani, D. Petrovska-Delacrétaz, G. Chollet; "Linear and non-linear fusion of ALISP-based and GMM systems for text-independent speaker verification", In ODYSSEY-The Speaker and language Recognition workshop, Toledo, Spain, May 31 - June 3, 2004.

  • A. El Hannani, D. Petrovska-Delacrétaz, R. Blouet, G. Chollet; Segmental Score Fusion for Text-independent Speaker Verification”; Proc. of the Multimodal User Authentication Workshop, Santa Barbara, USA, December 2003.

  • D. Petrovska-Delacrétaz, M. Abalo, A. El Hannani, G. Chollet. "Data-driven Speech segmentation for Speaker Verification and Language Identification", In proc. of Non Linear Speech Processing (NOLISP 03), Le Croisic, France, May 20-23rd, 2003.

  • D. Petrovska-Delacrétaz, A. El Hannani, G. Chollet. " Searching Through a Speech Memory for Text-independent Speaker Verification", 4th International Conference on Audio and Video-Based Biometric Person Authentication (AVBPA). University of Surrey, Guildford, UK June 9-11, 2003.