Srikanth Madikeri got his Ph.D. in Computer Science and Engineering from Indian Institute of Technology Madras (India) in 2013. During his Ph.D., he worked on automatic speaker recognition and spoken keyword spotting. He is currently working as a Research Associate at Idiap Research Institute (Martigny, Switzerland) in the Speech Processing group. His current research interests include – Automatic Speech Recognition for low resource languages, Automatic Speaker Recognition and Speaker Diarization.
Automatic Speech Recognition for Low-Resource languages
This talk focuses on automatic speech recognition (ASR) systems for low-resource languages with applications to information retrieval.
A common approach to improve ASR system performance for low-resource ASR is to train multilingual acoustic models by pooling resources from multiple languages. In this talk, we present the challenges and benefits of different multilingual modeling with Lattice-Free Maximum Mutual Information (LF-MMI), the state-of-the-art technique for hybrid ASR systems. We also present an incremental semi-supervised learning approach applied to multi-genre speech recognition, a common task in the MATERIAL program. The simple approach helps avoid fast saturation of performance improvements when using large amounts of data for semi-supervised learning. Finally, we present Pkwrap, a Pytorch wrapper on Kaldi (among the most popular speech recognition toolkits), that helps combine the benefits of training acoustic models with Pytorch and Kaldi. The toolkit, now available at https://github.com/idiap/pkwrap, is intended to provide both fast prototyping benefits of Pytorch while using necessary functionalities from Kaldi (LF-MMI, parallel training, decoding, etc.).
The talk will take place on Monday March 8th 2021 at 13:00 CET, virtually on zoom https://cesnet.zoom.us/j/98589068121.