Upcoming Talks

S. Umesh

S. Umesh is a professor in the Department of Electrical Engineering at Indian Institute of Technology – Madras. His research interests are mainly in automatic speech recognition particularly in low-resource modelling and speaker normalization & adaptation. He has also been a visiting researcher at AT&T Laboratories, Cambridge University and RWTH-Aachen under the Humboldt Fellowship. He is currently leading a consortium of 12 Indian institutions to develop speech based systems in agricultural domain. His talk takes place on Tuesday, June 27, 2017 at 13:00 in room E105.

Acoustic Modelling of low-resource Indian languages. In this talk, I will present recent efforts in India to build speech-based systems in agriculture domain to provide easy access to information to about 600 million farmers. This is being developed by a consortium of 12 Indian institutions initially in 12 languages, which will then be expanded to another 12 languages. Since the usage is in extremely noisy environments such as fields, the emphasis is on high accuracy by using directed queries which elicit short phrase-like responses. Within this framework, we explored cross-lingual and multilingual acoustic modelling techniques using subspace-GMMs and phone-CAT approaches. We also extended the use of phone-CAT for phone-mapping and articulatory features extraction which were then fed to a DNN based acoustic model. Further, we explored the joint estimation of acoustic model (DNN) and articulatory feature extractors. These approaches gave significant improvement in recognition performance, when compared to building systems using data from only one language. Finally, since the speech consisted of mostly short and noisy utterances, conventional adaptation and speaker-normalization approaches could not be easily used. We investigated the use of a neural network to map filter-bank features to fMLLR/VTLN features, so that the normalization can be done at frame-level without first-pass decode, or the necessity of long utterances to estimate the transforms. Alternately, we used a teacher-student framework where the teacher trained on normalized features is used to provide “soft targets” to the student network trained on un-normalized features. In both approaches, we obtained recognition performance that is better than ivector-based normalization schemes.

 

Interspeech Guests

Interspeech, NTT Japan, Monday, August 28, 2017.

 

Miloslav Druckmüller

MDMiloslav Druckmüller is a Professor of Applied Mathematics at the Institute of mathematics, Faculty of Mechanical Engineering, Brno University of Technology and the head of the Department of Computer Graphics and Geometry. His main interests are numerical methods of image analysis, digital image processing, computer graphics and complex variable analysis. During the last 10 years he has been cooperating widely with the Institute for Astronomy, University of Hawaii in the field of solar coronal plasma research. He cTse2008proreated a large archive of K-corona (photospheric light scattered on free electrons) images and temperature maps based on Fe and Ni ions observing based on data obtained during total solar eclipses during last two decades. Nowadays his research is mainly focused on processing and analysis of data obtained by NASA SDO spacecraft. His talk takes place in POSTPONED.

 

Vlastimil Havran

havran-bigVlastimil Havran is Associate professor at the Czech Technical University in Prague. His research interests include data structures and algorithms for rendering images and videos, visibility calculations, geometric range searching for global illumination, software architectures for rendering, applied Monte Carlo methods, data compression etc. POSTPONED

 

Kevin Köser

Kevin KöserKevin Köser is a senior researcher at the GEOMAR Helmholtz Centre for Ocean Research, Kiel. His main research interest lies in novel camera-based measurement techniques for (deep) sea environments and processes (3D underwater vision). These help to study resources, to explore and monitor (deep) sea habitats or to assess hazards, e.g. with respect to gas flux or seafloor dynamics. In the past years Dr. Köser has taught the classes 3D Photography and Computer Vision Lab at the Swiss Federal Institute of Technology (ETH Zurich) and has worked as a senior researcher in ETH’s Computer Vision and Geometry Lab on shape and motion extraction from photos and videos, geolocalization and image registration. POSTPONED