Monthly Archives: July 2014

Sanjeev Khudanpur: Statistical Language Modeling Turns Thirty-Something: Are We Ready To Settle Down?

Sanjeev Khudanpur received the B.Tech. degree in Electrical Engineering from the Indian Institute of Technology, Bombay, in 1988, and the Ph.D. degree in Electrical and Computer Engineering from the University of Maryland, College Park, in 1997. His doctoral dissertation was supervised by Prof. Prakash Narayan and was titled Model Selection and Universal Data Compression. Since 1996, he has been on the faculty of the Johns Hopkins University. Until June 2001, he was an Associate Research Scientist in the Center for Language and Speech Processing and, from July 2001 to June 2008, an Assistant Professor in the Department of Electrical and Computer Engineering and the Department of Computer Science. He became an Associate Professor in July 2008. He is also affiliated with the Johns Hopkins University Human Language Technology Center of Excellence. In Fall 2000, he held a visiting appointment in the Institute for Mathematics and its Applications (IMA), University of Minnesota, Minneapolis, MN. He organized two IMA workshops on the role of mathematics in multimedia – “Mathematical Foundations of Speech Processing and Recognition,” and “Mathematical Foundations of Natural Language Modeling.” The talk of Sanjeev Khudanpur takes place on Friday, July 4, 1pm, at E104.

Statistical Language Modeling Turns Thirty-Something: Are We Ready To
Settle Down?

Abstract: It has been 14 years since Roni Rosenfeld described “Two Decades of Statistical Language Modeling: Where Do We Go From Here?” in a special issue of the Proceedings of the IEEE (August 2000).  Perhaps it is time to review what we have learnt in the years since? This lecture will begin with what was well known in 2000 — n-grams, decision tree language models, syntactic language models, maximum entropy (log-linear) models, latent semantic analysis and dynamic adaptation — and then move on to discuss new techniques that have emerged since, such as models with sparse priors, nonparametric Bayesian methods (including Dirichlet processes), and models based on neural networks, including  feed-forward, recurrent and deep belief networks. Rather than just a survey, the main goal of the lecture will be to expose the core mathematical and statistical problems in language modeling, and to explain how various competing methods address these issues.  It will be argued that the key to solving what appears at first blush to be a hopelessly high-dimensional, sparse-data estimation problem is to structure the model (family) and to guide the choice of parameter values using linguistic knowledge. It is hoped that viewing the core issues in this manner will enable the audience to gain a deeper understanding of the strengths and weaknesses of various approaches. And, no, we are not yet ready to settle down yet.  But we now know what we are looking for: it varies from application to application.  To each his own!

Hynek Hermansky: My Adventures with Speech

Hynek Hermansky is the Julian S. Smith Professor of the Electrical Engineering and the Director of Centre for Language and Speech Processing at the Johns Hopkins University in Baltimore, Maryland.  His main research interests are in bio-inspired speech processing. He has been working in speech research for over 30 years, previously as a Director of Research at the IDIAP Research Institute, Martigny and a Titular Professor at the Swiss Federal Institute of Technology in Lausanne, Switzerland, a Professor and Director of the Center for Information Processing at OHSU Portland, Oregon, a Senior Member of Research Staff at U S WEST Advanced Technologies in Boulder, Colorado, a Research Engineer at Panasonic Technologies in Santa Barbara, California, a Research Fellow at the University of Tokyo, and an Assistant Professor at the Brno University of Technology, Czech Republic.  He is a Fellow of IEEE for “Invention and development of perceptually-based speech processing methods”, and a Fellow of International Speech Communication Association for “Pioneering bio-inspired approaches to processing of speech”. He is the holder of the 2013 International Speech Communication Association Medal for Scientific Achievement, is a Member of The Board of the International Speech Communication Association, and a Member of the Editorial Board of Speech Communication. He was the General Chair of the 2013 ICASSP Workshop on Automatic Speech Recognition and Understanding, a Member of the Organizing Committee at the 2011 ICASSP in Prague, Technical Chair at the 1998 ICASSP in Seattle and an Associate Editor for IEEE Transaction on Speech and Audio. He holds 10 US patents and authored or co-authored over 200 papers in reviewed journals and conference proceedings. His speech processing techniques such as Perceptual Linear Prediction, RASTA spectral filtering, multi-stream speech information processing or data-driven discriminative Tandem technique are widely used in research laboratories worldwide as well as in industrial applications.  Prof. Hermansky holds Dr.Eng. degree from the University of Tokyo, and Dipl. Ing. degree from Brno University of Technology, Czech Republic. His talk takes place on Thursday, July 3, 1pm, at E104.

My Adventures with Speech

Abstract: I intend to mention some techniques I got involved in during the past 40 years. I will not dwell too much on details of the techniques. These are documented in various publications.  Rather, I will try to talk about things which we, researchers, may say in private but seldom write about: about personal intuitions and beliefs, about excitements,  frustrations, surprises, and interesting encounters on the road, while struggling to understand and emulate one of the most significant achievements of human race, the ability to communicate by speech.