Tag Archives: Past talks

Talks already given.

Daniel Ramos: Meeting Forensic Science Requirements with Automatic Speaker Recognition Systems

Dr. Daniel Ramos finished his PhD in 2007 in Universidad Autonoma de Madrid (UAM), Spain. From 2011, he is an Associate Professor at the UAM. He is a member of the ATVS – Biometric Recognition Group. During his career, he has visited several research laboratories and institutions around the world, including the Institute of Scientific Police at the University of Lausanne (Switzerland), the School of Mathematics at the University of Edinburgh (Scotland), the Electrical Engineering School at the University of Stellenbosch (South Africa), and more recently the Netherlands Forensic Institute, where he has co-organized a workshop on the scientific validation of evidence evaluation methods. His research interests are focused on forensic evaluation of the evidence using Bayesian techniques, validation of forensic evaluation methods, speaker and language recognition, biometric systems and, more generally, signal processing and pattern recognition.

Dr. Ramos is actively involved in several projects focused on different aspects of forensic science, such as yearly R&D contracts with Spanish Guardia Civil or the Management Committee of the EU COST 1106 Action on Forensic Biometrics. He has also participated in several international competitive evaluations of speaker and language recognition technology, such as NIST Speaker Recognition Evaluations since 2004, the Forensic Speaker Recognition Evaluation NFI/TNO 2003 and the NIST Language Recognition Evaluation since 2007. Dr. Ramos has received several distinctions and awards, he is regularly a member of scientific committees in different international conferences, and he is often invited to give talks in conferences and institutions, for instance as a keynote speaker at the European Academy of Forensic Sciences Conference 2012 (EAFS 2012) in The Hague, The Netherlands. His talk takes place on Wednesday, December 3, E105 at 10am.

Meeting Forensic Science Requirements with Automatic Speaker Recognition Systems

Abstract: This talk aims at describing the requirements of forensic science that are critical for the use automatic speaker recognition systems in forensic applications. We first introduce the current context of forensic science in Europe at different levels. Then, we describe how to meet those requirements with score-based automatic speaker recognition systems, with focus on two main forensic scenarios: investigative and evaluative. Investigative applications include mainly database search and distributed speech acquisition and sharing, and we give some examples of systems and projects in this context. Evaluative applications include interpretation and reporting of the output of the system to court, where a likelihood ratio approach is the preferred recommendation in Europe, and where effective communication to court is of paramount importance. We will address key concepts like the evaluation of evidence; the expression of conclusions; the validation of forensic interpretation methods; the accreditation of laboratories by appropriate standards; and the current efforts for converging to common procedures in Europe.

Daniel Sýkora: Adding Depth to Hand-drawn Images

Daniel Sýkora is an Assistant Professor at the Czech Technical University in Prague. His main research interest is strongly coupled with his long-standing passion for hand-drawn animation. He developed numerous techniques which allow to eliminate repetitive and time consuming tasks while still preserve full creative freedom of manual work. To turn these research ideas into practical products Daniel intensively cooperates with studio Anifilm in Prague as well as renowned industrial partners such as Disney, Adobe, or TVPaint Development. His talk takes place on November 19, E104 at 2pm.

Adding Depth to Hand-drawn Images

Abstract: Recovering depth from a single image remains an open problem after decades of active research. In this talk we focus on a specific variant of the problem where the input image is hand-crafted line drawing. As opposed to previous attempts to provide complete 3D reconstruction either by imposing various geometric constraints or using sketch-based interfaces to produce a full 3D model incrementally, we seek for a specific kind of bas-relief approximation which is less complex to create while still sufficient for many important tasks that can arise in 2D pipelines. It enables to maintaining correct visibility and connectivity of individual parts during interactive shape manipulation, deformable image registration, and fragment composition. In the context of image enhancement it helps to improve perception of depth, generate 3D-like shading or even global illumination effects, and allows to produce stereoscopic imagery as well as source for 3D printing.

Josef Kittler: 3D Assisted 2D Face Recognition

JosefKittler Josef Kittler is professor of Machine Intelligence at the Centre for Vision, Speech and Signal Processing, University of Surrey. He received his BA, PhD and DSc degrees from the University of Cambridge. He teaches and conducts research in the subject area of Signal Processing and Machine Intelligence, with a focus on Machine Learning, Biometrics, Video and Image Database retrieval, Automatic Inspection, Medical Data Analysis, and Cognitive Vision. He published a Prentice Hall textbook on Pattern Recognition: A Statistical Approach and several edited volumes, as well as more than 700 scientific papers, including in excess of 200 journal papers. He serves on the Editorial Board of several scientific journals in Pattern Recognition and Computer Vision. He became Series Editor of Springer Lecture Notes on Computer Science in 2004. He served as President of the International Association for Pattern Recognition 1994-1996. He was elected Fellow of the Royal Academy of Engineering in 2000. In 2006 he was awarded the KS Fu Prize from the International Association for Pattern Recognition, for outstanding contributions to pattern recognition. He received Honorary Doctorates from the University of Lappeenranta in 1999 and the Czech Technical University in Prague in 2007. In 2008 he was awarded the IET Faraday Medal and in 2009 he became EURASIP Fellow. His talk takes place on November 4, E104 at 2pm.

3D Assisted 2D Face Recognition

Abstract: 3D Morphable Face Models (3DMM) have been used in face recognition for some time now. They can be applied in their own right as a basis for 3D face recognition and analysis involving 3D face data. However their prevalent use over the last decade has been as a versatile tool designed to assist 2D face recognition in many different ways. For instance, 3DMM can be used for pose, illumination and expression normalisation of 2D face images. It has the generative capacity to augment the training and test databases for various 2D face processing related tasks. It can expand the gallery set for pose invariant face matching. For any 2D face image it can furnish complementary information, in terms of its 3D shape and texture. It can also aid multiple frame fusion by providing the means of registering a set of 2D images.

A key enabling technology for this versatility is 3D face model to 2D face image fitting. The recent developments in 3D model to 2D image fitting will be discussed. They include the use of symmetry to improve the accuracy of illumination estimation, multistage close form fitting to accelerate the fitting process, modifying the imaging model to cope with 2D images of low resolution, and building illumination free 3DMM. These various enhancements will be overviewed and their merit demonstrated on a number of face analysis related problems in the context of 2D face recognition.

Karol Myszkowski: Perceptual Display: Towards Reducing Gaps Between Real World and Displayed Scenes

Karol Myszkowsk Karol Myszkowski i is a senior researcher in the Computer Graphics Group of the Max-Planck-Institut für Informatik. In the past, he served as an Associate Professor at the University of Aizu, Japan. He also worked as a Research Associate and then Assistant Professor at Szczecin University of Technology. His research interests include perception issues in graphics, high-dynamic range imaging, global illumination, rendering, and animation. His talk takes place on Thursday, October 9, 10:00, E104.

Perceptual Display: Towards Reducing Gaps Between Real World and Displayed Scenes

Abstract: The human visual system (HVS) has its own limitations (e.g., the quality of eye optics, the luminance range that can be simultaneously perceived, and so on), which to certain extent reduce the requirements imposed on display devices. Still a significant deficit of reproducible contrast, brightness, spatial pixel resolution, and depth ranges can be observed, which fall short with respect to the HVS capabilities. Moreover, unfortunate interactions between technological and biological aspects create new problems, which are unknown for real-world observation conditions.

In this talk, we are aiming at the exploitation of perceptual effects to enhance apparent image qualities. At first, we show how the perceived image contrast and brightness can be improved by exploiting the Cornsweet and glare illusions. Then, we present techniques for hold-type blur reduction, which is inherent for LCD displays. Also, we investigate apparent resolution enhancements, which enable showing image details beyond the physical pixel resolution of the display device. Finally, we discuss the problem of perceived depth enhancement in stereovision, as well as comfortable handling of specular effects, film grain, and video cuts.

Video recording of the talk is publicly available.

Sanjeev Khudanpur: Statistical Language Modeling Turns Thirty-Something: Are We Ready To Settle Down?

Sanjeev Khudanpur received the B.Tech. degree in Electrical Engineering from the Indian Institute of Technology, Bombay, in 1988, and the Ph.D. degree in Electrical and Computer Engineering from the University of Maryland, College Park, in 1997. His doctoral dissertation was supervised by Prof. Prakash Narayan and was titled Model Selection and Universal Data Compression. Since 1996, he has been on the faculty of the Johns Hopkins University. Until June 2001, he was an Associate Research Scientist in the Center for Language and Speech Processing and, from July 2001 to June 2008, an Assistant Professor in the Department of Electrical and Computer Engineering and the Department of Computer Science. He became an Associate Professor in July 2008. He is also affiliated with the Johns Hopkins University Human Language Technology Center of Excellence. In Fall 2000, he held a visiting appointment in the Institute for Mathematics and its Applications (IMA), University of Minnesota, Minneapolis, MN. He organized two IMA workshops on the role of mathematics in multimedia – “Mathematical Foundations of Speech Processing and Recognition,” and “Mathematical Foundations of Natural Language Modeling.” The talk of Sanjeev Khudanpur takes place on Friday, July 4, 1pm, at E104.

Statistical Language Modeling Turns Thirty-Something: Are We Ready To
Settle Down?

Abstract: It has been 14 years since Roni Rosenfeld described “Two Decades of Statistical Language Modeling: Where Do We Go From Here?” in a special issue of the Proceedings of the IEEE (August 2000). Perhaps it is time to review what we have learnt in the years since? This lecture will begin with what was well known in 2000 — n-grams, decision tree language models, syntactic language models, maximum entropy (log-linear) models, latent semantic analysis and dynamic adaptation — and then move on to discuss new techniques that have emerged since, such as models with sparse priors, nonparametric Bayesian methods (including Dirichlet processes), and models based on neural networks, including feed-forward, recurrent and deep belief networks. Rather than just a survey, the main goal of the lecture will be to expose the core mathematical and statistical problems in language modeling, and to explain how various competing methods address these issues. It will be argued that the key to solving what appears at first blush to be a hopelessly high-dimensional, sparse-data estimation problem is to structure the model (family) and to guide the choice of parameter values using linguistic knowledge. It is hoped that viewing the core issues in this manner will enable the audience to gain a deeper understanding of the strengths and weaknesses of various approaches. And, no, we are not yet ready to settle down yet. But we now know what we are looking for: it varies from application to application. To each his own!

Hynek Hermansky: My Adventures with Speech

Hynek Hermansky is the Julian S. Smith Professor of the Electrical Engineering and the Director of Centre for Language and Speech Processing at the Johns Hopkins University in Baltimore, Maryland. His main research interests are in bio-inspired speech processing. He has been working in speech research for over 30 years, previously as a Director of Research at the IDIAP Research Institute, Martigny and a Titular Professor at the Swiss Federal Institute of Technology in Lausanne, Switzerland, a Professor and Director of the Center for Information Processing at OHSU Portland, Oregon, a Senior Member of Research Staff at U S WEST Advanced Technologies in Boulder, Colorado, a Research Engineer at Panasonic Technologies in Santa Barbara, California, a Research Fellow at the University of Tokyo, and an Assistant Professor at the Brno University of Technology, Czech Republic. He is a Fellow of IEEE for “Invention and development of perceptually-based speech processing methods”, and a Fellow of International Speech Communication Association for “Pioneering bio-inspired approaches to processing of speech”. He is the holder of the 2013 International Speech Communication Association Medal for Scientific Achievement, is a Member of The Board of the International Speech Communication Association, and a Member of the Editorial Board of Speech Communication. He was the General Chair of the 2013 ICASSP Workshop on Automatic Speech Recognition and Understanding, a Member of the Organizing Committee at the 2011 ICASSP in Prague, Technical Chair at the 1998 ICASSP in Seattle and an Associate Editor for IEEE Transaction on Speech and Audio. He holds 10 US patents and authored or co-authored over 200 papers in reviewed journals and conference proceedings. His speech processing techniques such as Perceptual Linear Prediction, RASTA spectral filtering, multi-stream speech information processing or data-driven discriminative Tandem technique are widely used in research laboratories worldwide as well as in industrial applications. Prof. Hermansky holds Dr.Eng. degree from the University of Tokyo, and Dipl. Ing. degree from Brno University of Technology, Czech Republic. His talk takes place on Thursday, July 3, 1pm, at E104.

My Adventures with Speech

Abstract: I intend to mention some techniques I got involved in during the past 40 years. I will not dwell too much on details of the techniques. These are documented in various publications. Rather, I will try to talk about things which we, researchers, may say in private but seldom write about: about personal intuitions and beliefs, about excitements, frustrations, surprises, and interesting encounters on the road, while struggling to understand and emulate one of the most significant achievements of human race, the ability to communicate by speech.

Brian Barsky: The BLUR Project at Berkeley

Brian A. Barsky is Professor of Computer Science and Affiliate Professor of Optometry and Vision Science at University of California, Berkeley. His research interests include computer aided geometric design and modeling, interactive 3D computer graphics, computer aided cornea modeling and visualization, medical imaging, and virtual environments for surgical simulation. His talk takes place on Thursday, June 12, 3pm, at E104.

The BLUR Project at Berkeley

Abstract: The multidisciplinary BLUR project at UC Berkeley combines computer graphics with optics, optometry, and photography. This research investigates mathematical models to describe the shape of the cornea and algorithms for cornea measurement, scientific and medical visualization for the display of cornea shape, mathematics and algorithms for the design and fabrication of contact lenses, simulation of vision using actual patient data measured by wavefront aberrometry, photo-realistic rendering algorithms for generating imagery with optically-correct depth of field, view camera simulation. This talk will present an overview of rendering algorithms for simulating depth of field found in photographs and of vision-realistic rendering algorithms for simulating a subject’s vision. Recent work on vision correcting displays will also be briefly introduced.

Alexander Wilkie: Predictive Rendering

Alexander Wilkie image

Alex Wilkie will kindly share his deep knowledge of predictive rendering. Alex is well known rigorous researcher, and a very good speaker. Do not miss the chance to learn about his recent advances in realistic rendering, presented at SIGGRAPH, EUROGRAPHICS, or EGSR. His talk takes place on Monday, May 19, 2pm, at E105.

Predictive Rendering – The Other Type of Realistic Computer Graphics

Abstract: This talk has two parts: in the first, we first discuss the basic differences between mainstream computer graphics, and genuinely predictive image synthesis. In the second part, we give a brief overview of the application domains predictive rendering is useful for, the technological state of the art in this field, and the main research directions that are currently being investigated. This includes the specific topics that our group in Prague is working on now, and which directions will probably be upcoming research areas in the near term future.

VGS Invited Talks @ FIT

Invited Talks on Vision, Graphics, and Speech at Faculty of Information Technology, Brno University of Technology