Themos Stafylakis is a Marie Curie Research Fellow on audiovisual automatic speech recognition at the Computer Vision Laboratory of University of Nottingham (UK). He holds a PhD from Technical University of Athens (Greece) on Speaker Diarization for Broadcast News. He has a strong publication record on speaker recognition and diarization, as a result of his 5-year post-doc at CRIM (Montreal, Canada), under the supervision of Patrick Kenny. He is currently working on lip-reading and audiovisual speech recognition using deep learning methods. His talk takes place on Wednesday, November 22, 2017 at 13:00 in room A112.
Deep Word Embeddings for Audiovisual Speech Recognition
During the last few years, visual and audiovisual automatic speech recognition (ASR) are witnessing a renaissance, which can largely be attributed to the advent of deep learning methods. Deep architectures and learning algorithms initially proposed for audio-based ASR are combined with powerful computer vision models and are finding their way to lipreading and audiovisual ASR. In my talk, I will go through some of the most recent advances in audiovisual ASR, with emphasis on those based on deep learning. I will then present a deep architecture for visual and audiovisual ASR which attains state-of-the-art results in the challenging lipreading-in-the-wild database. Finally, I will focus on how this architecture can generalize to words unseen during training and discuss its applicability in continuous speech audiovisual ASR.
Vlastimil Havran is Associate professor at the Czech Technical University in Prague. His research interests include data structures and algorithms for rendering images and videos, visibility calculations, geometric range searching for global illumination, software architectures for rendering, applied Monte Carlo methods, data compression etc. His talk takes place on Monday, December 4, 2017 at 12:00 in room E105.
Surface reflectance in rendering algorithms
The rendering of images by computers, i.e., computationally solving a rendering equation, consists of three components: computing visibility for example by ray tracing, the interaction of light with surface and efficient Monte Carlo sampling algorithms. In this talk, we focus on various aspects of surface reflectance. That is a key issue to get high fidelity of objects’ visual appearance in the rendered images not only in the movie industry but also in real time applications of virtual and augmented reality. First, we recall the initial concepts of surface reflectance and its use in rendering equation. Then we will present our results on the surface reflectance characterization and its possible use in rendering algorithms. Further, we will show why the standard surface reflectance model usually represented as bidirectional reflectance distribution function needs to be extended spatially to achieve high fidelity of visual appearance. As this spatial extension leads to a big data problems, we will describe our algorithm for compression of spatially varying surface reflectance data. We also will describe an effective perceptually motivated method to compare two similar surface reflectance datasets, where one can be the reference data and the second one the result of its compression. As the last topic, we will describe the concepts and problems when we measure such surface reflectance datasets for real-world applications.
Santosh Mathan is an Engineering Fellow at Honeywell Aerospace and a Principal Scientist in the Human Centered Systems group at Honeywell Laboratories. His research lies at the intersection of human computer interaction, machine learning, and biological signal processing. Santosh is the principal investigator and program manager on several efforts to use neurotechnology in practical settings. These efforts, carried out in collaboration with academic and industry researchers around the world, have led to the development of machine learning and signal processing algorithms that can estimate changes in cognitive function following brain trauma, identify fluctuations in attention, boost the activity of cortical networks underlying fluid intelligence, and serve as the basis for hands-free robotic control. Papers describing these projects have won multiple best paper awards at research conferences, and have been covered by the press in publications including the Wall Street Journal and Wired. He has been awarded over 19 US patents. Santosh has a doctoral degree in Human Computer Interaction from the School of Computer Science at Carnegie Mellon University, where his research explored the use of computational cognitive models for diagnosing and remedying student difficulties during skill acquisition. His talk takes place in December 2017 / January 2018.
Scaling up Cognitive Efficacy with Neurotechnology
Cognition and behavior arise from the activity of billions of neurons. Ongoing research indicates that non-invasive neural sensing techniques can provide a window into this never ending storm of electrical activity in our brains, and yield rich information of interest to system designers and trainers. Direct measurement of brain activity has the potential to provide objective measures that can help system designers and trainers in a variety of ways, including estimating the impact of a system on users during the design process, estimating cognitive proficiency during training, and providing new modalities for humans to interact with computer systems. In this presentation, Santosh Mathan will review research in the Honeywell Advanced Technology organization that offer novel tools and techniques to advance Human Computer Interaction. While many of these research explorations are at an early stage, they offer the preview of practical tools that lie around the corner for researchers and practitioners with an interest in boosting human performance in challenging task environments.
Jan Kybic was born in Prague, Czech Republic, in 1974. He received a Mgr. (BSc.) and Ing. (MSc.) degrees with honors from the Czech Technical University, Prague, in 1996 and 1998, respectively. In 2001, he obtained the Ph.D. in biomedical image processing from Ecole Polytechnique Federale de Lausanne (EPFL), Switzerland, for his thesis on elastic image registration using parametric deformation models. Between October 2002 and February 2003, he held a post-doc research position in INRIA, Sophia-Antipolis, France. Since 2003 he is a Senior Research Fellow with Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague and passed his habilitation (Associate Professor) in 2010 and became a full professor in 2015. He was a Vice-Dean in 2011-2013 and a Department Head in 2013-2017. Jan Kybic has authored or co-authored 31 articles in peer-reviewed international scientific journals, one book, two book chapters, and over 80 conference publications. He has supervised nine PhD students, six of them have already successfuly graduated. He has also supervised over twenty master, bachelor and short-term student projects.
He is a member of IEEE and served as an Associate Editor for IEEE Transactions on Medical Imaging and as a reviewer for numerous international journals and conferences. He was a general chair of the ISBI 2016 conference.
His research interests include signal and image processing, medical imaging, image registration, splines and wavelets, inverse problems, elastography, computer vision, numerical methods, algorithm theory and control theory.
He teaches Digital Image Processing and Medical Imaging courses.
His talk takes place on Thursday, March 1, 2018.
Miloslav Druckmüller is a Professor of Applied Mathematics at the Institute of mathematics, Faculty of Mechanical Engineering, Brno University of Technology and the head of the Department of Computer Graphics and Geometry. His main interests are numerical methods of image analysis, digital image processing, computer graphics and complex variable analysis. During the last 10 years he has been cooperating widely with the Institute for Astronomy, University of Hawaii in the field of solar coronal plasma research. He created a large archive of K-corona (photospheric light scattered on free electrons) images and temperature maps based on Fe and Ni ions observing based on data obtained during total solar eclipses during last two decades. Nowadays his research is mainly focused on processing and analysis of data obtained by NASA SDO spacecraft. His talk takes place in POSTPONED.
Kevin Köser is a senior researcher at the GEOMAR Helmholtz Centre for Ocean Research, Kiel. His main research interest lies in novel camera-based measurement techniques for (deep) sea environments and processes (3D underwater vision). These help to study resources, to explore and monitor (deep) sea habitats or to assess hazards, e.g. with respect to gas flux or seafloor dynamics. In the past years Dr. Köser has taught the classes 3D Photography and Computer Vision Lab at the Swiss Federal Institute of Technology (ETH Zurich) and has worked as a senior researcher in ETH’s Computer Vision and Geometry Lab on shape and motion extraction from photos and videos, geolocalization and image registration. POSTPONED