Category Archives: News

Tomáš Mikolov: Neural Networks for Natural Language Processing

mikolovTomáš Mikolov is a research scientist at Facebook AI Research since 2014. Previously he has been a member of Google Brain team, where he developed efficient algorithms for computing distributed representations of words (word2vec project). He has obtained PhD from Brno University of Technology for work on recurrent neural network based language models (RNNLM). His long term research goal is to develop intelligent machines capable of communicating with people using natural language. His talk will take place on Tuesday, January 3rd, 2017, 5pm in room E112.

Neural Networks for Natural Language Processing

Abstract: Neural networks are currently very successful in various machine learning tasks that involve natural language. In this talk, I will describe how recurrent neural network language models have been developed, as well as their most frequent applications to speech recognition and machine translation. Next, I will talk about distributed word representations, their interesting properties, and efficient ways how to compute them. Finally, I will describe our latest efforts to create novel dataset that would allow researchers to develop new types of applications that include communication with human users in natural language.

Gernot Ziegler: Data Parallelism in Computer Vision

gernot_newGernot Ziegler (Dr.Ing.) is an Austrian engineer with an MSc degree in Computer Science and Engineering from Linköping University, Sweden, and a PhD from the University of Saarbrücken, Germany. He pursued his PhD studies at the Max-Planck-Institute for Informatics in Saarbrücken, Germany, specializing in GPU algorithms for computer vision and data-parallel algorithms for spatial data structures. He then joined NVIDIA’s DevTech team, where he consulted in high performance computing and automotive computer vision on graphics hardware. In 2016, Gernot has founded his own consulting company to explore the applications of his computer vision expertise on graphics hardware in mobile consumer, industrial vision and heritage digitalization. His talk will take place on Wednesday, December 14th, 2016, 1pm in room E105.

Data Parallelism in Computer Vision

Abstract: In algorithmic design, serial data dependencies which accelerate CPU processing for computer vision are often counterproductive for the data-parallel GPU. The talk presents data structures and algorithms that enable data parallelism for connected components, line detection, feature detection, marching cubes or octree generation. We will point out the important aspects of data parallel design that will allow you to design new algorithms for GPGPU-based computer vision and image processing yourself. As food for thought, I will sketch algorithmic ideas that could lead to new collaborative results in real-time computer vision.ziegler-talk

Video recording of the talk is publicly available.

Stefan Jeschke: Recent Advances in Vector Graphics Creation and Display

Stefan Jeschke is a scientist at IST Austria. He received an M.Sc. in 2001 and a Ph.D. in 2005, both in computer science from the University of Rostock, Germany. Afterwards, he spend several years as a post doc researcher in several projects at Vienna University of Technology and Arizona State University. His research interest includes modeling and display of vectorized image representations, applications and solvers for PDEs, as well as modeling and rendering complex natural phenomena, preferably in real time. His talk will take place on Tuesday, November 8th, 2016, 1pm in room G202.

Recent Advances in Vector Graphics Creation and Display

This talk gives an overview of my recent work on vector graphics representations as semantically meaningful image descriptions, in contrast to pixel-based raster images. I will cover the problem of how to efficiently create vector graphics either from scratch or from given raster images. The goal was to support designers to produce complex, high-quality representations with only limited manual input. Furthermore, I will talk about various new developments that are mainly based on the so-called “diffusion curves”. Here the goal is to improve the expressiveness of such representations, for example, by adding textures so that natural images appear more realistic without adding excessive amounts of geometry beyond what can be handled by a designer. Rendering such representations at interactive frame rates on modern GPUs is another aspect I will cover in this talk.

Video recording of the talk is publicly available.

Tomáš Pajdla: 3D Reconstruction from Photographs and Algebraic Geometry

pajdlaTomáš Pajdla is a Distinguished Researcher at the CIIRC – Czech Institute of Informatics, Robotics and Cybernetics (ciirc.cvut.cz) and an Assistant Professor at the Faculty of Electrical Engineering (fel.cvut.cz) of the Czech Technical University in Prague. He works in geometry, algebra and optimization of computer vision and robotics, 3D reconstruction from images, and visual object recognition. He is known for his contributions to geometry of cameras, image matching, 3D reconstruction, visual localization, camera and hand-eye calibration, and algebraic methods in computer vision (Google Scholar citations). He coauthored works awarded the best paper prizes at OAGM 1998 and 2013, BMVC 2002 and ACCV 2014. His talk will take place on Wednesday, November 2nd, 2016, 1pm in room E105.

3D Reconstruction from Photographs and Algebraic Geometry

Abstract: pajdla_workWe will show a connection between the state of the art 3D reconstruction from photographs and algebraic geometry. In particular, we will show how some modern tools from computational algebraic geometry can be used to solve some classical as well as recent problems in computing camera calibration and orientation in space. We will present applications in large scale reconstruction from photographs, robotics and camera calibration.

Video recording of the talk is publicly available.

Ralf Schlüter: On the Relation between Error Measures, Statistical Modeling, and Decision Rules

RalfSchlueter_200kbRalf Schlüter studied physics at RWTH Aachen University, Germany, and Edinburgh University, Scotland. He received the Diplom degree with honors in physics in 1995 and the Dr.rer.nat. degree with honors in computer science in 2000, from RWTH Aachen University. From November 1995 to April 1996 Ralf Schlüter was with the Institute for Theoretical Physics B at RWTH Aachen, where he worked on statistical physics and stochastic simulation techniques. Since May 1996 Ralf Schlüter is with the Faculty of Mathematics, Computer Science and Natural Sciences of RWTH Aachen University, where he currently is Academic Director. He leads the automatic speech recognition group at the Human Language Technology and Pattern Recognition lab. His research interests cover speech recognition in general, discriminative training, neural networks, information theory, stochastic modeling, signal analysis, and theoretic aspects of pattern classification. His talk will take place on Tuesday, August 23rd, 2016, 10am in room A112.

On the Relation between Error Measures, Statistical Modeling, and Decision Rules

Abstract: The aim of automatic speech recognition (ASR), or more generally, pattern classification, is to minimize the expected error rate.  This requires a consistent interaction of the error measure with statistical modeling and the corresponding decision rule. Nevertheless, the error measure often is not considered consistently in ASR:

  • error measures usually are not easily tractable due to their discrete nature,
  • the quantitative relation between modeling and error measure at least analytically is unclear and usually is only exploited empirically,
  • the standard decision rule does not consider word error loss.

In this presentation, bounds on the classification error will be presented that can directly be related to acoustic and language modeling. A first analytic relation between language model perplexity and sentence error is established, and the quantitative effect of context reduction and feature omission on the error rate are derived. The corresponding error bounds were discovered and finally analytically proven within a simulation-induced framework, which will be outlined. Also, first attempts on how to design a training criterion to support the use of the standard decision rule while retaining the target of minimum word error rate are discussed. Finally, conditions will be presented under which the standard decision rule does in fact implicitly optimize word/token error rate in spite of its sentence/segment-based target.

Elmar Eisemann: Everything Counts – Rendering Highly-detailed Environments in Real-time

ElmarEisemannBWElmar Eisemann is a professor at TU Delft, heading the Computer Graphics and Visualization Group. Before he was an associated professor at Telecom ParisTech (until 2012) and a senior scientist heading a research group in the Cluster of Excellence (Saarland University / MPI Informatik) (until 2009). He studied at the École Normale Supérieure in Paris (2001-2005) and received his PhD from the University of Grenoble at INRIA Rhône-Alpes (2005-2008). He spent several research visits abroad; at the Massachusetts Institute of Technology (2003), University of Illinois Urbana-Champaign (2006), Adobe Systems Inc. (2007,2008). His interests include real-time and perceptual rendering, alternative representations, shadow algorithms, global illumination, and GPU acceleration techniques. He coauthored the book “Real-time shadows” and participated in various committees and editorial boards. He was local organizer of EGSR 2010, 2012, HPG 2012, and is paper chair of HPG 2015. His work received several distinction awards and he was honored with the Eurographics Young Researcher Award 2011. His talk will take place on Friday, May 20th, 2016, 2pm in room E105.

Everything Counts – Rendering Highly-detailed Environments in Real-time

A traditional challenge in computer graphics is the simulation of natural scenes, including complex geometric models and a realistic reproduction of physical phenomena, requiring novel theoretical insights, appropriate algorithms, and well-designed data structures. In particular, there is a need for efficient image-synthesis solutions, which is fueled by the development of modern display devices, which support 3D stereo, have high resolution and refresh rates, and deep color palettes.

In this talk, we will present methods for efficient image synthesis to address recent rendering challenges. In particular, we will focus on large-scale data sets and present novel techniques to encode highly detailed geometric information in a compact representation. Further, we will give an outlook on rendering techniques for modern display devices, as these often require very differing solutions. In particular, human perception starts to paly an increasing role and has high potential to be a key factor in future rendering solutions.

Video recording of the talk is publicly available.

Josef Sivic: Learning visual representations from Internet data

sivicJosef Sivic holds a permanent position as an INRIA senior researcher (directeur de recherche) in the Department of Computer Science at the École Normale Supérieure (ENS) in Paris. He received a degree from the Czech Technical University, Prague, in 2002 and PhD from the University of Oxford in 2006. His research interests are in developing learnable image representations for automatic visual search and recognition applied to large image and video collections. Before joining INRIA Dr. Sivic spent six months at the Computer Science and Artificial Intelligence Lab at the Massachusetts Institute of Technology. He has published more than 60 scientific publications, has served as an area chair for major computer vision conferences (CVPR’11, ICCV’11, ECCV’12, CVPR’13 and ICCV’13) and as a program chair for ICCV’15. He currently serves as an associate editor for the International Journal of Computer Vision and is a Senior Fellow in the Learning in Machines & Brains program of the Canadian Institute for Advanced Research. He was awarded an ERC grant in 2013. His talk will take place on Friday, April 22nd, 2016, 10:30am in room E105.

Learning visual representations from Internet data

Abstract:
Unprecedented amount of visual data is now available on the Internet. Wouldn’t it be great if a machine could automatically learn from this data? For example, imagine a machine that can learn how to change a flat tire of a car by watching instruction videos on Youtube, or that can learn how to navigate in a city by observing street-view imagery. Learning from Internet data is, however, a very challenging problem as the data is equipped only with weak supervisory signals such as human narration of the instruction video or noisy geotags for street-level imagery. In this talk, I will describe our recent progress on learning visual representations from such weakly annotated visual data.

In the first part of the talk, I will describe a new convolutional neural network architecture that is trainable in an end-to-end manner for the visual place recognition task. I will show that the network can be trained from weakly annotated Google Street View Time Machine imagery and significantly improves over current state-of-the-art in visual place recognition.

In the second part of the talk, I will describe a technique for automatically learning the main steps to complete a certain task, such as changing a car tire, from a set of narrated instruction videos. The method solves two clustering problems, one in text and one in video, linked by joint constraints to obtain a single coherent sequence of steps in both modalities. I will show results on a newly collected dataset of instruction videos from Youtube that include complex interactions between people and objects, and are captured in a variety of indoor and outdoor settings.

Joint work with J.-B. Alayrac, P. Bojanowski, N. Agrawal, S. Lacoste-Julien, I. Laptev, R. Arandjelovic, P. Gronat, A. Torii and T. Pajdla.

Tomáš Werner: Linear Programming Relaxation Approach to Discrete Energy Minimization

werner-faceTomáš Werner works as a researcher at the Center for Machine Perception, Faculty of Electrical Engineering, Czech Technical University, where he also obtained his PhD degree. In 2001-2002 he worked as a post-doc at the Visual Geometry Group, Oxford University, U.K. In the past, his main interest was multiple view geometry and three-dimensional reconstruction in computer vision. Today, his interest is in machine learning and optimization, in particular graphical models. He is a (co-)author of more than 70 publications, with 350 citations in WoS. His talk will take place on Wednesday, February 24, 2016, 1pm in room G202. THE TALK IS POSTPONED, it will take place on Tuesday, April 12, 2016, 2pm in room A113.

Linear Programming Relaxation Approach to Discrete Energy Minimization

Abstract: Discrete energy minimization consists in minimizing a function of many discrete variables that is a sum of functions, each depending on a small subset of the variables. This is also known as MAP inference in graphical  models (Markov random fields) or weighted constraint satisfaction. Many successful approaches to this useful but NP-complete problem are based on  its natural LP relaxation. I will discuss this LP relaxation in detail,  along with algorithms able to solve it for very large instances, which appear e.g. in computer vision. In particular, I will discuss in detail a convex message passing algorihtm, generalized min-sum diffusion.

Christian Theobalt: Reconstructing the Real World in Motion

Christian Theobalt is a Professor of Computer Science and the head of the research group “Graphics, Vision, & Video” at the Max-Planck-Institute for Informatics, Saarbruecken, Germany. He is also an adjunct faculty at Saarland University. From 2007 until 2009 he was a Visiting Assistant Professor in the Department of Computer Science at Stanford University. Most of his research deals with algorithmic problems that lie on the boundary between the fields of Computer Vision and Computer Graphics, such as dynamic 3D scene reconstruction and marker-less motion capture, computer animation, appearance and reflectance modelling, machine learning for graphics and vision, new sensors for 3D acquisition, advanced video processing, as well as image- and physically-based rendering.

For his work, he received several awards, including the Otto Hahn Medal of the Max-Planck Society in 2007, the EUROGRAPHICS Young Researcher Award in 2009, and the German Pattern Recognition Award 2012. Further, in 2013 he was awarded an ERC Starting Grant by the European Union. In 2015, the German business magazine Capital elected him as one of the top 40 innovation leaders under 40. Christian Theobalt is a Principal Investigator and a member of the Steering Committee of the Intel Visual Computing Institute in Saarbruecken. He is also a co-founder of a spin-off company from his group – www.thecaptury.com – that is commercializing a new generation of marker-less motion and performance capture solutions.

Reconstructing the Real World in Motion

Even though many challenges remain unsolved, in recent years computer graphics algorithms to render photo-realistic imagery have seen tremendous progress. An important prerequisite for high-quality renderings is the availability of good models of the scenes to be rendered, namely models of shape, motion and appearance. Unfortunately, the technology to create such models has not kept pace with the technology to render the imagery. In fact, we observe a content creation bottleneck, as it often takes man months of tedious manual work by animation artists to craft models of moving virtual scenes.

To overcome this limitation, the graphics and vision communities has been developing techniques to capture dense 4D (3D+time) models of dynamic scenes from real world examples, for instance from footage of real world scenes recorded with cameras or other sensors. One example are performance capture methods that measure detailed dynamic surface models, for example of actors or an actor’s face, from multi-view video and without markers in the scene. Even though such 4D capture methods made big strides ahead, they are still at an early stage. Their application is limited to scenes of moderate complexity in controlled environments, reconstructed detail is limited, and captured content cannot be easily modified, to name only a few restrictions. Recently, the need for efficient dynamic scene reconstruction methods has further increased by developments in other thriving research domains, such as virtual and augmented reality, 3D video, or robotics.

In this talk, I will elaborate on some ideas on how to go beyond the current limits of 4D reconstruction, and show some results from our recent work. For instance, I will show how we can take steps to capture dynamic models of humans and general scenes in unconstrained environments with few sensors. I will also show how we can capture higher shape detail as well as material parameters of scenes outside of the lab. The talk will also show how one can effectively reconstruct very challenging scenes of a smaller scale, such a hand motion. Further on, I will discuss how we can capitalize on more sophisticated light transport models to enable high-quality reconstruction in much more uncontrolled scenes, eventually also outdoors, with only few cameras, or even just a single one. Ideas on how to perform deformable scene reconstruction in real-time will also be presented, if time allows.

His talk takes place on Wednesday, March 23, 2016, 1pm in room G202.

Video recording of the talk is publicly available.

Christoph H. Lampert: Classifier Adaptation at Prediction Time

Christoph LampertChristoph Lampert received the PhD degree in mathematics from the University of Bonn in 2003. In 2010 he joined the Institute of Science and Technology Austria (IST Austria) first as an Assistant Professor and since 2015 as a Professor. His research on computer vision and machine learning won several international and national awards, including the best paper prize of CVPR 2008. In 2012 he was awarded an ERC Starting Grant by the European Research Council. He is an Editor of the International Journal of Computer Vision (IJCV), Action Editor of the Journal for Machine Learning Research (JMLR), and Associate Editor in Chief of the IEEE Transaction on Pattern Analysis and Machine Intelligence (TPAMI). His talk takes place on Tuesday, January 12, 2016, 1pm in room E104.

Classifier Adaptation at Prediction Time

Abstract: In the era of “big data” and a large commercial interest in computer vision, it is only a matter of time until we will buy commercial object recognition systems in pre-trained form instead of training them ourselves. This, however, poses a problem of domain adaptation: the data distribution in which a customer plans to use the system will almost certainly differ from the data distribution that the vendor used during training. Two relevant effects are a change of the class ratios and the fact that the image sequences that needs to be classified in real applications are typically not i.i.d. In my talk I will introduce simple probabilistic technique that can adapt the object recognition system to the test time distribution without having to change the underlying pre-trained classifiers. I will also introduce a framework for creating realistically distributed image sequences that offer a way to benchmark such adaptive recognition systems. Our results show that the above “problem” of domain adaptation can actually be a blessing in disguise: with proper adaptation the error rates on realistic image sequences are typically lower than on standard i.i.d. test sets.

Video recording of the talk is publicly available.

Petr Kubánek: Data processing of Astronomical Images

1915390_1284011425104_3974180_nPetr Kubánek received master degree in Software engineering from the Faculty of Mathematics and Physics of Charles University in Prague, and master degree in fuzzy logic from University of Granada in Spain. Currently he is research fellow at the Institute of Physics of Czech Academy of Sciences in Prague.  He is developing RTS2 (Remote Telescope System 2nd Version), a package for fully autonomous astronomical observatory control and scheduling. RTS2 is being used on multiple observatories around the planet, on all continents (as one of the RTS2 collaborator is currently winterovering at Dome C in Antartica). Petr’s interests and expertises spans from distributed device control through databases towards image processing and data mining. During his carrier, he collaborated with top world institutions (Harvard/CfA on FLWO 48″ telescope, UC Berkeley on RATIR 1.5m telescope, NASA/IfA on ATLAS project, ESA/ISDEFE on TBT project, SLAC and BNL on Large Synoptics Survey Telescope (LSST) CCD testing) and enjoyed travel to restricted areas (scheduled for observing run at US Naval Observatory in Arizona). Hi is on kind-of parental leave, enjoying his new family, and slowly returning back to vivid astronomical world. His talk takes place on Tuesday, December 8, 2pm in room E104.

Data processing of Astronomical Images

Astronomy and astrophysics is one of the science fields leverageing most rapidly technological progress. Be it with simple lens used by Galileo to study the stars and planets, to modern, huge marvellous telescopes using top of the art control systems and detectors, technological progress is tightly coupled with progress in astronomy and astrophysics. In this talk, I will review principles of data acquisition and processing as performed by astronomers around the planet. I will start with basic processing done on film cameras and photography, progressing towards advanced processing and interpretation of multi terabytes digital data acquired by most productive astronomical instruments.

Video recording of the talk is publicly available.

Yosi Keller: Probabilistic approach to high order assignment problems

Yosi KellerYosi Keller received the BSc degree in Electrical Engineering in 1994 from the Technion-Israel Institute of Technology, Haifa. He received the MSc and PhD degrees in electrical engineering from Tel-Aviv University, Tel-Aviv, in 1998 and 2003, respectively. From 2003 to 2006 he was a Gibbs assistant professor with the Department of Mathematics, Yale University. He is an Associate Professor at the Faculty of Engineering in Bar Ilan University, Israel. His research relates to the applications of graph theory and machine learning to signal processing, computer vision and 3D modelling. His talk takes place on Thursday, November 26, 1pm in room E105.

Probabilistic approach to high order assignment problems

A gamut of computer vision and engineering problems can be cast as high order matching problems, where one considers the affinity/probability of two or more assignments simultaneously. The spectral matching approach of Leordeanu and Hebert (2005) was shown to provide an approximate solution of this np-hard problem. It this talk we present recent results on the probabilistic interpretation of spectral matching. We extend the results of Zass and Shashua (2008) and provide a probabilistic interpretation to the spectral matching and graduated assignment (1996) algorithms. We then derive a new probabilistic matching scheme, and show that it can be extended to high order matching scheme, via a dual marginalization-decomposition scheme. We will present a novel Integer Least Squares algorithm and apply it to the decoding of MIMO and OFDM channels, in the uncoded and coded cases, respectively. Joint work with Amir Egozi, Michael Chertok , Avi Septimus, Ayelet Haimovitch, Shimrit Haber and Dr. Itzik Bergel.

Video recording of the talk is publicly available.

Michael Wimmer: Computer Graphics Meets Computational Design

MichaelWimmerMichael Wimmer is currently an Associate Professor at the Institute of Computer Graphics and Algorithms of the Vienna University of Technology, where he heads the Rendering Group. His academic career started with his M.Sc. in 1997 at the Vienna Universtiy of Technology, where he obtained his Ph.D. in 2001. His research interests are real-time rendering, computer games, real-time visualization of urban environments, point-based rendering, procedural modeling and shape modeling. He has coauthored over 100 papers in these fields. He also coauthored the book Real-Time Shadows. He served on many program committees, including ACM SIGGRAPH and SIGGRAPH Asia, Eurographics, Eurographics Symposium on Rendering, ACM I3D, etc. He is currently associate editor of Computers & Graphics and TVCG. He was papers co-chair of EGSR 2008, Pacific Graphics 2012, and Eurographics 2015. His talk takes place on Tuesday, October 20, 1 pm in room A112.

Computer Graphics Meets Computational Design

In this talk, I will report on recent advancements in Computer Graphics, which will be of great interest for next-generation computational design tools. I will present methods for modeling from images, modeling by examples and multiple examples, but also procedural modeling, modeling of physical behavior and light transport, all recently developed in our group. The common rationale behind our research is that we exploit real-time processing power and computer graphics algorithms to enable interactive computational design tools that allow short feedback loops in design processes.

Shinji Watanabe: Practical Bayesian Methods for Speech and Language Processing

shinji_watanabeShinji Watanabe is a Senior Principal Researcher at Mitsubishi Electric Research Laboratories (MERL), Cambridge, MA, USA. He received his Ph.D. from Waseda University, Tokyo, Japan, in 2006. From 2001 to 2011, he was a research scientist at NTT Communication Science Laboratories, Kyoto, Japan. In 2009, he was a visiting scholar at the Georgia Institute of Technology, Atlanta, GA. His research interests include Bayesian machine learning, and speech and language processing. He has published more than 100 papers in journals and conferences, and received several awards including the best paper award from IEICE in 2003. He is currently an Associate Editor of the IEEE Transactions on Audio Speech and Language Processing, and member of several committees including the IEEE Signal Processing Society Speech and Language Technical Committee (SLTC). His talk will take place on Tuesday, September 15, 1 pm in room E104.

Practical Bayesian Methods for Speech and Language Processing

In this talk, I will introduce practical Bayesian methods for speech and language processing; mainly focusing on Bayesian acoustic models for speech recognition. In general, speech and language processing involves extensive knowledge of statistical models. Both acoustic and language models are important parts of modern speech recognition systems where the models learned from real-world data present large complexity, ambiguity and uncertainty. Modeling the uncertainty is crucial to tackle model regularization for robust speech recognition. I will introduce the applications of several approximated Bayesian inference techniques including maximum a posteriori, asymptotic, and variational Bayesian methods to acoustic modeling, and discuss the effectiveness and the difficulties of these approximated methods. In addition, I will also briefly explain the recent activities of the MERL speech and audio research.

Jiří Bittner: Recent Advances in Bounding Volume Hierarchies for Ray Tracing

JiriBittnerJiří Bittner is an associate professor at the Faculty of Electrical Engineering of the Czech Technical University in Prague. He received his Ph.D. in 2003 at the same institute. His research interests include visibility computations, real-time rendering, spatial data structures, and global illumination. He participated in a number of national and international research projects and also several commercial projects dealing with real-time rendering of complex scenes. His talk took place on Wednesday, June 10, 1pm in room E104.

Recent Advances in Bounding Volume Hierarchies for Ray Tracing

Abstract: In my talk I will briefly survey the usage of bounding volume hierarchies (BVH) for ray tracing acceleration. I will present a technique optimizing bounding volume hierarchies using insertion based global optimization procedure that leads to hierarchies of higher quality compared to the previous state of the art methods. I will also discuss a modification of this technique for the incremental construction of BVH and outline the usage of the incremental construction for real-time ray tracing of complex models streamed over a network. I will further present a method allowing to construct a single BVH optimized for all frames of a given animation sequence. I will conclude my talk by presenting a new ray tracing acceleration technique combining BVHs and ray space hierarchies that allows to perform real-time ray tracing of complex scenes that do not fit into the memory of the GPU.

Branislav Mičušík: Calibrating Surveillance Camera Networks

branomicusiksm2Branislav Mičušík is a senior scientist at the Austrian Institute of Technology. Prior to that in ’07-’09 he was a visiting research scholar at Stanford University, USA. In ’04-’07 he was a postdoctoral researcher at the Vienna University of Technology, Austria. He received his Ph.D. in ’04 from the Czech Technical University in Prague, at the Center of Machine Perception. His research interests are driven by wish to learn computers and machines to understand what they see in order to infer their own location. He is a holder of the Microsoft Visual Computing Award 2011 given to the best young scientist in Visual Computing in Austria and the Best Scientific Paper Prize at the British Machine Vision Conference in ’07. His talk takes place on Wednesday, May 27, 11am in room E104.

Calibrating Surveillance Camera Networks

Abstract: Camera systems have witnessed a huge increase in the number of installed cameras, generating a massive amount of video data. Current computer vision technologies are not fully able to exploit the visual information available in such large camera networks partially due to the lack of information about camera exact location. A manual calibration with special calibration targets, especially in ad hoc large camera networks, does not scale well with the number of cameras, is too time consuming, hence impractical. Therefore, a fully or semi automatic method with minimal user effort is an inevitable objective which should solely rely on visual information.

I present three approaches to tackle the calibration and localization problem of self-calibrating camera networks purely relying on available visual data. First, I present an approach for the calibration of cameras building on the latest achievements in Structure from Motion community. This stands for localization a camera in a priori built 3D model consisting of either points or line segments. Second, and third respectively, our approaches calibrating a single camera, and multiple surveilance cameras respectively, from detecting and tracking people will be reviewed. I show how multiple view geometry between overlapping and non-overlapping camera views with static and dynamic point correspondences gives a strong cue towards calibrating the cameras yielding practically appealing solutions.

Rafał Mantiuk: From high dynamic range to perceptual realism

RafalMantiukRafał Mantiuk is a senior lecturer (associate professor) at Bangor University (UK) and a member of a Reasearch Institute of Visual Computing. Before comming to Bangor he received his PhD from the Max-Planck-Institute for Computer Science (2006, Germany) and was a postdoctoral researcher at the University of British Columbia (Canada). He has published numerous journal and conference papers presented at ACM SIGGRAPH, Eurographics, CVPR and SPIE HVEI conferences, applied for several patents and was recognized by the Heinz Billing Award (2006). Rafal Mantiuk investigates how the knowledge of the human visual system and perception can be incorporated within computer graphics and imaging algorithms. His recent interests focus on designing imaging algorithms that adapt to human visual performance and viewing conditions in order to deliver the best images given limited resources, such as computation time or display contrast. His talk takes place on Friday, March 27 at 1pm, in room E104.

From high dynamic range to perceptual realism

Abstract: Today’s computer graphics techniques make it possible to create imagery that is hardly distinguishable from photographs. However, a photograph is clearly no match to an actual real-world scene. I argue that the next big challenge in graphics is to achieve perceptual realism by creating artificial imagery that would be hard to distinguish from reality. This requires profound changes in the entire imaging pipeline, from acquisition and rendering to display, with the strong focus on visual perception.

In this talk I will give an brief overview of several projects related to high dynamic range imaging and the applications of visual perception. Then I will discuss in more detail a project in which we explored the “dark side” of the dynamic range in order to model how people perceived images at low luminance. We use such a model to simulate the appearance of night scenes on regular displays, or to generate compensated images that reverse the changes in vision due to low luminance levels. The method can be used in games, driving simulators, or as a compensation for displays used under varying ambient light levels.

Jörn Anemüller: Machine learning approaches for estimation of a neuron’s spectro-temporal filter from non-Gaussian stimulus ensebmles

anemullerJörn Anemüller studied Physics at the University of Oldenburg, Germany, and Information Processing and Neural Networks at King’s College, University of London, where he received the M.Sc. in 1996. He earned the Ph.D. in Physics at the University of Oldenburg in 2001with a dissertation on “Across frequency-processing in convolutive blind source separation”. From 2001 to 2004 he conducted work on biomedical signal analysis as a post-doctoral fellow at the Salk Institute for Biological Studies and at the University of California, San Diego. Since 2004 he is member of the scientific staff at the Dept. of Physics, University of Oldenburg, currently leading the statistical signal models research group. His interests include statistical signal processing and machine learning techniques with application to acoustic, speech and biomedical signals. His talk takes place on Thursday, March 26 at 1pm, in room D0206.

Machine learning approaches for estimation of a neuron’s spectro-temporal filter from non-Gaussian stimulus ensebmles

Abstract: Engineers may view an auditory neuron as an unknown but (hopefully) identifiable system that transforms the acoustic stimulus input into a binary “spike” or “no-spike” output. The linear part of the neuron’s spectro-temporal transfer function is commonly refered to as the spectro-temporal receptive field (STRF). From a machine learning perspective, this setting corresponds to the binary classification problem of discriminating spike-eliciting from non-spike-eliciting stimulus examples. The classification-based receptive field (CbRF) estimation method that we proposed recently adapts a linear large-margin classifier to optimally predict experimental stimulus-response data and subsequently interprets learned classifier weights as the neuron’s receptive field filter.

Efficacy of the CbRF method is validated with simulations and for auditory spectro-temporal receptive field estimation from experimental recordings in the auditory midbrain of Mongolian gerbils. Acoustic stimulation is performed with frequency- modulated tone complexes that mimic properties of natural stimuli, specifically non-Gaussian amplitude distribution and higher-order correlations. Results demonstrate that the proposed approach successfully identifies correct underlying STRFs, even in cases where standard second-order methods based on the spike-triggered average (STA) do not.

Applied to small data samples, the method is shown to converge on smaller amounts of experimental recordings and with lower estimation variance than the generalized linear model and recent information theoretic methods. Analysis of temporal variability of receptive fields quantifies differences between processing at different stages along the auditory pathway.

Implications for speech recognition and acoustic event detection are briefly discussed.

Filip Šroubek – Advances in Image Restoration: from Theory to Practice

FilipSroubekFilip Šroubek received the M.Sc. degree in computer science from the Czech Technical University, Prague, Czech Republic in 1998 and the Ph.D. degree in computer science from Charles University, Prague, Czech Republic in 2003. From 2004 to 2006, he was on a postdoctoral position in the Instituto de Optica, CSIC, Madrid, Spain. In 2010/2011 he received a Fulbright Visiting Scholarship at the University of California, Santa Cruz. Currently he is with the Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic. His talk takes place on Tuesday, February 24 at 11am in room E105.

Advances in Image Restoration: from Theory to Practice

reconAbstract: We rely on images with ever growing emphasis. Our perception of the world is however limited by imperfect measuring conditions and devices used to acquire images. By image restoration, we understand mathematical procedures removing degradation from images. Two prominent topics of image restoration that has evolved considerably in the last 10 years are blind deconvolution and superresolution. Deconvolution by itself is an ill-posed inverse problem and one of the fundamental topics of image processing. The blind case, when the blur kernel is also unknown, is even more challenging and requires special optimization approaches to converge to the correct solution. Superresolution extends blind deconvolution by recovering lost spatial resolution of images. In this talk we will cover the recent advances in both topics that pave the way from theory to practice. Various real acquisition scenarios will be discussed together with proposed solutions for both blind deconvolution and superresolution and efficient numerical optimization methods, which allow fast implementation. Examples with real data will illustrate performance of the proposed solutions.

Ondřej Chum: Visual Retrieval with Geometric Constraint

OndrejChumOndřej Chum received the MSc degree in computer science from Charles University, Prague, in 2001 and the PhD degree from the Czech Technical University in Prague, in 2005. From 2005 to 2006, he was a research Fellow at the Centre for Machine Perception, Czech Technical University. From 2006 to 2007 he was a post-doc at the Visual Geometry Group, University of Oxford, UK. Recently, he is now an associate professor back at the Centre for Machine Perception. His research interests include object recognition, large-scale image and particular-object retrieval, invariant feature detection, and RANSAC-type optimization. He has coorganized the “25 years of RANSAC” Workshop in conjunction with CVPR 2006, Computer Vision Winter Workshop 2006, and Vision and Sports Summer School (VS3) in Prague 2012 and 2014. He was the recipient of the runner up award for the “2012 Outstanding Young Researcher in Image & Vision Computing” by the Journal of Image and Vision Computing for researchers within seven years of their PhD, and the Best Paper Prize at the British Machine Vision Conference in 2002. In 2013, he was awarded ERC-CZ grant. His talk takes place on Wednesday, January 28 at 3pm in room E104.

Visual Retrieval with Geometric Constraint

Abstract: In the talk, I will address the topic of image retrieval. In particular, I will focus on retrieval methods based on bag of words image representation that exploit geometric constrains. Novel formulations of image retrieval problem will be discussed, showing that the classical ranking of images based on similarity addresses only one of possible user requirements. Retrieval methods efficiently solving the new formulations by exploiting geometric constraints will be used in different scenarios. These include online browsing of image collections, image analysis based on large collections of photographs, or model construction.

For online browsing, I will show queries that try to answer question such as: “What is this?” (zoom in at a detail), “Where is that?” (zoom-out to larger visual context), or “What is to the left / right of this?”. For image analysis, two novel problems straddling the boundary between image retrieval and data mining are formulated: for every pixel in the query image, (i) find the database image with the maximum resolution depicting the pixel and (ii) find the frequency with which it is photographed in detail.