Monthly Archives: March 2018

Niko Brummer: Tractable priors, likelihoods, posteriors and proper scoring rules for the astronomically complex problem of partitioning a large set of recordings w.r.t. speaker

brummerNiko Brummer received B.Eng (1986), M.Eng (1988) and Ph.D. (2010) degrees, all in electronic engineering, from Stellenbosch University. He worked as researcher at DataFusion (later called Spescom DataVoice), and AGNITIO and is currently with Nuance Communications. Most of his research for the last 25 years has been applied to automatic speaker and language recognition and he has been participating in most of the NIST SRE and LRE evaluations in these technologies, from the year 2000 to the present. He has been contributing to the Odyssey Workshop series since 2001 and was organizer of Odyssey 2008 in Stellenbosch. His FoCal and Bosaris Toolkits are widely used for fusion and calibration in speaker and language recognition research.

His research interests include development of new algorithms for speaker and language recognition, as well as evaluation methodologies for these technologies. In both cases, his emphasis is on probabilistic modelling. He has worked with both generative (eigenchannel, JFA, i-vector PLDA) and discriminative (system fusion, discriminative JFA and PLDA) recognizers. In evaluation, his focus is on judging the goodness of classifiers that produce probabilistic outputs in the form of well calibrated class likelihoods.

Tractable priors, likelihoods, posteriors and proper scoring rules for the astronomically complex problem of partitioning a large set of recordings w.r.t. speaker

Real-world speaker recognition problems are not always arranged into neat, NIST-style challenges with large labelled training databases and binary target/non-target evaluation trials. In the most general case we are given a (sometimes large) collection of recordings and ideally we just want to go and recognize the speakers in there. This problem is usually called speaker clustering and solutions like AHC (agglomerative hierarchical clustering) exist. The catch is that neither AHC, nor indeed any other yet-to-be-invented algorithm can find the correct solution with certainty. In the simple case of binary trials, we in the speaker recognition world are already very comfortable with dealing with this uncertainty—the recognizers quantify their uncertainty as likelihood-ratios. We know how calibrate these likelihood-ratios, how to use them to make Bayes decisions and how to judge their goodness with proper scoring rules. At a first glance all of these things seem to be hopelessly intractable for the clustering problem because of the astronomically large size of the solution space. In this talk show otherwise and propose a suite of tractable tools for probabilistic clustering.

His talk takes place on Monday, April 16, 2018 at 13:00 in room G202.

Video recording of the talk is publicly available.

Slides of the talk are publicly available.

David Filip: Standardization and Research

David Filip is Chair (Convener) of OASIS XLIFF OMOS TC; Secretary, Lead Editor and Liaison Officer of OASIS XLIFF TC; a former Co-Chair and Editor for the W3C ITS 2.0 Recommendation; Steering Committee member of GALA TAPICC, Advisory Editorial Board member for the Multilingual magazine; co-moderator of the Standards IG at JIAMCATT. David has been also appointed as NSAI expert to ISO TC 37/SC 3 and /SC 5, ISO/IEC JTC 1/WG 9. /SC38, and /SC42. His specialties include open standards and process metadata, workflow and meta-workflow automation. David works as a Research Fellow at the ADAPT Research Centre, Trinity College Dublin, Ireland. Before 2011, he oversaw key research and change projects for Moravia’s worldwide operations. David held research scholarships at universities in Vienna, Hamburg and Geneva, and graduated in 2004 from Brno University with a PhD in Analytic Philosophy. David also holds master’s degrees in Philosophy, Art History, Theory of Art and German Philology.

Standardization and Research

David will explain about the multilingual content standardization ecosystem, starting with foundational standards such as XML and Unicode, over XML vocabularies for payload and metadata exchange, to API and reference architecture specifications. He will explain basic standardization principles with special regard for internet based technologies, touching on different standardization cultures ranging from industry associations, over ad hoc consortia, IETF, OASIS, W3C, Unicode, to traditional SDOs such as ISO, ISO/IEC, ASTM etc. David will also touch on the relationship of standardization, research, and innovation and how it is important or not for research groups and institutes to participate in standardization. Difference between anticipatory and post hoc standardization will be explained and how royalty free standards create and grow markets for technology and innovation.

His talk takes place on Thursday, March 22, 2018 at 13:00 in room E104.