[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Post-doctoral position on audiovisual speech separation (deadline Sept 1)

TITLE: Environment-robust audiovisual speech separation
RECRUITMENT DATE: as soon as possible between October 1 and December 1, 2011
DURATION: 18 months
SALARY: depending on experience
PRINCIPAL INVESTIGATOR: Nancy Bertin (nancy.bertin@xxxxxxxx)
CO-PRINCIPAL INVESTIGATOR: Emmanuel Vincent (emmanuel.vincent@xxxxxxxx)

Speech separation is the task of estimating the signal of each speaker within a recorded sound scene involving one or more speakers and background noise. Existing approaches have typically been assessed in specific environments, e.g. meeting environments involving concurrent speakers and moderate reverberation or outdoor environments involving diffuse background noise but no reverberation [1].

The purpose of this postdoctoral position is to propose a source separation algorithm applicable to a wide range of environments and a set of associated use case scenarios. In a first stage, an experimental multi-environment benchmark will be developed and a number of state-of-the-art algorithms will be evaluated. In a second stage, new environment-robust algorithms will be investigated by designing improved speaker and background noise models and integrating them into the state-of-the-art variance modeling-based source separation framework together with the available video information [2,3,4]. A range of separation-related tasks, such as enhancement and denoising, will be proposed and evaluated, so as to find the use case scenarios making best use of the technology at hand.

[1] E. Vincent, S. Araki, F.J. Theis, G. Nolte, P. Bofill, H. Sawada, A. Ozerov, B.V. Gowreesunker, D. Lutter, and N.Q.K. Duong, "The Signal Separation Evaluation Campaign (2007-2010): Achievements and remaining challenges", Technical Report RR-7581, INRIA, 2011. [2] E. Vincent, M.G. Jafari, S.A. Abdallah, M.D. Plumbley, and M.E. Davies, "Probabilistic modeling paradigms for audio source separation", in Machine Audition: Principles, Algorithms and Systems, IGI Global, pp. 162-185, 2010. [3] A. Ozerov, E. Vincent, and F. Bimbot, "A general flexible framework for the handling of prior information in audio source separation", Technical Report RR-7453, INRIA, 2010. [4] A. Llagostera Casanovas, G. Monaci, P. Vandergheynst, and R. Gribonval, "Blind audiovisual source separation based on sparse redundant representations", IEEE Transactions on Multimedia, 12(5), pp. 358-371, 2010.

CNRS, the French National Center for Scientific Research, and INRIA, the French National Institute for Research in Computer Science and Control, both play a leading role in the development of Information Science and Technology (IST) in Europe. The METISS team (http://www.irisa.fr/metiss/) gathers a staff of 20 people focusing on audio signal processing research within the joint CNRS/INRIA lab called IRISA in Rennes. This position is part of a collaborative project with Canon Research Centre France (CRF) in nearby Cesson-Sévigné. It will involve regular exchanges and collaboration with the Audio Research Team at CRF.

Prospective candidates must hold or be about to defend a PhD in audio signal processing. Proficient coding in Matlab is necessary. Additional expertise in audio benchmarking or source separation or audiovisual processing would be an asset.

Applications including a full resume, a letter of motivation and up to three reference letters must be sent by email to the principal investigator before September 1, 2011. Phone interviews of selected candidates will be held early September.