[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AUDITORY] Final registration reminder: CHiME 2013, 2nd International Workshop on Machine Listening in Multisource Environments

               2nd International Workshop on 
Machine Listening in Multisource Environments (CHiME 2013) 

              in conjunction with ICASSP 2013 
              June 1, 2013, Vancouver, Canada 


*REGISTRATION DEADLINE*: May 11, 2013 (only 2 weeks left!) 




Model-based Speech Separation and Recognition: Yesterday, Today, and 
Steven J. Rennie, IBM 

Recently, model-based approaches for multi-talker speech separation and 
recognition have demonstrated great success in highly constrained 
scenarios, and efficient algorithms for separating data with literally 
*trillions* of underlying states have been unveiled. In less constrained 
scenarios, deep neural networks (DNNs) learned on features inspired by 
human auditory processing have shown great capacity for directly 
learning masking functions from parallel data. Ideally, a robust speech 
separation/recognition system should be continuously learning, adapting, 
and exploiting structure that is present in both target and peripheral 
signals and interactions, make minimal assumptions about the data to be 
separated/recognized, not require parallel data streams, and have 
essentially unlimited information capacity. In this talk I�ll briefly 
review the current state of robust speech separation/recognition 
technology--where we are, where we apparently need to go, and how we 
might get there. I'll then discuss in more detail recent work that I've 
been involved with that is aligned with these goals. Specifically, I 
will discuss some new results on efficiently learning the structure of 
models and efficiently optimizing a wide class of matrix-valued 
functions, some recent work on Factorial Restricted Boltzmann machines 
for robust ASR, and finally, Direct product DBNs, a new architecture 
that makes it feasible to learn DNNs with literally *millions* of neurons. 

Recognizing and Classifying Environmental Sounds 
Daniel P.W. Ellis, Columbia University 

Animal hearing exists to extract useful information out of the 
environment, and for a lot of animals for a large portion of the 
evolutionary history of hearing this sound environment has not consisted 
of speech or music, but of more generic acoustic information arising 
from collisions, motions, and other events in the external world.  This 
aspect of sound analysis -- getting information out of non-speech, 
non-music, environmental sounds -- is finally beginning to gain 
popularity in research since it holds promise as a tool for automatic 
search and retrieval of audio/video recordings, an increasingly urgent 
problem.  I will discuss our recent work in using audio analysis to 
manage and search environmental sound archives (including personal audio 
lifelogs and consumer video collections), and illustrate with some of 
the approaches that work more or less well, with an effort to explain why. 


CHiME 2013 will consider the challenge of developing machine listening 
applications for operation in multisource environments, i.e. real-world 
conditions with acoustic clutter, where the number and nature of the 
sound sources is unknown and changing over time. It will bring together 
researchers from a broad range of disciplines (computational hearing, 
blind source separation, speech recognition, machine learning) to 
discuss novel and established approaches to this problem. The 
cross-fertilisation of ideas will foster fresh approaches that 
efficiently combine the complementary strengths of each research field. 

One highlight of the Workshop will be the presentation of the results of 
the 2nd CHiME Speech Separation and Recognition Challenge, that is a 
two-microphone multisource speech separation and recognition challenge 
supported by the IEEE AASP, MLSP and SLTC Technical Committees. To find 
out more, please visit http://spandh.dcs.shef.ac.uk/chime_challenge


To register, please visit 

The registration fee is 35 UK pounds and includes admission to the 
sessions, electronic proceedings, buffet lunch, and tee and coffee breaks. 


The workshop is taking place at the Hyatt Regency Vancouver, 655 Burrard 
Street -- close to the ICASSP 2013 venue -- on the day after ICASSP 
finishes, Saturday 1st June. Information about accommodation and how to 
get to and from downtown Vancouver can be found on the main ICASSP website: 

See you in Vancouver. 

Best regards, 

CHiME Organising Committee