Summary of responses to my previous enquriy. (Chen Ke )

Subject: Summary of responses to my previous enquriy. From: Chen Ke <chenke(at)PKU.EDU.CN> Date: Tue, 11 Apr 1995 12:41:17 +0800 Hello, I have already posted an enquiry about the topic of computational auditory model and received several responses. Here, I am pleased to post these reponses. By the way, I would like to thank all people who have responsed me again. Ke Chen --------------------------------------- Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA05572 (5.67b8/IDA-1.5 for chenke); Tue, 4 Apr 1995 02:00:52 +0800 Received: from alink-gw.apple.com by pkuns.PKU.EDU.CN with SMTP id AA14629 (5.67b/IDA-1.5 for chenke(at)pccms.pku.edu.cn); Tue, 4 Apr 1995 02:04:16 +0800 Received: from federal-excess.apple.com by alink-gw.apple.com with SMTP (921113.SGI.UNSUPPORTED_PROTOTYPE/7-Oct-1993-eef) id AA15249; Mon, 3 Apr 95 11:04:41 -0700 for chenke(at)pku.edu.cn Received: from taurus.apple.com by federal-excess.apple.com (5.0/1-Nov-1994-eef) id AA23069; Mon, 3 Apr 1995 11:03:13 +0800 for chenke(at)pku.edu.cn Received: from [17.255.8.25] (dlyon1.atg.apple.com [17.255.8.25]) by taurus.apple.com (8.6.10/8.6.5) with SMTP id LAA20750; Mon, 3 Apr 1995 11:04:36 -0700 Date: Mon, 3 Apr 1995 11:04:36 -0700 X-Sender: lyon(at)taurus.apple.com Message-Id: <v02110106aba574cad445(at)[17.255.8.25]> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" To: Chen Ke <chenke(at)pku.edu.cn> From: lyon(at)apple.com (Richard Lyon) Subject: Re: Enquiry about the work on computational auditory model. Cc: AUDITORY(at)vm1.mcgill.ca Content-Length: 1903 Status: RO >To my best knowledge, almost all of work in this field focuses in the >peripheral auditory model. Little work on the central auditory, auditory >path and auditory cortex has already been reported. As a result, I want >to investigate aforementioned work and conduct my research. I would appreciate >it if anyone could give me pointers. Dr. Ke Chen, It's true that the modeling work gets thinner, harder to find, and harder to understand, assess, and apply, as you move more centrally into the auditory system. But there is a substantial body of work out there if you look hard enough. Of course, it depends on what you mean by central, too. Are pitch and binaural mechanisms central? The auditory pathway has many levels, and these are probably more peripheral than central, but are not as peripheral and cochlea and cochlear nucleus. The correlation models of Licklider (1951, for pitch) and Jeffress (1948, for binaural) have spawned a lot of work in the last decade, my own included. Neurophysiologists have confirmed the existence of binaural cross-correlation circuits (e.g. TC Yin in cats, Konishi Knudsen Sullivan in barn owls) and of delay-tuned correlators for pitch-like operations (N. Suga in bats). Knudsen's and Konishi's groups continue to do lots of studies and models of primarily spatial processing through I.C. and tectum, including learning. Cortical modeling is in a more primitive state, but some attempts are being made (e.g. by Shamma) to understand and model the physiology. The are numerous other groups active in auditory physiology and modeling, and I apologize for not having time to give a more balanced account. Let us know what you intend to do with modeling, and maybe we can make more specific suggestions. Do you have an application in mind, or a particular level you want to model? \Dick Lyon (408)974-4245 Apple/ATG/InteractiveMedia/PerceptionSystems ----------------------------------------- Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA05441 (5.67b8/IDA-1.5 for chenke); Mon, 3 Apr 1995 23:10:50 +0800 Received: from cornell.edu by pkuns.PKU.EDU.CN with SMTP id AA14454 (5.67b/IDA-1.5 for chenke(at)pccms.pku.edu.cn); Mon, 3 Apr 1995 23:14:18 +0800 Received: from blue.ornith.cornell.edu (BLUE.ORNITH.CORNELL.EDU [132.236.164.11]) by cornell.edu (8.6.9/8.6.9) with SMTP id LAA12827 for <chenke(at)PKU.EDU.CN>; Mon, 3 Apr 1995 11:14:58 -0400 Received: from minke by blue.ornith.cornell.edu (4.1/SMI-4.1) id AA03247; Mon, 3 Apr 95 11:14:57 EDT Date: Mon, 3 Apr 95 11:14:57 EDT Message-Id: <9504031514.AA03247(at)blue.ornith.cornell.edu> Received: by minke (4.1/SMI-4.1) id AA00428; Mon, 3 Apr 95 11:14:56 EDT To: chenke(at)pku.edu.cn In-Reply-To: <199504030420.AAA02619(at)cornell.edu> (message from Chen Ke on Mon, 3 Apr 1995 12:05:32 +0800) From: "Dave Mellinger" <dave(at)ornith.cornell.edu> Sender: dave%blue(at)cornell.edu Subject: Re: Enquiry about the work on computational auditory model. Reply-To: dave(at)ornith.cornell.edu Status: RO Below are references for three Ph.D. theses (one of them mine) you might want to look at. They all have computational models which include higher parts of the auditory system. There is an active group at the University of Sheffield in Britain working in this area (Cooke and Brown, below, were associated with it). Also, I'd suggest contacting DeLiang Wang at Ohio State University (+1-614-292-2911), as he is currently working on neurocomputational models of auditory processing. Try also Dan Ellis at the MIT Media Lab (dpwe(at)media.mit.edu), who's doing some interesting work. ======================================================================= David K. Mellinger, Postdoctoral Research Associate Bioacoustics Research Program email dave(at)ornith.cornell.edu Cornell Laboratory of Ornithology phone +1-607-254-2431 159 Sapsucker Woods Road fax +1-607-254-2415 Ithaca, NY 14850-1999 USA ======================================================================= (at)phdthesis{cooke:thesis, author = "Martin Peter Cooke", title = ""Modelling Auditory Processing and Organisation", school = "University of Sheffield", year = 1991, month = may, } (at)phdthesis{brown:thesis, author = "Guy Jason Brown", title = "Computational Auditory Scene Analysis", school = "University of Sheffield", year = 1992, note = "published as Department of Computer Science Rept. CS-92-22", } (at)phdthesis{mellinger:thesis, author = "David K. Mellinger", title = "Event Formation and Separation in Musical Sound", school = "Stanford University", year = 1991, address = "Stanford, CA 94305", } ------------------------- Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA05258 (5.67b8/IDA-1.5 for chenke); Mon, 3 Apr 1995 21:17:26 +0800 Received: from sun0.aic.nrl.navy.mil by pkuns.PKU.EDU.CN with SMTP id AA14351 (5.67b/IDA-1.5 for chenke(at)pccms.pku.edu.cn); Mon, 3 Apr 1995 21:20:51 +0800 Received: from sun35.aic.nrl.navy.mil by Sun0.AIC.NRL.Navy.Mil (4.1/SMI-4.0) id AA29723; Mon, 3 Apr 95 09:21:22 EDT Received: by sun35.aic.nrl.navy.mil; Mon, 3 Apr 95 09:21:22 EDT Date: Mon, 3 Apr 95 09:21:22 EDT From: ballas(at)AIC.NRL.Navy.Mil Message-Id: <9504031321.AA09189(at)sun35.aic.nrl.navy.mil> To: chenke(at)pku.edu.cn Subject: Re: Enquiry about the work on computational auditory model. Status: RO There were a series of papers published on computational approaches to sound Interpretation in the following book: Natural Computation edited by Whitman Richards, Cambridge, MA: the MIT Press, 1988. I have published a series of papers on how well people do in identifying brief everyday sounds, and examined a series of factors. This work might provide guideance on what would be importatn in a computational model. jim -------------------- Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA05284 (5.67b8/IDA-1.5 for chenke); Mon, 3 Apr 1995 21:19:53 +0800 Received: from vulcan.le.ac.uk by pkuns.PKU.EDU.CN with SMTP id AA14359 (5.67b/IDA-1.5 for chenke(at)pccms.pku.edu.cn); Mon, 3 Apr 1995 21:23:03 +0800 Received: from violet.le.ac.uk by vulcan with SMTP (PP); Mon, 3 Apr 1995 14:21:23 +0100 Received: from VIOLET/MAILQUEUE by violet.le.ac.uk (Mercury 1.13); Mon, 3 Apr 95 14:21:07 +0100 (BST) Received: from MAILQUEUE by VIOLET (Mercury 1.13); Mon, 3 Apr 95 14:10:24 +0100 (BST) From: "Kien Seng, Wong" <ksw2(at)leicester.ac.uk> To: chenke(at)pku.edu.cn Date: Mon, 3 Apr 1995 14:10:15 +0100 (BST) Subject: Re: Enquiry about the work on computational auditory model. Priority: normal X-Mailer: Pegasus Mail v3.22 Message-Id: <1347C2B24F8(at)violet.le.ac.uk> Status: RO Hello there, I read your message on the news server. I am currently trying to model the auditory nervous system (ANS) also, starting with the VCN cells. I have looked at a few types of cells in the VCN and also SOC in the past few months. It all depends on with area of the sound perception you want to model. I you should know that the SOC is mainly recognised by many as the beginning stages of sound localisation processing. However, I must caution you that the SOC of humans is rather different from mammals. The inferior colliculus is still not very much investigated on so there seems to ne little data in that area at the moment. My current interests are the onset-c units in the VCN. Many have suspected that they are used as pitch processors.... A guess anyway... I recommend you read papers by Young E.D, Sachs, Alan Palmer and Ray Meddis for some recent findings on the ANS. I have been trying to get some nice papers recently but nothing interesting. I will try to inform you if I come across anything. Please let me know also if you have anything interesting. Thanks Kien Seng Wong BTSP: Speech and Hearing Section Engineering Department University of Leicester E-Mail : KSW2(at)LE.AC.UK ----------------- Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA09247 (5.67b8/IDA-1.5 for chenke); Wed, 5 Apr 1995 00:59:11 +0800 Received: from alink-gw.apple.com by pkuns.PKU.EDU.CN with SMTP id AA15771 (5.67b/IDA-1.5 for chenke(at)pccms.pku.edu.cn); Wed, 5 Apr 1995 01:02:35 +0800 Received: from federal-excess.apple.com by alink-gw.apple.com with SMTP (921113.SGI.UNSUPPORTED_PROTOTYPE/7-Oct-1993-eef) id AA18561; Tue, 4 Apr 95 10:02:53 -0700 for chenke(at)pku.edu.cn Received: from taurus.apple.com by federal-excess.apple.com (5.0/1-Nov-1994-eef) id AA02258; Tue, 4 Apr 1995 10:01:26 +0800 for chenke(at)pku.edu.cn Received: from [17.255.8.25] (dlyon1.atg.apple.com [17.255.8.25]) by taurus.apple.com (8.6.10/8.6.5) with SMTP id KAA10505 for <chenke(at)pku.edu.cn>; Tue, 4 Apr 1995 10:02:51 -0700 Date: Tue, 4 Apr 1995 10:02:51 -0700 X-Sender: lyon(at)taurus.apple.com Message-Id: <v0211010aaba6c4efb528(at)[17.255.8.25]> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" To: Chen Ke <chenke(at)pku.edu.cn> From: lyon(at)apple.com (Richard Lyon) Subject: Re: Enquiry about the work on computational auditory model. Content-Length: 4623 Status: RO > In particular, we're going to apply this computational >model to speaker recognition. Ke, In that case maybe you don't care about binaural and spatial aspects of the auditory system, which is a lot of what the auditory system is about, and where a lot of the best work has been on modeling it. Pitch, on the other hand, is almost certainly an important cue for speaker recognition, and this is an area where I have done some work. I don't have much insight to offer on how auditory models are likely to apply to speaker recognition more generally. \Dick Lyon (408)974-4245 Apple/ATG/InteractiveMedia/PerceptionSystems Here are a selection of my relevant publications. Let me know if want copies of any that are hard to find locally. Malcolm Slaney, Daniel Naar, and Richard F. Lyon, "Auditory Model Inversion for Sound Separation," Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, Adelaide, April 1994. Richard F. Lyon, "Cost, Power, and Parallelism in Speech Signal Processing," Proc. IEEE 1993 Custom Integrated Circuits Conference, pp. 15.1.1-15.1.9, San Diego, CA, May 9-12, 1993. Malcolm Slaney and Richard F. Lyon, "On the Importance of Time-- A Temporal Representation of Sound," chapter 5 in Visual Representations of Speech Signals, M. Cooke and Steve Beet (eds.), John Wiley & Sons Ltd., 1992. Lloyd Watts, Doug Kerns, Richard Lyon, and Carver Mead, "Improved Implementation of the Silicon Cochlea," IEEE J. Solid State Circuits 27(5) pp.692-700, May 1992. Clive Summerfield and Richard Lyon, "ASIC Implementation of the Lyon Cochlea Model," Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, San Francisco, March 1992. Malcolm Slaney and Richard F. Lyon, "Visualizing Sound with Auditory Correlograms," DRAFT submitted to JASA 1991; unfinished. Malcolm Slaney and Richard F. Lyon, "Apple Hearing Demo Reel," Apple Technical Report #25, Apple Computer, Inc., Cupertino, 1991. Richard F. Lyon, "Automatic Gain Control in Cochlear Mechanics", The Mechanics and Biophysics of Hearing, P. Dallos et al., eds., Springer-Verlag, 1990. Malcolm Slaney and Richard Lyon, "A Perceptual Pitch Detector," Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, Albuquerque, April 1990. Richard O. Duda, Richard F. Lyon, and Malcolm Slaney, "Correlograms and the Separation of Sound," 24th Asilomar Conference on Signals, Systems and Computers, IEEE, Maple Press, 1990. Richard F. Lyon and Carver Mead, "Cochlear Hydrodynamics Demystified", Caltech Computer Science Technical Report Caltech-CS-TR-88-4, 1989. Richard F. Lyon and Carver Mead, "Electronic Cochlea", Ch. 16 in Analog VLSI and Neural Systems, Carver Mead, Addison Wesley, 1989. Richard F. Lyon and Carver A. Mead, "An Analog Electronic Cochlea" IEEE Trans. ASSP. 36(7), July 1988. Richard F. Lyon and Carver A. Mead, "A CMOS VLSI Cochlea," Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, New York, April 1988. Richard F. Lyon and Eric P. Loeb, "Experiments in Isolated Digit Recognition with a Cochlear Model--An Update", Proceedings Speech Recognition Workshop, DARPA, San Diego, March 1987. Richard F. Lyon, "Speech Recognition in Scale Space", Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, Dallas, 1987. Richard F. Lyon, "Speech Recognition Experiments with a Cochlear Model", Proceedings, DARPA Speech Recognition Workshop, Palo Alto, Feb. 1986, and shorter version in Proceedings of Montreal Symposium on Speech Recognition, McGill Univ., July, 1986. Richard F. Lyon and Lounette Dyer, "Experiments with a Computational Model of the Cochlea", Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, Tokyo, 1986. Richard F. Lyon and Niels Lauritzen, "Processing Speech with the Multi-Serial Signal Processor", Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, Tampa, March, 1985. Richard F. Lyon, "Computational Models of Neural Auditory Processing", Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, San Diego, March, 1984. Richard F. Lyon, "A Computational Model of Binaural Localization and Separation", Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, Boston, April 1983. Richard F. Lyon, "A Computational Model of Filtering, Detection, and Compression in the Cochlea", Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, Paris, May 1982. ------------------------- Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA04713 (5.67b8/IDA-1.5 for chenke); Fri, 7 Apr 1995 22:37:41 +0800 Received: from media.mit.edu (media-lab.media.mit.edu) by pkuns.PKU.EDU.CN with SMTP id AA19325 (5.67b/IDA-1.5 for chenke(at)pccms.pku.edu.cn); Fri, 7 Apr 1995 22:41:27 +0800 Received: by media.mit.edu (5.57/DA1.0.4.amt) id AA06918; Fri, 7 Apr 95 10:41:15 -0400 Message-Id: <9504071441.AA06918(at)media.mit.edu> To: Chen Ke <chenke(at)pku.edu.cn> Subject: Re: Enquiry about your work in auditory model. In-Reply-To: Your message of "Fri, 07 Apr 1995 17:01:10 +0800." <199504070901.AA04051(at)pccms.pku.edu.cn> X-Phys-Location: 1039 Mass Ave #8A, Cambridge MA 02138 Date: Fri, 07 Apr 1995 10:41:15 -0400 From: "Dan Ellis" <dpwe(at)media.mit.edu> Status: RO Dear Dr. Chen - Thank you for your message. I was interested in your original post to AUDITORY and in Dick Lyon's subsequent response. My own interests lie in functional modeling of the higher auditory system -- a field which is now gaining some identity under the title of "Computational Auditory Scene Analysis", broadly, computer models attempting to reproduce the kinds of phenomena described in the book "Auditory Scene Analysis" by psychologist Albert Bregman. Although ultimately the study of the neurophysiology of the auditory centers in the brain will inform and (hopefully) confirm this work, my feeling is that it is difficult to interpret such research at the moment, and we are better able to make progress simply trying to reproduce the phenomena by any method we can get to work. Also, I am an engineer, not a physiologist, so I suppose my bias is showing. I am focusing on the problem of auditory event detection (building a computer model able to predict when a listener will report that a new 'event' has occured in a sound signal) and source separation (systems that can partition acoustic energy into different groups corresponding to percepts of independent sound sources); my tools are signal processing and the techniques of artificial intelligence, and my inspiration comes from psychoacoustics and auditory neuro- physiology. You can read about my previous work in the following short papers, available over the internet: Ellis, D.P.W., Vercoe, B.L. (1992). <a href="ftp://sound.media.mit.edu/pub/Papers/asa-slc-92.ps.Z">A perceptual representation of audio for sound source separation</a> Presented to the 123rd meeting of the Acoustical Society of America, Salt Lake City. Ellis, D.P.W. (1993). <a href="ftp://sound.media.mit.edu/pub/Papers/waspaa93.ps.Z">Hierarchic models of sound for separation and restoration</a> Proc. 1993 IEEE Mohonk workshop on Applications of Signal Processing to Acoustics and Audio. Ellis, D.P.W. (1994). <a href="ftp://sound.media.mit.edu/pub/Papers/ICPR-94.ps.Z">A computer implementation of psychoacoustic grouping rules</a> Proc. 12th Intl. Conf. on Pattern Recognition, Jerusalem You can find out more about the research in our group through our web server, http://sound.media.mit.edu/, although it's not really all that informative at the moment. If you have trouble downloading the papers, I can mail you paper copies. I should be glad to keep in touch over the areas that interest you. I was in Beijing (and various other places in China) in 1987, as a tourist. It was fascinating but quite austere. How are things there now? Best wishes, -- DAn Ellis <dpwe(at)media.mit.edu> MIT Media Lab Perceptual Computing - Machine Listening group. ---------------------------- Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA04776 (5.67b8/IDA-1.5 for chenke); Fri, 7 Apr 1995 23:36:09 +0800 Received: from research.att.com by pkuns.PKU.EDU.CN with SMTP id AA19376 (5.67b/IDA-1.5 for chenke(at)pccms.pku.edu.cn); Fri, 7 Apr 1995 23:40:35 +0800 Received: by research.att.com; Fri Apr 7 11:37 EDT 1995 Received: (jba(at)localhost) by sear.research.att.com (940816.SGI.8.6.9/8.6.4) id PAA24205 for <chenke(at)pku.edu.cn>; Fri, 7 Apr 1995 15:36:30 GMT Date: Fri, 7 Apr 1995 15:36:30 GMT From: Jont Allen <jba(at)research.att.com> Message-Id: <199504071536.PAA24205(at)sear.research.att.com> To: Chen Ke <chenke(at)pku.edu.cn> Subject: Re: Enquiry about the work on computational auditory model. Status: RO Ke, I dont work that much on neural models, more on cochlear models. One of the best pieces of work is that of SHamma, J Neuro PHy. ALso there are some papers in IEEE acoustics and speech, by Kuansan Wang (tiwain) and SHamma. Send Kuansan mail at kuansan(at)research.att.com. He is one of my coworkers here at bell labs. ------------------ Received: from pkuns.PKU.EDU.CN by pccms.pku.edu.cn with SMTP id AA10722 (5.67b8/IDA-1.5 for chenke); Mon, 10 Apr 1995 23:02:07 +0800 Received: from research.att.com by pkuns.PKU.EDU.CN with SMTP id AA22000 (5.67b/IDA-1.5 for chenke(at)pccms.pku.edu.cn); Mon, 10 Apr 1995 23:03:43 +0800 Received: by research.att.com; Mon Apr 10 10:39 EDT 1995 Received: (kuansan(at)localhost) by dayton.research.att.com (940816.SGI.8.6.9/8.6.4) id OAA26466 for <chenke(at)pku.edu.cn>; Mon, 10 Apr 1995 14:39:04 GMT Date: Mon, 10 Apr 1995 14:39:04 GMT From: Kuansan Wang <kuansan(at)research.att.com> Message-Id: <199504101439.OAA26466(at)dayton.research.att.com> To: Chen Ke <chenke(at)pku.edu.cn> Subject: Re: Jont Allen Status: RO My PhD work was to establish an integrated computational model for auditory processing, ranging from peripheral system to the primary cortex (A1). I have a mathematical model based on physiological data on ferrets, and, from the data of psychoacoustic experiments, it seems to work for human as well. More experiments are designed being and conducted by my thesis advisor and his colleagues, and they are recruiting more students with engineering background (like myself) to formulate observation and construct models. The journal paper documenting the first stage of these works is going to appear in IEEE Trans on Speech and Audio this September. However, a shorter version has appeared in IEEE EMB magazine this March. Since I joined Bell Labs, I have been applying my PhD work on speech recognition. Sounds like we have overlapped interests. I'll be happy to engage on more discussion and brain storm on this matter. Please don't hesitate to write me. p.s. Are you attending any conference in US in the future by any chance? This year's ASA meeting in Washington DC has a special session on central auditory system. It is held at the end of May. --------------------------------------------------------------------------

This message came from the mail archive
http://www.auditory.org/postings/1995/
maintained by:

DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University