Re: [AUDITORY] Logan's theorem - a challenge (Ken Grant )

Subject: Re: [AUDITORY] Logan's theorem - a challenge From: Ken Grant <ken.w.grant@xxxxxxxx> Date: Tue, 28 Sep 2021 04:43:32 -0400 --00000000000000834105cd0a36de Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable This discussion reminded me of an important and often overlooked paper by Ron Cole and Brian Scott *Cole, R. A., & Scott, B. (1974). Toward a theory of speech perception. Psychological Review, 81(4), 348=E2=80=93374. https://doi.org/10.1037/h0036656 <https://psycnet.apa.org/doi/10.1037/h0036656>* V/r Ken On Tue, Sep 28, 2021 at 2:42 AM Prof Leslie Smith <l.s.smith@xxxxxxxx> wrote: > I sen this originally Alain de Chaveigne, but perhaps I should have made > it more public. Here goes. > > Dear Alain: > > I did some related work with my student Madhuranda Pahar some while ago: > it ended up with the publication linked to below. > > What we did was to resynthesize speech (or any other sound) from > zero-crossings (positive-going only) in band-limited signals (using the > gamma tone filterbank) plus some information about the maximal size of th= e > signal in the previous half-cycle. > > In essence, given a surprisingly small number of channels, plus a little > information about the signal level (i.e. a log-based coding of the signal > amplitude in the previous half-cycle, using 4 or 5 values - called > threshold levels in the paper), one can quite easily make out the speech. > > It's not a wonderful paper, and could do with more work and more examples= , > and the resynthesis is not particularly straightforward (but that's not > important - what matters is the possibility of resynthesis, as the brain > interprets the AN signal, rather than re-creating it. And we'd never hear= d > of Logan's theorem (unfortunately!). > > Still, I hope this might be of interest. I believe i have the Matlab code > still (but it could do with being reworked. > > The paper can be found at > http://www.cs.stir.ac.uk/~lss/recentpapers/PID6701133.pdf > > Reference: M.Pahar, L.S. Smith Coding and Decoding Speech using a > Biologically Inspired Coding System > presented at IEEE SSCI 2020, (virtual conference) 1-4 December 2020. DOI > 10.1109/SSCI47803.2020.9308328. > > --Leslie Smith > > Alain de Cheveigne wrote: > > Hi all, > > > > Here=C3=A2=E2=82=AC=E2=84=A2s a challenge for the young nimble minds on= this list, and the old > > and wise. > > > > Logan=C3=A2=E2=82=AC=E2=84=A2s theorem states that a signal can be reco= nstructed from its zero > > crossings, to a scale, as long as the spectral representation of that > > signal is less than an octave wide. It sounds like magic given that ze= ro > > crossing information is so crude. How can the full signal be recovered > > from a sparse series of time values (with signs but no amplitudes)? > > =C3=A2=E2=82=AC=C5=93Band-limited=C3=A2=E2=82=AC is clearly a powerful= assumption. > > > > Why is this of interest in the auditory context? The band-limited > premise > > is approximately valid for each channel of the cochlear filterbank > > (sometimes characterized as a 1/3 octave filter). While cochlear > > transduction is non-linear, Logan=C3=A2=E2=82=AC=E2=84=A2s theorem sugg= ests that any > > information lost due to that non-linearity can be restored, within each > > channel. If so, cochlear transduction is =C3=A2=E2=82=AC=C5=93transpare= nt=C3=A2=E2=82=AC , which is > > encouraging for those who like to speculate about neural models of > > auditory processing. An algorithm applicable to the sound waveform can = be > > implemented by the brain with similar results, in principle. > > > > Logan=C3=A2=E2=82=AC=E2=84=A2s theorem has been invoked by David Marr f= or vision and several > > authors for hearing (some refs below). The theorem is unclear as to how > > the original signal should be reconstructed, which is an obstacle to > > formulating concrete models, but in these days of machine learning it > > might be OK to assume that the system can somehow learn to use the > > information, granted that it=C3=A2=E2=82=AC=E2=84=A2s there. The hypot= hesis has far-reaching > > implications, for example it implies that spectral resolution of centra= l > > auditory processing is not limited by peripheral frequency analysis (as > > already assumed by for example phase opponency or lateral inhibitory > > hypotheses). > > > > Before venturing further along this limb, it=C3=A2=E2=82=AC=E2=84=A2s w= orth considering some > > issues. First, Logan made clear that his theorem only applies to a > > perfectly band-limited signal, and might not be =C3=A2=E2=82=AC=C5=93ap= proximately valid=C3=A2=E2=82=AC > > for a signal that is =C3=A2=E2=82=AC=C5=93approximately band-limited=C3= =A2=E2=82=AC . No practical > > signal is band-limited, if only because it must be time limited, and th= us > > the theorem might conceivably not be applicable at all. On the other > > hand, half-wave rectification offers much richer information than zero > > crossings, so perhaps the end result is valid (information preserved) > even > > if the theorem is not applicable stricto sensu. Second, there are many > > other imperfections such as adaptation, stochastic sampling to a > > spike-based representation, and so on, that might affect the usefulness > of > > the hypothesis. > > > > The challenge is to address some of these loose ends. For example: > > (1) Can the theorem be extended to make use of a halfwave-rectified > signal > > rather than zero crossings? Might that allow it to be applicable to > > practical time-limited signals? > > (2) What is the impact of real cochlear filter characteristics, > > adaptation, or stochastic sampling? > > (3) In what sense can one say that the acoustic signal is "available=C3= =A2=E2=82=AC > to > > neural signal processing? What are the limits of that concept? > > (4) Can all this be formulated in a way intelligible by non-mathematica= l > > auditory scientists? > > > > This is the challenge. The reward is - possibly - a better understandi= ng > > of how our brain hears the world. > > > > Alain > > > > --- > > Logan BF, JR. (1977) Information in the zero crossings of bandpass > > signals. Bell Syst. Tech. J. 56:487=C3=A2=E2=82=AC=E2=80=9C510. > > > > Marr, D. (1982) VISION - A Computational Investigation into the Human > > Representation and Processing of Visual Information. W.H. Freeman and C= o, > > republished by MIT press 2010. > > > > Heinz, M.G., Swaminathan J. (2009) Quantifying Envelope and > Fine-Structure > > Coding in Auditory Nerve Responses to Chimaeric Speech, JARO 10: > 407=C3=A2=E2=82=AC=E2=80=9C423 > > DOI: 10.1007/s10162-009-0169-8. > > > > Shamma, S, Lorenzi, C (2013) On the balance of envelope and temporal fi= ne > > structure in the encoding of speech in the early auditory system, J. > > Acoust. Soc. Am. 133, 2818=C3=A2=E2=82=AC=E2=80=9C2833. > > > > Parida S, Bharadwaj H, Heinz MG (2021) Spectrally specific temporal > > analyses of spike-train responses to complex sounds: A unifying > framework. > > PLoS Comput Biol 17(2): e1008155. > > https://doi.org/10.1371/journal.pcbi.1008155 > > > > de Cheveign=C3=83=C2=A9, A. (in press) Harmonic Cancellation, a Fundame= ntal of > > Auditory Scene Analysis. Trends in Hearing (https://psyarxiv.com/b8e5w/ > ). > > > -- > Prof Leslie Smith (Emeritus) > Computing Science & Mathematics, > University of Stirling, Stirling FK9 4LA > Scotland, UK > Tel +44 1786 467435 > Web: http://www.cs.stir.ac.uk/~lss > Blog: http://lestheprof.com > --=20 Ken W. Grant, Ph.D. Chief, Scientific and Clinical Studies Section America Building, Room 5601 Walter Reed National Military Medical Center 4954 North Palmer Road Bethesda, MD 20889-5630 OFFICE: 301-319-7043 CELL: 301-919-2957 kenneth.w.grant.civ@xxxxxxxx ken.w.grant@xxxxxxxx --00000000000000834105cd0a36de Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto">This discussion reminded me of an important and often ove= rlooked paper by Ron Cole and Brian Scott</div><div dir=3D"auto"> = </div><div dir=3D"auto">Cole, R. A., & Scott, B. (1974). Toward= a theory of speech perception.=C2=A0Psychologica= l Review, 81(4), 348=E2=80=93374.=C2=A0<a target=3D"_blank" href= =3D"https://psycnet.apa.org/doi/10.1037/h0036656" style=3D"box-sizing:borde= r-box;font-family:sans-serif;text-decoration:none;font-size:14px;color:rgb(= 44,114,183)">https://doi.org/10.1037/h0036656</a> </div><div = dir=3D"auto"> </div><div dir=3D"auto">V/r</div= ><div dir=3D"auto"> </div><div dir=3D"auto">Ken</span= ></div><div> <div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_= attr">On Tue, Sep 28, 2021 at 2:42 AM Prof Leslie Smith <<a href=3D"mail= to:l.s.smith@xxxxxxxx">l.s.smith@xxxxxxxx</a>> wrote: </div= ><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border= -left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:= rgb(204,204,204)">I sen this originally=C2=A0 Alain de Chaveigne, but perha= ps I should have made it more public. Here goes. Dear Alain: I did some related work with my student Madhuranda Pahar some while ago:<br= > it ended up with the publication linked to below. What we did was to resynthesize speech (or any other sound) from zero-crossings (positive-going only) in band-limited signals (using the gamma tone filterbank) plus some information about the maximal size of the<= br> signal in the previous half-cycle. In essence, given a surprisingly small number of channels, plus a little<br= > information about the signal level (i.e. a log-based coding of the signal<b= r> amplitude in the previous half-cycle, using 4 or 5 values - called threshold levels in the paper), one can quite easily make out the speech.<b= r> It's not a wonderful paper, and could do with more work and more exampl= es, and the resynthesis is not particularly straightforward (but that's not= important - what matters is the possibility of resynthesis, as the brain<br= > interprets the AN signal, rather than re-creating it. And we'd never he= ard of Logan's theorem (unfortunately!). Still, I hope this might be of interest. I believe i have the Matlab code<b= r> still (but it could do with being reworked. The paper can be found at <a href=3D"http://www.cs.stir.ac.uk/~lss/recentpapers/PID6701133.pdf" rel= =3D"noreferrer" target=3D"_blank">http://www.cs.stir.ac.uk/~lss/recentpaper= s/PID6701133.pdf</a> Reference: M.Pahar, L.S. Smith Coding and Decoding Speech using a Biologically Inspired Coding System presented at IEEE SSCI 2020, (virtual conference) 1-4 December 2020. DOI<br= > 10.1109/SSCI47803.2020.9308328. --Leslie Smith Alain de Cheveigne wrote: > Hi all, > > Here=C3=A2=E2=82=AC=E2=84=A2s a challenge for the young nimble minds o= n this list, and the old > and wise. > > Logan=C3=A2=E2=82=AC=E2=84=A2s theorem states that a signal can be rec= onstructed from its zero > crossings, to a scale, as long as the spectral representation of that<= br> > signal is less than an octave wide.=C2=A0 It sounds like magic given t= hat zero > crossing information is so crude. How can the full signal be recovered= > from a sparse series of time values (with signs but no amplitudes)?<br= > > =C3=A2=E2=82=AC=C5=93Band-limited=C3=A2=E2=82=AC=C2=A0 is clearly a po= werful assumption. > > Why is this of interest in the auditory context?=C2=A0 The band-limite= d premise > is approximately valid for each channel of the cochlear filterbank > (sometimes characterized as a 1/3 octave filter).=C2=A0 While cochlear= > transduction is non-linear, Logan=C3=A2=E2=82=AC=E2=84=A2s theorem sug= gests that any > information lost due to that non-linearity can be restored, within eac= h > channel. If so, cochlear transduction is =C3=A2=E2=82=AC=C5=93transpar= ent=C3=A2=E2=82=AC , which is > encouraging for those who like to speculate about neural models of > auditory processing. An algorithm applicable to the sound waveform can= be > implemented by the brain with similar results, in principle. > > Logan=C3=A2=E2=82=AC=E2=84=A2s theorem has been invoked by David Marr = for vision and several > authors for hearing (some refs below). The theorem is unclear as to ho= w > the original signal should be reconstructed, which is an obstacle to<b= r> > formulating concrete models, but in these days of machine learning it<= br> > might be OK to assume that the system can somehow learn to use the > information, granted that it=C3=A2=E2=82=AC=E2=84=A2s there.=C2=A0 The= hypothesis has far-reaching > implications, for example it implies that spectral resolution of centr= al > auditory processing is not limited by peripheral frequency analysis (a= s > already assumed by for example phase opponency or lateral inhibitory<b= r> > hypotheses). > > Before venturing further along this limb, it=C3=A2=E2=82=AC=E2=84=A2s = worth considering some > issues.=C2=A0 First, Logan made clear that his theorem only applies to= a > perfectly band-limited signal, and might not be =C3=A2=E2=82=AC=C5=93a= pproximately valid=C3=A2=E2=82=AC > for a signal that is =C3=A2=E2=82=AC=C5=93approximately band-limited= =C3=A2=E2=82=AC .=C2=A0 No practical > signal is band-limited, if only because it must be time limited, and t= hus > the theorem might conceivably not be applicable at all.=C2=A0 On the o= ther > hand, half-wave rectification offers much richer information than zero= > crossings, so perhaps the end result is valid (information preserved) = even > if the theorem is not applicable stricto sensu.=C2=A0 Second, there ar= e many > other imperfections such as adaptation, stochastic sampling to a > spike-based representation, and so on, that might affect the usefulnes= s of > the hypothesis. > > The challenge is to address some of these loose ends. For example: > (1) Can the theorem be extended to make use of a halfwave-rectified si= gnal > rather than zero crossings? Might that allow it to be applicable to<br= > > practical time-limited signals? > (2) What is the impact of real cochlear filter characteristics, > adaptation, or stochastic sampling? > (3) In what sense can one say that the acoustic signal is "availa= ble=C3=A2=E2=82=AC=C2=A0 to > neural signal processing?=C2=A0 What are the limits of that concept?<b= r> > (4) Can all this be formulated in a way intelligible by non-mathematic= al > auditory scientists? > > This is the challenge.=C2=A0 The reward is - possibly - a better under= standing > of how our brain hears the world. > > Alain > > --- > Logan BF, JR. (1977) Information in the zero crossings of bandpass > signals. Bell Syst. Tech. J. 56:487=C3=A2=E2=82=AC=E2=80=9C510. > > Marr, D. (1982) VISION - A Computational Investigation into the Human<= br> > Representation and Processing of Visual Information. W.H. Freeman and = Co, > republished by MIT press 2010. > > Heinz, M.G., Swaminathan J. (2009) Quantifying Envelope and Fine-Struc= ture > Coding in Auditory Nerve Responses to Chimaeric Speech, JARO 10: 407= =C3=A2=E2=82=AC=E2=80=9C423 > DOI: 10.1007/s10162-009-0169-8. > > Shamma, S, Lorenzi, C (2013) On the balance of envelope and temporal f= ine > structure in the encoding of speech in the early auditory system, J.<b= r> > Acoust. Soc. Am. 133, 2818=C3=A2=E2=82=AC=E2=80=9C2833. > > Parida S, Bharadwaj H, Heinz MG (2021) Spectrally specific temporal<br= > > analyses of spike-train responses to complex sounds: A unifying framew= ork. > PLoS Comput Biol 17(2): e1008155. > <a href=3D"https://doi.org/10.1371/journal.pcbi.1008155" rel=3D"norefe= rrer" target=3D"_blank">https://doi.org/10.1371/journal.pcbi.1008155</a><br= > > > de Cheveign=C3=83=C2=A9, A. (in press) Harmonic Cancellation, a Fundam= ental of > Auditory Scene Analysis. Trends in Hearing (<a href=3D"https://psyarxi= v.com/b8e5w/" rel=3D"noreferrer" target=3D"_blank">https://psyarxiv.com/b8e= 5w/</a>). -- Prof Leslie Smith (Emeritus) Computing Science & Mathematics, University of Stirling, Stirling FK9 4LA Scotland, UK Tel +44 1786 467435 Web: <a href=3D"http://www.cs.stir.ac.uk/~lss" rel=3D"noreferrer" target=3D= "_blank">http://www.cs.stir.ac.uk/~lss</a> Blog: <a href=3D"http://lestheprof.com" rel=3D"noreferrer" target=3D"_blank= ">http://lestheprof.com</a> </blockquote></div></div>-- <div dir=3D"ltr" class=3D"gmail_signature" = data-smartmail=3D"gmail_signature">Ken W. Grant, Ph.D. Chief, Scientific= and Clinical Studies Section America Building, Room 5601 Walter Reed= National Military Medical Center 4954 North Palmer Road Bethesda, MD= 20889-5630 =C2=A0 OFFICE:=C2=A0 301-319-7043 CELL:=C2=A0 301-919-= 2957 =C2=A0 <a href=3D"mailto:kenneth.w.grant.civ@xxxxxxxx">kenneth.w= .grant.civ@xxxxxxxx</a> <a href=3D"mailto:ken.w.grant@xxxxxxxx">ken.w.g= rant@xxxxxxxx</a></div> --00000000000000834105cd0a36de--

This message came from the mail archive
src/postings/2021/
maintained by:

DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University