[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] Logan's theorem - a challenge



This discussion reminded me of an important and often overlooked paper by Ron Cole and Brian Scott

Cole, R. A., & Scott, B. (1974). Toward a theory of speech perception. Psychological Review, 81(4), 348–374. https://doi.org/10.1037/h0036656

V/r

Ken

On Tue, Sep 28, 2021 at 2:42 AM Prof Leslie Smith <l.s.smith@xxxxxxxxxxxxx> wrote:
I sen this originally  Alain de Chaveigne, but perhaps I should have made
it more public. Here goes.

Dear Alain:

I did some related work with my student Madhuranda Pahar some while ago:
it ended up with the publication linked to below.

What we did was to resynthesize speech (or any other sound) from
zero-crossings (positive-going only) in band-limited signals (using the
gamma tone filterbank) plus some information about the maximal size of the
signal in the previous half-cycle.

In essence, given a surprisingly small number of channels, plus a little
information about the signal level (i.e. a log-based coding of the signal
amplitude in the previous half-cycle, using 4 or 5 values - called
threshold levels in the paper), one can quite easily make out the speech.

It's not a wonderful paper, and could do with more work and more examples,
and the resynthesis is not particularly straightforward (but that's not
important - what matters is the possibility of resynthesis, as the brain
interprets the AN signal, rather than re-creating it. And we'd never heard
of Logan's theorem (unfortunately!).

Still, I hope this might be of interest. I believe i have the Matlab code
still (but it could do with being reworked.

The paper can be found at
http://www.cs.stir.ac.uk/~lss/recentpapers/PID6701133.pdf

Reference: M.Pahar, L.S. Smith Coding and Decoding Speech using a
Biologically Inspired Coding System
presented at IEEE SSCI 2020, (virtual conference) 1-4 December 2020. DOI
10.1109/SSCI47803.2020.9308328.

--Leslie Smith

Alain de Cheveigne wrote:
> Hi all,
>
> Here’s a challenge for the young nimble minds on this list, and the old
> and wise.
>
> Logan’s theorem states that a signal can be reconstructed from its zero
> crossings, to a scale, as long as the spectral representation of that
> signal is less than an octave wide.  It sounds like magic given that zero
> crossing information is so crude. How can the full signal be recovered
> from a sparse series of time values (with signs but no amplitudes)?
> “Band-limited† is clearly a powerful assumption.
>
> Why is this of interest in the auditory context?  The band-limited premise
> is approximately valid for each channel of the cochlear filterbank
> (sometimes characterized as a 1/3 octave filter).  While cochlear
> transduction is non-linear, Logan’s theorem suggests that any
> information lost due to that non-linearity can be restored, within each
> channel. If so, cochlear transduction is “transparent†, which is
> encouraging for those who like to speculate about neural models of
> auditory processing. An algorithm applicable to the sound waveform can be
> implemented by the brain with similar results, in principle.
>
> Logan’s theorem has been invoked by David Marr for vision and several
> authors for hearing (some refs below). The theorem is unclear as to how
> the original signal should be reconstructed, which is an obstacle to
> formulating concrete models, but in these days of machine learning it
> might be OK to assume that the system can somehow learn to use the
> information, granted that it’s there.  The hypothesis has far-reaching
> implications, for example it implies that spectral resolution of central
> auditory processing is not limited by peripheral frequency analysis (as
> already assumed by for example phase opponency or lateral inhibitory
> hypotheses).
>
> Before venturing further along this limb, it’s worth considering some
> issues.  First, Logan made clear that his theorem only applies to a
> perfectly band-limited signal, and might not be “approximately validâ€
> for a signal that is “approximately band-limited†.  No practical
> signal is band-limited, if only because it must be time limited, and thus
> the theorem might conceivably not be applicable at all.  On the other
> hand, half-wave rectification offers much richer information than zero
> crossings, so perhaps the end result is valid (information preserved) even
> if the theorem is not applicable stricto sensu.  Second, there are many
> other imperfections such as adaptation, stochastic sampling to a
> spike-based representation, and so on, that might affect the usefulness of
> the hypothesis.
>
> The challenge is to address some of these loose ends. For example:
> (1) Can the theorem be extended to make use of a halfwave-rectified signal
> rather than zero crossings? Might that allow it to be applicable to
> practical time-limited signals?
> (2) What is the impact of real cochlear filter characteristics,
> adaptation, or stochastic sampling?
> (3) In what sense can one say that the acoustic signal is "available† to
> neural signal processing?  What are the limits of that concept?
> (4) Can all this be formulated in a way intelligible by non-mathematical
> auditory scientists?
>
> This is the challenge.  The reward is - possibly - a better understanding
> of how our brain hears the world.
>
> Alain
>
> ---
> Logan BF, JR. (1977) Information in the zero crossings of bandpass
> signals. Bell Syst. Tech. J. 56:487–510.
>
> Marr, D. (1982) VISION - A Computational Investigation into the Human
> Representation and Processing of Visual Information. W.H. Freeman and Co,
> republished by MIT press 2010.
>
> Heinz, M.G., Swaminathan J. (2009) Quantifying Envelope and Fine-Structure
> Coding in Auditory Nerve Responses to Chimaeric Speech, JARO 10: 407–423
> DOI: 10.1007/s10162-009-0169-8.
>
> Shamma, S, Lorenzi, C (2013) On the balance of envelope and temporal fine
> structure in the encoding of speech in the early auditory system, J.
> Acoust. Soc. Am. 133, 2818–2833.
>
> Parida S, Bharadwaj H, Heinz MG (2021) Spectrally specific temporal
> analyses of spike-train responses to complex sounds: A unifying framework.
> PLoS Comput Biol 17(2): e1008155.
> https://doi.org/10.1371/journal.pcbi.1008155
>
> de Cheveigné, A. (in press) Harmonic Cancellation, a Fundamental of
> Auditory Scene Analysis. Trends in Hearing (https://psyarxiv.com/b8e5w/).


--
Prof Leslie Smith (Emeritus)
Computing Science & Mathematics,
University of Stirling, Stirling FK9 4LA
Scotland, UK
Tel +44 1786 467435
Web: http://www.cs.stir.ac.uk/~lss
Blog: http://lestheprof.com
--
Ken W. Grant, Ph.D.
Chief, Scientific and Clinical Studies Section
America Building, Room 5601
Walter Reed National Military Medical Center
4954 North Palmer Road
Bethesda, MD 20889-5630
 
OFFICE:  301-319-7043
CELL:  301-919-2957
 
kenneth.w.grant.civ@xxxxxxxx
ken.w.grant@xxxxxxxxx