Re: [AUDITORY] Logan's theorem - a challenge (Malcolm Slaney )


Subject: Re: [AUDITORY] Logan's theorem - a challenge
From:    Malcolm Slaney  <000001757ffb5fe1-dmarc-request@xxxxxxxx>
Date:    Sun, 26 Sep 2021 08:01:22 -0700

POCS. Projections onto Convex Sets [1]. Dick Lyon and I used POCS to invert [2] our favorite auditory model. A = contemporaneous paper [3] from Shamma=E2=80=99s lab did the same.=20 Both the band-limited constraint and the known positive values of the = signal define convex sets. We know in the frequency domain many parts = of the spectrum are equal to zero. And in the time domain we know the = values that are positive. We can iterate between the time and the = frequency domain, each time projecting onto the appropriate constraint, = to find the best solution. I didn=E2=80=99t work out the theory, but since the auditory filter bank = has a bandwidth of less than an octave, I think there must be only a = single solution. In practice, just a handful of back and forth = iterations was sufficient to find the solution. Piece of cake. :-) Our interest in this problem was not to generate audio, a cute parlor = trick, but to show that the auditory representation we were working with = did not lose any perceptually important information. =E2=80=94 Malcolm P.S. Reconstructions from zero crossing requires infinite resolution of = the time of the zero crossing. That would be hard to do with a spike = representation. Fortunately, there is a LOT more information in the HWR = signal. [1] https://en.wikipedia.org/wiki/Projections_onto_convex_sets [2] M. Slaney, D. Naar. R. Lyon. Auditory model inversion for sound = separation. Proceedings of ICASSP '94. IEEE International Conference on = Acoustics, Speech and Signal Processing, 1994. = https://engineering.purdue.edu/~malcolm/apple/icassp94/CorrelogramInversio= n.pdf [3] X. Yang; K. Wang; S.A. Shamma. Auditory representations of acoustic = signals. IEEE Transactions on Information Theory, Volume: 38, Issue: 2, = March 1992. https://ieeexplore.ieee.org/document/119739 > On Sep 25, 2021, at 11:03 PM, Alain de Cheveigne = <alain.de.cheveigne@xxxxxxxx> wrote: >=20 > Hi all, >=20 > Here=E2=80=99s a challenge for the young nimble minds on this list, = and the old and wise. >=20 > Logan=E2=80=99s theorem states that a signal can be reconstructed from = its zero crossings, to a scale, as long as the spectral representation = of that signal is less than an octave wide. It sounds like magic given = that zero crossing information is so crude. How can the full signal be = recovered from a sparse series of time values (with signs but no = amplitudes)? =E2=80=9CBand-limited=E2=80=9D is clearly a powerful = assumption. >=20 > Why is this of interest in the auditory context? The band-limited = premise is approximately valid for each channel of the cochlear = filterbank (sometimes characterized as a 1/3 octave filter). While = cochlear transduction is non-linear, Logan=E2=80=99s theorem suggests = that any information lost due to that non-linearity can be restored, = within each channel. If so, cochlear transduction is =E2=80=9Ctransparent=E2= =80=9D, which is encouraging for those who like to speculate about = neural models of auditory processing. An algorithm applicable to the = sound waveform can be implemented by the brain with similar results, in = principle. =20 >=20 > Logan=E2=80=99s theorem has been invoked by David Marr for vision and = several authors for hearing (some refs below). The theorem is unclear as = to how the original signal should be reconstructed, which is an obstacle = to formulating concrete models, but in these days of machine learning it = might be OK to assume that the system can somehow learn to use the = information, granted that it=E2=80=99s there. The hypothesis has = far-reaching implications, for example it implies that spectral = resolution of central auditory processing is not limited by peripheral = frequency analysis (as already assumed by for example phase opponency or = lateral inhibitory hypotheses). >=20 > Before venturing further along this limb, it=E2=80=99s worth = considering some issues. First, Logan made clear that his theorem only = applies to a perfectly band-limited signal, and might not be = =E2=80=9Capproximately valid=E2=80=9D for a signal that is = =E2=80=9Capproximately band-limited=E2=80=9D. No practical signal is = band-limited, if only because it must be time limited, and thus the = theorem might conceivably not be applicable at all. On the other hand, = half-wave rectification offers much richer information than zero = crossings, so perhaps the end result is valid (information preserved) = even if the theorem is not applicable stricto sensu. Second, there are = many other imperfections such as adaptation, stochastic sampling to a = spike-based representation, and so on, that might affect the usefulness = of the hypothesis. >=20 > The challenge is to address some of these loose ends. For example: > (1) Can the theorem be extended to make use of a halfwave-rectified = signal rather than zero crossings? Might that allow it to be applicable = to practical time-limited signals? > (2) What is the impact of real cochlear filter characteristics, = adaptation, or stochastic sampling? =20 > (3) In what sense can one say that the acoustic signal is = "available=E2=80=9D to neural signal processing? What are the limits of = that concept? > (4) Can all this be formulated in a way intelligible by = non-mathematical auditory scientists? >=20 > This is the challenge. The reward is - possibly - a better = understanding of how our brain hears the world. >=20 > Alain >=20 > --- > Logan BF, JR. (1977) Information in the zero crossings of bandpass = signals. Bell Syst. Tech. J. 56:487=E2=80=93510. >=20 > Marr, D. (1982) VISION - A Computational Investigation into the Human = Representation and Processing of Visual Information. W.H. Freeman and = Co, republished by MIT press 2010. >=20 > Heinz, M.G., Swaminathan J. (2009) Quantifying Envelope and = Fine-Structure Coding in Auditory Nerve Responses to Chimaeric Speech, = JARO 10: 407=E2=80=93423 > DOI: 10.1007/s10162-009-0169-8. >=20 > Shamma, S, Lorenzi, C (2013) On the balance of envelope and temporal = fine structure in the encoding of speech in the early auditory system, = J. Acoust. Soc. Am. 133, 2818=E2=80=932833. >=20 > Parida S, Bharadwaj H, Heinz MG (2021) Spectrally specific temporal = analyses of spike-train responses to complex sounds: A unifying = framework. PLoS Comput Biol 17(2): e1008155. = https://doi.org/10.1371/journal.pcbi.1008155 >=20 > de Cheveign=C3=A9, A. (in press) Harmonic Cancellation, a Fundamental = of Auditory Scene Analysis. Trends in Hearing = (https://psyarxiv.com/b8e5w/).


This message came from the mail archive
src/postings/2021/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University