[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Roughness in audio and vision

Hi Bryan,

The classic models for auditory roughness (the sensation caused by
temporal envelope modulations in the frequency range of 20-120 Hz)
came from Terhardt and are described in Zwicker and Fastl (1990,
2007).  Daniel and Weber (1997) have an updated model and that's a
good reference for the auditory version of roughness.

It's interesting to study how the sensation in one modality might
effect the perception in another.  I've discussed, but haven't seen
any studies on, the link between *tactile* roughness and auditory
roughness.  For a general ref on tactile roughness, see Conner et al.
(1990), for example.  It might be interesting to combine that modality
as well.

Fastl, H. & Zwicker, E. Psychoacosutics: Facts and Models Springer-Verlag, 2007

Daniel, P. & Weber, R. Psychoacoustical Roughness: Implementation of
an Optimized Model Acta Acustica united with Acustica, 1997, 83,

Connor, C. E.; Hsiao, S. S.; Phillips, J. R. & Johnson, K. O. Tactile
roughness: neural codes that account for psychophysical magnitude
estimates. J Neurosci, Department of Neuroscience, Johns Hopkins
University School of Medicine, Baltimore, Maryland 21205., 1990, 10,

Good luck.

Martin McKinney
Starkey Laboratories

On Mon, Feb 16, 2009 at 10:59 AM, AUDITORY automatic digest system
<LISTSERV@xxxxxxxxxxxxxxx> wrote:
> ------------------------------
> Date:    Mon, 16 Feb 2009 11:01:05 +0000
> From:    Dan Stowell <dan.stowell@xxxxxxxxxxxxxxx>
> Subject: Re: Roughness in audio and vision
> Hi Bryan -
> Bryan Pardo wrote:
>> Some colleagues of mine are interested in  the relationship between=20
>> roughness in visual images and audio images. They sent me the following=
> =20
>> questions they were thinking about in hopes that I might be able to=20
>> provide some references to get them started.  I figured this is just th=
> e=20
>> mailing list to get some pointers to papers. If any of these questions=20
>> make you think of a paper or two, I=E2=80=99d appreciate your emailing =
>  the=20
>> reference.
>> =20
>> 1) Do we have a reliable method to measure the roughness of a given a=20
>> natural sound or image?
> For images I wouldn't know (maybe some measure of fractal dimension?=20
> http://dx.doi.org/10.1103/PhysRevA.39.1500 ) but for sound, I keep=20
> finding papers where auditory roughness is said to be related to fast=20
> amplitude modulation (AM), e.g.
> Joder et al (2009), TASLP
> http://dx.doi.org/10.1109/TASL.2008.2007613
> which references a thesis I haven't read (Eronen 2001) as the source of=20
> their method for measuring AM in the 10--40 Hz range.
>> 2) How could one synthesis sound clips (and images) with ascending or=20
>> descending order of roughness?
> If that kind of AM does indeed cause auditory roughness then=20
> synthesising is easy, just change the depth of the AM.
>> 3) How can acoustic roughness influence the perceived roughness of the=20
>> vision?
> Good question!
> Dan
> --=20
> Dan Stowell
> Centre for Digital Music
> School of Electronic Engineering and Computer Science
> Queen Mary, University of London
> Mile End Road, London E1 4NS
> http://www.elec.qmul.ac.uk/department/staff/research/dans.htm
> http://www.mcld.co.uk/
> ------------------------------
> Date:    Mon, 16 Feb 2009 08:03:03 -0500
> From:    Mary Andrianopoulos <mva@xxxxxxxxxxxxxxxx>
> Subject: Re: Roughness in audio and vision
> Hi Bryan;
> I just want to add some additional information to Question 1:
>> 1) Do we have a reliable method to measure the roughness of a given a=20
>> natural sound or image?
> One can calculate roughness or aperiodicities in the acoustic signal by=20
> looking at not only amplitude perturbation (shimmer), but frequency=20
> pertubation (jitter) as well as noise to harmonics ratio (NHR), degree of=
> =20
> tremor, voice breaks (for isolated vowe prolongations), number or degree =
> of=20
> sub-harmonics, etc.. Some commercially pre-pared acoustic software progra=
> ms=20
> (by KayPentax, such as their MDVP program allow one to measure at one tim=
> e a=20
> host of 30 or so parameters on one acoustic signal at a single time).
> With respect to reliability, published literature suggests that some=20
> measurements are more reliable than others for various speaking tasks.=20
> Single vowel prolongations, e.g., [a] prolongation, tend to yield more=20
> consistency than connected speech. However the idea is, the less jitter,=20
> shimmer, NHR, and sub-harmonics, the less "roughness" in the voice and ch=
> aos=20
> in the spectrogram. Of course, one needs to control the collection of the=
> =20
> acoustic signal by recording in a sounded treated room or a room with an=20
> ambient noise level of < 50 dB, use a condenser microphone, etc.. I belie=
> ve=20
> the specs to control artifacts in the recording and collection of the=20
> acoustic signal can be found on the NCVS website: http://www.ncvs.org/=20
> There are also some great tutorials can be viewed on this website on such=
> =20
> topics at: http://www.ncvs.org/ncvs/tutorials/voiceprod/tutorial/index.ht=
> ml
> In terms of reliability, there are some issues on the reliability of some=
>  of=20
> these parameters. There are numerous published articles on this topic. My=
> =20
> research team found relatively strong reliability in measuring degrees of=
> =20
> roughness in the following articles:
> Multimodal Standardization of Voice Among Four Multicultural=20
> Populations*1Fundamental Frequency and Spectral Characteristics
> Journal of Voice, Volume 15, Issue 2, Pages 194-219
> M.Andrianopoulos, K.Darrow, J.Chen
> Multimodal Standardization of Voice Among Four Multicultural Populations=20
> Formant Structures
> Journal of Voice, Volume 15, Issue 1, Pages 61-77
> M.Andrianopoulos, K.Darrow, J.Chen
> Good luck:
> Mary A
> UMass-Amherst