[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Articulation Index

To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: Articulation Index
From: "D. Sen" <dsen@xxxxxxxx>
Date: Tue, 20 Aug 2002 10:19:54 -0400
Comments: To: David.Isherwood@NOKIA.COM
Delivery-date: Tue Aug 20 10:25:00 2002
References: <7F874D8CD4FDA54AAAE7C8B43D32B8070C24B8@trebe004.europe.nokia.com>
Reply-to: "D. Sen" <dsen@xxxxxxxx>
Sender: AUDITORY Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020529

Its of course not just Very Low Bit Rate coders where quality and
intelligibility are not correlated. (Indeed given the time, resources
and computational complexity, you could design an extremely low quality
coder with both high quality and inteligibility.)

The point is that it is possible to destroy the quality cues while
maintaining the intelligibility cues. Various kinds of distortions and
manipulations including peak clipping will do this to the signal.

Quality cues are mainly derived from how smooth the evolution (in time)
of parameters such as formants and pitch. Intelligibility cues have more
to do with "matching" of parameters and patterns with those stored in
memory. Unless constraints are placed......such as fairly clean
environmental conditions and measuring compression algorithms which
strive to maintain the original signal waveform, a generalized quality
measure will not be able to predict intelligibility with any degree of
accuracy....

David Isherwood wrote:

Hi,

I think it's true to say that quality and intelligibility are not neccessarily correlated for (V)LBR coders, some purposely degrading the overall quality of the signal to purportedly increase intelligibility in noisy environments. Another problem for objective metrics is that there can also be differences in the recognition of speech for single words and complete sentences making it difficult to define an optimal perceptually motivated objective metric for SI.

I'd be interested in whether anyone has any experience of how the various objective measures associated with speech intelligiblity correlate with subjective results obtained by speech+noise methodologies for single word recognition (e.g. diagnostic rhyme test [DRT] (ANSI S3.2-1989), modified rhyme test [MRT], phonetically balanced word test [PB], etc.) and sentence recognition (e.g. speech perception in noise test [SPIN], hearing in noise test [HINT], connected speech test [CST], etc) for LBR and VLBR speech coders.

David Isherwood
Speech and Audio Systems Laboratory
Nokia Research Center

-----Original Message-----
From: ext John G. Beerends [mailto:J.G.Beerends@KPN.COM]
Sent: 20 August, 2002 10:08
To: AUDITORY@LISTS.MCGILL.CA
Subject: Re: Articulation Index


Agreed that for very low bit rate, especially when robotization occurs,
quality and intelligibility are different. For the commercially used codecs
like GSM FR/EFR/HR/AMR, IS54/96, ITU G.723.1/727/728/729 I think it will
work. We have a formal listening test that used speech listening effort
subjective testing that shows P.862 PESQ can be used for intelligibility
within limits.

In general the STI cannot be used to assess intelligibility of speech
degraded by low bit rate codecs.

John Beerends


-----Original Message-----
From: D. Sen [mailto:dsen@ieee.org]
Sent: vrijdag 16 augustus 2002 16:44
To: J.G.Beerends@KPN.COM
Cc: AUDITORY@LISTS.MCGILL.CA
Subject: Re: Articulation Index


John G. Beerends wrote:

If you want to measure speech intelligibility of coded speech
(distortion+noise) the AI can definitely not be applied. An option is to

use

ITU-T recommendation P.862 that describes a method called Perceptual
Evaluation of Speech Quality

http://www.itu.int/rec/recommendation.asp?type=folders&lang=e&parent=T-REC-P

.862.

The method was developed for speech quality but in general speech quality
and intelligibilty are closely linked


There are many instances where speech intelligibility and quality are
not correlated. The LPC-10e (US Federal Standard - 1015) algorithm is an
example where intelligibility is high but quality is atrocious.

The Speech Transmission Index (Steeneen, H.J.M and Houtgast, T., "A
physical method for measuring speech-transmission quality", JASA, 67(1),
1980) might be a better objective measure of intelligibility than the AI
for systems with nonlinear distortion.

and the algorithm can be re-optimized
for intelligibility. A paper describing the method will be published
(probably this year) in the J. Audio Eng. Soc.

John Beerends
KPN Research


-----Original Message-----
From: Brent Edwards [mailto:brent@edwards.net]
Sent: donderdag 15 augustus 2002 0:51
To: AUDITORY@LISTS.MCGILL.CA
Subject: Re: Articulation Index


If the coder is only introducing stationary additive noise to the speech,
then you can do this. If the perceptual coder is affecting the speech in a
way different from this (which I suspect it is), then you cannot apply the
AI (to my understanding of the Speech Intelligibility Index ANSI

standard).

--Brent

----- Original Message -----
From: "Hugo de Paula" <hugodepaula@GMX.NET>
To: <AUDITORY@LISTS.MCGILL.CA>
Sent: Wednesday, August 14, 2002 12:56 PM
Subject: Articulation Index

Well,

The Articulation Index gives a measure of the intelligibility of hearing
speech in a given noise environment. I have some recordings that I used

some

perceptual coding techniques that caused distortion to these signals. I
would like to calculate the AI of  the distorced sound. As the AI is
calculated based on the environmental noise, I would use the reference

sound

to measure the 'noise' in the distorted sound.

Hugo.

Hi Hugo,

Pardon, I can not help you, but I would like to know what means
"articulation index" for a pair of sounds.
What is the use of this figure?

Thanks,

Regis

Hugo de Paula wrote:

Hi all,

Does anybody know of a matlab code for calculating the Articulation

Index

given a pair of sound files: the first with the original recorded

source

and

the other the live recording?

thank you,

Hugo


--
D. Sen, PhD
http://www.auditorymodels.org/~dsen


--
D. Sen, PhD
http://www.auditorymodels.org/~dsen

Follow-Ups:
- sharc
  - From: Donagh O'Shea

References:
- Re: Articulation Index
  - From: David Isherwood

Prev by Date: Re: Articulation Index
Next by Date: Auditory software
Previous by thread: Re: Articulation Index
Next by thread: sharc
Index(es):
- Date
- Thread