[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AUDITORY Digest - 12 Sep 2005 to 13 Sep 2005 (#2005-181)



Quoting AUDITORY automatic digest system <LISTSERV@xxxxxxxxxxxxxxx>:

> There are 6 messages totalling 418 lines in this issue.
>
> Topics of the day:
>
>   1. BMLD questions
>   2. Amek angela
>   3. lead vocal detection, removal and extraction
>   4. Speakers for speech testing (2)
>   5. fMRI compatible sound system
>
> ----------------------------------------------------------------------
>
> Date:    Tue, 13 Sep 2005 09:21:56 -0400
> From:    Xuejing Sun <xuejing@xxxxxxxxxxx>
> Subject: Re: BMLD questions
>
> Thanks for all the replies. I appreciate your help.
> Best regards,
> Xuejing
>
> ------------------------------
>
> Date:    Tue, 13 Sep 2005 16:11:01 +0100
> From:    "f.neff" <fn2@xxxxxxxxxxxxx>
> Subject: Amek angela
>
> Hi list,
>
> Does anybody know if it is possible to rack mount the channels of an amek
> angela desk and use them as separate preamps?
>
> Thanks,
>
> Flaithri Neff,
> Ireland.
>
> ps. Tony from Queen Mary University London sent me an email but I have
> somehow
> lost his email address and I would like to reply to him... so if you receive
> this Tony you might send me an email, thanks.
>
> ------------------------------
>
> Date:    Tue, 13 Sep 2005 17:19:31 +0200
> From:    =?iso-8859-1?Q?Tobias_Wagenbla=DF?= <tobias.wagenblass@xxxxxx>
> Subject: lead vocal detection, removal and extraction
>
> hi,
>
> i have to do a "karaoke algorithm" as my diploma thesis and i am currently
> collecting information about this topic. so, it would be helpful if anyone
> could help me finding informations on
>
> - detecting and extracting the lead vocals (in stereo signals)
> - removing the lead vocals (from stereo signals)
>
> and related stuff like source separation.
>
> thanks so far,
> tobias
>
> ------------------------------
>
> Date:    Tue, 13 Sep 2005 13:32:11 -0700
> From:    "Ward R. Drennan" <drennan@xxxxxxxxxxxxxxxx>
> Subject: Speakers for speech testing
>
> This is a multi-part message in MIME format.
>
> ------=_NextPart_000_0082_01C5B867.8B93F900
> Content-Type: text/plain;
> 	charset="iso-8859-1"
> Content-Transfer-Encoding: quoted-printable
>
> Does anyone know of research that has investigated a effect of the =
> speaker's frequency response on speech perception ability? We could try =
> to get speaker response to be perfectly flat, but so long as we are =
> within an ANSI standard, does it really make a difference? Does anyone =
> know the scientific basis of this standard?
>
> ANSI standard 3.6-1996 (from Katz on speech audiometry):
> No more than 10 dB attenuation 125-250 Hz
> +/- 3 dB 250-4000 Hz
> +/- 5 dB 4000-6000 Hz
>
> Ward R. Drennan, Ph. D.
> VM Bloedel Hearing Research Center
> University of Washington Box 357923
> Seattle, WA 98195-7923
> Phone: (206) 897-1848
> Fax: (206) 616-1828
> ------=_NextPart_000_0082_01C5B867.8B93F900
> Content-Type: text/html;
> 	charset="iso-8859-1"
> Content-Transfer-Encoding: quoted-printable
>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
> <HTML><HEAD>
> <META http-equiv=3DContent-Type content=3D"text/html; =
> charset=3Diso-8859-1">
> <META content=3D"MSHTML 6.00.2900.2722" name=3DGENERATOR>
> <STYLE></STYLE>
> </HEAD>
> <BODY bgColor=3D#ffffff>
> <DIV><FONT face=3DArial size=3D2>Does anyone know of research that has =
> investigated=20
> a effect of the speaker's frequency response on speech perception =
> ability? We=20
> could try to get speaker response to be perfectly flat, but so long as =
> we are=20
> within an ANSI standard, does it really make a difference? Does anyone =
> know the=20
> scientific basis of this standard?</FONT></DIV>
> <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
> <DIV><FONT face=3DArial size=3D2>ANSI standard 3.6-1996 (from Katz on =
> speech=20
> audiometry):</FONT></DIV>
> <DIV><FONT face=3DArial size=3D2>No more than 10 dB attenuation 125-250=20
> Hz</FONT></DIV>
> <DIV><FONT face=3DArial size=3D2>+/- 3 dB 250-4000 Hz</FONT></DIV>
> <DIV><FONT face=3DArial size=3D2>+/- 5 dB 4000-6000 Hz</FONT></DIV>
> <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
> <DIV><FONT face=3DArial size=3D2>Ward R. Drennan, Ph. D.<BR>VM Bloedel =
> Hearing=20
> Research Center<BR>University of Washington Box 357923<BR>Seattle, WA=20
> 98195-7923<BR>Phone: (206) 897-1848<BR>Fax: (206)=20
> 616-1828</FONT></DIV></BODY></HTML>
>
> ------=_NextPart_000_0082_01C5B867.8B93F900--
>
> ------------------------------
>
> Date:    Tue, 13 Sep 2005 13:32:06 -0800
> From:    Brent Edwards <brent@xxxxxxxxxxx>
> Subject: Re: Speakers for speech testing
>
> From an AI perspective, the speaker response shouldn't matter as long as au=
> dibility is ensured, the change in response doesn't affect spread of maskin=
> g relative to a flat response, and the rollover level isn't reached. While =
> not a directly answer to your question but related nonetheless, I had this =
> to say about the frequency response of hearing aids in my chapter of the Sp=
> ringer speech book:
>
> "The slope of the frequency response can change considerably and not affect=
>  intelligibility as long as speech remains between the thresold of audibili=
> ty and discomfort (Lippman, et al. 1981; van Dijkhuizen et al. 1987), altho=
> ugh a negative slope may result in a deterioration of intelligibility due t=
> o upward spread of masking (can Dijkhuizen et al. 1989)."
>
> --Brent
>
> ----- Original Message -----
> From: "Ward R. Drennan" <drennan@xxxxxxxxxxxxxxxx>
> To: AUDITORY@xxxxxxxxxxxxxxx
> Subject: Speakers for speech testing
> Date:         Tue, 13 Sep 2005 13:32:11 -0700
>
> >=20
> > Does anyone know of research that has investigated a effect of the speake=
> r's=20
> > frequency response on speech perception ability? We could try to get spea=
> ker=20
> > response to be perfectly flat, but so long as we are within an ANSI stand=
> ard,=20
> > does it really make a difference? Does anyone know the scientific basis o=
> f=20
> > this standard?
> >=20
> > ANSI standard 3.6-1996 (from Katz on speech audiometry):
> > No more than 10 dB attenuation 125-250 Hz
> > +/- 3 dB 250-4000 Hz
> > +/- 5 dB 4000-6000 Hz
> >=20
> > Ward R. Drennan, Ph. D.
> > VM Bloedel Hearing Research Center
> > University of Washington Box 357923
> > Seattle, WA 98195-7923
> > Phone: (206) 897-1848
> > Fax: (206) 616-1828
>
> ------------------------------
>
> Date:    Tue, 13 Sep 2005 19:48:11 -0400
> From:    Nadine Gaab <gaab@xxxxxxx>
> Subject: fMRI compatible sound system
>
> This is a multi-part message in MIME format.
>
> ------=_NextPart_000_0081_01C5B89C.12EA4880
> Content-Type: text/plain;
> 	charset="US-ASCII"
> Content-Transfer-Encoding: 7bit
>
> Hello List!
>
> We are interested in purchasing a new high quality fMRI compatible sound
> system. I was wondering what other people use and what their experiences
> are. Thanks in advance for your suggestions and reviews
>
> Nadine Gaab
>
>
>
> --
>
> "If we knew what it was we were doing, it would not be called research,
> would it?"
>
> A. Einstein (1879-1955)
>
> Nadine Gaab, PhD
>
> Postdoctoral Associate
>
> Department of Brain and Cognitive Sciences Massachusetts Institute of
> Technology
>
> 77 Massachusetts Avenue
>
> Room NE20-423
>
> Cambridge, MA 02138-4307
>
> phone: (617)-258-8221
>
> email: gaab@xxxxxxx
>
> http://web.mit.edu/bcs/
>
>
>
>
> ------=_NextPart_000_0081_01C5B89C.12EA4880
> Content-Type: text/html;
> 	charset="US-ASCII"
> Content-Transfer-Encoding: quoted-printable
>
> <html xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
> xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
> xmlns:st1=3D"urn:schemas-microsoft-com:office:smarttags" =
> xmlns=3D"http://www.w3.org/TR/REC-html40";>
>
> <head>
> <META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
> charset=3Dus-ascii">
> <meta name=3DGenerator content=3D"Microsoft Word 11 (filtered medium)">
> <o:SmartTagType =
> namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
>  name=3D"PostalCode"
>  downloadurl=3D"http://www.5iamas-microsoft-com:office:smarttags"/>
> <o:SmartTagType =
> namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
>  name=3D"State" =
> downloadurl=3D"http://www.5iamas-microsoft-com:office:smarttags"/>
> <o:SmartTagType =
> namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
>  name=3D"City" =
> downloadurl=3D"http://www.5iamas-microsoft-com:office:smarttags"/>
> <o:SmartTagType =
> namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
>  name=3D"place" downloadurl=3D"http://www.5iantlavalamp.com/"/>
> <o:SmartTagType =
> namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
>  name=3D"Street"/>
> <o:SmartTagType =
> namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
>  name=3D"address"/>
> <!--[if !mso]>
> <style>
> st1\:*{behavior:url(#default#ieooui) }
> </style>
> <![endif]-->
> <style>
> <!--
>  /* Style Definitions */
>  p.MsoNormal, li.MsoNormal, div.MsoNormal
> 	{margin:0in;
> 	margin-bottom:.0001pt;
> 	font-size:12.0pt;
> 	font-family:"Times New Roman";}
> a:link, span.MsoHyperlink
> 	{color:blue;
> 	text-decoration:underline;}
> a:visited, span.MsoHyperlinkFollowed
> 	{color:purple;
> 	text-decoration:underline;}
> p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
> 	{margin:0in;
> 	margin-bottom:.0001pt;
> 	font-size:10.0pt;
> 	font-family:"Courier New";}
> p
> 	{mso-margin-top-alt:auto;
> 	margin-right:0in;
> 	mso-margin-bottom-alt:auto;
> 	margin-left:0in;
> 	font-size:12.0pt;
> 	font-family:"Times New Roman";}
> span.EmailStyle17
> 	{mso-style-type:personal-compose;
> 	font-family:Arial;
> 	color:windowtext;}
> @page Section1
> 	{size:8.5in 11.0in;
> 	margin:1.0in 1.25in 1.0in 1.25in;}
> div.Section1
> 	{page:Section1;}
> -->
> </style>
>
> </head>
>
> <body lang=3DEN-US link=3Dblue vlink=3Dpurple>
>
> <div class=3DSection1>
>
> <p class=3DMsoNormal style=3D'text-autospace:none'><font size=3D2 =
> face=3D"Courier New"><span
> style=3D'font-size:10.0pt;font-family:"Courier New"'>Hello =
> List!<o:p></o:p></span></font></p>
>
> <p class=3DMsoNormal><font size=3D2 face=3D"Courier New"><span =
> style=3D'font-size:10.0pt;
> font-family:"Courier New"'>We are interested in purchasing a new high =
> quality
> fMRI compatible sound system. I was wondering what other people use and =
> what
> their experiences are. Thanks in advance for your suggestions and =
> reviews <o:p></o:p></span></font></p>
>
> <p class=3DMsoNormal><font size=3D2 face=3D"Courier New"><span =
> style=3D'font-size:10.0pt;
> font-family:"Courier New"'>Nadine Gaab </span></font><font size=3D2 =
> face=3DArial><span
> style=3D'font-size:10.0pt;font-family:Arial'><o:p></o:p></span></font></p=
> >
>
> <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
> style=3D'font-size:10.0pt;
> font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
> <p><font size=3D2 face=3D"Times New Roman"><span =
> style=3D'font-size:10.0pt'>--<o:p></o:p></span></font></p>
>
> <p><font size=3D2 face=3D"Times New Roman"><span =
> style=3D'font-size:10.0pt'>&quot;If
> we knew what it was we were doing, it would not be called research, =
> would
> it?&quot;<o:p></o:p></span></font></p>
>
> <p><font size=3D2 face=3D"Times New Roman"><span =
> style=3D'font-size:10.0pt'>A.
> Einstein (1879-1955)<o:p></o:p></span></font></p>
>
> <p><font size=3D2 face=3D"Times New Roman"><span =
> style=3D'font-size:10.0pt'>Nadine
> Gaab, PhD<o:p></o:p></span></font></p>
>
> <p><font size=3D2 face=3D"Times New Roman"><span =
> style=3D'font-size:10.0pt'>Postdoctoral
> Associate<o:p></o:p></span></font></p>
>
> <p><font size=3D2 face=3D"Times New Roman"><span =
> style=3D'font-size:10.0pt'>Department
> of Brain and Cognitive Sciences Massachusetts Institute of =
> Technology<o:p></o:p></span></font></p>
>
> <p><st1:Street w:st=3D"on"><st1:address w:st=3D"on"><font size=3D2
>   face=3D"Times New Roman"><span style=3D'font-size:10.0pt'>77 =
> Massachusetts Avenue</span></font></st1:address></st1:Street><font
> size=3D2><span style=3D'font-size:10.0pt'><o:p></o:p></span></font></p>
>
> <p><font size=3D2 face=3D"Times New Roman"><span =
> style=3D'font-size:10.0pt'>Room
> NE20-423<o:p></o:p></span></font></p>
>
> <p><st1:place w:st=3D"on"><st1:City w:st=3D"on"><font size=3D2 =
> face=3D"Times New Roman"><span
>   style=3D'font-size:10.0pt'>Cambridge</span></font></st1:City><font =
> size=3D2><span
>  style=3D'font-size:10.0pt'>, <st1:State w:st=3D"on">MA</st1:State> =
> <st1:PostalCode
>  w:st=3D"on">02138</st1:PostalCode></span></font></st1:place><font =
> size=3D2><span
> style=3D'font-size:10.0pt'>-4307<o:p></o:p></span></font></p>
>
> <p><font size=3D2 face=3D"Times New Roman"><span =
> style=3D'font-size:10.0pt'>phone:
> (617)-258-8221<o:p></o:p></span></font></p>
>
> <p><font size=3D2 face=3D"Times New Roman"><span =
> style=3D'font-size:10.0pt'>email: <a
> href=3D"mailto:gaab@xxxxxxx";>gaab@xxxxxxx</a><o:p></o:p></span></font></p=
> >
>
> <p><font size=3D2 face=3DArial><span =
> style=3D'font-size:10.0pt;font-family:Arial'><a
> href=3D"http://web.mit.edu/bcs/";>http://web.mit.edu/bcs/</a></span></font=
> ><font
> size=3D2><span style=3D'font-size:10.0pt'><o:p></o:p></span></font></p>
>
> <p class=3DMsoNormal><font size=3D3 face=3D"Times New Roman"><span =
> style=3D'font-size:
> 12.0pt'><o:p>&nbsp;</o:p></span></font></p>
>
> </div>
>
> </body>
>
> </html>
>
> ------=_NextPart_000_0081_01C5B89C.12EA4880--
>
> ------------------------------
>
> End of AUDITORY Digest - 12 Sep 2005 to 13 Sep 2005 (#2005-181)
> ***************************************************************
>
>
Dear Tobias,

Regarding your "karaoke algorithm" - I hope you will think of it more as a
theoretical experiment and not actually attempt to extract vocals from a stereo
mix. My guess is that it would require some massive processing power. Karaoke
recordings are sometimes made from scratch by studio session musicians who
recreate the original recording but don't add vocals. Another source is from
record companies who have the instrumental versions of popular songs that they
make available to karaoke manufacturers.

In a typical recording, vocals are added to tracks along with stereo effects
such as reverb and delay. So even if you removed the pure vocal elements, you
would still have the vocal effects, blended into the sonic ambience. Another
problem is that the vocal is in the same frequency range as so many other
instruments, so extracting it might take out a lot of similar timbres.

Good luck,

Susan Rogers
Levitin Lab
Psychology Department
McGill University