Re: Audio editing (Matt Winn )


Subject: Re: Audio editing
From:    Matt Winn  <mwinn83@xxxxxxxx>
Date:    Tue, 18 Dec 2012 11:34:27 -0600
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

--047d7b2e0e25e0f8ef04d123e803 Content-Type: text/plain; charset=ISO-8859-1 Abin and List, Forgive double-postings, as I apparently made an error trying to attach a file. It may be easier for you to do this in a scriptable (and free) environment like Praat instead of Audition. I would like to share a simple tool that I have made for this kind of intensity normalization. In the Praat script linked here, you can scale the intensities of all sounds in a folder to a selected level. It will alert you if any of the sounds clip, and offer you the option of decreasing your target intensity level until none of them clip. In the end, you will have a folder full of normalized sounds and an info text file to let you know what changes were applied. The original sounds are preserved. This is designed to use for a folder full of short sounds (e.g. words), and might not be ideal for longer sounds. It does not perform compression. Find the script here: http://www.mattwinn.com/Scale_intensity_of_all_sounds_check_maxima_v2.txt To use it in Praat, either copy the text into a new Praat Script window or open it directly. Regarding naturalness - you should be aware that compression and (to a lesser extent) normalization actually decrease the naturalness of the signals by altering each of them in different ways. There are some inherent volume differences between some speech sounds (e.g. /s/ is louder than /f/, /a/ is louder than /u/), so normalizing levels for these sounds would decrease naturalness to some extent. Good luck, Matt On Tue, Dec 18, 2012 at 9:04 AM, Matt Winn <mwinn83@xxxxxxxx> wrote: > Abin, > > It may be easier to do this in a scriptable environment like Praat instead > of Audition. I have attached a script that you can use in Praat to scale > the intensities of all sounds in a folder to a selected level. If the > sounds clip, it will alert you and offer you the option of decreasing your > target intensity level. This way, you can prevent any clipping in the > output files. In the end, you will have a folder full of normalized sounds > and an info text file to let you know what changes were applied. None of > the original sounds are altered. > > This is designed to use for a folder full of short sounds (i.e. words), > and might not be ideal for longer sounds. It does not perform compression. > > > > Regarding naturalness - you should be aware that compression and (to a > lesser extent) normalization actually decrease the naturalness of the > signals by altering each of them in different ways. There are some inherent > volume differences between some speech sounds (e.g. /s/ is louder than /f/, > /a/ is louder than /u/), so normalizing levels for these sounds would > decrease naturalness to some extent. > > > > Good luck, > > Matt > > > On Mon, Dec 17, 2012 at 5:05 PM, Abin Kuruvilla Mathew < > amat527@xxxxxxxx> wrote: > >> Dear All, >> >> I have a set of audio files (consonants and vowels) to be editied in >> Adobe audition and was wondering to what extent and how much of >> Normalization (RMS) and dynamic compression (if necessary) would be needed >> so that the naturalness is preserved and clipping doesn't occur. >> >> kind regards, >> Abin >> >> -- >> Abin K. Mathew >> Doctoral student >> Department of Psychology (Speech Science) >> Tamaki Campus, 261 Morrin Road, Glen Innes >> The University of Auckland >> Private Bag 92019 >> Auckland- 1142 >> New Zealand >> Email: amat527@xxxxxxxx >> >> > --047d7b2e0e25e0f8ef04d123e803 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt">Abin and List,</span><spa= n style=3D"font-size:12.0pt;font-family:&quot;Times New Roman&quot;,&quot;s= erif&quot;"></span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt">Forgive double-postings, = as I apparently made an error trying to attach a file. </span><span style=3D"font-size:12.0pt">It may be easier = for you to do this in a scriptable (and free) environment like Praat instead of Audition. </span><span style=3D"font-size= :12.0pt">I would like to share a simple tool that I have made for this kind of intensity normalization.</span><span style=3D"font-size:12.0pt;font= -family:&quot;Times New Roman&quot;,&quot;serif&quot;"></span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt">In the Praat script linke= d here, you can scale the intensities of all sounds in a folder to a selected level= . It will alert you if any of the sounds clip, and offer you the option of decre= asing your target intensity level until none of them clip. In the end, you will h= ave a folder full of normalized sounds and an info text file to let you know wh= at changes were applied. The original sounds are preserved.</span><span style= =3D"font-size:12.0pt;font-family:&quot;Times New Roman&quot;,&quot;serif&qu= ot;"></span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt">This is designed to use f= or a folder full of short sounds (e.g. words), and might not be ideal for longer sounds. It does not perform compression.=A0</span><span style=3D"font-size:12.0pt;font-family:&quot;Tim= es New Roman&quot;,&quot;serif&quot;"></span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt">=A0Find the script here:<= /span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt;font-family:&quot;Times Ne= w Roman&quot;,&quot;serif&quot;"><a href=3D"http://www.mattwinn.com/Scale_i= ntensity_of_all_sounds_check_maxima_v2.txt">http://www.mattwinn.com/Scale_i= ntensity_of_all_sounds_check_maxima_v2.txt</a></span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt">To use it in Praat, eithe= r copy the text into a new Praat Script window or open it directly. </span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt">=A0</span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt">Regarding naturalness - y= ou should be aware that compression and (to a lesser extent) normalization actually decrease the naturalness of= the signals by altering each of them in different ways. There are some inherent volume differences between some speech sounds (e.g. /s/ is louder than /f/,= /a/ is louder than /u/), so normalizing levels for these sounds would decrease naturalness to some extent.=A0</span><span style=3D"font-size:12.0pt;font-f= amily:&quot;Times New Roman&quot;,&quot;serif&quot;"></span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt;line-height:normal"><s= pan style=3D"font-size:12.0pt;font-family:&quot;Times New Roman&quot;,&quot= ;serif&quot;">=A0</span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt">Good luck,</span><span st= yle=3D"font-size:12.0pt;font-family:&quot;Times New Roman&quot;,&quot;serif= &quot;"></span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt">Matt</span><span style=3D= "font-size:12.0pt;font-family:&quot;Times New Roman&quot;,&quot;serif&quot;= "></span></p> <br><br><div class=3D"gmail_quote">On Tue, Dec 18, 2012 at 9:04 AM, Matt Wi= nn <span dir=3D"ltr">&lt;<a href=3D"mailto:mwinn83@xxxxxxxx" target=3D"_bl= ank">mwinn83@xxxxxxxx</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_= quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1= ex"> <div><p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001p= t;line-height:normal"><font face=3D"Arial, sans-serif" size=3D"3"></font></= p><p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;l= ine-height:normal"> <font face=3D"Arial, sans-serif" size=3D"3"><span style=3D"font-size:12.0pt= ;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;">Abin,</span></font><= /p><font face=3D"Arial, sans-serif" size=3D"3"> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt;font-family:&quot;Arial&qu= ot;,&quot;sans-serif&quot;">It may be easier to do this in a scriptable environment like Praat instead of Audition. I have attached a script that you can use in Praat to scale the intensities of all sounds in a folder to a selected level. If the sounds clip, it will alert y= ou and offer you the option of decreasing your target intensity level. This wa= y, you can prevent any clipping in the output files. In the end, you will have= a folder full of normalized sounds and an info text file to let you know what= changes were applied. None of the original sounds are altered.</span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt;font-family:&quot;Arial&qu= ot;,&quot;sans-serif&quot;">This is designed to use for a folder full of short sounds (i.e. words), and might not be ideal for longer sounds. It does not perform compression.=A0</span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt;font-family:&quot;Arial&qu= ot;,&quot;sans-serif&quot;">=A0</span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt;font-family:&quot;Arial&qu= ot;,&quot;sans-serif&quot;">Regarding naturalness - you should be aware that compression and (to a lesser extent) normalization actually decrease the naturalness of the signals by altering each of them i= n different ways. There are some inherent volume differences between some spe= ech sounds (e.g. /s/ is louder than /f/, /a/ is louder than /u/), so normalizin= g levels for these sounds would decrease naturalness to some extent.=A0</span= ></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt;font-family:&quot;Arial&qu= ot;,&quot;sans-serif&quot;"><br> <br> </span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt;font-family:&quot;Arial&qu= ot;,&quot;sans-serif&quot;">Good luck,</span></p> <p class=3D"MsoNormal" style=3D"margin-bottom:0in;margin-bottom:.0001pt;lin= e-height:normal"><span style=3D"font-size:12.0pt;font-family:&quot;Arial&qu= ot;,&quot;sans-serif&quot;">Matt</span></p></font><div><div><p></p> <br><div class=3D"gmail_quote"> On Mon, Dec 17, 2012 at 5:05 PM, Abin Kuruvilla Mathew <span dir=3D"ltr">&l= t;<a href=3D"mailto:amat527@xxxxxxxx" target=3D"_blank">amat527@xxxxxxxx= cklanduni.ac.nz</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex">Dear All,<div><br></div><div>I have a set of= audio files (consonants and vowels) to be editied in Adobe audition and wa= s wondering to what extent and how much of Normalization (RMS) and dynamic = compression (if necessary) would be needed so that the naturalness is prese= rved and clipping doesn&#39;t occur.</div> <div><br></div><div>kind regards,</div><div>Abin<span><font color=3D"#88888= 8"><br clear=3D"all"><div><br></div>-- <br><div style=3D"line-height:17px;c= olor:rgb(42,42,42);font-size:13px;font-family:&quot;Segoe UI&quot;,Tahoma,V= erdana,Arial,sans-serif"> Abin K. Mathew</div><div style=3D"line-height:17px;color:rgb(42,42,42);font= -size:13px;font-family:&quot;Segoe UI&quot;,Tahoma,Verdana,Arial,sans-serif= ">Doctoral student</div><div style=3D"line-height:17px;color:rgb(42,42,42);= font-size:13px;font-family:&quot;Segoe UI&quot;,Tahoma,Verdana,Arial,sans-s= erif"> Department of Psychology (Speech Science)</div><div style=3D"line-height:17= px;color:rgb(42,42,42);font-size:13px;font-family:&quot;Segoe UI&quot;,Taho= ma,Verdana,Arial,sans-serif"><div>Tamaki Campus, 261 Morrin Road, Glen Inne= s</div> <div>The University of Auckland</div><div>Private Bag 92019</div><div>Auckl= and- 1142</div><div>New Zealand</div><div>Email: <a href=3D"mailto:amat527@xxxxxxxx= aucklanduni.ac.nz" target=3D"_blank">amat527@xxxxxxxx</a>=A0</div>= </div> <br> </font></span></div> </blockquote></div><br></div></div></div> </blockquote></div><br> --047d7b2e0e25e0f8ef04d123e803--


This message came from the mail archive
/var/www/postings/2012/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University