Re: [AUDITORY] Cut silence in beginning and end of speech recordings automatically? (Christine Rankovic )


Subject: Re: [AUDITORY] Cut silence in beginning and end of speech recordings automatically?
From:    Christine Rankovic  <rankovic@xxxxxxxx>
Date:    Fri, 5 Mar 2021 09:35:31 -0500

This is a multi-part message in MIME format. ------=_NextPart_000_0007_01D711A2.E30CCCC0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello Tamar: =20 The noise at the beginning and ending of recorded waveforms is annoying. = It is distracting to listeners and the waves should be carefully = trimmed. =20 Edit by hand if at all possible, especially if the waves are intended = for intelligibility testing and consist of nonsense syllables, words, or = sentences. You=E2=80=99ll see that it is easy to inadvertently cut off = low-intensity speech sounds such as =E2=80=98s=E2=80=99, = =E2=80=98th=E2=80=99, etc. Also, make sure to cut the wave only at = zero-crossings to avoid clicks.=20 =20 Wave editing isn=E2=80=99t difficult if you practice on a set of waves = and listen carefully to the results. I can=E2=80=99t imagine that = automatic editing methods can achieve this. =20 =20 Best wishes,=20 Christine Rankovic, PhD =20 =20 From: AUDITORY - Research in Auditory Perception = [mailto:AUDITORY@xxxxxxxx On Behalf Of Gabriele Bunkheila Sent: Friday, March 05, 2021 5:20 AM To: AUDITORY@xxxxxxxx Subject: Re: Cut silence in beginning and end of speech recordings = automatically? =20 Hi Tamar, =20 Since you mentioned MATLAB, I thought I=E2=80=99d share a couple of = pointers. A good fit for this would be detectSpeech = (https://www.mathworks.com/help/audio/ref/detectspeech.html), which uses = a fairly accessible algorithm based on short-term energy and spectral = spread. detectSpeech has been available in Audio Toolbox since release = R2020a. =20 In case any of your data was more challenging, you could consider trying = the function classifySound = (https://www.mathworks.com/help/audio/ref/classifysound.html), which has = only been available since release R2020b and uses the pre-trained YAMNet = network under the hood. =20 I hope this helps =E2=80=93 feel free to get in touch directly if you = needed more guidance.=20 Regards and good luck, Gabriele. =20 -- Gabriele Bunkheila [he/him] =E2=80=93 Product Management, DSP and Audio=20 MathWorks =20 From: AUDITORY - Research in Auditory Perception = <AUDITORY@xxxxxxxx> On Behalf Of Tamar Regev Sent: mi=C3=A9rcoles, 3 de marzo de 2021 16:10 To: AUDITORY@xxxxxxxx Subject: [AUDITORY] Cut silence in beginning and end of speech = recordings automatically? =20 Hi all, =20 Does anyone know of a good way to automatically trim silent parts (which = may contain some minor background noise) at the beginning and end of = speech recordings? =20 Preferentially using Matlab but any other automatic way would work (we = want to run this on many sound files). =20 Thanks a lot! Tamar =20 ------=_NextPart_000_0007_01D711A2.E30CCCC0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <html xmlns:v=3D"urn:schemas-microsoft-com:vml" = xmlns:o=3D"urn:schemas-microsoft-com:office:office" = xmlns:w=3D"urn:schemas-microsoft-com:office:word" = xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" = xmlns=3D"http://www.w3.org/TR/REC-html40"><head><meta = http-equiv=3DContent-Type content=3D"text/html; charset=3Dutf-8"><meta = name=3DGenerator content=3D"Microsoft Word 12 (filtered = medium)"><style><!-- /* Font Definitions */ @xxxxxxxx {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @xxxxxxxx {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} @xxxxxxxx {font-family:Tahoma; panose-1:2 11 6 4 3 5 4 4 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:11.0pt; font-family:"Calibri","sans-serif";} a:link, span.MsoHyperlink {mso-style-priority:99; color:#0563C1; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:#954F72; text-decoration:underline;} span.EmailStyle17 {mso-style-type:personal; font-family:"Calibri","sans-serif"; color:windowtext;} span.EmailStyle18 {mso-style-type:personal-reply; font-family:"Calibri","sans-serif"; color:#1F497D;} .MsoChpDefault {mso-style-type:export-only; font-size:10.0pt;} @xxxxxxxx WordSection1 {size:8.5in 11.0in; margin:70.85pt 85.05pt 70.85pt 85.05pt;} div.WordSection1 {page:WordSection1;} --></style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext=3D"edit"> <o:idmap v:ext=3D"edit" data=3D"1" /> </o:shapelayout></xml><![endif]--></head><body lang=3DEN-US = link=3D"#0563C1" vlink=3D"#954F72"><div class=3DWordSection1><p = class=3DMsoNormal><span style=3D'color:#1F497D'>Hello = Tamar:<o:p></o:p></span></p><p class=3DMsoNormal><span = style=3D'color:#1F497D'><o:p>&nbsp;</o:p></span></p><p = class=3DMsoNormal><span style=3D'color:#1F497D'>The noise at the = beginning and ending of recorded waveforms is annoying.=C2=A0 It is = distracting to listeners and the waves should be carefully = trimmed.<o:p></o:p></span></p><p class=3DMsoNormal><span = style=3D'color:#1F497D'><o:p>&nbsp;</o:p></span></p><p = class=3DMsoNormal><span style=3D'color:#1F497D'>Edit by hand if at all = possible, especially if the waves are intended for intelligibility = testing and consist of nonsense syllables, words, or sentences.=C2=A0 = You=E2=80=99ll see that it is easy to inadvertently cut off = low-intensity speech sounds such as =E2=80=98s=E2=80=99, = =E2=80=98th=E2=80=99, etc.=C2=A0 Also, make sure to cut the wave only at = zero-crossings to avoid clicks. <o:p></o:p></span></p><p = class=3DMsoNormal><span = style=3D'color:#1F497D'><o:p>&nbsp;</o:p></span></p><p = class=3DMsoNormal><span style=3D'color:#1F497D'>Wave editing = isn=E2=80=99t difficult if you practice on a set of waves and listen = carefully to the results. =C2=A0I can=E2=80=99t imagine that automatic = editing methods can achieve this.=C2=A0 <o:p></o:p></span></p><p = class=3DMsoNormal><span = style=3D'color:#1F497D'><o:p>&nbsp;</o:p></span></p><p = class=3DMsoNormal><span style=3D'color:#1F497D'>Best wishes, = <o:p></o:p></span></p><p class=3DMsoNormal><span = style=3D'color:#1F497D'>Christine Rankovic, PhD<o:p></o:p></span></p><p = class=3DMsoNormal><span = style=3D'color:#1F497D'><o:p>&nbsp;</o:p></span></p><p = class=3DMsoNormal><span = style=3D'color:#1F497D'><o:p>&nbsp;</o:p></span></p><div><div = style=3D'border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in = 0in 0in'><p class=3DMsoNormal><b><span = style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span>= </b><span style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"'> = AUDITORY - Research in Auditory Perception = [mailto:AUDITORY@xxxxxxxx <b>On Behalf Of </b>Gabriele = Bunkheila<br><b>Sent:</b> Friday, March 05, 2021 5:20 AM<br><b>To:</b> = AUDITORY@xxxxxxxx<br><b>Subject:</b> Re: Cut silence in beginning = and end of speech recordings = automatically?<o:p></o:p></span></p></div></div><p = class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>Hi = Tamar,<o:p></o:p></p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p = class=3DMsoNormal>Since you mentioned MATLAB, I thought I=E2=80=99d = share a couple of pointers. A good fit for this would be detectSpeech = (<a = href=3D"https://www.mathworks.com/help/audio/ref/detectspeech.html">https= ://www.mathworks.com/help/audio/ref/detectspeech.html</a>), which uses a = fairly accessible algorithm based on short-term energy and spectral = spread. detectSpeech has been available in Audio Toolbox since release = R2020a.<o:p></o:p></p><p class=3DMsoNormal><o:p>&nbsp;</o:p></p><p = class=3DMsoNormal>In case any of your data was more challenging, you = could consider trying the function classifySound (<a = href=3D"https://www.mathworks.com/help/audio/ref/classifysound.html">http= s://www.mathworks.com/help/audio/ref/classifysound.html</a>), which has = only been available since release R2020b and uses the pre-trained YAMNet = network under the hood.<o:p></o:p></p><p = class=3DMsoNormal><o:p>&nbsp;</o:p></p><p class=3DMsoNormal>I hope this = helps =E2=80=93 feel free to get in touch directly if you needed more = guidance. <br><br>Regards and good luck,<br>Gabriele.<o:p></o:p></p><p = class=3DMsoNormal><span lang=3DEN-GB>&nbsp;<o:p></o:p></span></p><p = class=3DMsoNormal><span lang=3DEN-GB = style=3D'font-size:10.0pt'>--<o:p></o:p></span></p><p = class=3DMsoNormal><span lang=3DEN-GB style=3D'font-size:10.0pt'>Gabriele = Bunkheila [he/him] =E2=80=93 Product Management, DSP and Audio = <o:p></o:p></span></p><p class=3DMsoNormal><span lang=3DES = style=3D'font-size:10.0pt'>MathWorks<o:p></o:p></span></p><p = class=3DMsoNormal><span lang=3DES><o:p>&nbsp;</o:p></span></p><div = style=3D'border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in = 0in 0in'><p class=3DMsoNormal><b>From:</b> AUDITORY - Research in = Auditory Perception &lt;AUDITORY@xxxxxxxx&gt; <b>On Behalf Of = </b>Tamar Regev<br><b>Sent:</b> mi=C3=A9rcoles, 3 de marzo de 2021 = 16:10<br><b>To:</b> AUDITORY@xxxxxxxx<br><b>Subject:</b> = [AUDITORY] Cut silence in beginning and end of speech recordings = automatically?<o:p></o:p></p></div><p = class=3DMsoNormal><o:p>&nbsp;</o:p></p><div><p class=3DMsoNormal><span = lang=3DES>Hi all,<o:p></o:p></span></p><div><p class=3DMsoNormal><span = lang=3DES><o:p>&nbsp;</o:p></span></p></div><div><p = class=3DMsoNormal><span lang=3DES>Does anyone know of a good way to = automatically trim silent parts (which may contain some&nbsp;minor = background noise) at the beginning and end of speech = recordings?<o:p></o:p></span></p></div><div><p class=3DMsoNormal><span = lang=3DES><o:p>&nbsp;</o:p></span></p></div><div><p = class=3DMsoNormal><span lang=3DES>Preferentially using Matlab but any = other automatic way would work (we want to run this on <u>many</u> sound = files).<o:p></o:p></span></p></div><div><p class=3DMsoNormal><span = lang=3DES><o:p>&nbsp;</o:p></span></p></div><div><p = class=3DMsoNormal><span lang=3DES>Thanks a = lot!<o:p></o:p></span></p></div><div><p class=3DMsoNormal><span = lang=3DES>Tamar<o:p></o:p></span></p></div><div><p = class=3DMsoNormal><span = lang=3DES><o:p>&nbsp;</o:p></span></p></div></div></div></body></html> ------=_NextPart_000_0007_01D711A2.E30CCCC0--


This message came from the mail archive
src/postings/2021/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University