Re: [AUDITORY] Cut silence in beginning and end of speech recordings automatically? (Olivier Lartillot )


Subject: Re: [AUDITORY] Cut silence in beginning and end of speech recordings automatically?
From:    Olivier Lartillot  <olartillot@xxxxxxxx>
Date:    Fri, 5 Mar 2021 12:16:09 +0100

--Apple-Mail=_E23E027C-1597-4B75-B15B-3D94536AE5C0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi, Still in MATLAB, using MIRtoolbox, you could simply write: a =3D miraudio('Folder','Trim=E2=80=99) mirsave(a) It would perform the operation on all files in the Folder automatically. = Using =E2=80=98Folders=E2=80=99 instead will also perform on subfolders = recursively. I hope it works OK. There is a =E2=80=98TrimThreshold=E2=80=99 parameter = also. = https://www.jyu.fi/hytk/fi/laitokset/mutku/en/research/materials/mirtoolbo= x = <https://www.jyu.fi/hytk/fi/laitokset/mutku/en/research/materials/mirtoolb= ox> All the best, Olivier > 5. mar. 2021 kl. 11:19 skrev Gabriele Bunkheila = <gbunkhei@xxxxxxxx>: >=20 > Hi Tamar, > =20 > Since you mentioned MATLAB, I thought I=E2=80=99d share a couple of = pointers. A good fit for this would be detectSpeech = (https://www.mathworks.com/help/audio/ref/detectspeech.html = <https://www.mathworks.com/help/audio/ref/detectspeech.html>), which = uses a fairly accessible algorithm based on short-term energy and = spectral spread. detectSpeech has been available in Audio Toolbox since = release R2020a. > =20 > In case any of your data was more challenging, you could consider = trying the function classifySound = (https://www.mathworks.com/help/audio/ref/classifysound.html = <https://www.mathworks.com/help/audio/ref/classifysound.html>), which = has only been available since release R2020b and uses the pre-trained = YAMNet network under the hood. > =20 > I hope this helps =E2=80=93 feel free to get in touch directly if you = needed more guidance.=20 >=20 > Regards and good luck, > Gabriele. > =20 > -- > Gabriele Bunkheila [he/him] =E2=80=93 Product Management, DSP and = Audio > MathWorks > =20 > From: AUDITORY - Research in Auditory Perception = <AUDITORY@xxxxxxxx <mailto:AUDITORY@xxxxxxxx>> On Behalf = Of Tamar Regev > Sent: mi=C3=A9rcoles, 3 de marzo de 2021 16:10 > To: AUDITORY@xxxxxxxx <mailto:AUDITORY@xxxxxxxx> > Subject: [AUDITORY] Cut silence in beginning and end of speech = recordings automatically? > =20 > Hi all, > =20 > Does anyone know of a good way to automatically trim silent parts = (which may contain some minor background noise) at the beginning and end = of speech recordings? > =20 > Preferentially using Matlab but any other automatic way would work (we = want to run this on many sound files). > =20 > Thanks a lot! > Tamar --Apple-Mail=_E23E027C-1597-4B75-B15B-3D94536AE5C0 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; = charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; line-break: after-white-space;" = class=3D"">Hi,<div class=3D""><br class=3D""></div><div class=3D"">Still = in MATLAB, using MIRtoolbox, you could simply write:</div><div = class=3D""><br class=3D""></div><div class=3D"">a =3D = miraudio('Folder','Trim=E2=80=99)</div><div = class=3D"">mirsave(a)</div><div class=3D""><br class=3D""></div><div = class=3D"">It would perform the operation on all files in the Folder = automatically. Using =E2=80=98Folders=E2=80=99 instead will also perform = on subfolders recursively.</div><div class=3D""><br class=3D""></div><div = class=3D"">I hope it works OK. There is a =E2=80=98TrimThreshold=E2=80=99 = parameter also.</div><div class=3D""><br class=3D""></div><div = class=3D""><a = href=3D"https://www.jyu.fi/hytk/fi/laitokset/mutku/en/research/materials/m= irtoolbox" = class=3D"">https://www.jyu.fi/hytk/fi/laitokset/mutku/en/research/material= s/mirtoolbox</a></div><div class=3D""><br class=3D""></div><div = class=3D"">All the best,</div><div class=3D""><br class=3D""></div><div = class=3D"">Olivier</div><div class=3D""><div><br class=3D""><blockquote = type=3D"cite" class=3D""><div class=3D"">5. mar. 2021 kl. 11:19 skrev = Gabriele Bunkheila &lt;<a href=3D"mailto:gbunkhei@xxxxxxxx" = class=3D"">gbunkhei@xxxxxxxx</a>&gt;:</div><br = class=3D"Apple-interchange-newline"><div class=3D""><div = class=3D"WordSection1" style=3D"page: WordSection1; caret-color: rgb(0, = 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; = font-variant-caps: normal; font-weight: normal; letter-spacing: normal; = text-align: start; text-indent: 0px; text-transform: none; white-space: = normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none;"><div style=3D"margin: 0cm; font-size: 11pt; = font-family: Calibri, sans-serif;" class=3D""><span lang=3D"EN-US" = class=3D"">Hi Tamar,<o:p class=3D""></o:p></span></div><div = style=3D"margin: 0cm; font-size: 11pt; font-family: Calibri, = sans-serif;" class=3D""><span lang=3D"EN-US" class=3D""><o:p = class=3D"">&nbsp;</o:p></span></div><div style=3D"margin: 0cm; = font-size: 11pt; font-family: Calibri, sans-serif;" class=3D""><span = lang=3D"EN-US" class=3D"">Since you mentioned MATLAB, I thought I=E2=80=99= d share a couple of pointers. A good fit for this would be detectSpeech = (<a href=3D"https://www.mathworks.com/help/audio/ref/detectspeech.html" = style=3D"color: rgb(5, 99, 193); text-decoration: underline;" = class=3D"">https://www.mathworks.com/help/audio/ref/detectspeech.html</a>)= , which uses a fairly accessible algorithm based on short-term energy = and spectral spread. detectSpeech has been available in Audio Toolbox = since release R2020a.<o:p class=3D""></o:p></span></div><div = style=3D"margin: 0cm; font-size: 11pt; font-family: Calibri, = sans-serif;" class=3D""><span lang=3D"EN-US" class=3D""><o:p = class=3D"">&nbsp;</o:p></span></div><div style=3D"margin: 0cm; = font-size: 11pt; font-family: Calibri, sans-serif;" class=3D""><span = lang=3D"EN-US" class=3D"">In case any of your data was more challenging, = you could consider trying the function classifySound (<a = href=3D"https://www.mathworks.com/help/audio/ref/classifysound.html" = style=3D"color: rgb(5, 99, 193); text-decoration: underline;" = class=3D"">https://www.mathworks.com/help/audio/ref/classifysound.html</a>= ), which has only been available since release R2020b and uses the = pre-trained YAMNet network under the hood.<o:p = class=3D""></o:p></span></div><div style=3D"margin: 0cm; font-size: = 11pt; font-family: Calibri, sans-serif;" class=3D""><span lang=3D"EN-US" = class=3D""><o:p class=3D"">&nbsp;</o:p></span></div><div style=3D"margin: = 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=3D""><span = lang=3D"EN-US" class=3D"">I hope this helps =E2=80=93 feel free to get = in touch directly if you needed more guidance.<span = class=3D"Apple-converted-space">&nbsp;</span><br class=3D""><br = class=3D"">Regards and good luck,<br class=3D"">Gabriele.<o:p = class=3D""></o:p></span></div><div style=3D"margin: 0cm; font-size: = 11pt; font-family: Calibri, sans-serif;" class=3D""><span lang=3D"EN-GB" = class=3D"">&nbsp;<o:p class=3D""></o:p></span></div><div style=3D"margin: = 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=3D""><span = lang=3D"EN-GB" style=3D"font-size: 10pt;" class=3D"">--<o:p = class=3D""></o:p></span></div><div style=3D"margin: 0cm; font-size: = 11pt; font-family: Calibri, sans-serif;" class=3D""><span lang=3D"EN-GB" = style=3D"font-size: 10pt;" class=3D"">Gabriele Bunkheila [he/him] =E2=80=93= Product Management, DSP and Audio<o:p class=3D""></o:p></span></div><div = style=3D"margin: 0cm; font-size: 11pt; font-family: Calibri, = sans-serif;" class=3D""><span style=3D"font-size: 10pt;" = class=3D"">MathWorks<o:p class=3D""></o:p></span></div><div = style=3D"margin: 0cm; font-size: 11pt; font-family: Calibri, = sans-serif;" class=3D""><span class=3D""><o:p = class=3D"">&nbsp;</o:p></span></div><div style=3D"border-style: solid = none none; border-top-width: 1pt; border-top-color: rgb(225, 225, 225); = padding: 3pt 0cm 0cm;" class=3D""><div style=3D"margin: 0cm; font-size: = 11pt; font-family: Calibri, sans-serif;" class=3D""><b class=3D""><span = lang=3D"EN-US" class=3D"">From:</span></b><span lang=3D"EN-US" = class=3D""><span class=3D"Apple-converted-space">&nbsp;</span>AUDITORY - = Research in Auditory Perception &lt;<a = href=3D"mailto:AUDITORY@xxxxxxxx" style=3D"color: rgb(5, 99, = 193); text-decoration: underline;" = class=3D"">AUDITORY@xxxxxxxx</a>&gt;<span = class=3D"Apple-converted-space">&nbsp;</span><b class=3D"">On Behalf = Of<span class=3D"Apple-converted-space">&nbsp;</span></b>Tamar Regev<br = class=3D""><b class=3D"">Sent:</b><span = class=3D"Apple-converted-space">&nbsp;</span>mi=C3=A9rcoles, 3 de marzo = de 2021 16:10<br class=3D""><b class=3D"">To:</b><span = class=3D"Apple-converted-space">&nbsp;</span><a = href=3D"mailto:AUDITORY@xxxxxxxx" style=3D"color: rgb(5, 99, = 193); text-decoration: underline;" = class=3D"">AUDITORY@xxxxxxxx</a><br class=3D""><b = class=3D"">Subject:</b><span = class=3D"Apple-converted-space">&nbsp;</span>[AUDITORY] Cut silence in = beginning and end of speech recordings automatically?<o:p = class=3D""></o:p></span></div></div><div style=3D"margin: 0cm; = font-size: 11pt; font-family: Calibri, sans-serif;" class=3D""><span = lang=3D"EN-US" class=3D""><o:p class=3D"">&nbsp;</o:p></span></div><div = class=3D""><div style=3D"margin: 0cm; font-size: 11pt; font-family: = Calibri, sans-serif;" class=3D"">Hi all,<o:p class=3D""></o:p></div><div = class=3D""><div style=3D"margin: 0cm; font-size: 11pt; font-family: = Calibri, sans-serif;" class=3D""><o:p = class=3D"">&nbsp;</o:p></div></div><div class=3D""><div style=3D"margin: = 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=3D"">Does = anyone know of a good way to automatically trim silent parts (which may = contain some&nbsp;minor background noise) at the beginning and end of = speech recordings?<o:p class=3D""></o:p></div></div><div class=3D""><div = style=3D"margin: 0cm; font-size: 11pt; font-family: Calibri, = sans-serif;" class=3D""><o:p class=3D"">&nbsp;</o:p></div></div><div = class=3D""><div style=3D"margin: 0cm; font-size: 11pt; font-family: = Calibri, sans-serif;" class=3D"">Preferentially using Matlab but any = other automatic way would work (we want to run this on<span = class=3D"Apple-converted-space">&nbsp;</span><u class=3D"">many</u><span = class=3D"Apple-converted-space">&nbsp;</span>sound files).<o:p = class=3D""></o:p></div></div><div class=3D""><div style=3D"margin: 0cm; = font-size: 11pt; font-family: Calibri, sans-serif;" class=3D""><o:p = class=3D"">&nbsp;</o:p></div></div><div class=3D""><div style=3D"margin: = 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=3D"">Thanks= a lot!<o:p class=3D""></o:p></div></div><div class=3D""><div = style=3D"margin: 0cm; font-size: 11pt; font-family: Calibri, = sans-serif;" = class=3D"">Tamar</div></div></div></div></div></blockquote></div><br = class=3D""></div></body></html>= --Apple-Mail=_E23E027C-1597-4B75-B15B-3D94536AE5C0--


This message came from the mail archive
src/postings/2021/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University