[AUDITORY] AI and audio discussion

Subject: [AUDITORY] AI and audio discussion

From: Sarah Plovnick <sarah.plovnick@xxxxxxxxxxxx>

Date: Wed, 17 Nov 2021 10:54:39 -0800

Approved-by: sarah.plovnick@xxxxxxxxxxxx

Arc-authentication-results: i=1; mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.104 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-archive:list-owner:list-subscribe:list-unsubscribe:list-help :precedence:to:subject:from:sender:reply-to:date:message-id :mime-version:approved-by; bh=VX6/OifTX6R88GHX/L7s5r4kdSODg2YkfyvFBl37MLE=; b=OO6xflC1xTRfapyNCPb0OhvdmobpqpAfEd+RG1I5SMh866zDRT2Mw6uIAs10MuTrbt fsQBmPcbAZMaV2ZBwj76cui/EfJ0jTD0AvWmLvJpXuZNsA8/XuSY0f8ALva8ZW4K4wsW VGZEnTYTUpTOA8I3VvzcVAJHKhlsuC/2KuaRkVv2cIlQ6cHFBcc1foxdpUKyKpBfwTkm OjGxAtNORVLPi6HyfuO3vSmmd17qFghQ8A25AeIgRimeq+S4SdWoQxVjLu/FHz1bzdFw KHQie1zwemizqQygjnV/wqoo42umCx/T8yiL5ge1FP65BYoeHFu8jsfcfL1p4fZ85Fu/ zlCg==

Arc-seal: i=1; a=rsa-sha256; t=1637216004; cv=none; d=google.com; s=arc-20160816; b=YXP0oqi5u/Zu4lWFcWxsrxU276AbmCEmoAkrmSPZrh+7MGCVxJXcM9taFGjlmrheua +outpSlvR09NgisgUpznrUfuY127MebE+uSoC2bBPV7lR5MmX4zIPcYpuDRAkHC+EKfy b22/NPXz6fj85qV2HBI+65wDj65sgHSLYEUpyU5drVVXNEv8XiCPUqUUGLxvpw0xGdl/ CK4d6XZK+cc19pF5AIdaQEeIfDOP4/a1cS3HHVyDBSgNb88F3VOv55lXMjMVH5eqJEqL Z12USIoUTDi82XgbLtVMOGW96k3LPhsvP/SQjqqRyhPyo0uUSBJrM3zK4cBntjsHb3V8 PAwA==

Authentication-results: mx.google.com; spf=pass (google.com: domain of owner-auditory@xxxxxxxxxxxxxxx designates 132.206.27.104 as permitted sender) smtp.mailfrom=owner-auditory@xxxxxxxxxxxxxxx

Delivered-to: dan.ellis@xxxxxxxxx

List-archive: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

List-help: <https://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>, <mailto:LISTSERV@LISTS.MCGILL.CA?body=INFO%20AUDITORY>

List-owner: <mailto:AUDITORY-request@LISTS.MCGILL.CA>

List-subscribe: <mailto:AUDITORY-subscribe-request@LISTS.MCGILL.CA>

List-unsubscribe: <mailto:AUDITORY-unsubscribe-request@LISTS.MCGILL.CA>

Reply-to: Sarah Plovnick <sarah.plovnick@xxxxxxxxxxxx>

Sender: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

Hi all,

I'm a PhD candidate in ethnomusicology at UC Berkeley, and I have a research question with which I'm hoping you can help. My research focuses on sound communication between Taiwan and China, and especially how sound gets around censorship mechanisms. I'm trying to understand whether there are any technological reasons why audio communication might be more difficult to censor than visual communication. Since AI is central in many censorship tools, I am especially interested in the unique challenges of using sound data with AI.

My understanding thus far is that there were certain developments around 2010 which prompted the use of GPUs for AI, and led to huge breakthroughs in AI applications in industry. My question is whether the switch to GPUs also led to a greater focus on visual data because the physical architecture of the GPUs lend themselves better to visual rather than audio data. Thoughts on this topic? Is there visual bias in AI research? If so, is this bias technological, or cultural? What are some of the unique challenges of using AI technologies with sound data?

Looking forward to hearing your candid feedback, and thanks to Justin Salamon for pointing me toward this listserv.

Best,

Sarah