Few-Shot and Zero-Shot Learning for Music Information Retrieval#

By Yu Wang, Hugo Flores García, and Jeong Choi

Welcome to the web book written for our tutorial at ISMIR 2022. This is shared under Creative Commons BY-NC-SA 4.0.

Overview#

In this tutorial, we will go over

  1. Foundations of few-shot learning (FSL) and zero-shot learning (ZSL) - Gerenal introduction including task definition and existing approaches.

  2. Coding examples - Showcasing the training and evaluation pipeline of FSL and ZSL models on specific MIR tasks.

  3. Recent advances of FSL/ZSL in music information retrieval (MIR) - Discussing the techniques used in these works together with their findings and contributions

  4. Remaining challenges and future directions

We aim for this tutorial to be useful to researchers and practitioners in the ISMIR community who are facing labeled data scarcity issues, looking for new interaction paradigms between users and MIR systems, or generally interested in the techniques and applications of FSL and ZSL. We assume the audience is familiar with the basic machine learning concepts.

About the authors#

Yu Wang is a Ph.D. candidate in Music Technology at the Music and Audio Research Laboratory at New York University, working under Prof. Juan Pablo Bello. Her research interests focus on machine learning and signal processing for music and general audio. Specifically, she is interested in adaptive and interactive machine listening with minimal supervision. She has interned with Adobe Research, Spotify, and Google magenta. Japanese math rock is her current favorite music genre.

Hugo Flores García is a Ph.D. candidate in Computer Science at Northwestern University, working under Prof. Bryan Pardo in the Interactive Audio Lab. Hugo’s research interests lie at the intersection of machine learning, signal processing, and human computer interaction for music and audio. Hugo has previously worked on a deep learning framework for Audacity, an open source audio editor, and has interned with Spotify and Descript. Hugo holds an B.S. in Electrical Engineering from Georgia Southern University (2020). He is a jazz guitarist, and can be seen playing with various groups local to the Chicago area. Hugo enjoys augmenting musical instruments with technology, as well as making interactive music and art in SuperCollider and Max/MSP.

Jeong Choi is a machine learning researcher at Naver, where he leads NOW AI team that’s working on a multi-modal recommendation system for a video streaming service, Naver NOW. Before joining Naver, he was a researcher at NCSOFT, working on a recommedation system in a music game FUSER. He also interned at Deezer Research. He received a M.S. in Culture Technology at Music and Audio Computing Lab. of Korea Advance Institute of Science and Technology, under the supervision of Prof. Juhan Nam. His research interest is on representational learning of various signals that can further contribute to diverse music recommendation strategies.

Referencing this book#

@book{music-fsl-zsl:book,
	Author = {Yu Wang and Hugo Flores García and Jeong Choi},
	Month = Dec.,
	Publisher = {https://music-fsl-zsl.github.io/tutorial},
	Title = {Few-Shot and Zero-Shot Learning for Music Information Retrieval},
	Year = 2022,
	Url = {https://music-fsl-zsl.github.io/tutorial}
}