Repository logo
 

Interpreting CNN Models for Musical Instrument Recognition Using Multi-Spectrogram Heatmap Analysis: A Preliminary Study

aut.relation.journalFrontiers in Artificial Intelligence
aut.relation.startpage1499913
aut.relation.volume7
dc.contributor.authorChen, R
dc.contributor.authorGhobakhlou, Ali
dc.contributor.authorNarayanan, A
dc.date.accessioned2025-01-29T23:38:15Z
dc.date.available2025-01-29T23:38:15Z
dc.date.issued2024-12-18
dc.description.abstractIntroduction: Musical instrument recognition is a critical component of music information retrieval (MIR), aimed at identifying and classifying instruments from audio recordings. This task poses significant challenges due to the complexity and variability of musical signals. Methods: In this study, we employed convolutional neural networks (CNNs) to analyze the contributions of various spectrogram representations—STFT, Log-Mel, MFCC, Chroma, Spectral Contrast, and Tonnetz—to the classification of ten different musical instruments. The NSynth database was used for training and evaluation. Visual heatmap analysis and statistical metrics, including Difference Mean, KL Divergence, JS Divergence, and Earth Mover’s Distance, were utilized to assess feature importance and model interpretability. Results: Our findings highlight the strengths and limitations of each spectrogram type in capturing distinctive features of different instruments. MFCC and Log-Mel spectrograms demonstrated superior performance across most instruments, while others provided insights into specific characteristics. Discussion: This analysis provides some insights into optimizing spectrogram-based approaches for musical instrument recognition, offering guidance for future model development and improving interpretability through statistical and visual analyses.
dc.identifier.citationFrontiers in Artificial Intelligence, ISSN: 2624-8212 (Print); 2624-8212 (Online), Frontiers Media SA, 7, 1499913-. doi: 10.3389/frai.2024.1499913
dc.identifier.doi10.3389/frai.2024.1499913
dc.identifier.issn2624-8212
dc.identifier.issn2624-8212
dc.identifier.urihttp://hdl.handle.net/10292/18557
dc.languageeng
dc.publisherFrontiers Media SA
dc.relation.urihttps://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2024.1499913/full
dc.rights© 2024 Chen, Ghobakhlou and Narayanan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
dc.rights.accessrightsOpenAccess
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectconvolutional neural networks
dc.subjectfeature extraction
dc.subjectfeature maps
dc.subjectheatmaps
dc.subjectmusic information retrieval
dc.subjectmusical instrument recognition
dc.subjectpattern recognition
dc.subjectspectrogram analysis
dc.subject46 Information and Computing Sciences
dc.subject4602 Artificial Intelligence
dc.subject4611 Machine Learning
dc.subjectBioengineering
dc.subject4007 Control engineering, mechatronics and robotics
dc.subject4602 Artificial intelligence
dc.subject4611 Machine learning
dc.titleInterpreting CNN Models for Musical Instrument Recognition Using Multi-Spectrogram Heatmap Analysis: A Preliminary Study
dc.typeJournal Article
pubs.elements-id582094

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Interpreting CNN models for musical instrument recognition.pdf
Size:
1.57 MB
Format:
Adobe Portable Document Format
Description:
Evidence for verification