Hierarchical Residual Attention Network for Musical Instrument Recognition Using Scaled Multi-Spectrogram
| aut.relation.endpage | 10837 | |
| aut.relation.issue | 23 | |
| aut.relation.journal | Applied Sciences | |
| aut.relation.startpage | 10837 | |
| aut.relation.volume | 14 | |
| dc.contributor.author | Chen, Rujia | |
| dc.contributor.author | Ghobakhlou, Akbar | |
| dc.contributor.author | Narayanan, Ajit | |
| dc.date.accessioned | 2024-12-11T21:32:42Z | |
| dc.date.available | 2024-12-11T21:32:42Z | |
| dc.date.issued | 2024-11-22 | |
| dc.description.abstract | Musical instrument recognition is a relatively unexplored area of machine learning due to the need to analyze complex spatial–temporal audio features. Traditional methods using individual spectrograms, like STFT, Log-Mel, and MFCC, often miss the full range of features. Here, we propose a hierarchical residual attention network using a scaled combination of multiple spectrograms, including STFT, Log-Mel, MFCC, and CST features (Chroma, Spectral contrast, and Tonnetz), to create a comprehensive sound representation. This model enhances the focus on relevant spectrogram parts through attention mechanisms. Experimental results with the OpenMIC-2018 dataset show significant improvement in classification accuracy, especially with the “Magnified 1/4 Size” configuration. Future work will optimize CST feature scaling, explore advanced attention mechanisms, and apply the model to other audio tasks to assess its generalizability. | |
| dc.identifier.citation | Applied Sciences, ISSN: 2076-3417 (Print); 2076-3417 (Online), MDPI AG, 14(23), 10837-10837. doi: 10.3390/app142310837 | |
| dc.identifier.doi | 10.3390/app142310837 | |
| dc.identifier.issn | 2076-3417 | |
| dc.identifier.issn | 2076-3417 | |
| dc.identifier.uri | http://hdl.handle.net/10292/18449 | |
| dc.language | en | |
| dc.publisher | MDPI AG | |
| dc.relation.uri | https://www.mdpi.com/2076-3417/14/23/10837 | |
| dc.rights | © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). | |
| dc.rights.accessrights | OpenAccess | |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
| dc.subject | 46 Information and Computing Sciences | |
| dc.subject | 4611 Machine Learning | |
| dc.title | Hierarchical Residual Attention Network for Musical Instrument Recognition Using Scaled Multi-Spectrogram | |
| dc.type | Journal Article | |
| pubs.elements-id | 576448 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Hierarchical Residual Attention Network for Musical Instrument Recognition.pdf
- Size:
- 10.22 MB
- Format:
- Adobe Portable Document Format
- Description:
- Journal article
