All-in-One Metrical and Functional Structure Analysis with Neighborhood Attentions on Demixed Audio

Cited 1 time in webofscience Cited 0 time in scopus
  • Hit : 39
  • Download : 0
Music is characterized by complex hierarchical structures. Developing a comprehensive model to capture these structures has been a significant challenge in the field of Music Information Retrieval (MIR). Prior research has mainly focused on addressing individual tasks for specific hierarchical levels, rather than providing a unified approach. In this paper, we introduce a versatile, all-in-one model that jointly performs beat and downbeat tracking as well as functional structure segmentation and labeling. The model leverages source-separated spectrograms as inputs and employs dilated neighborhood attentions to capture temporal long-term dependencies, along with non-dilated attentions for local instrumental dependencies. Consequently, the proposed model achieves state-of-the-art performance in all four tasks on the Harmonix Set while maintaining a relatively lower number of parameters compared to recent state-of-the-art models. Furthermore, our ablation study demonstrates that the concurrent learning of beats, downbeats, and segments can lead to enhanced performance, with each task mutually benefiting from the others.
Publisher
Institute of Electrical and Electronics Engineers Inc.
Issue Date
2023-10-24
Language
English
Citation

2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023

ISSN
1931-1168
DOI
10.1109/WASPAA58266.2023.10248148
URI
http://hdl.handle.net/10203/317170
Appears in Collection
GCT-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 1 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0