Effective and compact neural autoregressive models for piano music transcription피아노 음악 채보를 위한 효과적이고 간결한 자기회귀 신경망 모델

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 1
  • Download : 0
In this dissertation, I focus on autoregressive model among neural network-based automatic transcription models. The piano has a characteristic that all sounds are generated only by the note onset and the continuation of the note that occurred in advance, so it is expected that the autoregressive model will have an advantage in inducing a causal relationship in frame-by-frame prediction. I designed the autoregressive prediction model based on a model combining acoustic module and music language module. In order to take advantage of the characteristics of the autoregressive model, a model capable of real-time operation was designed using a unidirectional RNN, and methods to overcome the disadvantages of the autoregressive model, which receives less information and is vulnerable to exposure bias compared to models using a bidirectional RNN, were suggested. For stable learning, I propose a network and learning method that expresses the states of notes in more detail and effectively utilizes recursive information. In addition to this, I induce the model to learn the invariance of the pitch shifting of the piano and the independence of each pitch. To this end, in the acoustic module, neurons are separated for each pitch, and each pitch is processed through a shared network. The music language model is also simplified to model the state progression of each pitch note. As a result, it was shown that the autoregressive model can also produce high performance when appropriately adjusted, and the hypothetically presented factors also showed an effect on performance improvement. In order to confirm the practical performance of the proposed model, the model was verified with multiple datasets with varied recording environments. The effectiveness of the proposed elements were examined through a note-level detailed analysis. The proposed model operated in real time with low complexity and showed equivalent performance to non-real-time models.2018
Advisors
남주한researcher
Description
한국과학기술원 :문화기술대학원,
Publisher
한국과학기술원
Issue Date
2024
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 문화기술대학원, 2024.2,[x, 114 p. :]

Keywords

피아노 채보▼a딥러닝▼a자기회귀 모델; Piano transcription▼aDeep learning▼aAutoregressive model

URI
http://hdl.handle.net/10203/321996
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1098151&flag=dissertation
Appears in Collection
GCT-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0