Deep learning for vocal melody extraction보컬 멜로디 추출을 위한 딥러닝

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 165
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorNam, Juhan-
dc.contributor.advisor남주한-
dc.contributor.authorKum, Sangeun-
dc.date.accessioned2022-04-15T01:53:38Z-
dc.date.available2022-04-15T01:53:38Z-
dc.date.issued2021-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=956569&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/294535-
dc.description.abstractIn this thesis, we propose various deep learning (DL) based methods for vocal melody extraction. Vocal melody extraction is the task that identifies the melody pitch contour of the singing voice from multiple sources. Previous studies have been proposed as methods of calculating the pitch saliency from a spectrogram or isolating the melody source from the mixture. However, these methods have limitations in obtaining optimal outputs for various music. Although the performance of melody extraction has improved with the recent advances in DL, there are still limitations in terms of overall performance, the model using music-related knowledge and the lack of labeled data. Here we report the effective methods to estimate the pitch of melody and detect singing voice by introducing novel DL models and loss function. We also propose a multi-task network that allows pitch estimation and voice detection are tightly coupled. To address the lack of labeled data, we applied the semi-supervised learning that utilizes large amounts of unlabeled data. We explored the effects of three teacher-student model setups, data augmentation, unlabeled data, and proposed the most effective learning method for vocal melody extraction. In addition, we apply semi-supervised learning to the singing vocal detection and show that it can be extended to other MIR tasks that suffer from lack of labeled data.-
dc.languageeng-
dc.titleDeep learning for vocal melody extraction-
dc.title.alternative보컬 멜로디 추출을 위한 딥러닝-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :문화기술대학원,-
dc.description.isOpenAccess학위논문(박사) - 한국과학기술원 : 문화기술대학원, 2021.2,[iv, 75 p. :]-
dc.publisher.country한국과학기술원-
dc.type.journalArticleThesis(Ph.D)-
dc.contributor.alternativeauthor금상은-
dc.subject.keywordAuthorDeep Learning▼aVocal Melody Extraction▼aSinging Voice Detection▼aSemi-Supervised Learning▼aTeacher-Student Framework-
dc.subject.keywordAuthor딥러닝▼a보컬 멜로디 추출▼a음성 구간 탐지▼a반지도 학습▼a교사-학생 프레임워크-
Appears in Collection
GCT-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0