DSpace at KOASAS: PSEUDO-LABEL TRANSFER FROM FRAME-LEVEL TO NOTE-LEVEL IN A TEACHER-STUDENT FRAMEWORK FOR SINGING TRANSCRIPTION FROM POLYPHONIC MUSIC

DSpace at KOASAS

College of Liberal Arts and Convergence Science(인문사회융합과학대학)Graduate School of Culture Technology(문화기술대학원)GCT-Conference Papers(학술회의논문)

PSEUDO-LABEL TRANSFER FROM FRAME-LEVEL TO NOTE-LEVEL IN A TEACHER-STUDENT FRAMEWORK FOR SINGING TRANSCRIPTION FROM POLYPHONIC MUSIC

Cited 7 time in

Cited 0 time in

Hit : 89
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Kum, Sangeun	ko
dc.contributor.author	Lee, Jongpil	ko
dc.contributor.author	Kim, Keunhyoung Luke	ko
dc.contributor.author	Kim, Taehyoung	ko
dc.contributor.author	Nam, Juhan	ko
dc.date.accessioned	2022-09-27T10:00:18Z	-
dc.date.available	2022-09-27T10:00:18Z	-
dc.date.created	2022-09-14	-
dc.date.created	2022-09-14	-
dc.date.issued	2022-05-24	-
dc.identifier.citation	47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022, pp.796 - 800	-
dc.identifier.issn	1520-6149	-
dc.identifier.uri	http://hdl.handle.net/10203/298716	-
dc.description.abstract	Lack of large-scale note-level labeled data is the major obstacle to singing transcription from polyphonic music. We address the issue by using pseudo labels from vocal pitch estimation models given unlabeled data. The proposed method first converts the frame-level pseudo labels to note-level through pitch and rhythm quantization steps. Then, it further improves the label quality through self-training in a teacher-student framework. To validate the method, we conduct various experiment settings by investigating two vocal pitch estimation models as pseudo-label generators, two setups of teacher-student frameworks, and the number of iterations in self-training. The results show that the proposed method can effectively leverage large-scale unlabeled audio data and self-training with the noisy student model helps to improve performance. Finally, we show that the model trained with only unlabeled data has comparable performance to previous works and the model trained with additional labeled data achieves higher accuracy than the model trained with only labeled data.	-
dc.language	English	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	PSEUDO-LABEL TRANSFER FROM FRAME-LEVEL TO NOTE-LEVEL IN A TEACHER-STUDENT FRAMEWORK FOR SINGING TRANSCRIPTION FROM POLYPHONIC MUSIC	-
dc.type	Conference	-
dc.identifier.wosid	000864187901014	-
dc.identifier.scopusid	2-s2.0-85131239972	-
dc.type.rims	CONF	-
dc.citation.beginningpage	796	-
dc.citation.endingpage	800	-
dc.citation.publicationname	47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022	-
dc.identifier.conferencecountry	SI	-
dc.identifier.conferencelocation	Marina Bay Sands Expo & Convention Center	-
dc.identifier.doi	10.1109/ICASSP43922.2022.9747147	-
dc.contributor.localauthor	Nam, Juhan	-
dc.contributor.nonIdAuthor	Kum, Sangeun	-
dc.contributor.nonIdAuthor	Lee, Jongpil	-
dc.contributor.nonIdAuthor	Kim, Keunhyoung Luke	-
dc.contributor.nonIdAuthor	Kim, Taehyoung	-

Appears in Collection: GCT-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 7 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

PSEUDO-LABEL TRANSFER FROM FRAME-LEVEL TO NOTE-LEVEL IN A TEACHER-STUDENT FRAMEWORK FOR SINGING TRANSCRIPTION FROM POLYPHONIC MUSIC

This item is cited by other documents in WoS

KOASAS

Communities & Collections