DSpace at KOASAS: Speech and music pitch trajectory classification using recurrent neural networks for monaural speech segregation

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Journal Papers(저널논문)

Speech and music pitch trajectory classification using recurrent neural networks for monaural speech segregation

Cited 3 time in

Cited 3 time in

Hit : 390
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Kim, Han-Gyu	ko
dc.contributor.author	Jang, Gil-Jin	ko
dc.contributor.author	Oh, Yung-Hwan	ko
dc.contributor.author	Choi, Ho-Jin	ko
dc.date.accessioned	2020-10-15T00:55:11Z	-
dc.date.available	2020-10-15T00:55:11Z	-
dc.date.created	2019-07-01	-
dc.date.issued	2020-10	-
dc.identifier.citation	JOURNAL OF SUPERCOMPUTING, v.76, no.10, pp.8193 - 8213	-
dc.identifier.issn	0920-8542	-
dc.identifier.uri	http://hdl.handle.net/10203/276584	-
dc.description.abstract	In this paper, we propose speech/music pitch classification based on recurrent neural network (RNN) for monaural speech segregation from music interferences. The speech segregation methods in this paper exploit sub-band masking to construct segregation masks modulated by the estimated speech pitch. However, for speech signals mixed with music, speech pitch estimation becomes unreliable, as speech and music have similar harmonic structures. In order to remove the music interference effectively, we propose an RNN-based speech/music pitch classification. Our proposed method models the temporal trajectories of speech and music pitch values and determines an unknown continuous pitch sequence as belonging to either speech or music. Among various types of RNNs, we chose simple recurrent network, long short-term memory (LSTM), and bidirectional LSTM for pitch classification. The experimental results show that our proposed method significantly outperforms the baseline methods for speech–music mixtures without loss of segregation performance for speech-noise mixtures.	-
dc.language	English	-
dc.publisher	SPRINGER	-
dc.title	Speech and music pitch trajectory classification using recurrent neural networks for monaural speech segregation	-
dc.type	Article	-
dc.identifier.wosid	000569152500032	-
dc.identifier.scopusid	2-s2.0-85077145665	-
dc.type.rims	ART	-
dc.citation.volume	76	-
dc.citation.issue	10	-
dc.citation.beginningpage	8193	-
dc.citation.endingpage	8213	-
dc.citation.publicationname	JOURNAL OF SUPERCOMPUTING	-
dc.identifier.doi	10.1007/s11227-019-02785-x	-
dc.contributor.localauthor	Oh, Yung-Hwan	-
dc.contributor.localauthor	Choi, Ho-Jin	-
dc.contributor.nonIdAuthor	Kim, Han-Gyu	-
dc.contributor.nonIdAuthor	Jang, Gil-Jin	-
dc.description.isOpenAccess	N	-
dc.type.journalArticle	Article	-
dc.subject.keywordAuthor	Speech segregation	-
dc.subject.keywordAuthor	Speech pitch estimation	-
dc.subject.keywordAuthor	Pitch classification	-
dc.subject.keywordAuthor	Recurrent neural network	-
dc.subject.keywordAuthor	Long short-term memory	-
dc.subject.keywordAuthor	Bidirectional long short-term memory	-
dc.subject.keywordPlus	SEPARATION	-

Appears in Collection: CS-Journal Papers(저널논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 3 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Speech and music pitch trajectory classification using recurrent neural networks for monaural speech segregation

This item is cited by other documents in WoS

KOASAS

Communities & Collections