POLYPHONIC SOUND EVENT DETECTION USING CONVOLUTIONAL BIDIRECTIONAL LSTM AND SYNTHETIC DATA-BASED TRANSFER LEARNING

Cited 25 time in webofscience Cited 15 time in scopus
  • Hit : 264
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorJung, Seokwonko
dc.contributor.authorPark, Jungbaeko
dc.contributor.authorLee, Sangwanko
dc.date.accessioned2020-06-26T03:21:04Z-
dc.date.available2020-06-26T03:21:04Z-
dc.date.created2020-06-17-
dc.date.created2020-06-17-
dc.date.issued2019-05-
dc.identifier.citation44th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.885 - 889-
dc.identifier.issn1520-6149-
dc.identifier.urihttp://hdl.handle.net/10203/274942-
dc.description.abstractThis paper presents a novel approach to improve the performance of polyphonic sound event detection that combines a convolutional bidirectional recurrent neural network (CBRNN) with transfer learning. The ordinary convolutional recurrent neural network (CRNN) is known to suffer from a vanishing gradient problem, which significantly reduces the efficiency of information transfer to past events. To resolve this issue, we combine forward and backward long short-term memory (LSTM) modules and demonstrate that they complement each other. To effectively deal with the issue of overfitting that arises from increased model complexity, we apply transfer learning with a dataset that contains synthesized artifacts. We show that the model achieves faster and better performance with less data. Simulations with the 2016 TUT dataset show that the performance of the CBRNN with transfer learning is dramatically improved compared to the ordinary CRNN; the F1 score was 28.4% higher, and the error rate was 0.42 lower.-
dc.languageEnglish-
dc.publisherIEEE-
dc.titlePOLYPHONIC SOUND EVENT DETECTION USING CONVOLUTIONAL BIDIRECTIONAL LSTM AND SYNTHETIC DATA-BASED TRANSFER LEARNING-
dc.typeConference-
dc.identifier.wosid000482554001023-
dc.identifier.scopusid2-s2.0-85068970982-
dc.type.rimsCONF-
dc.citation.beginningpage885-
dc.citation.endingpage889-
dc.citation.publicationname44th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)-
dc.identifier.conferencecountryUS-
dc.identifier.conferencelocationBrighton, ENGLAND-
dc.identifier.doi10.1109/ICASSP.2019.8682909-
dc.contributor.localauthorLee, Sangwan-
dc.contributor.nonIdAuthorJung, Seokwon-
dc.contributor.nonIdAuthorPark, Jungbae-
Appears in Collection
BiS-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 25 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0