Deformable CNN and Imbalance-Aware Feature Learning for Singing Technique Classification

Cited 3 time in webofscience Cited 0 time in scopus
  • Hit : 123
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorYamamoto, Yuyako
dc.contributor.authorNam, Juhanko
dc.contributor.authorTerasawa, Hirokoko
dc.date.accessioned2022-12-03T02:00:36Z-
dc.date.available2022-12-03T02:00:36Z-
dc.date.created2022-12-02-
dc.date.created2022-12-02-
dc.date.issued2022-09-21-
dc.identifier.citation23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, pp.2778 - 2782-
dc.identifier.issn2308-457X-
dc.identifier.urihttp://hdl.handle.net/10203/301509-
dc.description.abstractSinging techniques are used for expressive vocal performances by employing temporal fluctuations of the timbre, the pitch, and other components of the voice. Their classification is a challenging task, because of mainly two factors: 1) the fluctuations in singing techniques have a wide variety and are affected by many factors and 2) existing datasets are imbalanced. To deal with these problems, we developed a novel audio feature learning method based on deformable convolution with decoupled training of the feature extractor and the classifier using a class-weighted loss function. The experimental results show the following: 1) the deformable convolution improves the classification results, particularly when it is applied to the last two convolutional layers, and 2) both re-training the classifier and weighting the cross-entropy loss function by a smoothed inverse frequency enhance the classification performance.-
dc.languageEnglish-
dc.publisherInternational Speech Communication Association-
dc.titleDeformable CNN and Imbalance-Aware Feature Learning for Singing Technique Classification-
dc.typeConference-
dc.identifier.wosid000900724502190-
dc.identifier.scopusid2-s2.0-85140085822-
dc.type.rimsCONF-
dc.citation.beginningpage2778-
dc.citation.endingpage2782-
dc.citation.publicationname23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022-
dc.identifier.conferencecountryKO-
dc.identifier.conferencelocationIncheon-
dc.identifier.doi10.21437/Interspeech.2022-11137-
dc.contributor.localauthorNam, Juhan-
dc.contributor.nonIdAuthorYamamoto, Yuya-
dc.contributor.nonIdAuthorTerasawa, Hiroko-
Appears in Collection
GCT-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 3 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0