Weavspeech: Data Augmentation Strategy For Automatic Speech Recognition Via Semantic-Aware Weaving

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 44
  • Download : 0
A cut-and-paste type of data augmentation strategy has attracted considerable attention in the vision community due to its simplicity and effectiveness in improving generalization performance. However, it is challenging for Automatic Speech Recognition (ASR) tasks to apply this type of augmentation since segments corresponding to specific output tokens (e.g. words or sub-words) have various lengths. Furthermore, if speech signals are indiscriminately mixed without considering semantics, the risk of generating nonsensical sentences arises. To address these issues, in this paper, we propose WeavSpeech, still a simple yet effective cut-and-paste augmentation method for ASR tasks that weaves a pair of speech data considering semantics. Our method can be applied to any language without requiring language-specific knowledge and seamlessly integrated with other verified augmentations. We validate the superiority of our method on representative ASR benchmark datasets, including LibriSpeech and WSJ.
Publisher
IEEE Signal Processing Society
Issue Date
2023-06-04
Language
English
Citation

48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023

DOI
10.1109/icassp49357.2023.10097196
URI
http://hdl.handle.net/10203/316311
Appears in Collection
AI-Conference Papers(학술대회논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0