Grad-StyleSpeech: Any-speaker Adaptive Text-To-Speech Synthesis with Diffusion Models

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 26
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorKang, Minkiko
dc.contributor.authorMin, Dongchanko
dc.contributor.authorHwang, Sung Juko
dc.date.accessioned2023-12-12T07:01:01Z-
dc.date.available2023-12-12T07:01:01Z-
dc.date.created2023-12-10-
dc.date.issued2023-06-06-
dc.identifier.citation48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023-
dc.identifier.urihttp://hdl.handle.net/10203/316286-
dc.description.abstractThere has been a significant progress in Text-To-Speech (TTS) synthesis technology in recent years, thanks to the advancement in neural generative modeling. However, existing methods on any-speaker adaptive TTS have achieved unsatisfactory performance, due to their suboptimal accuracy in mimicking the target speakers’ styles. In this work, we present Grad-StyleSpeech, which is an any-speaker adaptive TTS framework that is based on a diffusion model that can generate highly natural speech with extremely high similarity to target speakers’ voice, given a few seconds of reference speech. Grad-StyleSpeech significantly outperforms recent speaker-adaptive TTS baselines on English benchmarks. Audio samples are available at https://nardien.github.io/grad-stylespeech-demo.-
dc.languageEnglish-
dc.publisherIEEE Signal Processing Society-
dc.titleGrad-StyleSpeech: Any-speaker Adaptive Text-To-Speech Synthesis with Diffusion Models-
dc.typeConference-
dc.identifier.scopusid2-s2.0-85177568036-
dc.type.rimsCONF-
dc.citation.publicationname48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023-
dc.identifier.conferencecountryGR-
dc.identifier.conferencelocationRhodes Island-
dc.identifier.doi10.1109/ICASSP49357.2023.10095515-
dc.contributor.localauthorHwang, Sung Ju-
dc.contributor.nonIdAuthorKang, Minki-
Appears in Collection
AI-Conference Papers(학술대회논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0