Voice imitation based on speaker adaptivemulti-speaker speech synthesis model화자 적응형 다화자 음성합성 모델을 이용한 새로운 화자의 음성모사

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 394
  • Download : 0
Recently-developed deep learning based text-to-speech (TTS) models showed promising performance and possibility of multi-speaker TTS. However, current multi-speaker TTS models are not easily extensible to new speaker's voice and requires much time to retrain the model with new speaker's voice. Our approach can instantly imitate new speaker's voice using speaker adaptation technique. We proposed novel network architecture to enable this task and generated speech samples that are comparable to the existing multi-speaker TTS model. Moreover, we made improvement on baseline TTS model Tacotron by introducing additional connections. We also proposed and demonstrated a way to train speaker embedding to generate arbitrary voices by tuning its value.
Advisors
Lee, Soo Youngresearcher이수영researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2018
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2018.2,[iv, 35 p. :]

Keywords

peech synthesis▼aext-to-speech▼aulti-speaker text-to-speech▼aeep learning▼aeural Network; 성 합성▼a화자 음성 합성▼a화 학습▼a경회로망▼a자 적응

URI
http://hdl.handle.net/10203/266891
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=824433&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0