Style-based audio-driven talking head generation스타일 기반의 음성에 따른 얼굴 비디오 생성

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 165
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorHwang, Sung Ju-
dc.contributor.advisor황성주-
dc.contributor.authorSong, Minyoung-
dc.date.accessioned2023-06-22T19:31:29Z-
dc.date.available2023-06-22T19:31:29Z-
dc.date.issued2022-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=997676&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/308230-
dc.description학위논문(석사) - 한국과학기술원 : 김재철AI대학원, 2022.2,[iii, 17 p. :]-
dc.description.abstractWhile audio-driven talking head generation has achieved highly realistic multi-speaker generation, previous works rely on predefined additional data such as 3D model parameters, landmarks, and head pose angles. However, these explicit supervisions are expensive as scanning 3D models require special devices in a controlled lab environment, and landmarks are a manual annotation. In this paper, we propose a novel multi-speaker talking video generation framework that does not use any predefined prior for the first time. We first design a novel style code manipulator that explores the latent space of pretrained StyleGAN3 and generates a sequence of style codes within the distribution of the generator. In this way, we achieve identity-preserving head pose matching without any support of predefined supervision. Furthermore, by leveraging the power of StyleGAN3, our framework achieves high-quality video generation. Finally, we adopt sync loss, computed from an expert discriminator that maps audio and visual features to unified space, for better lip synchronization. Our framework is fully unsupervised since we do not include any model trained with additional data. Experimental results show that our method can generate high-quality video results and show competitive performance with the state-of-the-art methods that use supervision.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.titleStyle-based audio-driven talking head generation-
dc.title.alternative스타일 기반의 음성에 따른 얼굴 비디오 생성-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :김재철AI대학원,-
dc.contributor.alternativeauthor송민영-
Appears in Collection
AI-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0