DSpace at KOASAS: Efficient adversarial audio synthesis via progressive upsampling

DSpace at KOASAS

RIMS Collection RIMS Conference Papers

Efficient adversarial audio synthesis via progressive upsampling

Cited 0 time in webofscience

Cited 0 time in

Hit : 88
Download : 0

Export

Cho, Youngwoo researcher / Chang, Minwook / Lee, Sanghyeon / Lee, Hyoungwoo / Kim, Gerard-jounghyun / Choo, Jaegul researcher

This paper proposes a novel generative model called PUGAN, which progressively synthesizes high-quality audio in a raw waveform. Progressive upsampling GAN (PUGAN) leverages the progressive generation of higher-resolution output by stacking multiple encoder-decoder architectures. Compared to an existing state-of-the-art model called WaveGAN, which uses a single decoder architecture, our model generates audio signals and converts them to a higher resolution in a progressive manner, while using a significantly smaller number of parameters, e.g., 3.17x smaller for 16 kHz output, than WaveGAN. Our experiments show that the audio signals can be generated in real time with a comparable quality to that of WaveGAN in terms of the inception scores and human perception.

Publisher: Institute of Electrical and Electronics Engineers Inc.

Issue Date: 2021-06-06

Language: English

Citation: 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021, pp.3410 - 3414

ISSN: 1520-6149

DOI: 10.1109/ICASSP39728.2021.9413954

URI: http://hdl.handle.net/10203/290641

Appears in Collection: RIMS Conference Papers

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Efficient adversarial audio synthesis via progressive upsampling

KOASAS

Communities & Collections