Efficient adversarial audio synthesis via progressive upsampling

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 40
  • Download : 0
This paper proposes a novel generative model called PUGAN, which progressively synthesizes high-quality audio in a raw waveform. Progressive upsampling GAN (PUGAN) leverages the progressive generation of higher-resolution output by stacking multiple encoder-decoder architectures. Compared to an existing state-of-the-art model called WaveGAN, which uses a single decoder architecture, our model generates audio signals and converts them to a higher resolution in a progressive manner, while using a significantly smaller number of parameters, e.g., 3.17x smaller for 16 kHz output, than WaveGAN. Our experiments show that the audio signals can be generated in real time with a comparable quality to that of WaveGAN in terms of the inception scores and human perception.
Publisher
Institute of Electrical and Electronics Engineers Inc.
Issue Date
2021-06-06
Language
English
Citation

2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021, pp.3410 - 3414

ISSN
1520-6149
DOI
10.1109/ICASSP39728.2021.9413954
URI
http://hdl.handle.net/10203/290641
Appears in Collection
RIMS Conference Papers
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0