Dynamic Noise Embedding: Noise Aware Training and Adaptation for Speech Enhancement

Cited 6 time in webofscience Cited 0 time in scopus
  • Hit : 85
  • Download : 0
Estimating noise information exactly is crucial for noise aware training in speech applications including speech enhancement (SE) which is our focus in this paper. To estimate noise-only frames, we employ voice activity detection (VAD) to detect non-speech frames by applying optimal threshold on speech posterior. Here, the non-speech frames can be regarded as noise-only frames in noisy signal. These estimated frames are used to extract noise embedding, named dynamic noise embedding (DNE), which is useful for an SE module to capture the characteristic of background noise. The DNE is extracted by a simple neural network, and the SE module with the DNE can be jointly trained to be adaptive to the environment. Experiments are conducted on TIMIT dataset for single-channel denoising task and U-Net is used as a backbone SE module. Experimental results show that the DNE plays an important role in the SE module by increasing the quality and the intelligibility of corrupted signal even if the noise is non-stationary and unseen in training. In addition, we demonstrate that the DNE can be flexibly applied to other neural network-based SE modules.
Publisher
IEEE
Issue Date
2020-12
Language
English
Citation

2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020, pp.739 - 746

ISSN
2309-9402
URI
http://hdl.handle.net/10203/288418
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 6 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0