Vowel based Voice Activity Detection with LSTM Recurrent Neural Network

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 677
  • Download : 0
Voice activity detection (VAD) determines whether the incoming signal segments are speech or noiseand is an important technique in almost all of speech-related applications. In order to improve VAD performance in various noise environments, characterizing the speech feature has been the most crucial issue up to date. Among several proposed speech features, the context information of speech through time and vowel sound characteristics are known to current state-of-the-art speech features. Therefore, in order to reflect both on these merits, we propose vowel based VAD by Long short term memory recurrent neural network (LSTM-RNN). LSTM-RNN is known to the powerful model to capture dynamical context information through time. Moreover, with teaching the LSTM-RNN to only vowel sounds rather than whole speech, LSTM-RNN can learn more effectively because of the reduced manifold of speech. According to our experiments, proposed method shows better accuracy not only in the VAD task compared to LSTM-RNN based VAD but alsoa vowel detection task.
Publisher
Association for Computing Machinery
Issue Date
2016-11-22
Language
English
Citation

8th International Conference on Signal Processing Systems, ICSPS 2016, pp.134 - 137

DOI
10.1145/3015166.3015207
URI
http://hdl.handle.net/10203/222538
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0