Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference

Cited 6 time in webofscience Cited 0 time in scopus
  • Hit : 363
  • Download : 562
Speech segmentation is a crucial step in automatic speech recognition because additional speech analyses are performed for each framed speech segment. Conventional segmentation techniques primarily segment speech using a fixed frame size for computational simplicity. However, this approach is insufficient for capturing the quasi-regular structure of speech, which causes substantial recognition failure in noisy environments. How does the brain handle quasi-regular structured speech and maintain high recognition performance under any circumstance? Recent neurophysiological studies have suggested that the phase of neuronal oscillations in the auditory cortex contributes to accurate speech recognition by guiding speech segmentation into smaller units at different timescales. A phase-locked relationship between neuronal oscillation and the speech envelope has recently been obtained, which suggests that the speech envelope provides a foundation for multi-timescale speech segmental information. In this study, we quantitatively investigated the role of the speech envelope as a potential temporal reference to segment speech using its instantaneous phase information. We evaluated the proposed approach by the achieved information gain and recognition performance in various noisy environments. The results indicate that the proposed segmentation scheme not only extracts more information from speech but also provides greater robustness in a recognition test.
Publisher
NATURE PUBLISHING GROUP
Issue Date
2016-11
Language
English
Article Type
Article
Keywords

HUMAN AUDITORY-CORTEX; FRAME RATE ANALYSIS; PHASE; PATTERNS; INTELLIGIBILITY; OSCILLATIONS; COMPREHENSION; MECHANISMS; CONSONANTS; LISTENERS

Citation

SCIENTIFIC REPORTS, v.6

ISSN
2045-2322
DOI
10.1038/srep37647
URI
http://hdl.handle.net/10203/218316
Appears in Collection
BiS-Journal Papers(저널논문)
Files in This Item
Brain-inspired speech segmentation for automati...(2.48 MB)Download
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 6 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0