Coherence-based phonemic analysis on the effect of reverberation to practical automatic speech recognition

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 124
  • Download : 0
Reverberation is one of the most critical obstacles to adopt automatic speech recognition (ASR) in real life environments. Therefore, comprehensive understanding on the effect of reverberation to ASR is required to design robust ASR systems for practical uses. To deepen our understanding on the effect of reverberation to practical ASR, we performed a phonemic analysis on commercial ASR system. The analysis method involves a new metric named mean phoneme coherence (MPC), defined by time-frequency-averaged coherence function between clean and reverberated speech spectrograms of each phoneme. MPC measures the amount of spectral contamination on phonemes under certain reverberation condition thus quantifies not only the amount of reverberation on the phonemes but also contextual influences on the phoneme within sentence spoken in the reverberation condition. MPC was proven to represent the amount of reverberation and intelligibility of speeches under given reverberation condition by comparing MPC with word error rate (WER) in real reverberation conditions. Furthermore, the relationship between phoneme groups' vulnerability to spectral contamination and ASR performance upon reverberation is analyzed by comparing median of phoneme groups' MPC distribution with phoneme group word accuracy (PGWA). Analysis has shown that the two quantities show weak correlation, thus reverberation differently affects the intelligibility of phonemes. In addition, a comparative study among phoneme groups has shown that nasals and semivowels show the least robust ASR performances to reverberation while nasals and stops are most vulnerable to cause spectral contamination. The results and discussions present what should be taken into account for ASR robust to reverberation.
Publisher
ELSEVIER SCI LTD
Issue Date
2025-01
Language
English
Article Type
Article
Citation

APPLIED ACOUSTICS, v.227

ISSN
0003-682X
DOI
10.1016/j.apacoust.2024.110233
URI
http://hdl.handle.net/10203/323567
Appears in Collection
ME-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0