Band-width expansion using spline codebook-based spectral folding and shiftingSpline codebook 기반의 spectral folding 과 shifting 을 이용한 대역폭 확장

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 508
  • Download : 0
A narrow-band (0 ~ 4 kHz) speech has muffled sound and less sufficient intelligibility. So its quality is low compared with wide-band speech because of the deficiency of high-band components. For the quality enhancement of narrow-band speech, band-width expansion methods can be useful. They use the characteristics such as the spectrum envelope and the excitation signal of narrow-band speech for the estimation of its high-frequency components. In this thesis, a spline codebook-based spectral folding (SCSF) method is proposed. The SCSF method simultaneously performs the high-band spectral envelope estimation and excitation signal generation. In training, the cepstrum codebook for spectrum-folded speech is generated by the vector quantization (VQ) of the extracted cepstrum. The spline codebook is made with the cepstrum codebook and the corresponding splines. In restoring narrow-band speech, the spline for the input speech is decided using the cepstral VQ and its corresponding spline function. Finally, the wide-band speech is generated by applying the spline to spectrumfolded speech. The expanded speech by the SCSF method shows better quality than other methods. However, the expanded speech by the SCSF has the problem of the high-band harmonics, because the strong pitch components in low frequency bands is folded to high frequency bands during the process. To improve the SCSF method, this thesis proposes a spline codebook-based spectral shifting (SCSS) method. The SCSS is similar to the SCSF. But, the spectral shifting method in the SCSS method is used to generate the excitation signal. The spectral shifting method generates the high-band excitation signal by shifting the low-band one using a cosine function generator. The expanded speech by the SCSS method obtains the highest score in our objective tests. In addition, listeners also prefer the expanded speech by the SCSS to those by other methods.
Advisors
Hahn, Min-Sooresearcher한민수researcher
Description
한국정보통신대학교 : 공학부,
Publisher
한국정보통신대학교
Issue Date
2007
Identifier
392841/225023 / 020054571
Language
eng
Description

학위논문(석사) - 한국정보통신대학교 : 공학부, 2007.8 , [ viii, 46 p. ]

URI
http://hdl.handle.net/10203/54861
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=392841&flag=dissertation
Appears in Collection
School of Engineering-Theses_Master(공학부 석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0