A narrow-band (0 ~ 4 kHz) speech has muffled sound and less sufficient intelligibility. So its quality is low compared with wide-band speech because of the deficiency of high-band components. For the quality enhancement of narrow-band speech, band-width expansion methods can be useful. They use the characteristics such as the spectrum envelope and the excitation signal of narrow-band speech for the estimation of its high-frequency components.
In this thesis, a spline codebook-based spectral folding (SCSF) method is proposed. The SCSF method simultaneously performs the high-band spectral envelope estimation and excitation signal generation. In training, the cepstrum codebook for spectrum-folded speech is generated by the vector quantization (VQ) of the extracted cepstrum. The spline codebook is made with the cepstrum codebook and the corresponding splines. In restoring narrow-band speech, the spline for the input speech is decided using the cepstral VQ and its corresponding spline function. Finally, the wide-band speech is generated by applying the spline to spectrumfolded speech. The expanded speech by the SCSF method shows better quality than other methods. However, the expanded speech by the SCSF has the problem of the high-band harmonics, because the strong pitch components in low frequency bands is folded to high frequency bands during the process.
To improve the SCSF method, this thesis proposes a spline codebook-based spectral shifting (SCSS) method. The SCSS is similar to the SCSF. But, the spectral shifting method in the SCSS method is used to generate the excitation signal. The spectral shifting method generates the high-band excitation signal by shifting the low-band one using a cosine function generator. The expanded speech by the SCSS method obtains the highest score in our objective tests. In addition, listeners also prefer the expanded speech by the SCSS to those by other methods.