Very low bit rate (VLBR) speech coding technology digitizes speech signal at bit rate about 1 kbps and below, so that it can transfer or store speech signal effectively. To develop a VLBR speech coder, it is essential to remove the temporal redundancy of spectral information of speech. Most of VLBR speech coders analyze the input speech as a sequence of phonetically meaningful segments like phonemes and then quantize them to remove the spectral redundancy. In this case, it is expected that the coded speech can be utilized by several interesting applications such as client-server model speech recognition, spoken document retrieval, speaker transformation, speaking rate change, and so on. It is because the VLBR speech coder abstracts the essential information of the input speech more efficiently compared with a fixed-frame speech coding system.
In this paper, two important aspects of a VLBR speech coding are studied: 1) development of a novel method for quantizing spectral information of speech and 2) application of a VLBR speech coder output. Thus a VLBR speech coder is implemented and its applications are discussed.
The implemented vocoder adopts temporal decomposition method, which does not requires training or matching patterns. For representing spectral information of input speech, line spectral frequency (LSF) parameters are used since several merits of LSF parameter are very applicable to a low bit rate speech coder, such as their robustness in quantization and transmission error. However, they also have an inherent property called LSF````s ordering property and this prohibits the temporal decomposition of LSF parameters. In order to solve this problem, a restricted temporal decomposition is proposed. Finally, a VLBR speech coder at the average bit rate of 996 bps is developed, and performance tests prove that the proposed vocoder reproduces a similar quality of the 2400 bps LPC-10E vocoder.
As an application of the implemented VLBR speech coder, an automa...