2.8.In de beschrijving van EP 685 is – voor zover hier van belang – onder meer het volgende opgenomen:
[0001]Embodiments of the present invention relate to the field of communications technologies, and in particular, to a method for predicting a bandwidth extension frequency band signal, and a decoding device.
[0002]In the field of digital communications, there are extremely widespread application requirements for voice, picture, audio, and video transmission, such as a phone call, an audio and video conference, broadcast television, and multimedia entertainment. To reduce a resource occupied in a process of storing or transmitting an audio and video signal, an audio and video compression and encoding technology comes into existence. (…)
[0003]An increasing emphasis is placed on audio quality in communication transmission; therefore, there is a need to increase quality of a music signal as much as possible on a premise that voice quality is ensured. Meanwhile, the amount of information of an audio signal is extremely rich; therefore, a code excited linear prediction (Code Excited Linear Prediction, CELP for short) encoding mode of conventional voice cannot be adopted; instead, generally, to process the audio signal, a time domain signal is converted into a frequency domain signal by using an audio encoding technology of domain transformation encoding, thereby enhancing encoding quality of the audio signal.
[0004]In an existing audio encoding technology, generally, by adopting a transformation technology, such as a fast Fourier transform (Fast Fourier Transform, FFT for short) or a modified discrete cosine transform (Modified Discrete Cosine Transform, MDCT for short) or a discrete cosine transform (Discrete Cosine Transform, DCT for short), a high frequency band signal in an audio signal is converted from a time domain signal to a frequency domain signal, and then, the frequency domain signal is encoded.
[0005]In the case of a low bit rate, limited quantization bits cannot quantize all to-be-quantized audio signals; therefore, an encoding device uses most bits to precisely quantize relatively important low frequency band signals in audio signals, that is, quantization parameters of the low frequency band signals occupy most bits, and only a few bits are used to roughly quantize and encode high frequency band signals in the audio signals to obtain frequency envelopes of the high frequency band signals. Then, the frequency envelopes of the high frequency band signals and the quantization parameters of the low frequency band signals are sent to a decoding device in a form of a bitstream. The quantization parameters of the low frequency band signals may include excitation signals and frequency envelopes. When being quantized, the low frequency band signals may first also be converted from time domain signals to frequency domain signals, and then, the frequency domain signals are quantized and encoded into excitation signals.
[0006]Generally, the decoding device may restore the low frequency band signals according to the quantization parameters that are of the low frequency band signals
and in the received bitstream, then acquire the excitation signals of the low frequency band signals according to the low frequency band signals, predict excitation signals of the high frequency band signals by using a bandwidth extension (band width extension, BWE for short) technology and a spectrum filling technology and according to the excitation signals of the low frequency band signals, and modify the predicted excitation signals of the high frequency band signals according to the frequency envelopes that are of the high frequency band signals and in the bitstream, to obtain the predicted high frequency band signals. Herein, the obtained high frequency band signals are frequency domain signals. In the BWE technology, a highest frequency bin to which a bit is allocated bay be a highest frequency bin to which an excitation signal is decoded on a frequency bin greater than the highest frequency bin.
[0007]A frequency band greater than the highest frequency bin to which a bit is allocated may be referred to as a high frequency band, and a frequency band less than the highest frequency bin to which a bit is allocated may be referred to as a low frequency band. That an excitation signal of a high frequency band signal is predicted according to an excitation signal of a low frequency band signal may be specifically as follows: The highest frequency bin to which a bit is allocated is used as a center, an excitation signal that is of the low frequency band signal and less than the highest frequency bin to which a bit is allocated is copied into a high frequency band signal that is greater than the highest frequency bin to which a bit is allocated and whose bandwidth is equivalent to bandwidth of the low frequency band signal, and the excitation signal is used as the excitation signal of the high frequency band signal.
[0008]The prior art has the following disadvantages:
According to the foregoing method for predicting a bandwidth extension frequency band signal in the prior art, an excitation signal of a high frequency band signal is predicted according to an excitation signal of a low frequency band signal, excitation signals of different low frequency band signals may be copied into a same high frequency band signal in different frames, causing discontinuity of excitation signal and reducing quality of the predicted bandwidth extension frequency band signal, thereby reducing auditory quality of an audio signal.
[0009]EP 2 186 086 A1 discloses a method for spectrum recovery in spectral decoding of an audio signal, includes obtaining of an initial set of spectral coefficients representing the audio signal, and determining a transition frequency. The transition frequency is adapted to a spectral content of the audio signal. Spectral holes in the initial set of spectral coefficients below the transition frequency are noise filled and the initial set of spectral coefficients are bandwidth extended above the transition frequency.
[0010]Embodiments of the present invention provide a method for predicting a bandwidth extension frequency band signal according to claim 1, and a decoding device according to claim 6, so as to improve quality of the predicted bandwidth extension frequency band signal, thereby enhancing auditory quality of an audio signal.
DESCRIPTION OF EMBODIMENTS
[0012]To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the following clearly and completely describes the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Apparently, the described embodiments are some but not all of the embodiments of the present invention. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts shall fall within the protection scope of the present invention.
[0013]In the field of digital signal processing, an audio codec and a video codec are widely applied to various electronic devices such as a mobile phone, a wireless apparatus, a personal data assistant (PDA), a handheld or portable computer, a GPS receiver/navigator, a camera, an audio/video player, a camcorder, a videorecorder, and a monitoring device. Generally, this type of electronic device includes an audio coder or an audio decoder, where the audio coder or decoder may be directly implemented by a digital circuit or a chip such as a DSP (digital signal processor), or be implemented by driving, by software code, a processor to execute a process in the software code.
[0014]For example, an audio encoder first performs framing processing on an input signal to obtain time domain data with one frame being 20 ms, then performs windowing processing on the time domain data to obtain a signal after windowing, performs frequency domain transformation on the time domain signal after windowing, to transform the signal from a time domain to a frequency domain, encodes the frequency domain signal, and transmits the encoded frequency domain signal to a decoder side. After receiving a compressed bitstream transmitted by an encoder side, the decoder side performs a corresponding decoding operation on the signal, performs, on a frequency domain signal obtained by decoding inverse transformation corresponding to the transformation used by the encoding end, to transform the signal from frequency domain to time domain, and performs post processing on the time domain signal to obtain a synthesized signal, that is, a signal output by the decoder side.
(…)
[0016]As shown in FIG. 1 , the time-frequency transforming module 10 is configured to: receive an input audio signal, and then convert the audio signal from a time domain signal to a frequency domain signal. Then, the envelope extracting module 11 extracts a frequency envelope from the frequency domain signal obtained by a transform by the time-frequency transforming module 10, where the frequency envelope may also be referred to as a sub-band normalization factor. Herein, the frequency envelope includes a frequency envelope of a low frequency band signal and a frequency envelope of a high frequency band signal in the frequency domain signal. The envelope quantizing and encoding module 12 performs quantization and encoding processing on the frequency envelope obtained by the envelope extracting module 11, to obtain a quantized and encoded frequency envelope. The bit allocating module 13 determines a bit allocation of each sub-band according to the quantized frequency envelope. The excitation generating module 14 performs, by using information about the quantized and encoded envelope obtained by the envelope quantizing and encoding module 12, normalization processing on the frequency domain signal obtained by the time-frequency transforming module 10, to obtain an excitation signal, that is, a normalized frequency domain signal, and the excitation signal also includes an excitation signal of the high frequency band signal and an excitation signal of the low frequency band signal. The excitation quantizing and encoding module 15 performs, according to the bit allocation of each sub-band allocated by the bit allocating module 13, quantization and encoding processing on the excitation signal generated by the excitation generating module 14, to obtain a quantized excitation signal. The multiplexing module 16 separately multiplexes the quantized frequency envelope quantized by the envelope quantizing and encoding module 12 and the quantized excitation signal quantized by the excitation quantizing and encoding module 15 into a bitstream, and outputs the bitstream to a decoding device.
[0017]FIG. 2 is a schematic structural diagram of a decoding device in the prior art. As shown in FIG. 2 , the existing decoding device includes a demultiplexing module 20, a frequency envelope decoding module 21, a bit allocation acquiring module 22, an excitation signal decoding module 23, a bandwidth extension module 24, a frequency domain signal restoration module 25, and a frequency-time transforming module 26.
[0018]As shown in FIG. 2 , the demultiplexing module 20 receives a bitstream sent by a side of an encoding device, and demultiplexes (including decoding) the bitstream to separately obtain a quantized frequency envelope and a quantized excitation signal. The frequency envelope decoding module 21 acquires the quantized frequency envelope from a signal obtained by demultiplexing by the demultiplexing module 20, and perform quantization and decoding to obtain a frequency envelope. The bit allocation acquiring module 22 determines a bit allocation of each sub-band according to the frequency envelope obtained by the frequency envelope decoding module 21. The excitation signal decoding module 23 acquires the quantized excitation signal from the signal obtained by demultiplexing by the demultiplexing module 20, and performs, according to the bit allocation that is of each sub-band and is obtained by the bit allocation acquiring module 22, quantization and decoding to obtain an excitation signal. The bandwidth extension module 24 performs extension on an entire bandwidth according to the excitation signal obtained by the excitation signal decoding module 23. Specifically, an excitation signal of a high frequency band signal is extended by using an excitation signal of a low frequency band signal. When quantizing and encoding an excitation signal and an envelope signal, an excitation quantizing and encoding module 15 and an envelope quantizing and encoding module 12 use most bits to quantize a signal of the relatively important low frequency band signal, and use few bits to quantize a signal of the high frequency band signal, and the excitation signal of the high frequency band signal may even be excluded. Therefore, the bandwidth extension module 24 needs to use the excitation signal of the low frequency band signal to extend the excitation signal of the high frequency band signal, thereby obtaining an excitation signal of an entire frequency band. The frequency domain signal restoration module 25 is separately connected to the frequency envelope decoding module 21 and the bandwidth extension module 24, and the frequency domain signal restoration module 25 restores a frequency domain signal according to the frequency envelope obtained by the frequency envelope decoding module 21 and the excitation signal that is of the entire frequency band and is obtained by the bandwidth extension module 24. The frequency-time transforming module 26 converts the frequency domain signal restored by the frequency domain signal restoration module 25 into a time domain signal, thereby obtaining an originally input audio signal.
[0019]FIG. 1 and FIG. 2 are structural diagrams of an encoding device and a corresponding decoding device in the prior art. According to processing processes of the
encoding device and the decoding device in the prior art shown in FIG. 1 and FIG. 2, it may be learned that in the prior art, an excitation signal and envelope information that are of a low frequency band signal and are used when the decoding device restores a frequency domain signal of the low frequency band signal are sent by a side of the encoding device. Therefore, restoration of the frequency domain signal of the low frequency band signal is relatively accurate. For a frequency domain signal of a high frequency band signal, there is a need to first use the excitation signal of the low frequency band signal to predict an excitation signal of the high frequency band signal, and then use envelope information that is of the high frequency band signal and is sent by the side of the encoding device, to modify the predicted excitation signal of the high frequency band signal, so as to obtain the
frequency domain signal of the high frequency band signal. When predicting the frequency domain signal of the high frequency band signal, the encoding device does not consider a signal type and uses a same frequency envelope. For example, when the signal type is a harmonic, a sub-band range covered by the used frequency envelope is relatively narrow (less than a sub-band range covered from a crest to a valley of one harmonic). When the frequency envelope is used to modify the predicted excitation of the high frequency band signal, more noises are brought in, therefore a relatively large error exists between the high frequency band signal obtained by modification and an actual high frequency band signal, severely affecting an accuracy rate of predicting the high frequency band signal, and reducing quality of the predicted high frequency band signal and reducing auditory quality of an audio signal. In addition, by using the foregoing prior art in which an excitation signal of a high frequency band signal is predicted according to an excitation signal of a low frequency band signal, excitation signals of different low frequency band signals may be copied into a same high frequency band signal of different frames, causing discontinuity of excitation, reducing quality of the predicted high frequency band signal, and thereby reducing auditory quality of an audio signal. Therefore, the following technical solutions of embodiments of the present invention may be used to resolve the foregoing technical problem.
[0020]FIG. 3 is a flowchart of a method for predicting a bandwidth extension frequency band signal according to an embodiment of the present invention. In this embodiment, the method for predicting a bandwidth extension frequency band signal may be executed by a decoding device. As shown in FIG. 3, in this embodiment, the method for predicting a bandwidth extension frequency band signal may specifically include the following steps:
100. The decoding device demultiplexes a received bitstream, and decodes the demultiplexed bitstream to obtain a frequency domain signal.
101. The decoding device determines whether a highest frequency bin, to which a bit is allocated, of the frequency domain signal is less than a preset start frequency bin of a bandwidth extension frequency band; when the highest frequency bin to which a bit is allocated is less than the preset start frequency bin of the bandwidth extension frequency band, executes step 102; otherwise, when the highest frequency bin to which a bit is allocated is greater than or equal to the preset start frequency bin of the bandwidth extension frequency band, executes step 103.
102. The decoding device predicts an excitation signal of the bandwidth extension frequency band according to an excitation signal within a predetermined frequency band range of the frequency domain signal and the preset start frequency bin of the bandwidth extension frequency band, and executes
step 104.
103. The decoding device predicts the excitation signal of the bandwidth extension frequency band according to the excitation signal within the predetermined frequency band range of the frequency domain signal, the preset start frequency bin of the bandwidth extension frequency band, and the highest frequency bin to which a bit is allocated, and executes step 104.
104. The decoding device predicts the bandwidth extension frequency band signal according to the predicted excitation signal of the bandwidth extension frequency band and a frequency envelope of the bandwidth extension frequency band.
[0021]According to the method for predicting a bandwidth extension frequency band signal in this embodiment, a start frequency bin of bandwidth extension is set, and a highest frequency bin to which a frequency domain signal is decoded and the start frequency bin are compared, to perform excitation restoration of a bandwidth extension frequency band, so that extended excitation signals are continuous between frames, and a frequency bin of a decoded excitation signal is maintained, thereby ensuring auditory quality of a restored bandwidth extension frequency band signal and enhancing auditory quality of an output audio signal.
(…)
[0028]By using the method for predicting a bandwidth extension frequency band signal in the foregoing embodiment, continuity of predicted excitation signals that are of a bandwidth extension frequency band signal and between a former frame and a latter frame can be effectively ensured, thereby ensuring auditory quality of a restored bandwidth extension frequency band signal and enhancing auditory quality of an audio signal.
[0029]FIG. 4 is a flowchart of a method for predicting a bandwidth extension frequency band signal according to another embodiment of the present invention. On the basis of the embodiment shown in FIG. 3 , in this embodiment, the technical solutions of the present invention are introduced in more details in the method for predicting a bandwidth extension frequency band signal. In this embodiment, the method for predicting a bandwidth extension frequency band signal may specifically include the following content:
200. A decoding device receives a bitstream sent by an encoding device, and decodes the received bitstream to obtain a frequency domain signal.
The bitstream carries a quantization parameter of a low frequency band signal and a frequency envelope of the bandwidth extension frequency band signal.
201. The decoding device acquires an excitation signal of the low frequency band signal according to the quantization parameter of the low frequency band signal.
202. The decoding device determines a highest frequency flast_sfm, on which a bit is allocated, of the frequency domain signal according to the quantization parameter of the low frequency band signal.
In this embodiment, the flast_sfm is used to represent the highest frequency bin, to which a bit is allocated, of the frequency domain signal.
203. The decoding device determines whether the flast_sfm is less than a preset start frequency fbwe_start of a bandwidth extension frequency band of the frequency domain signal; when the flast_sfm is less than the fbwe_start, execute step 204; otherwise, and when the flast_sfm is greater than or equal to the fbwe_start, execute step 205.
(…)
[0057]By using the technical solutions of the foregoing embodiment, continuity of predicted excitation signals that are of a bandwidth extension frequency band signal and between a former frame and a latter frame can be effectively ensured, thereby ensuring auditory quality of a restored bandwidth extension frequency band signal and enhancing auditory quality of an audio signal.
(…)
(…)
[0075]Functions of the decoding device shown in FIG. 2 may be adjusted according to the foregoing function modules, to obtain an example diagram of the decoding device in this embodiment of the present invention. Details are not described herein again.
(…)
[0081]The described apparatus embodiment is merely exemplary. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on at least two network units. Some or all of the modules may be selected according to an actual need to achieve the objectives of the solutions of the embodiments. A person of ordinary skill in the art may understand and implement the embodiments of the present invention without creative efforts.”