frame of voiced signal
frame_voiced.bin,
frame of unvoiced signal
frame_unvoiced.bin,
(raw data
without header, fs=16000 Hz, for download to MATLAB
use function loadbin.m),
Compute parameters of autoregressive model of the order p=16 for unvoiced speech frame frame_unvoiced.bin, i.e. autoregressive coefficients a_k and power of the prediction error E_p. To compute these parameters use MATLAB functions lpc, aryule, or arburg.
For voiced frame computed value of pitch (basic period), i.e. f_0, T_0, and L_0 (period in samples). Use the procedure from the last seminar.
Realization of LPC-encoder for a long signal
Compute pitch for all short-time frames of longer signal including the detection of unvoiced and non-speech frames,
see last seminar.
For given signal, computed the parameters of AR model for all short-time frames
(autoregressive coefficients and power of the prediction error) and save computed parameters in a matrix where AR-model parameters for particular short-time frames are located within rows.
Use frame length 30 ms without overlapping (later frame length 30 ms without overlapping).
Result of HOMEWORK:
function for signal encoding with general length of short-time frame and segmentation step for general sampling frequency at the input. The output should be a matrix of encoded parameters, i.e. at each row of output matrix will contain vector of autoregressive coefficients a_k, power of the prediction error E_p, and pitch of the signal f_0 (set f_0 = 10 for unvoiced frames and f_0 = 0 for non-speech frames).
Decoding of one short-time frame on the basis of AR model (LPC)
Create noise-based excitation for decoding of unvoiced speech frame, i.e. Gaussian white noise with zero mean value and power equal to one.
Realize decoding of unvoiced speech frame using AR model and given excitation.
Result: For given unvoiced frame frame_unvoiced.bin observe:
- time and frequency representation of the following signals: original signal, excitation, artificialy generated signal.
Create pulse-based excitation for decoding of voiced speech frame, i.e. periodically repeated pulses with distance equal to L_0 and power equal to one.
Create artificial voiced speech frame by filtering using prepared excitation and AR model.
Result: for given voiced speech frame frame_voiced.bin observe:
- time and frequency representation of the following signals: original signal, excitation, artificialy generated signal.
LPC decoder for long signal
Decode the signal from saved parameters of AR model. Be care and take into account the following problems:
Filtering should be realized always without overlapping for the length related to the segmentation step of the encoding process.
Do not forget to keep inicial conditions for the filtering of successive frames, i.e. use the 4th input and 2nd output parameters of function filter as well.
Use artificcial excitation based on f0 saved for particular short-time frames.
For the generation of excitation for successive frames kepp the basic periond accross the frames (i.e. the first pulse cannot be placed always at the first sample of short-time frame).
Result: decoded signal from previously computed matrix of encoded parametres for the utterance SA106S06.CS0 - observe waveform and spectrogram of original and decoded signals.
Try to use noise excitation only (i.e f_0 = 0 for all short-time frames)
Try to change (i.e. scale) f_0 in voiced frames (f_0 = scale*f_0), where scale is multiplicative constant with a value in the range 0.8 - 1.5.