BE2M31ZRE seminar - LPC vocoder

Ceska verze teto stranky

Back to main page | list of seminars

BE2M31ZRE seminar
LPC vocoder

REQUIRED HOMEWORK: (5 points)

LPC based encoding of a short-time speech frame
- Compute parameters of autoregressive model of the order p=16 for voiced and unvoiced speech frame frame_voiced.bin and frame_unvoiced.bin, i.e. autoregressive coefficients a_k and power of the prediction error E_p. To compute these parameters use MATLAB functions lpc, aryule, or arburg.
- Based on the procedure done at previous seminars, computed value of pitch (fundamental frequency) f_0 for voiced frame (for unvoiced frame set the value of pitch equal to 1).
- Save computed values into one vector encv according to the following order encv = [ f_0, a_0, a_1, a_2, ...., a_p, E_p ] ;
Realization of LPC-encoder for a long signal
- Compute pitch for all short-time frames of longer signal including the detection of unvoiced and non-speech frames.
- Use frame length of 30 ms without overlapping for the comuptation of pitch and parameters of AR model.
- Save computed parameters into a matrix encv which particular lines will contain above mentioned parameters of encoded signal for particular short-time frames according to the following format encv = [ f_0_1, a_0_1, a_1_1, a_2_1, ...., a_p_1, E_p_1 ; f_0_2, a_0_2, a_1_2, a_2_2, ...., a_p_2, E_p_2 ; .... f_0_i, a_0_i, a_1_i, a_2_i, ...., a_p_i, E_p_i ; .... f_0_L, a_0_L, a_1_L, a_2_L, ...., a_p_L, E_p_L ; ] ;

Delivered results of HOMEWORK:

function (script) for signal encoding with possible setup of general length of short-time frame, segmentation step, and sampling frequency at the input. The output should be a matrix of encoded parameters, which each row will contain vector of pitch of the signal f_0, autoregressive coefficients a_k, and the power of the prediction error E_p. Set f_0 = 10 for unvoiced frames and f_0 = 0 for non-speech frames.
parameters of short frames : save the parameters of frames saved in files frame_voiced.bin and frame_unvoiced.bin into variables encv_voiced and encv_unvoiced. Files contain signals as raw data without header, 16-bit signed-integer little-endian, fs=16000 Hz, for download to MATLAB use function loadbin.m.
parameters of longer signal : compute parameters of signals SA106S06.CS0 and T2EYYYS1.CS0 (your utterances form database zreratdb, where YYY is your personal code given to you at the first seminar, available in zrerat_blocken_2025_cs0.zip) and computed matrices save as variables encv_SA106S06 and encv_T2EYYYS1. All signals have the format of raw data without header, 16-bit signed-integer little-endian, fs=16000 Hz.
Via WEB interface at Moodle FEE (authorized access) deliver *.zip archive, which will contain created function (script) for signal encoding and the *.mat file with computed parameters of given signals. All 4 variables mentioned above can be saved into mat-file e.g. by the command save task2.mat encv_voiced encv_voiced encv_SA106S06 encv_T20YYYS1 ;
Deadline for the delivery of homework is on Mo 17.3.2025, 16:00 .

Tasks to be done at the seminar:

Decoding of one short-time frame on the basis of AR model (LPC)
- Create noise-based excitation for decoding of unvoiced speech frame, i.e. Gaussian white noise with zero mean value and power equal to one.
- Realize decoding of unvoiced speech frame using AR model and given excitation.
- Result: For given unvoiced frame frame_unvoiced.bin observe:
  - time and frequency representation of the following signals: original signal, excitation, artificialy generated signal.
- Create pulse-based excitation for decoding of voiced speech frame, i.e. periodically repeated pulses with distance equal to L_0 and power equal to one.
- Create artificial voiced speech frame by filtering using prepared excitation and AR model.
- Result: for given voiced speech frame frame_voiced.bin observe:
  - time and frequency representation of the following signals: original signal, excitation, artificialy generated signal.
LPC decoder for long signal
- Decode the signal from saved parameters of AR model. Be care and take into account the following problems:
  1. Filtering should be realized always without overlapping for the length related to the segmentation step of the encoding process.
  2. Do not forget to keep inicial conditions for the filtering of successive frames, i.e. use the 4th input and 2nd output parameters of function filter as well.
  3. Use artificcial excitation based on f0 saved for particular short-time frames.
  4. For the generation of excitation for successive frames kepp the basic periond accross the frames (i.e. the first pulse cannot be placed always at the first sample of short-time frame).
- Result: decoded signal from previously computed matrix of encoded parametres for the utterance SA106S06.CS0 - observe waveform and spectrogram of original and decoded signals.
- Try to use noise excitation only (i.e f_0 = 0 for all short-time frames)
- Try to change (i.e. scale) f_0 in voiced frames (f_0 = scale*f_0), where scale is multiplicative constant with a value in the range 0.8 - 1.5.

Other signals for the processing

longer utterances SA106S06.CS0, SA176S01.CS0, SA002S02.CS0, SA107S06.CS0, SA110S06.CS0, SA114S06.CS0,
(raw data without header, fs=16000 Hz),
your records from database zreratdb, see zrerat_blocken_2025_cs0.zip or "K:\VYUKA\ZRE\signaly\zreratdb".
on-line recorded signals (fs 16 kHz), read e.g. the following sentences:
    "Six sisters sing a song of Sebastian Smith."
    "David and Robin drive their big car."
    "One - zero - three - seven - nine."

BE2M31ZRE seminar LPC vocoder

BE2M31ZRE seminar
LPC vocoder