Cepstrum, cepstral distance, voice activity detection

**Tasks to do: **

**Cepstrum of one speech frame**- Download into MATLAB space the signal
*muz1-AA-frame.CS0*- raw data without header, fs=16000 Hz, for loading into MATLAB use function loadbin.m - Compute cepstrum of this frame using following techniques:

- DFT based real cepstrum - function*rceps*

- LPC cepstrum - functions*lpc*,*a2c.m*,*a2c0.m*

- MFCC (mel-frequency cepstral coefficients) - as dct of logarithm of mel spectrum computed on the basis of mel-frequency filter bank using the following functions melbf.m, mel.m, melinv.m. - 1st checked result: for
**one short-time frame**muz1-AA-frame.CS0 display:

- coefficients**c[0]-c[12]**for**real, LPC and MFCC cepstrum**

- Download into MATLAB space the signal
**Cepstrum of longer utterance**- Compute cesptral coefficients for all short-time frames of
the signal
*SA176S01.CS0*using following parameters of short-time analysis:

- frame length 25 ms, frame step 10ms, Hamming window weighting

- LPC order p=6,

- number of bands in mel filter bank M=30, boundaries of frequnecy band fmin=100Hz, fmax=6500Hz,

- observe always coefficients c[0] - c[12] or c[1] - c[12] ( i.e. without the coefficient c[0]) - Use the following functions

- short-time real cepstrum of long signal - vrceps.m

- short-time LPC cepstrum of long signal - vaceps.m, aceps.m, burg.m, a2c.m, a2c0.m,

- short-time mel-cepstrum of long signal (MFCC] - vmfcc.m, melbf.m, mel.m, melinv.m, - 2nd checked result: for
**longer utterance**SA176S01.CS0 display:

- coefficients**c[1]-c[12]**for**real, LPC and MFCC cepstrum** __Further signals for possible processing__

-*mc20bc116016.ils_a*- raw data, fs=44100 Hz,

-*a30650b1.wav*- wav format,

- your own signal on-line recorded using the sampling frequency of 8 kHz, 16 kHz, 44.1 kHz

- Compute cesptral coefficients for all short-time frames of
the signal
**Cepstral distance from a background and cepstral voice activity detector**- Compute average cepstrum from the inital non-speech part of processed utterance SA176S01.CS0, see 2nd checked result, i.e. from 10-20 segments using MFCC cepstrum.
- Compute Euclidan distance between current frame cepstrum and background average cepstrum (computed in the previous step as written above) and observe its time-dependency.
- Use the following approaches of cepstral distance computation

-*cd0.m*(distance in dBs with c[0])

-*cd1.m*(distance in dBs without c[0]) - Realize the detection of speech activity on the basis of computed distance using fixed and adaptive thresholding on the basis of dynamics, i.e. using functions thr_fixed.m and thr_adapt_dyn.m
- 3rd checked result: for
**the utteranceSA176S01.CS0**display:

- result of**VAD using MFCC cepstrum and adaptive threshold on the basis of dynamics**. *Compare the results of cepstral VAD with energy one, i.e. use the analogous thresholding for the estimation of short-time power of the signal in the same short-time frames. You can use your functions created at 2nd seminar or the following function speechpwr.m).*

ADDITIONAL TASK:

*BONUS result (4th point):***Functional cepstral voice activity detection of your own on-line recorded signal.**

Bonus result can be presented at the seminar or delivered later via email after individual completion at home; 4th point will be given to everybody who will present this result till the end of the seminar, when it is delivered after a home-completion 4th point will be given just to the first 3 delivered solution with evident originality.