BE2M31ZRE seminar
GMM-based classification of vowels
Tasks to do:
REQUIRED HOMEWORK: (5 points)
Compute MFCC cepstra (12+1) for all your own realizations of basic vowels from the database zreratdb (This data are available for you in the directory "K:\ZRE\data\zreratdb" in the classroom 802 at CTU FEE or they can be downloaded from the archive zrerat_blocken_2024_cs0.zip which contains *.CS0 signals resampled to 16 kHz.).
Use the following setup of MFCC computation:
- apply preemphasis with the coeffcient m=0.97 - short-time frame length of 25 ms, frame shift 5 ms (to increase number of realizations), Hamming weighting window,
- number of filter-bank bands nd frequency range boundaries: M=30, fmin=100Hz, fmax=6500Hz,
- use function vmfcc.m (s
with further called functions melbf.m,
mel.m, melinv.m).
Save computed MFCC coefficients in particular variables for available realization of all vowels, i.e. ca1, ca2, ca3, ce1, ce2, ce3, ci1, ci2, ci3, co1, co2, co3, cu1, cu2, cu3
Observe the clustering of cepstrum in available 8 dimension excluding c[0], i.e. for c[1] ... c[8].
Delivered results of HOMEWORK:
function for MFCC cepstrum computation of particular vowels including an observation of MFCC distribution for c[1]-c[2], c[3]-c[4], c[5]-c[6], c[7]-c[8] (Attention, coefficient c[0] is excluded).
Computed MFCC cepstra of particular vowels save into a matfile task3.mat. You can use the following command
save task3.mat ca1 ca2 ca3 ce1 ce2 ce3 ci1 ci2 ci3 co1 co2 co3 cu1 cu2 cu3 ;
Via WEB interface at Moodle FEE (authorized access) deliver *.zip archive, which will contain created function (script) for cepstrum computation and the *.mat file with computed MFCC cepstra for particular realizations of all 5 basic vowels.
Deadline for the delivery of homework is on Mo 8.4.2024, 9:00 .
Identification of vowels based on cepstral features
Compute mean values and variances of computed MFCC coefficients for joined 1st and 2nd realizations of basic 5 vowels.
Result:
Add mean values of coefficients c[1]-c[8] for all 5 vowels into home created figures of cepstrum distribution.
Compute parameters of GMM models with diagonal covariance matrix for particular vowels using MATLAB function gmdistribution.fit and observe obtained results. Identify previously computed mean values and variances within structured variable gmm.
Realize the classification using MATLAB function pdf for all short-time frames of the 3rd realizations of particular vowels for
all available GMM models.
Result: Display for selected 3rd realization of vowel "E":
all values of log-likelihood and their mean values for all 5 GMM models of particular basic vowels,
repeat it also for other vowels.
You can use pre-trained GMM models of vowels computed on the basis larger set from the database zreratdb, see cv8_vowel_gmms.mat.
Realize a classification of on-line recorded vowel.
Remove non-speech frames and very soft frames as well for on-line recorded vowel based on VAD using short=time power in dBs and fixed thresholding.
Result:
observe waveform and spectrogram without application of VAD,
observe MFCC time dependency without VAD,
observe MFCC time dependency after application of VAD for purposes of further classification,
all values of emitted log-likelihood and their averages for all 5 GMM models of particular basic vowels.
OPTIONAL HOMEWORK
Traning of general speaker-independent GMM models of vowels
Compute parameters of general GMM model from all available realizations of basic Czech vowels saved in the database zreratdb pronunced by Czech speakers.
Data are available in the classroom 802 at CTU FEE
in the directory "K:\ZRE\data\zreratdb".
For your work out of FEE classrooms, take data from the following archive of signals resampled to sampling frequency of 16 kHz zrerat_block200_2023_cs0.zip (Czech speakers) or small archive of all vowels zrerat_vowels_all_cs0.zip.
Realize GMM-based vowel identification using formants