BE2M31ZRE seminar
GMMbased classification of vowels
Tasks to do:
 Simplified classification of vowels based on GMM

Compute mean values and variances for 1st and 2nd formant for basic 5 vowels which formant frequencies are saved in the following file
formants_vowels.mat
(Formants 14 computed always from 3 realizations of each particular vowel are saved in variables Fa1, Fa2, Fa3, etc.). Compute mean values from the 1st realizations, i.e. from Fa1, Fe1, Fi1, Fo1 a Fu1.
 1st checked result:
 Display distribution of 1st and 2nd formant (formant triangle)
for particular realizations of vowels (i.e. 3 different figures),
 add mean values of formants for all 5 vowels into previously created figures.
 Define the formula for two dimensional (ndimensional) Gaussian function (i.e. probability density function) for the purpose of vowel classification based on the first 2 (4) formants and prepare the implementation of this function in MATLAB.
 Display particular two dimensional Gaussian probability density functions of particular vowels.
 2nd checked result:
 Compute values of likelihood emitted by Gaussian probability density function for particular vowels. Use one shorttime frame from the second or third realization of vowels (e.g. the 5th row of Fa2). Find the highest likelihood and check that it correspond to analyzed sound. Repeat for other vowels.
 Repeat the same task using available functions in MATLAB,
i.e. gmdistribution.fit and pdf and compare obtained results.
 3rd checked result:
 Realize the classification using MATLAB function pdf for all shorttime frames of the 2nd or 3rd realization of vowels for
all available GMM models. Observe results in graphical form.
 Try to use also logarithmic probabilities.
 Online identification of vowels based on formants or cepstra
 Compute parameters of GMM model from all your available realizations of vowels saved in the database zreratdb.
 You can access recorded signals directly in clasrooms at CTU FEE. The data are directly available in the directory "H:\VYUKA\ZRE\signaly\zreratdb".
 To work outside of CTU FEE, you can dowload the following archive of signals resampled to
16 kHz zrerat_blocken_2018_cs0.zip.
 Realize the classification of online recorded vowel.
 Use all 4 formant frequencies or 12 MFCC cepstral coefficients (c[1]c[12]) as speech features.
 4th checked result: Display for online recorded vowel:
 waveform, spectrogram and formant (or MFCC) time dependency,
 formants or cepstrum for voice activity shorttime frames only (use power VAD),
 all values of loglikelihood and their averages for all 5 GMM modles of particular basic vowels.