Repeat the training of GMM models and the GMM-based verification with extended feature vector.
Firstly, use extended number of 20 MFCC coefficients without c[0].
Extend also feature vector by differential
features (i.e. numeric estimation of the 1st derivative of the order 2
computed from 2 neighbouring frames). The order of numerical estimation of the first derivative should be around m = 9.
Use the following function diffceps
for the estimation of delta features.
Estimate again parameters of GMM models with 6 mixtures and full covariance matrix.
Result:
Display short-time values as well as the mean of the score for your
and other voice in the case when extended feature vector is used.
Update your GMM-based verification using extended feature vector and observe the change in verification results.
Speaker identification in a set of mixed utterances of more speakers
Find 10 your utterances in the set available in the directory "K:\ZRE\data\spksearch_ls2122". There are always 10 utterances per each unique speaker (including you) in given set.
Realize the identification of yur utterances using GMM-based approach with extended feature vector. (VQ-baes or basic GMM-based identification is possible as well, but the target accuracy will be lower and for the case of VQ significantly slower).
You can also download the following archive
spksearch_ls2223.zip for your work out of CTU FEE classrooms. It contains records resamples to 16 kHz, commonly with the list of files required for the processing in the cycle according to script demo_loadbin_in_cycle.m.
Result :
Seznam 10 rozpoznaných promluv vašeho hlasu, tj. 10 promluv s NEJVYŠŠÍ napočítanou průměrnou logaritmickou pravděpodobností (případně s nejmenší vzdáleností).
Seznam dalších 10 promluv (pořadí 11-20) s nejvyšší napočítanou průměrnou logaritmickou pravděpodobností (případně vzdáleností).