BE2M31ZRE - spk ident

Ceska verze teto stranky

Back to main page | list of seminars

BE2M31ZRE seminar
GMM-based speaker identification

Guidelines:

Improved GMM-based speaker verification
- Repeat the training of GMM models and the GMM-based verification with extended feature vector.
- Firstly, use extended number of 20 MFCC coefficients without c[0].
- Extend also feature vector by differential features (i.e. numeric estimation of the 1st derivative of the order 2 computed from 2 neighbouring frames). The order of numerical estimation of the first derivative should be around m = 9. Use the following function diffceps for the estimation of delta features.
- Estimate again parameters of GMM models with 6 mixtures and full covariance matrix.
- Result:
  - Display short-time values as well as the mean of the score for your and other voice in the case when extended feature vector is used.
  - Update your GMM-based verification using extended feature vector and observe the change in verification results.
Speaker identification in a set of mixed utterances of more speakers
- Find 10 your utterances in the set available in the directory "K:\ZRE\data\spksearch_ls2425". There are always 10 utterances per each unique speaker (including you) in given set.
- Realize the identification of yur utterances using GMM-based approach with extended feature vector. (VQ-baes or basic GMM-based identification is possible as well, but the target accuracy will be lower and for the case of VQ significantly slower).
- To read all utterances use the following script demo_loadbin_in_cycle.m
- You can also download the following archive spksearch_ls2425.zip for your work out of CTU FEE classrooms. It contains records resamples to 16 kHz, commonly with the list of files required for the processing in the cycle according to script demo_loadbin_in_cycle.m.
- Result :
  - A list of 10 identified utterances of your voice, i.e. 10 utterances with the HIGHEST computed average logarithmic likelihood.
  - A list of further 10 utterances (numbers 11-20) with the highest computed average logarithmic likelihood.

BE2M31ZRE seminar GMM-based speaker identification

BE2M31ZRE seminar
GMM-based speaker identification