This document commonly with all following links uses ISO
8859-2 (ISO-Latin-2) encoding
AE2M31ZRE - seminars - TASK No. 3
Fundamental frequency (pitch) and its estimation
Tasks to do:
- Estimation of fundamental frequency (pitch) for one signal frame.
Estimate the fundamental frequency (pitch of the voice) for the following voiced frame of speech signal muz1-AA-frame.CS0
(raw data without header, fs=16000 Hz, for loading into MATLAB
use function loadbin.m). Make the
estimation in the following steps:
- compute and observe autocorrelation of the frame (biased
estimation using xcorr)
- on the basis of maximum location compute fundamental period in
samples (L_0), fundamental period in seconds (T_0) and the value of
pitch, i.e. the fundamental frequency in Hz (f_0),
- fcn max in MATLAB returns maximum and also its
position as the second output parameter,
- restrict the looking for the maximum accroding to the typical
range of human voice pitch which is 60-260 Hz.
- 1st checked result: pro for one voiced frame
- time waveform fo the singal commonly with the
computed autocorrelation function
- boundaries for the possible maximum location in
computed autocorrelation function,
- on the basis of detected maximum and its position, compute
the values of L_0, T_0 a f_0.
- Repeat the same procedure and observe the results also for
analogous voiced frames of other speakers, observe the signal
variability same as the differences in estimated pitch values and
maxima positions. Use the following signals:
- Repeat also for different voiced frames of one speaker (one male
and one female speaker) and observe the variability and stability of
f0 estimation fro particular speakers:
- Repeat also for one unvoiced speech
observe mainly the differences in time waveform same as in the
estimation of autocorrealtion function for unvoiced frame with noise
- Pitch estimation in longer utternace
- Implement estimation of fundamental frequency within particular
short-time frames for the whole utterance
(similarly as for the power computation realized within Task
No.2). Compute pitch for the each short-time frame of the analyzed
- The length of the frame should be 32 ms, the frame step should be
16 ms (i.e. work with 50% overlapping of analyzed short-time
- 2nd checked result: Signal waveform and
computed value of the pitch (f_0) for all available frames in
the whole utterance
SA176S01.CS0 (raw data,
fs=16000 Hz) same as for your own on-line recorded signal.
- The detection of voiced frames during the pitch estimation
- Try to detect unvoice frames on the basis of ZCR. Fundamental
frequency for uvoiced frame should be set to f_0 = 10 Hz.
- Detect also speech pause frames on the basis of energy computation
(for the frames with low energy). Set the value of f_0 = 0 Hz for
- 3rd checked result: Signal waveform and
computed f_0 for voiced frames only
in on-line recorded signal
(or for analyzed utterance SA176S01.CS0).
- POSSIBLE IMPROVEMENT - think about the smooting of pitch
estimation using median filtering (function
med.m) or other postprocessing.
- Pitch of your voice.
- 4th checked result: Compute the average
value of your voice pitch (f_0).
- Try to estimate the pitch also in Praat, and Wavesurfer.