Complex Cepstrum — Fundamental Frequency Estimation
This example shows how to estimate a speaker's fundamental frequency using the complex cepstrum. The example also estimates the fundamental frequency using a zero-crossing method and compares the results.
Load the speech signal. The recording is of a woman saying "MATLAB". The sampling frequency is 7418 Hz. The following code loads the speech waveform, mtlb
, and the sampling frequency, Fs
, into the MATLAB® workspace.
load mtlb
Use the spectrogram to identify a voiced segment for analysis.
segmentlen = 100;
noverlap = 90;
NFFT = 128;
spectrogram(mtlb,segmentlen,noverlap,NFFT,Fs,'yaxis')
Extract the segment from 0.1 to 0.25 seconds for analysis. The extracted segment corresponds roughly to the first vowel, /æ/, in "MATLAB".
dt = 1/Fs; I0 = round(0.1/dt); Iend = round(0.25/dt); x = mtlb(I0:Iend);
Obtain the complex cepstrum.
c = cceps(x);
Select a time range between 2 and 10 ms, corresponding to a frequency range of approximately 100 to 500 Hz. Identify the tallest peak of the cepstrum in the selected range. Find the frequency corresponding to the peak. Use the peak as the estimate of the fundamental frequency.
t = 0:dt:length(x)*dt-dt;
trng = t(t>=2e-3 & t<=10e-3);
crng = c(t>=2e-3 & t<=10e-3);
[~,I] = max(crng);
fprintf('Complex cepstrum F0 estimate is %3.2f Hz.\n',1/trng(I))
Complex cepstrum F0 estimate is 239.29 Hz.
Plot the cepstrum in the selected time range and overlay the peak.
plot(trng*1e3,crng) xlabel('ms') hold on plot(trng(I)*1e3,crng(I),'o') hold off
Use the zerocrossrate
function on a lowpass-filtered and rectified form of the vowel to estimate the fundamental frequency.
[b0,a0] = butter(2,325/(Fs/2));
xin = abs(x);
xin = filter(b0,a0,xin);
xin = xin-mean(xin);
zc = zerocrossrate(xin);
F0 = 0.5*Fs*zc;
fprintf('Zero-crossing F0 estimate is %3.2f Hz.\n',F0)
Zero-crossing F0 estimate is 234.94 Hz.