How to accurately compute the phase lag between two time series with same sampling frequency.

150 views (last 30 days)
Hi everyone,
My data set consists of two time series. Both the series have a consist time lag but that varies (a bit) with tiem as well. So, i am interested to compute that variation in phase lag between two series.
How my data looks:
What i get after cross-correlation approcah:
Here is the dataset detail:
clear all
clc
s=load("Data_timeseries.csv");
ev_time=datenum(s(:,1),s(:,2),s(:,2),s(:,4),s(:,5),s(:,6)); % time
ser1=s(:, 7); % series 1
ser2=s(:, 8); % series 2
Data is also attached, Thank you!

Accepted Answer

Chunru
Chunru on 7 Oct 2022
Edited: Chunru on 7 Oct 2022
s=load(websave("Data_timeseries.csv", "https://www.mathworks.com/matlabcentral/answers/uploaded_files/1148175/Data_timeseries.csv"));
ev_time=datenum(s(:,1),s(:,2),s(:,2),s(:,4),s(:,5),s(:,6)); % time
ser1=s(:, 7); % series 1
ser2=s(:, 8); % series 2
t = (0:length(ser1)-1);
ser1env = envelope(ser1, 100, 'peak'); % need envolope detection
yyaxis left
plot(t, ser1, 'r', t, ser1env, 'g');
yyaxis right
plot(t, ser2, 'b')
ns = length(ser1);
lwin = 2000;
overlap = 1000;
i1 = 1;
c = 1;
while true
i2 = i1+lwin-1;
if i2>ns
break
end
% remove mean before xcorr
[r,lags] = xcorr(ser1env(i1:i2)-mean(ser1env(i1:i2)), ser2(i1:i2)-mean(ser2(i1:i2)), 'coeff');
[rmax, imax] = max(r);
res(c, :) = [i1, lags(imax)];
i1 = i1 + (lwin-overlap);
c = c + 1;
end
res
res = 13×2
1 315 1001 331 2001 -410 3001 355 4001 -377 5001 327 6001 334 7001 320 8001 366 9001 -336
% remove mean before xcorr
[r,lags] = xcorr(ser1env-mean(ser1env), ser2-mean(ser2), 'coeff');
figure
plot(lags, r);
grid on
[rmax, imax] = max(r);
tmax = lags(imax);
tmax, rmax
tmax = 348
rmax = 0.5001
hold on
plot(tmax, rmax, 'ro')
  8 Comments
William Rose
William Rose on 7 Oct 2022
@Adnan Barkat, The plot you shared above of phase lag versus "Time (Hours)" does not have numbers on the y axis, so it is impossible to tell what the scale is. Please don't do this, because it makes it hard to interpret what you have done or want done. (Some of your other shared plots also lacked numbers on the axes.)
I disagree with your assertion that "we can at least calculate the phase lag for each hour or 2, 3 etc hours". The rason I disagree is that the signal ser2 (blue in your first plot) has a period of roughly 12 hours. The phase difference between two signals of approximately equal frequency can be estimated once per cycle, at best, by comparing zero crossing times. Estimates that are more frequent than 1 per cycle do not make sense in my opinion. The values obtained from such esitmates are more likely to reflect the non-sinusoidal nature of the respective signals than the actual phase differences. The whole concept of phase lag is based on the idea of comparing 2 sinusoids of equal frequency. Therefore one phase lag esitmate per cycle is the most frequent that is justified. The plot of ser2 shows 19.5 cycles in 10 days, so 1 cycle is 10/19.5 = 0.513 days long, or 12.31 hours.
I recommend that you make the time units clear in your code. You calculate ev_time in your code with datenum(). datenum produces times in units of days. @Chunru makes a time vector comprised of integers. Therefore his time vector measures time in samples. Since the sampling rate is 1 per minute, @Chunru 's time vector are also in units of minutes.
Andi
Andi on 7 Oct 2022
Edited: Andi on 7 Oct 2022
@William Rose Thanks for explaining. Actually, I am looking for very short-term phase shift measurements, that's why I consider short windows. There can be another way, we can start with a 1-day window and then use a moving window of 100, or 500 points to identify short-term fluctuations.
Another thing is pre-processing steps that may help to refine the signal. For example, I sue filters to smooth the time series and then remove the trend. Now the data looks more reasonable. But I am not sure whether the pre-processing steps influence the phase measurements [attached]
Plus, as per the underlying mechanism of the dataset, phase values should be the same most of the time, but the values computed by @Chunru is varying a lot, which makes be a bit confused why is this case.

Sign in to comment.

More Answers (2)

William Rose
William Rose on 7 Oct 2022
I will address your quesiton about phase, but first I have some quesitons. Then I adress phase, below.
Your upper plot does not show x or y axis values. When I plot the data, it does not look like your plot.
s=load("Data_timeseries.csv");
t=datenum(s(:,1),s(:,2),s(:,3),s(:,4),s(:,5),s(:,6)); % time (days)
t=(t-t(1)); %remove initial time offset
ser1=s(:, 7); % series 1
ser2=s(:, 8); % series 2
plot(t,ser1,'-g',t,ser2,'-b')
legend('Ser1','Ser2'); xlabel('Time (days)')
Plot of your data, above, does not look like your plot.
Maybe if I remove the mean value from each:
plot(t,ser1-mean(ser1),'-g',t,ser2-mean(ser2),'-b')
legend('Ser1-mean','Ser2-mean'); xlabel('Time (days)')
The plot of your mean-adjusted data, above, looks more like the plot you shared, but still not the same. The ser1 data in your file is a lot noisier than the green trace in your plot. Please comment on this difference.
How did you compute and plot cross correlation? I am asking because the cross corr plot you shared does not look like what I would expect. A cross correlation like what you showed is what happens if you do not first remove the mean from both signals. See plots below.
[rxy,lags]=xcorr(ser1,ser2,'normalized'); %compute cross correlation
subplot(211), plot(lags,rxy,'-b'); %plot cross correlation
xlabel('Lags (samples)'); ylabel('r_{ser1,ser2}') %add labels
dt=(t(end)-t(1))/(length(t)-1); %sampling interval (days)
subplot(212), plot(lags*dt,rxy,'-b'); %plot cross correlation
xlabel('Lag (days)'); ylabel('r_{ser1,ser2}') %add labels
The only difference in the plots above is the x-axis scaling. Neither one matches your plot. That's why I'm curious about how you made your cross corr plot. Both of the plots above are not the true cross correlation, because the mean was not removed. The true cross correlation is below.
[rxy,lags]=xcorr(ser1-mean(ser1),ser2-mean(ser2),'normalized'); %compute cross corr
figure; subplot(211); plot(lags*dt,rxy,'-b') %plot cross corr
xlim([-1 1]); title('Cross Correlation'); grid on %add title, grid
xlabel('Lag (days)'); ylabel('r_{ser1,ser2}') %add labels
subplot(212); plot(lags*dt,rxy,'-b') %plot cross corr
xlim([0 .5]); title('Cross Correlation'); grid on %add title, grid
xlabel('Lag (days)'); ylabel('r_{ser1,ser2}') %add labels
This is more reasonable and what I would expect.
----------
Now, regarding phase between the signals. Do you want one estimate of the phase difference between ser1 and ser2, based on this recording? Or do you want a new phase difference estimate for every cycle of the waves? How do you plan to use this information?
I assume you want a single phase estimate. Otherwise there would have been no reason for you to plot the cross correlation.
The plots of the signals, above, show that each is roughly sinusoidal. The frequency is about 19.5 cycles in ten days, so f~=19.5/10=1.95.
The cross correlation xcorr(ser1,ser2) has a positive peak at time t=+0.24, evident on the cross corr plot above. This means ser1 lags ser2 by 0.24 seconds.
The phase difference, in radians, is
,
or , where td is the lag, in units of time, and f is the frequency, in cycles per unit time.
  3 Comments
Andi
Andi on 7 Oct 2022
@William Rose, thanks. Well, here is the point-by-point reply for what I am looking for.
Data presentation: I plot the Ser2 as a secondary y-axis, if you plot it as secondary you will get more or less the same results. Yes, I agree still there is a difference because I first smooth the time series and then the plot. But both are fine, we can proceed without smoothing as well.
Cross-correlation plot: More or less I use the same script for plotting the cross-correlation as you did. The difference is just due to plotting.
Phase calculation: Regarding the phase, I am interested in calculating the phase change for small-time windows (or sample length). For example, in total I have 14400 data samples, I want to compute the phase for e.g., 500 samples with a moving window of 100 (or any other combination, which should be user defined because I need to tune to get the best results).
Phase for Each cycle: Yes, that may also be good to see the short-term phase changes. I think both approaches are worthwhile to test.

Sign in to comment.


William Rose
William Rose on 7 Oct 2022
There is a nice reuslt below, it just takes a while to get to it.
To estimate phase lag of ser1 relative to ser2, once per cycle, you will want to compare the zero crossing time differences. Use either positive or negative crossings. Phase lag estimates more frequent than once per cycle do not make sense. The phase difference, in radians, is
where is the zero crossing time difference, in units of time, and f is frequency in cycles per unit time.
s=load("Data_timeseries.csv");
t=datenum(s(:,1),s(:,2),s(:,3),s(:,4),s(:,5),s(:,6)); % time (days)
t=(t-t(1)); %time (days) (initial offset removed)
ser1=s(:, 7)-mean(s(:,7)); % series 1 minus its mean
ser2=s(:, 8)-mean(s(:,8)); % series 2 minus its mean
To estimate the zero crossing times, you will need to remove drift (high pass filter) and noise (low pass filter). Use filters with zero phase for this, because filters with non-zero phase could lead to errors in the phase lag esitmates. Zero phase filters include Matlb's filtfilt() and non-causal FIR filters that are symmetric about the middle.
Visual inspection of the signals shows 19.5 cycles in 10 days, so the dominant frequency is 1.95 cycles/day. We can do low pass and high pass filtering simultaneously with a bandpass filter. Therefore we make a bandpass with cutoff frequenencies 1/day and 4/day.
N=length(t);
dt=(t(end)-t(1))/(N-1); %sampling interval (days)
fs=1/dt; fNyq=fs/2; %sampling frequency, Nyquist frequency (cycles/day)
f1=1; f2=4; %bandpass corner frequencies (cycles/day)
[z,p,k]=butter(4,[f1,f2]/fNyq,'bandpass'); %4th order Butterworth bandpass filter coefficients
[sos,g]=zp2sos(z,p,k); %sos representation of the filter
y1=filtfilt(sos,g,ser1); %apply zero phase filter to ser1
y2=filtfilt(sos,g,ser2); %apply zero phase filter to ser2
Plot the signls pre and post filtering to make sure the filtering process is working as desired:
subplot(211), plot(t,ser1,'-g',t,y1,'-k');
xlabel('Time (days)'); legend('Ser1=raw','y1=filtered'); grid on
subplot(212), plot(t,ser2,'-b',t,y2,'-k');
xlabel('Time (days)'); legend('Ser2=raw','y2=filtered'); grid on
The plots above are reassuring. The filtered signals look like what we expect and desire.
Find the indices of the upward zero crossings of the filtered signals with an ingenious function from @Star Strider
zci = @(v) find(diff(sign(v))>0); %define function: returns indices of +ZCs
izc1=zci(y1); %find indices of + zero crossings of y1
izc2=zci(y2); %find indices of + zero crossings of y2
Interpolate to get time of zero crossing, using a function from @Nick Hunter.
ZeroX = @(x0,y0,x1,y1) x0 - (y0.*(x0 - x1))./(y0 - y1); % Interpolated x value for Zero-Crossing
ZT1 = ZeroX(t(izc1),y1(izc1),t(izc1+1),y1(izc1+1));
ZT2 = ZeroX(t(izc2),y2(izc2),t(izc2+1),y2(izc2+1));
Plot the results:
figure; subplot(211);
plot(t,y1,'-b',ZT1,zeros(size(ZT1)),'r+');
title('y1=filtered ser1, and + zero crossings'); grid on
subplot(212); plot(t,y2,'-b',ZT2,zeros(size(ZT2)),'r+');
title('y2=filtered ser2, and + zero crossings'); grid on
xlabel('Time (days)')
Compute the differences between the zero crossing times, and convert those difference to phases, and plot them versus time.
tzcd=ZT1-ZT2(1:end-1); %zero crossing time differeces (days)
fdom=19.5/10; %dominant frequency (cycles/day)
phz=360*fdom*tzcd; %phase lag of y1 relative to y2 (degrees)
figure; plot(ZT2(1:end-1),phz,'-rx');
xlabel('Time (days)'); ylabel('Lag (degrees)'); title('Phase Lag v. Time')
ylim([0,360]); set(gca,'ytick',0:90:360); grid on
This looks good. The mean phase lag above is a bit less than 180 degrees. The average phase lag for the entire signal, determined in my earlier answer, using the cross correlation, was 168 degrees. So the cycle-by-cycle results are consitent with the overall result.
  4 Comments
Andi
Andi on 8 Oct 2022
@William Rose Thank you a lot for your support and encouragement. Now, I am trying to generalize the script for the actual scenario. I encountered an issue at the last stage, where we manually input the total cycle identified in each time series. Below step:
tzcd=ZT1-ZT2(1:end-1); %zero crossing time differeces (days)
fdom=19.5/10;
Here is the full implementation of the script for your expert opinion:
clear all
clc
% % ...........Dataset1 ............%
cd_ev=readmatrix('ob.csv'); % % Observation data
ev_time=datenum(cd_ev(:,1),cd_ev(:,2),cd_ev(:,3),cd_ev(:,4),cd_ev(:,5),cd_ev(:,6));
cand_ev=ev_time'; % conversion to datenum
for jj=1:4 % here i used just 4 but actual observations much more than this
b=cand_ev(:,jj);
aa(jj)= addtodate(b, 5, 'day'); % define the upper limit of observation period
bb(jj)= addtodate(b, -5, 'day'); % define lower limit of the observation period
end
CE_U=aa; % upper limit
CE_L=bb; % lower limit
%
% % ------------ Series 1-------------%
%
d_tid=readmatrix('ser_t.csv'); % Series 1 with time and observations
d_tid(1,:) = [];
d_tid(:,1) = [];
s=d_tid;
t1=datenum(s(:,1),s(:,2),s(:,3),s(:,4),s(:,5),s(:,6));
t_d=[t1 d_tid(:,7)]; % after conversion and deleting unnecessary data
%------------- Data Set 3 --------------%
% Same procedure repeated for second time series
d_ven=readmatrix('ser_v.csv');
d_ven(1,:) = [];
d_ven(:,1) = [];
ss=d_ven;
ts=datenum(ss(:,1),ss(:,2),ss(:,3),ss(:,4),ss(:,5),ss(:,6));
v_d=[ts d_ven(:,7)];
% ---------- Data selection for a particular event ---------%
% Above mentioend limits are used to select the data of interest from both
% series.
s1=t_d(t_d(:,1)>= CE_L(6) & t_d(:,1)<= CE_U(6),:);
s2=v_d(v_d(:,1)>= CE_L(6) & v_d(:,1)<= CE_U(6),:);
%-------------------------------------------------------------------------%
% AFTERWARD, I REPEAT THE SAME STEPS AS YOU SUGGESTED %
S=[s1 s2];
%----------Pre-processing of tiem series -------------------%
t=S(:, 1);
t=(t-t(1)); %time (days) (initial offset removed)
ser1=S(:, 2)-mean(S(:,2)); % series 1 minus its mean
ser2=S(:, 4)-mean(S(:,4)); % series 2 minus its mean
%-------- Filtering and zero-phase -----------%
N=length(t);
dt=(t(end)-t(1))/(N-1); %sampling interval (days)
fs=1/dt; fNyq=fs/2; %sampling frequency, Nyquist frequency (cycles/day)
f1=1; f2=4; %bandpass corner frequencies (cycles/day)
[z,p,k]=butter(4,[f1,f2]/fNyq,'bandpass'); %4th order Butterworth bandpass filter coefficients
[sos,g]=zp2sos(z,p,k); %sos representation of the filter
y1=filtfilt(sos,g,ser1); %apply zero phase filter to ser1
y2=filtfilt(sos,g,ser2); %apply zero phase filter to ser2
%--------- Plotting ----------%
subplot(311), plot(t,ser1,'-g',t,y1,'-k');
xlabel('Time (days)'); legend('Ser1=raw','y1=filtered'); grid on
subplot(312), plot(t,ser2,'-b',t,y2,'-k');
xlabel('Time (days)'); legend('Ser2=raw','y2=filtered'); grid on
subplot(313),
yyaxis left
plot(t,y1,'-b')
yyaxis right
plot(t,y2,'-r');
xlabel('Time (days)'); legend('s1 filt','s2 filt'); grid on
%--------- Zero crossing time --------%
zci = @(v) find(diff(sign(v))>0); %define function: returns indices of +ZCs
izc1=zci(y1); %find indices of + zero crossings of y1
izc2=zci(y2); %find indices of + zero crossings of y2
%Interploation
ZeroX = @(x0,y0,x1,y1) x0 - (y0.*(x0 - x1))./(y0 - y1); % Interpolated x value for Zero-Crossing
ZT1 = ZeroX(t(izc1),y1(izc1),t(izc1+1),y1(izc1+1));
ZT2 = ZeroX(t(izc2),y2(izc2),t(izc2+1),y2(izc2+1));
%---------- plotting ---------%
figure; subplot(211);
plot(t,y1,'-b',ZT1,zeros(size(ZT1)),'r+');
title('y1=filtered ser1, and + zero crossings'); grid on
subplot(212); plot(t,y2,'-b',ZT2,zeros(size(ZT2)),'r+');
title('y2=filtered ser2, and + zero crossings'); grid on
xlabel('Time (days)')
%---------- Zero crossing and phase calculation ---------%
tzcd=ZT1-ZT2(1:end-1); %zero crossing time differeces (days)
fdom=19.5/10; %dominant frequency (cycles/day)
phz=360*fdom*tzcd; %phase lag of y1 relative to y2 (degrees)
figure; plot(ZT2(1:end-1),phz,'-rx');
xlabel('Time (days)'); ylabel('Lag (degrees)'); title('Phase Lag v. Time')
ylim([0,360]); set(gca,'ytick',0:90:360); grid on
I need to repeat the same procedure for each observations in dataset1, if there is no missing data in series1 and ser2 during that partcualr time window.
Problematic step
tzcd=ZT1-ZT2(1:end-1);
If, i change the order of series or change the dataset this lines give error.
[Data is also attached]
William Rose
William Rose on 8 Oct 2022
@Adnan Barkat, please send me a secure email, by clicking on the WR circle nex to my messages, then click on the envelope in the pop-up that appears. Thanks.

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!