# How to improve my code speed (for loop)

1 view (last 30 days)
Marleen van Dijk on 9 Jun 2020
Edited: Fabio Freschi on 9 Jun 2020
T = clasval.T;
Gamma_x = clasval.Gamma_x;
Gamma_xy = clasval.Gamma_xy;
Gamma_y = clasval.Gamma_y;
ut = clasval.ut;
[nsamples,dim]=size(trainingsset);
scorematrix=zeros(nsamples,nsamples);
for i=1:nsamples
for j=1:nsamples
probe=trainingsset(i,:);
gallery=trainingsset(j,:);
llr_score;
scorematrix(j,i)=score;
if i == j
break;
end
end
end
I would like to improve the speed f the following code. I already tried preallocating scoreamatrix, but it is still very slow. The code llr_score.m contains the following:
x=T*(probe-ut)';
y=T*(gallery-ut)';
score=x'*Gamma_x*x + x'*Gamma_xy*y+y'*Gamma_y*y;

Show 1 older comment
Marleen van Dijk on 9 Jun 2020
These are the dimensions. What is an MWE?
Fabio Freschi on 9 Jun 2020
is dim always 1?
Marleen van Dijk on 9 Jun 2020
Dim is equal to 1672 and n samples is equal to 500.
The classiffier is made using this function:
%% function for LLR classifier
function clasval = classifier(dataset, pcac, ldac)
trainingset = dataset.trainingset;
id = str2double(dataset.data(3,:))';
m = length(unique(id));
N=1; %number of enrolment samples
M=1; % number of probe samples
p=pcac; % number of PCA coefficients
l=ldac; % number of lda coefficients
[nsamples,dim]=size(trainingset);
% determine statistics of total distribution
ut=mean(trainingset); % size 1*dim
utn=repmat(ut,nsamples,1); % size nsamples*dim
z=trainingset-utn; % zero mean vectors (size nsamples*dim)
clear utn;
% Ut(dimxdim) and Vt(nsamples*dim) are left and right singular vectors, St(dim*dim)
% diagonal matrix with singular values of Z
[Ut,St,Vt]=svd(z','econ');
% pca dimension reduction
Utp=Ut(:,1:p); % size dim*pcac
Stp=St(1:p,1:p); % size pcac*pcac
% transformation to make total distribution white
% size pcac*dim
T1=sqrt(nsamples-1)*pinv(Stp)*Utp'; % pcac*dim
% determine statistics of within class distribution
zc=[];
for i=1:m
indices=find(id==i); % size 1*no_train
uc=mean(trainingset(indices,:)); % mean of all images in set (1*dim)
zc=cat(1,zc,trainingset(indices,:)-repmat(uc,size(indices)));
end
% size zc nsamples*dim
y=(T1*zc'); % size pcac*nsamples
% Uw(pcac*pcac), Sw(pcac*pcac), Vw(nsamples*pcac)
[Uw,Sw,Vw]=svd(y,'econ');
% lda dimension reduction
Uwl=Uw(:,p-l+1:p); % size pcac*ldac
Swl=Sw(p-l+1:p,p-l+1:p); % size ldac*ldac
% total transformation that diagonalises within class and whitens total
T=Uwl'*T1; % size ldac*dim
Sigma_w=Swl*Swl/(nsamples-1);
Sigma_b=eye(l)-Sigma_w;
Sigma_winv=pinv(Sigma_w);
Sigma_binv=pinv(Sigma_b);
Gamma_a=Sigma_winv*pinv((N+M)*Sigma_winv+Sigma_binv)*Sigma_winv;
Gamma_b=Sigma_winv*pinv(N*Sigma_winv+Sigma_binv)*Sigma_winv;
Gamma_g=Sigma_winv*pinv(M*Sigma_winv+Sigma_binv)*Sigma_winv;
Gamma_x=M*M*(Gamma_a-Gamma_g);
Gamma_xy=2*M*N*Gamma_a;
Gamma_y=N*N*(Gamma_a-Gamma_b);
Gamma_x11=Sigma_winv*pinv(2*Sigma_winv+Sigma_binv)*Sigma_winv-Sigma_winv*pinv(Sigma_winv+Sigma_binv)*Sigma_winv;
Gamma_xy11=2*Sigma_winv*pinv(2*Sigma_winv+Sigma_binv)*Sigma_winv;
Gamma_y11=Gamma_x11;
Gamma_x1n=eye(l)-Sigma_winv;
Gamma_xy1n=2*Sigma_winv;
Gamma_y1n=-Sigma_winv;
clasval = struct('T', T, 'Gamma_x', Gamma_x, 'Gamma_xy', Gamma_xy, 'Gamma_y', Gamma_y, 'ut', ut, 'scorematrix', [], 'threshold', []);

Fabio Freschi on 9 Jun 2020
Edited: Fabio Freschi on 9 Jun 2020
By profiling, the most computational intensive operations are
x=T*(probe-ut)';
y=T*(gallery-ut)';
That calculation is inside the nested loop and repeated for the same values multiple times. I moved that calculation out of the loop
clear all, close all
%% original code
% dimensions
N1 = 5;
N2 = 1672; % dim
N3 = 500; % nsamples
% dummy values
T = rand(N1,N2);
Gamma_x = rand(N1,N1);
Gamma_xy = rand(N1,N1);
Gamma_y = rand(N1,N1);
ut = rand(1,N2);
trainingsset = rand(N3,N2);
[nsamples,dim]=size(trainingsset);
scorematrix=zeros(nsamples,nsamples);
tic
for i=1:nsamples
for j=1:nsamples
probe=trainingsset(i,:);
gallery=trainingsset(j,:);
% llr_score
x=T*(probe-ut)';
y=T*(gallery-ut)';
score=x'*Gamma_x*x + x'*Gamma_xy*y+y'*Gamma_y*y;
scorematrix(j,i)=score;
if i == j
break;
end
end
end
toc
%% new code
scorematrix2=zeros(nsamples,nsamples);
tic
xy = T*bsxfun(@minus,trainingsset,ut).';
for i=1:nsamples
for j=1:nsamples
x = xy(:,i);
y = xy(:,j);
score=x'*Gamma_x*x + x'*Gamma_xy*y+y'*Gamma_y*y;
scorematrix2(j,i)=score;
if i == j
break;
end
end
end
toc
% check
norm(scorematrix-scorematrix2,'fro')/norm(scorematrix,'fro')
On my computer the new version is 30x faster with relative error of oder 10^-16 on the output