# Why does filtering data before PCA improve results?

3 views (last 30 days)
Morgan Facchin on 2 Aug 2022
Edited: Bruno Luong on 2 Aug 2022
I have a set of images that I want to discriminate using PCA. I noticed that applying a low-pass filtering (using filter2) to the images before feeding them into PCA greatly improves the results (it increases the relative amount of variance in the first PCs and corresponds more to what I expect). I then have the following more general question: why does filtering improve the results? I have two conflicting intuitions on this:
• On the one hand, the performance is better simply because filtering reduces the noise in the images
• On the other hand, filtering is only a linear transformation of the data, and the principal axes found by PCA should be "dragged" by this linear transformation and give the exact same results.
Would you have any clues to help me clarify this?
Bruno Luong on 2 Aug 2022
Edited: Bruno Luong on 2 Aug 2022
Let me try too understand your question, because I do this extremey simple code to feel how filtering improve PCA, and my conclusion is quite the opposite:
M=diag([1,100]);
x=randn(2,1e6);
y=M*x;
% PCA of Non filtered data
[U,S,V]=svd(y',0);
PCA=V(:,1);
if PCA(2)<0
PCA=-PCA;
end
nfiltererror = norm(PCA-[0;1])
nfiltererror = 1.8027e-05
% PCA of filtered data
xf = mean(x,2);
yf = M*xf;
[Uf,Sf,Vf]=svd(yf',0);
PCAf=Vf(:,1);
if PCAf(2)<0
PCAf=-PCAf;
end
filtererror = norm(PCAf-[0;1])
filtererror = 0.0279
if filtererror < nfiltererror
fprintf('filter is better\n');
else
fprintf('non-filter is better\n');
end
non-filter is better
So what do you observe? Can you make a MWE (example with 2 pixels?) to show it?

Matt J on 2 Aug 2022
Edited: Matt J on 2 Aug 2022
PCA applied to the transformed cluster should find PC1 close to L', and therefore the projections of the images on L' should be the same as they were on L (withing a scaling factor)
That is true for a rotation, but for arbitrary linear transformations, it is not true when the dimension of L is greater than 1. We can recraft my example above to examine how the singular values change under an arbitrary transformation when L and L' are 2D:
X=rand(7,2); X=[X,X]; X=X-mean(X);
S1=svd(X,0)
S1 = 4×1
1.4299 1.0318 0.0000 0.0000
S2=svd(X*rand(4),0)
S2 = 4×1
2.0355 0.2737 0.0000 0.0000
Clearly also the change is more than just a global scaling,
S1./S2.*[1 1 0 0]'
ans = 4×1
0.7025 3.7701 0 0

R2021b

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!