With PCA, how much of the photo did i compress?

28 views (last 30 days)
Hi all,
Here my own PCA algorithm code that i create by myself by inspiring Prof. Andrew Ng's ML lectures.
It works very well.
The problem that I do not know how much of the original photo i compressed and how to press it on the title of the photo.
In title i pressen the number of colums for original and number of principal component for reversed (reconstracted) photo, is that a true way to specify how much i compressed?
SO, could you please check and edit the title(sprintf('....')) parts?
clear
close all
clc
a=imread('ben.png');
b=a(:,:,2);
X=double(b)/255;
% imshow(X)=imshow(b) ile aynıdır
% imagesc(X)= imagesc(b) ikisi renkli
[U,S,Xn]=pca(X);
% K = 20; % or find K with below algorithm
ss=sum(sum(S));
for K=1:size(X,2)
ss2(K)=S(K,K);
if sum(ss2)/ss >=0.99
break
end
end
Z = projectData(Xn, U, K);
Xappx= recoverData(Z, U, K);
figure
subplot(1, 2, 1);
imshow(b)
title(sprintf('Original: %d features', size(X,2)));
axis square;
subplot(1, 2, 2);
imshow(Xappx)
title(sprintf('Recovered: with top %d principal component', K));
axis square;
function [U, S, X] = pca(X)
% U = zeros(n);
% S = zeros(n);
m=size(X,1);
sigma = (1/m)*(X'*X);
[U, S , ~] = svd(sigma);
end
function Z = projectData(X, U, K)
% Z = zeros(size(X, 1), K);
U_reduce = U(:,(1:K)); % n x K
Z = X * U_reduce; % m x k
end
function X_rec = recoverData(Z, U, K)
% X_rec = zeros(size(Z, 1), size(U, 1));
% m * n
X_rec = Z * U(:,1:K)'; %=m*n
end

Accepted Answer

William Rose
William Rose on 15 Jul 2022
The percent compression is (Au-Ac)/Au, where Au= the amount of information needed to generate the uncompressed image, and Ac= the amount of information needed to generate the compressed image.
Au: Image ben.png is 1408x1849 pixels. The monochrome image (for example, the green channel of the image, which you chose) has 1 byte per pixel, as represented in Matab. Therefore Au=2,603,392 bytes. (A monochrome PNG file on disk will probably be smaller, because PNG uses a lossless compression algorithm.)
Ac: To reconstruct an image that has been compressed with PCA, you need the basis set images and the weighting factors which tell you how much of each basis image to use in the reconstruction. Therefore PCA actually requires MORE infrmation that the original image, if you are only compressing one image: You have to supply each of the basis images, plus the weighting factors. If you are compressing a large set of images, then PCA can produce good compression, because you use a common basis set for all the images, and a small set of wegihting factors for each images. For example, supose you had 1000 images with the same size as "ben". The raw monocrome images require 1408x1849x1000 bytes, i.e. Au=2.603 x 10^9 bytes. If you recontsruct the images using the first 20 principal components, you would need 1408x1849x20 for the basis images, plus 20x1000 for the weighting factors, i.e Ac=5.209 x 10^6 bytes.
The compression percentage, if you had 1000 images and reconstructed them using 20 principal components (i.e. a basis set of 20 images), would be (Au-Ac)/Au= 98%. The compression ratio for one image with PCA is a negative number, bcause you need more informatin to reconstruct it with PCA than the original image.
See this article for more.
For a set of RGB images, you can do PCA on each color independently.
  2 Comments
ali yaman
ali yaman on 16 Jul 2022
Thank you for your comprehensive answer. I think I got it.
So, I should not use PCA in just one data (only one photo) but I use in many data, right?
Also, we can not determine whether sucseccfully i made compression or not, because it is just one photo and thats why compression ratio is negative, right?
Thanks.
William Rose
William Rose on 18 Jul 2022
Edited: William Rose on 19 Jul 2022
[edited: I should have said "row" in csome places where I said "column", and I added text to clarify.]
The description I referenced for PCA on a set of images is one approach, but it is not the only approach. When you do PCA on a set of images, as described in the website I cited, each pricipal component is itself an image. Each individual image in the library of compressed images is then reconstructed by adding varying amounts of the different principal component images. That method only makes sense to use if you have a library of images. It workes best when the images have some common features, such as a set of faces.
A different approach to PCA, which works for a single image, is to treat each the image matrix as a set of row vectors, and then find the principal components (PCs) for the matrix.
[coeff,score,~] = pca(double(img),'Centered',false);
img is the original monochrome image (a 2D array). coeff is the matrix of principal components. Each column of coeff is one principal component. score is the matrix of weighting factors. Each row of score is the weights needed to reconstrutct the corresponding row of img. I use double(img) to convert the values in img from unsigned integers to floating point numbers, as required by pca(). I use 'centered','false to prevent the subtraction of the mean value from each column.
Reconstruct the image adding varying amounts of the different principal component vectors. To reconstruct the image using the first 10 PCs, do this:
imgRC=uint8(score(:,1:10)*coeff(:,1:10)');
imgRC is the reconstructed monochrome image (a 2D array). I use uint8() to convert the floating point values to unsigned 8-bit integers.
This can work reasonably well for a single image. For a color image, split the color image into 3 monochrome images, and do PCA on each of them, as described above, then combine the 3 mono images to get the reconstructed color image. See code below:
img=imread(imagefile);
imgr=img(:,:,1); %red original
imgg=img(:,:,2); %green original
imgb=img(:,:,3); %blue original
Then do PCA on imgr, imgg, and imgb separately, as described above. You will get three reconstructed images. Suppose they are named imgRCr, imgRCg, and imgRCb. Then you create the color reconstructed image as follows:
imgRC=cat(3,imgr,imgg,imgb); %create color image (3D array)
imshow(imgRC); %display reconstructed image
The percent compression can be measured in different ways.
  1. You could save the reconstructed image as a file, and compare its file size to the file size of the orignal image. You may see no compression (for example, if the images are both .bmp), or you may find some compression (for example if the images are both .jpg or if they are both .png). The exact amount of compression is hard to predict since JPEG and PNG have their own built-in compression algorithms. When I tried it with ben.jpg and reconstructed with 20 PCs, I got .
  2. You could compare the number of numbers needed to represent the original and reconstructed images. . The number of numbers needed to represent the orignal monochrome image is . If you reconstruct with 20 PCs, you need 20*c numbers for the PCs, and you need 20*r numbers for the weighting factors, so . Then the compression for 20 PCs is . Image ben.jpg has 1849 rows x 1408 columns, therefore the compression with 20 PCs, measured as the ratio of numbers, is .

Sign in to comment.

More Answers (2)

William Rose
William Rose on 19 Jul 2022
@ali yaman, The attached script applies PCA image compression to image ben.jpg. It reconstructs the image with 10 principal components and with 20 PCs. The compressed files are saved as ben10.jpg and ben20.jpg. The red, blue, and green components are compressed separately and are combined after compression.
From the comments in the script:
%Demonstrate the use of PCA for color image compression.
%An image is read from disk. It is split into R, G, B components.
%Each color is compressed with PCA. The compressed color slices are combined
%to reconstruct color images. Reconstructed color images are saved to disk.
%The color and R,G,B channels of the original and reconstructed images
%are displayed in low resolution as an array of images.
%To compress a different file, change the value of imagefile.
%To compress with different numbers of PCs, change the value of numpc,
%for example, numpc=15 or [10,20] or [6,12,24] or [10,20,50,1000].
%To see the true effects of compression, the user should view the original
%and reconstructed images at full resolution.
Good luck.
  2 Comments
ali yaman
ali yaman on 29 Jul 2022
@William Rose Thanks for your enormous endeavour. It works perfectly. And, I learned a lot new things when a look at your codes.

Sign in to comment.


ali yaman
ali yaman on 15 Jul 2022
By the way is it possible to compress my original RGB photo which is asigned to variable a, without reduce it to just green colur( variable b) ?
  3 Comments

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!