How do I vectorize my nested for loops for a convolution operation example?

16 views (last 30 days)
Jeff Salvant
Jeff Salvant on 26 Apr 2022
Commented: Jan on 27 Apr 2022
I created a convolution neural network and it required a lot of for loops which resulted in long running time. I would like to speed up the calculation process and after was told vectorization is a great way to do that. However I am confused about how to do this because I am extracting smaller matrix sizes of a larger matrix which complicates everything. Any help would be greatly appreciated.
%% initialization set up
bias_1 = zeros(1,1,16);
kernel_1 = -1+2*rand([3,3,16]);
kernelSize_1 = size(kernel_1,2);
stride_1 = 1;
k_start_1 = 1; %kernel start
k_end_1 = 3; %end size of kernel = length(kernel_1)
X = rand(28,28); %normally its an image but for simplicity i just made it random numbers
% Hyperparameters setup: padding,
pad_1 = (kernelSize_1-1)/2; % Pad type: "same"
padValue_1 = 0; % Zero padding
X_padded_1 = padarray(X,[pad_1,pad_1],padValue_1); % padded image [30x30]
%---Calculated-expected-outputsize-of-the-convolution-operation------------
img_height_1 = size(X,2); % input image height = 28
img_width_1 = size(X,2); % input image width = 28
outputHeight_1 = floor((img_height_1 + 2*pad_1 - kernelSize_1)/stride_1 + 1); % calculated output height = 28
outputWidth_1 = floor((img_width_1 + 2*pad_1 - kernelSize_1)/stride_1 + 1); % calculated output width = 28
conv1_output = zeros(outputHeight_1,outputWidth_1,16); % pre-allocated zeros matrix for desired output [28x28x8]
%% convolution operation
for c = 1:16
for i = 1:outputHeight_1
for j = 1:outputWidth_1
% extract feature map from padded image
featureMap_1 = X_padded_1((i-1) + k_start_1:(i-1) + k_end_1, (j-1) + k_start_1: (j-1) + k_end_1, 1);
% weighted sum of padded image elementwise multiplied...
% ...with a single kernel and with a bias added
S1_cij = sum(featureMap_1.*kernel_1(:,:,c),'all') + bias_1(1,1,c);
% calculated output with the ReLU activation function
conv1_output(i,j,c) = max(0,S1_cij); % [28x28x8]
end
end
end

Accepted Answer

Jan
Jan on 26 Apr 2022
Not a vectorization, but some tiny changes, which let the code tun 30% faster (at least in Matlab online - test this locally!):
for c = 1:16
tmp1 = kernel_1(:,:,c);
tmp2 = bias_1(1,1,c);
for i = 1:outputHeight_1
for j = 1:outputWidth_1
% extract feature map from padded image
featureMap_1 = X_padded_1((i-1) + k_start_1:(i-1) + k_end_1, ...
(j-1) + k_start_1:(j-1) + k_end_1, 1);
% weighted sum of padded image elementwise multiplied...
% ...with a single kernel and with a bias added
conv1_output(i,j,c) = sum(featureMap_1 .* tmp1, 'all') + tmp2;
end
end
end
conv1_output = max(0, conv1_output);
Elapsed time is 0.031382 seconds.
  2 Comments
Jan
Jan on 27 Apr 2022
for c = 1:16
tmp1 = kernel_1(:,:,c);
for i = 1:outputHeight_1
for j = 1:outputWidth_1
% extract feature map from padded image
featureMap_1 = X_padded_1((i-1) + k_start_1:(i-1) + k_end_1, ...
(j-1) + k_start_1:(j-1) + k_end_1, 1);
% weighted sum of padded image elementwise multiplied...
% ...with a single kernel and with a bias added
conv1_output(i,j,c) = sum(featureMap_1 .* tmp1, 'all');
end
end
end
conv1_output = max(0, conv1_output + bias_1);
I've moved the addition of the bias out of the loop. Now the rest looks like a job for conv or even conv2.
I try it in the evening again.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!