Replacing variables in a dataset
Show older comments
Hi, I have a dataset e.g like this 44 44 0; 3 3 0; 132 133 0; 45 46 42; and want to replace the zeros with the average of each row (each row represents one variable). I know how to do it for each row individually, was wondering if there is a faster way though for when dealing with a large matrix?? Thanks !
Accepted Answer
More Answers (1)
Hoang Nguyen
on 17 Jul 2017
Edited: Hoang Nguyen
on 17 Jul 2017
Hi,
It seems you are trying to replace all the 0's in a matrix with the mean of the corresponding row. For example:
44 44 0
3 3 0
132 133 0
45 46 42
becomes
44 44 29.33
3 3 2
132 133 88.33
45 46 42
This can be done with simple matrix operations:
function[result] = replace0sWithMean(M)
% Finds the mean column vector by averaging each row
mu = mean(M, 2);
% Adds the mean column vector to the original matrix
added = M + mu;
% added(:,:) ~= mu(:) returns a boolean matrix where the 1's correspond to non-zero's in the original matrix
% Piece-wise multiply this result by the mean vector
% Subtract this quantity from the sum computed earlier in order to restore each original non-zero value
result = added - (added(:,:) ~= mu(:)) .* mu
end
As Guillaume wonderfully pointed out the case in which we don't want the 0's to be involved in the calculation of the means, use the following code snippet instead:
function[result] = replace0sWithMean(M)
temp = M;
temp(temp==0) = NaN;
mu = mean(temp, 2, 'omitnan');
added = M + mu;
result = added - (added(:,:) ~= mu(:)) .* mu;
end
1 Comment
Guillaume
on 17 Jul 2017
I would suspect that the mean is to be taken without the 0s.
Categories
Find more on Logical in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!