Copy down cell value until new value and repeat for specific range

15 views (last 30 days)
Hi, I have about 27,000+ excel files where there are a lot of blank cells to fill. I know how to do this manually but I was hoping for help in trying to write a code that will find a value and copy it into the blank cells below it until a new value appears, and repeat the process for all the rows and columns within a range (50 rows, 42 columns). I have tried this through VBA and cannot get it to work so I thought there might be something in MATLAB that will achieve this.
The first picture is how my data looks now (small subset), and the second is what I hope the MATLAB code will achieve.
2019-08-13 10_08_27-p300031.xlsx - Excel.png 2019-08-13 10_10_01-p300031.xlsx - Excel.png
  2 Comments
Guillaume
Guillaume on 13 Aug 2019
Edited: Guillaume on 13 Aug 2019
There's two way to do this:
  • Do this in excel, locating the non-empty cells in the excel spreadsheet and filling them up. It can be done fairly easily. The code in matlab would be more or less the same as the VBA code, which can be easily found by a simple search (e.g. the answer provided by brettdj in this stackoverflow. This would preserve all the formatting in the files.
  • Load the spreadsheets in matlab, fill them, and resave them. This may lose some formatting.
Either way, for 27,000+ files, this is going to be slow!

Sign in to comment.

Answers (1)

Adam Danz
Adam Danz on 13 Aug 2019
Edited: Adam Danz on 15 Aug 2019
If you choose to do this in Matlab, this should get you started. It reads in a spreadsheet named "data.xlsx" as a table and pulls out the numeric matrix and NaN fillers.
It then loops through each column and fills the NaNs with the first preceeding non-nan. If there are no preceeding non-nans (as in column 6 of the image in your question), it leaves the NaNs.
After this, you can write the data back to the excel file.
T = readtable('data.xlsx');
data = T{:,:};
% get rid of any full-NaN rows at the top (optional)
firstRow = find(any(~isnan(data),2),1);
data(1:firstRow-1,:) = [];
nanIdx = isnan(data);
nanIdxDiff = diff([false(size(nanIdx(1,:))); nanIdx; false(size(nanIdx(1,:)))]); %1 to nan, -1 to non-nan
% loop through each column
for i = 1:size(data,2)
nanStartIdx = find(nanIdxDiff(1:end-1,i) == 1);
nanStopIdx = find(nanIdxDiff(:,i) == -1);
nanCounts = nanStopIdx - nanStartIdx;
data(nanIdx(:,i),i) = repelem(data(max(nanStartIdx-1,1),i),nanCounts,1);
end
For testing & development you can replace the first two lines with this test data below.
data = [ -2 24 0.8 NaN 8 NaN
NaN NaN NaN NaN NaN NaN
NaN NaN 0.6 98 NaN NaN
-1 NaN NaN NaN NaN NaN
-3 NaN 0.7 NaN NaN NaN
NaN NaN NaN NaN NaN NaN
NaN NaN 0.7 NaN 7 NaN
0 24 NaN NaN NaN NaN];
Result
data =
-2 24 0.8 NaN 8 NaN
-2 24 0.8 NaN 8 NaN
-2 24 0.6 98 8 NaN
-1 24 0.6 98 8 NaN
-3 24 0.7 98 8 NaN
-3 24 0.7 98 8 NaN
-3 24 0.7 98 7 NaN
0 24 0.7 98 7 NaN
[addendum]
As Guillaume pointed out below, for r2016b and later you can use fillmissing().
  3 Comments
Adam Danz
Adam Danz on 13 Aug 2019
Good call (as usual).
I keep forgetting about fillmissing() (r2016b and later).
J B
J B on 16 Dec 2023
Commenting to boost the fillmissing answer. Saved me a LOT of time.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!