How to set a condition for sequential number of non NaN values in a vector?

1 view (last 30 days)
Hello, I have a matrix like that:
NaN -0.00042 0.00073 -0.00298 NaN
NaN -0.00114 0.00029 0.00265 NaN
NaN -0.00028 NaN -0.00066 NaN
NaN 0.00197 0.00219 0.00099 NaN
I would like to write a condition for each column: if the number of SEQUENTIAL non-NaN values >=300 (no matter where exactly in a column and non NaNs have not be the same, just any non NaNs), then leave the column, otherwise - delete the whole column. I'm stuck with that "sequential" - how can I say to matlab look for 300 consecutive non-NaNs in a row??
I would appreciate any help.
  1 Comment
dpb
dpb on 13 May 2016
Gotta' run; realized I'm late for a meeting...but look for a "runs" submission on FileExchange (maybe Bruno, I forget???).

Sign in to comment.

Accepted Answer

Roger Stafford
Roger Stafford on 13 May 2016
Edited: Roger Stafford on 13 May 2016
Let M be your matrix.
[m,n] = size(M);
T = diff([true(1,n);isnan(M);true(1,n)],1,1);
T2 = false(1,n);
for k = 1:n
f = find(T(:,k));
T2(k) = any((f(2:2:end)-f(1:2:end-1))>=300); % <-- Corrected
end
M2 = M(:,T2);
M2 will be your result. I assume that where you said "delete" you meant that literally - such columns should be removed from the matrix.
  5 Comments
Ekaterina Serikova
Ekaterina Serikova on 19 May 2016
Dear Roger, could you please advise how can I adjust the code for zeroes instead of NaNs? How can I Change isnan condition to zeroes? Thanks a lot.
dpb
dpb on 19 May 2016
Should be to just replace isnan(M) with M==0 in the first expression (although you may need to check on the initial condition as as long as there are no NaN in M the first value can't be but might it be 0?

Sign in to comment.

More Answers (1)

the cyclist
the cyclist on 13 May 2016
Download RunLengths from the FEX.
Then run this code. (I used a length threshold of 3 for illustration. You'll want to use 300.)
LENGTH_THRESHOLD = 3;
M = [NaN -0.00042 0.00073 -0.00298 NaN
NaN -0.00114 0.00029 0.00265 NaN
NaN -0.00028 NaN -0.00066 NaN
NaN 0.00197 0.00219 0.00099 NaN]
numberColumns = size(M,2);
maxNonNanLength = zeros(1,numberColumns);
for nc = 1:numberColumns
[b,L] = RunLength(M(:,nc));
currentNonNanLength = 0;
for nb = 1:numel(b)
if isnan(b(nb))
currentNonNanLength = 0;
else
currentNonNanLength = currentNonNanLength + L(nb);
maxNonNanLength(nc) = max(maxNonNanLength(nc),currentNonNanLength);
end
end
end
hasSufficientNonNanLength = maxNonNanLength >= LENGTH_THRESHOLD;
M = M(:,hasSufficientNonNanLength);

Categories

Find more on Genomics and Next Generation Sequencing in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!