Finding valid sequences in a list based on specific conditions in matlab

I have a list of scores represented as a cell array in MATLAB, and I need to extract valid sequences from this list based on certain conditions. The scores can be one of three stages: 'Rain', 'Cloud', or 'Sun'.
The conditions for a valid sequence are as follows:
For a sequence of 'Sun' stages, the sequence must have a length of at least 7.
For a sequence of 'Rain' or 'Cloud' stages, the sequence must have a length of at least 7, and all stages within the sequence must be either 'Rain' or 'Cloud'.
I have tried implementing the following code, but it does not provide the correct results for all cases:
scores = {'Rain' 'Cloud' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Rain'};
sequence_start = 1;
num_scores = numel(scores);
valid_sequences = [];
for j = 1:num_scores
stage = scores{j};
if strcmp(stage, 'Sun')
sequence_length = j - sequence_start + 1;
if sequence_length >= 7
valid_sequences = [valid_sequences; sequence_start, j];
end
elseif ismember(stage, {'Cloud', 'Rain', 'Thunder'})
sequence_length = j;
if sequence_length >= 7 && all(ismember(scores(sequence_start:j), {'Cloud', 'Rain', 'Thunder'}))
valid_sequences = [valid_sequences; sequence_start, j];
end
sequence_start = j + 1; % Reset sequence_start
else
sequence_start = j; % Reset sequence_start
end
end
if (num_scores - sequence_start + 1) >= 7
valid_sequences = [valid_sequences; sequence_start, num_scores];
end
this gives me the results valid_sequences = 3, 9 which is correct (Sun starts at position 3 and ends at position 9).
However, when I put the following test cases in:
test 1:
scores = {'Rain' 'Rain' 'Cloud' 'Cloud' 'Rain' 'Cloud' 'Sun' 'Cloud' 'Rain'};
I get the answer valid_sequences = 9,9
test 2:
scores = {'Sun' 'Rain' 'Cloud' 'Cloud' 'Rain' 'Cloud' 'Rain' 'Cloud' 'Thunder' 'Sun'};
valid_sequences =
7 7
8 8
9 9
Which suggests there seems to be something wrong with how the Rain/Cloud/Thunder gets read in after Sun.
I would be very grateful if you could help!

 Accepted Answer

See if the results are what you expect (if not please explain why)
getvalidseq({'Rain' 'Cloud' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Rain'})
ans = 1×2
3 9
getvalidseq({'Rain' 'Rain' 'Cloud' 'Cloud' 'Rain' 'Cloud' 'Sun' 'Cloud' 'Rain'})
ans = 0×2 empty double matrix
getvalidseq({'Sun' 'Rain' 'Cloud' 'Cloud' 'Rain' 'Cloud' 'Rain' 'Cloud' 'Thunder' 'Sun'})
ans = 1×2
2 9
function valid_sequences = getvalidseq(scores)
b = [true; diff(ismember(scores(:), {'Rain' 'Cloud' 'Thunder'})); true];
i = find(b);
n = diff(i);
j = find(n>=7);
valid_sequences = [i(j) i(j+1)-1];
end

5 Comments

getvalidseq({'Rain' 'Rain' 'Cloud' 'Cloud' 'Rain' 'Cloud' 'Sun' 'Cloud' 'Rain'})
ans =
0×2 empty double matrix
this should be hopefully be: 1 6; 7 7
because ideally i will then be able to take these values and extract the corresponding timeseries and merge them. Thanks so much for your help!
I don't understand, because they have respectively lengths 6 and 1, and you say you only keep those with length >= 7. And why discard |8 9]? There is no coherence in the expected results. The best I can do is:
getvalidseq({'Rain' 'Cloud' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Rain'})
ans = 3×2
1 2 3 9 10 10
getvalidseq({'Rain' 'Rain' 'Cloud' 'Cloud' 'Rain' 'Cloud' 'Sun' 'Cloud' 'Rain'})
ans = 3×2
1 6 7 7 8 9
getvalidseq({'Sun' 'Rain' 'Cloud' 'Cloud' 'Rain' 'Cloud' 'Rain' 'Cloud' 'Thunder' 'Sun'})
ans = 3×2
1 1 2 9 10 10
function valid_sequences = getvalidseq(scores)
b = [true; diff(ismember(scores(:), {'Rain' 'Cloud' 'Thunder'})); true];
i = find(b);
n = diff(i);
%j = find(n>=7);
j = find(n>=0);
valid_sequences = [i(j) i(j+1)-1];
end
I am so sorry, that was a typo.
this should be hopefully be: 1 6; 8 9. I would then combine 1-6 and 8-9 and have a total of 8 epochs.
OK this should do it.
getvalidseq({'Rain' 'Cloud' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Sun' 'Rain'})
ans = 1×2
3 9
getvalidseq({'Rain' 'Rain' 'Cloud' 'Cloud' 'Rain' 'Cloud' 'Sun' 'Cloud' 'Rain'})
ans = 2×2
1 6 8 9
getvalidseq({'Sun' 'Rain' 'Cloud' 'Cloud' 'Rain' 'Cloud' 'Rain' 'Cloud' 'Thunder' 'Sun'})
ans = 1×2
2 9
function valid_sequences = getvalidseq(scores)
b = [true; diff(ismember(scores(:), {'Rain' 'Cloud' 'Thunder'})); true];
i = find(b);
n = diff(i);
nn = size(n,1);
j1 = (1:2:nn)';
j2 = (2:2:nn)';
jj = {j1, j2};
jj = jj(cellfun(@(j) sum(n(j)), jj) >= 7);
j = cat(1,jj{:});
valid_sequences = [i(j) i(j+1)-1];
end
PS: I reread your original question and you never state the merge thing. I'm notorious here to dislike when people asking incompete question, or change the request during the thread is developping.
Thank you so so much!!!! I am so sorry for the lack of clarity, I will do better going forward. Thanks so incredibly much, this moves my work forward a ton!!

Sign in to comment.

More Answers (0)

Categories

Tags

Asked:

on 19 Jul 2023

Commented:

on 20 Jul 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!