How can I carry out overlap string match

Hi,
I have a string as such:
seq1 = 'QEAFEISKXXXXXXX'
I want to find all occurrences of two Xs separated by a single character (including X). By inspection, there are seven Xs side-by-side in the sequence so the answer i am looking for in terms of indices is: 9 10 11 12 13
I have used regexp as follows:
seq1 = 'QEAFEISKXXXXXXX';
seq2 = 'X.X'; % Dot indicates any character to match with
regexp(seq1, seq2)
The answer i get is: 9 12
How can i modify the code to get every occurrence?
Thanks.

 Accepted Answer

seq2 = 'X(?=.X)'
regexp(seq1, seq2)

2 Comments

Thank you very much. That's what i was looking for. I have another question, if i now have to find X _ _ X (Xs separated by two characters)then how would i modify your solution?
I got it. seq2 = 'X(?=..X)'. Thanks again.

Sign in to comment.

More Answers (1)

strfind('QEAFEISKXXXXXXX','XX')

2 Comments

Thank you for your answer. But in your answer, the two Xs are not separated by any character. What i am interested in is 'X_X' where _ can be character. E.g. 'XAX' 'XAX' 'XXX' etc. In my code i have used X.X but seems the regexp does not overlap through the sequence hence not giving the answer i am looking for.
exotics:
ii = strfind('QEAFEIXKXXXXXXX','X');
out = ii(any(bsxfun(@minus,ii(:),ii(:)') == -2,2));

Sign in to comment.

Categories

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!