How to see if characters are present in a string array.
15 views (last 30 days)
Show older comments
Elijah Roberts
on 2 Dec 2021
Commented: Elijah Roberts
on 3 Dec 2021
I am trying to write some code that will take a short amino acid sequence, ex. 'GSA' and then search through a string array of sequences to find the number and index of matches, but I would like it to ignore the order of the characters. As long as each character is present, I would like to consider it a hit.
Here is the code I have so far, which kind of works. InputSeq is the sequence I would like to search for, and AAseq is the string array of sequences that I would be searching through. This code only produces a match if all characters are present AND the order is correct.
InputSeq = "GSA";
AAseq = [ SGD; SGS; SGA; SGV; SGS; SGA; SGD; SGS; SGS; SGY; SGD; SGS; SGI.........];
result = ismember(InputSeq, AAseq)
This kind of works, but it will not register a match if the order of the characters does not match.
0 Comments
Accepted Answer
Stephen23
on 3 Dec 2021
Edited: Stephen23
on 3 Dec 2021
Assuming that all string elements contain exactly the same number of characters, then you can do this easily with basci logical operations on character arrays:
A = "GSA";
B = ["SGD";"SGS";"SGA";"SGV";"SGS";"SGA";"SGD";"SGS";"SGS";"SGY";"SGD";"SGS";"SGI"]
X = all(sort(char(A))==sort(char(B),2),2)
Or without sorting:
X = all(any(char(A)==permute(char(B),[1,3,2]),3),2)
3 Comments
More Answers (1)
Walter Roberson
on 2 Dec 2021
You could use multiple contains() tests.
But I suggest that instead you do something like
ismember(sort(char(InputSeq)), cellfun(@sort, cellstr(AAseq), 'uniform', 0))
2 Comments
Walter Roberson
on 2 Dec 2021
ismember( cellfun(@sort, cellstr(AAseq), 'uniform', 0), sort(char(InputSeq)) )
You could also strcmp()
See Also
Categories
Find more on Logical in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!