problem with regexp and split, and picking some cells

I have the following input:
>> data(1).Header
ans =
AF051909 |392-397:CAGCTG| |413-418:CAGGTG|
I needed to save them to cells as {'392-397', 'CAGCTG'; '413-418', 'CAGGTG';}
I so I used regexp to do so with the following code:
struKm(1).trueBinding = regexp(data(1).Header,'\s\||\:|\|','split');
this returns:
>> struKm(1).trueBinding
ans =
'AF051909' '392-397' 'CAGCTG' '' '413-418' 'CAGGTG' ''
as you can see there are to empty cells and I tried to find out why they are there but failed.
I also tried to ignore that and continue to picking up the cell that I need for the rest of my code which is 'CAGCTG' and 'CAGGTG'. I have this code to pick them up:
[r1,r2] = ismember(struKm(1).trueBinding,set)
it return zeros.
Can someone help with two issues please?
Regards, A.

 Accepted Answer

you can maintain your code and add a line code to remove empty elements
data='AF051909 |392-397:CAGCTG| |413-418:CAGGTG|'
s=regexp(data,'\s\||\:|\|','split');
s(cellfun(@(x) isempty(x),s))=[]

More Answers (1)

Thank you for your reply.
I solved the issues but another is appeared.
Now,
struKm(i).seqNam = cellstr(regexp(data(i).Header, '\s\||\:|\|','split')); % determen the seqance name heads
struKm(i).seqNam(cellfun(@(x) isempty(x),struKm(i).seqNam))=[];
This code is in a FOR LOOP.
the result for this code is:
ans =
'AF051909' '392-397' 'CAGCTG' '413-418' 'CAGGTG'
some seqNams contain only one Binding site (CAGCTG). for Example:
ans =
'M13483' '445-450' 'CAACTG'
Now I want to pick the Binding sites only which are (CAGCTG, CAGGTG, CAACTG , ... etc)
I have another for loop that will do it. The code:
struSize = length(struKm);
tempcell = cell(1,1);
for m=1:struSize
if (length(struKm(m).seqNam) == 3)
resultsk.BS{m} = struKm(m).seqNam(3);
disp(m);
end
if (length(struKm(m).seqNam) == 5)
resultsk.BS{m} = cellstr(struKm(m).seqNam([3,5]));
%tempcell = struKm(m).seqNam([3,5]); resultsk.BS{m} = cellstr(tempcell);
disp(m);
end
end
and the result for this code:
>> resultsk.BS{:}
ans =
'CAGCTG' 'CAGGTG'
ans =
'CAACTG'
ans =
'CAACTG'
The problem with some cells that have two binding sites which made the cell next to cell.
I need them all in one row. still struggling with this. Can you please help?
Thank you, A

2 Comments

post a sample of your data
>AF051909 |392-397:CAGCTG| |413-418:CAGGTG|
tgccgctcagaaaaaaacgatctttggtgaacagtaggagccatctgagcggtgcgacgcattgtgctcccattccacacgctgcggcggccctCAGCTGtcatgcctggaaCAGGTGgtgtaaggcaatccctgggcagccgtgctccccgcccccccccgggccgaccttaaaggcgctgcgtgtgccctggctcctc
>M13483 |445-450:CAACTG|
ccttacatggtctgggggctccctggctgatcctctcccctgcccttggctccatgaatggcctcggcagtcctagcgggtgcgaaggggaccaaataaggcaaggtggcagaccgggccccccacccctgcccccggctgctcCAACTGaccctgtccatcagcgttctataaagcggccctcctggagccagccaccc
>M26773 |446-451:CAACTG|
cttacatggtctgggagccccctggctgatcctctaccctgcccttggctccaagaatggcctcagcggtcctagatggtgctaaggcgaccaaataaggcaaggtggcagatcaggggccccccacccctgcccccggctgctcCAACTGaccccgtccatcagagagctataaagctgcgctccaggcgactgacacc
>M86232 |447-452:CACTTG|
ctgtgctattctggtttggatgtgactcagaacacagttgaacattatttgaactcacagagcttgccattctggaagcacagccttatatgtagtgtccatgggcagtcctattatgggaaagcaacttgagagaaaaggcgggtCACTTGcttgtgcgcaggtcctggaatttgaaatatccagaggcctctacagaa
>M86233 |447-452:CACTTG|
ctgtgctattctagtttggatgtgactcaggacagagttgaacattatttgaattcacagagcttgccatgctggaagcacagccttatatgtagtgtccatgggcagtcctattatggcaaagcaacttgagagaaaaggcgggtCACTTGcttgtgcgcaggtcctggaatttgaaatatccagaggccctacagaat
>X00371 |326-331:CACCTG|
gagctgtcctgcctcgccacaatggCACCTGccctaaaatagcttcccatgtgagggctagagaaaggaaaagattagaccctccctggatgagagagagaaagtgaaggagggcaggggagggggacagcgagccattgagcgatctttgtcaagcatcccagaaggtataaaaacgcccttgggaccaggcagcctca
>X53154 |440-445:CAGCTG|
cgaaggattggtaggcttgccgtcacaggacccccgctggctgactcaggggcgcaggctcttgcgggggagctggcctcccgcccccacggccacgggccctttcctggcaggacagcgggatcttgCAGCTGtcaggggaggggatgacgggggactgatgtcaggaggggatacaaatagtgccgacggctaggggg
>X59034 |442-447:CAGCTG| |461-466:CAGGTG|
accaaacacaatgacaagcctctgactcatgatctatgtagactctcagacactttacatctagtaagagtatagcgatcatgttaagcaaggcacgtctgtggccacagaaggccccaagctttgaggctgtgggcagctCAGCTGtcatgcgggcacaCAGGTGatgtaagacaatagctgtggagtcagctggcttc

Sign in to comment.

Categories

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!