You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
Compare two strings with some restrictions
1 view (last 30 days)
Show older comments
Hey, how are you?
I have to compare to strings of n and m lines each other to see if they have the same messages. The messages are the following way:
!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053
!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053
!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053
!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E0053
!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C0054
!AIVDM,1,1,,A,137FrD02Bu0:=9@GS16Vu5O`00S<,0*5D0054
!AIVDM,1,1,,B,33F6AD00@0P9ud6GbtWAQmob22rA,0*2B0055
As you can see the last four numbers change from 0000 to 5959 the first two are minutes and the other two seconds. I have the code to compare all the messages from one script to another but now I have to compare just the messages that have and ending in a range that we put. Exemple:
!AIVDM,1,1,,A,13GQ>C@P00P9rHrGasGf4?wn20SK,0*020059
This message ends with 0059 I should compare it with all the messages that end from the number 0000 and 0159. That makes a comparison with the numbers that are one minut above and up the message.
4 Comments
flashpode
on 15 Sep 2021
Edited: flashpode
on 18 Sep 2021
Okay one string is this one:
!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053
!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053
!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053
!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E0053
!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C0054
and the other string is:
"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"
so the output is another string that has the messages taht are in both strings
Accepted Answer
Walter Roberson
on 15 Sep 2021
In https://www.mathworks.com/matlabcentral/answers/1452949-get-the-last-for-digits-as-the-time-this-message-was-sent#answer_787044 I showed you have to extract the last 4 digits of each line, as text.
The result would have been a cell array of character vectors. You can str2double() to get a set of decimal numbers.
Once you have the set of decimal numbers, referred to below as DN, then
dur = minutes(floor(DN/100)) + seconds(mod(DN,100));
If you do that for both sets of data, getting dur1 and dur2, then
[~, M1, S1] = hms(dur1);
[~, M2, S2] = hms(dur2);
[has_match0, idx0] = ismember(M1, M2);
[has_match1, idx1] = ismember(M1+1, M2);
M1_has_match = has_match0 | has_match1;
M1_match(has_match1) = idx1(has_match1);
M1_match(has_match0) = idx0(has_match0);
M1_matches = find(M1_has_match);
M2_matches = M1_match(M1_has_match);
If I got everything right, then M1_matches will be the index into the first set of durations in which there are matches, and M2_matches will be the corresponding indexes into the second set of durations that match the first set.
Any one entry in the first set of durations is only looked for once in the second set of durations, but because of the matching process, any given entry in the second set of durations could match more than one entry in the first set of durations. You did not ask for the closest match that occurs within a particular time interval: you asked for matches that occur if the second set has any entry that has the same minute as one in the first set, or is the next minute after one in the first set.
31 Comments
flashpode
on 15 Sep 2021
I just want to compare this part of the message
"!AIVDM,1,1,,A,13FtuD?P00P9tuDGbFw4JgvH00T<,0*3F0011" with the other messages that have an ending number in the rank that we put that is from 1 minute less or more. I do not understand what you did and why because in the end you are not comparing the messages.
Walter Roberson
on 16 Sep 2021
S1s = [
"!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053"
"!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053"
"!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053"
"!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E0053"
"!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C0054"
"!AIVDM,1,1,,A,137FrD02Bu0:=9@GS16Vu5O`00S<,0*5D0054"
"!AIVDM,1,1,,B,33F6AD00@0P9ud6GbtWAQmob22rA,0*2B0055"
"!AIVDM,1,1,,B,13F9RTPP00P9rL0GasQf4?wf2D1E,0*430055"
"!AIVDM,1,1,,A,4028j;1vDfG0o09cG0Gdh4i000S:,0*560056"
"!AIVDM,1,1,,A,D028j;0flffp,0*430056"
"!AIVDM,1,1,,B,13E`977P00P:06:Gc8DkW?wh2@Q3,0*690056"
"!AIVDM,1,1,,A,13EpM3PP0009nVdGb9Itfwwh0HQ6,0*0A0056"
"!AIVDM,1,1,,B,13GQ:Fw01@P9qH6GaS:WiVEh2<1E,0*6D0056"
"!AIVDM,1,1,,B,13cq;9000IP9saTGb3d0b0ad8<1s,0*320057"
"!AIVDM,1,1,,A,39NSCRU000P9uM`GbVRU=@qD0000,0*480057"
"!AIVDM,1,1,,A,13Esmv000009qWBGb=BLHAUl0L1U,0*5A0057"
"!AIVDM,1,1,,A,13F:b60P0J09sKpGb1UhGOwl0<1l,0*3C0058"
"!AIVDM,1,1,,A,13EaMT?000P9wK2Gblptiooh0D1;,0*430059"
"!AIVDM,1,1,,A,13GNje0P00P9nebGb5nv4?wl00SB,0*4B0059"
"!AIVDM,1,1,,A,13GQ>C@P00P9rHrGasGf4?wn20SK,0*020059"
"!AIVDM,1,1,,B,H3dPfW4UC=D7@?>q82knjo2P9430,0*610059"
"!AIVDM,1,1,,B,13GPhM0P00P9rGPGast>4?wn2<1C,0*080059"
"!AIVDM,1,1,,A,13ErMfPP01P9rG0Gasc>4?wn26p4,0*0F0100"
"!AIVDM,1,1,,A,39NSVP500009tArGbL07q6Mn0F;r,0*490100"
"!AIVDM,1,1,,A,4028ioivDfG0s09kDvGag6G006p0,0*010100"
"!AIVDM,1,1,,A,D028ioj"
"!AIVDM,1,1,,A,4028jJ1vDfG1009cHtGdh1g026p4,0*100100"
"!AIVDM,1,1,,B,137FrD0v2u0:=9TGRwMVu5N00H0j,0*5E0101"
"!AIVDM,1,1,,B,19NS@=@01qP9u@fGQs3PbP`40H0l,0*030101"
"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbG1TJgv40H1;,0*7D0102"
];
S2s = [
"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"
"!AIVDM,1,1,,B,19NS@=@01qP9tp4GQkJ0bh`200SP,0*780000"
"!AIVDM,1,1,,B,137FrD0v2u0:=4pGS;s6u5On00SJ,0*000000"
"!AIVDM,1,1,,A,4028jJ1vDfG0009cIVGdh2?0280S,0*400000"
"!AIVDM,1,1,,B,H3GQ9khl4LLTF0l5T0000000000,2*070001"
"!AIVDM,1,1,,A,H33mw2Q>uV0luHTpN3800000000,2*080001"
"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbFw4Jgv40L1f,0*030002"
"!AIVDM,1,1,,B,D028jJ03`N?b<`O6Dl<O6D0,2*350002"
"!AIVDM,1,1,,B,137JlD51h0P9tddGbCQSm0j2081e,0*0E0002"
"!AIVDM,1,1,,A,13EuB00P0009n`TGb82:ugv600RQ,0*110003"
"!AIVDM,1,1,,A,13EsReP00009vQ`Gbj65gPh400SJ,0*350003"
"!AIVDM,1,1,,B,33EtT>5000P9nI2Gb8H5FP<60Dm:,0*580004"
"!AIVDM,1,1,,B,13Efqs800109q6DGb0wHhq@826p0,0*0A0004"
"!AIVDM,1,1,,A,13F9RTPP00P9rKpGasQf4?v820Sf,0*6D0004"
"!AIVDM,1,1,,B,13EmCs70010:0;bGc<Lbh3280<14,0*340004"
"!AIVDM,1,1,,A,13EcsW7P00P:07PGc9Ws@gv82D0l,0*060004"
"!AIVDM,1,1,,B,13EpM3PP0009nVPGb9EJG?v:0<1N,0*590005"
"!AIVDM,1,1,,A,15AMJH00000:1i6Ga0oP0Af:0<16,0*470005"
"!AIVDM,1,1,,A,13cq;9000GP9sUlGb2IPRPb680T5,0*510005"
"!AIVDM,1,1,,B,4028j;1vDfG0509cFhGdh5Q0083a,0*5C0006"
"!AIVDM,1,1,,B,D028j;0flffp,0*400006"
"!AIVDM,1,1,,A,137FrD032u0:=5<GS;6Vu5N:0<1=,0*620006"
"!AIVDM,1,1,,B,13Esmv000009qW:Gb=BLHAT@0H4>,0*660007"
"!AIVDM,1,1,,A,13E`977P00P:06DGc8D00?v>26p0,0*2B0007"
"!AIVDM,1,1,,A,33cm<M1th209wjpGb066k1v<0P00,0*470007"
"!AIVDM,1,1,,A,13GQ:Fw01?P9qW4GaW=7d6:>26p0,0*150007"
"!AIVDM,1,1,,B,13F:b60P0I09sHjGb0D0Bwv@0<1i,0*780008"
"!AIVDM,1,1,,B,B3E`?Q00002Q8LUrpW7Q3wT5kP06,0*760008"
"!AIVDM,1,1,,B,H3`fKwPiDp40000000000000000,2*5F0008"
"!AIVDM,1,1,,B,H3`fKwTTDBE5847@4lpnl0200320,0*700008"
"!AIVDM,1,1,,B,33EaMT?000P9wK4Gblq<iop<00jk,0*390009"
"!AIVDM,1,1,,B,H4hJ<S0l58T4R118Tp<E=>0TTV0,2*450009"
"!AIVDM,1,1,,A,13GPhM0P00P9rGHGast>4?vB20Sr,0*610009"
"!AIVDM,2,1,2,B,53cq;982CRFhT8P<000`thiV1H4p4@Tt00000017D1@CC4en0K1DhPkP0000,0*2C0009"
"!AIVDM,2,2,2,B,00000000000,2*250009"
"!AIVDM,1,1,,B,13GQ>C@P00P9rHjGasGf4?vB2@5d,0*0D0009"
"!AIVDM,1,1,,A,7000003dTINH,0*6D0010"
"!AIVDM,1,1,,A,19NS@=@01qP9tt6GQlahb@`F0H65,0*2D0010"
"!AIVDM,1,1,,A,33FMMd0P0009o1fGapC<Uwv@000k,0*330010"
"!AIVDM,1,1,,B,4028ioivDfG0909kDvGag6G00@6;,0*730010"
"!AIVDM,1,1,,B,D028ioj<Tffp,0*2F0010"
"!AIVDM,2,1,3,A,53FMMd82;AMHD4i4001=049DpdE:3C400000001I081234He0=hUDTR@CPK0,0*030010"
"!AIVDM,2,2,3,A,hDm1C33kP00,2*1F0010"
"!AIVDM,1,1,,B,13ErMfPP00P9rFrGasc>4?vD20RI,0*3C0010"
"!AIVDM,1,1,,B,4028jJ1vDfG0:09cIRGdh1g0286J,0*090010"
"!AIVDM,1,1,,B,13GNje0P00P9nePGb5mN4?vD00S=,0*170011"
"!AIVDM,1,1,,B,13EoPo7P00P:0IjGc:d00?vF2@6T,0*490011"
"!AIVDM,1,1,,A,13FtuD?P00P9tuDGbFw4JgvH00T<,0*3F0011"
"!AIVDM,1,1,,B,H3GQ9klTC=D4s;H51npno0106400,0*040011"
"!AIVDM,1,1,,B,137FrD0uBu0:=68GS9P6u5NF0@7B,0*2D0012"
"!AIVDM,1,1,,A,H33mw2TTCBD6ubp00000001@4440,0*170012"
"!AIVDM,1,1,,B,H4hJ<S4T1=30000J7;FoPP1P<560,0*4A0013"
"!AIVDM,1,1,,A,B3EpfoP0002OoB5rlUwQ3wT5kP06,0*1B0013"
"!AIVDM,1,1,,B,15AMJH00000:1i6Ga0pP0AhL0D16,0*5B0014"
"!AIVDM,1,1,,B,13F9RTPP00P9rKrGasQf4?vL288J,0*570014"
];
msg1 = regexp(S1s, '.*(?=\d{4}$)', 'match', 'once');
msg2 = regexp(S2s, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(S1s, '\d{4}$', 'match', 'once');
t2 = regexp(S2s, '\d{4}$', 'match', 'once');
mask1 = ismissing(msg1) | ismissing(t1);
mask2 = ismissing(msg2) | ismissing(t2);
origidx1 = (1:length(msg1));
origidx2 = (1:length(msg2));
msg1(mask1) = []; t1(mask1) = []; origidx1(mask1) = [];
msg2(mask2) = []; t2(mask2) = []; origidx2(mask2) = [];
DN1 = str2double(t1);
DN2 = str2double(t2);
dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));
dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));
[~, Min1, S1] = hms(dur1);
[~, Min2, S2] = hms(dur2);
num_msg1 = length(msg1);
msg_match = cell(num_msg1, 1);
for K = 1 : num_msg1
all_match_idx = find(msg1(K) == msg2);
if isempty(all_match_idx);
fprintf('No text match for line #%d -> "%s"\n', origidx1(K), msg1(K));
continue;
end
fprintf('potential match for line #%d -> "%s", checking times\n', origidx1(K), msg1(K));
disp(K), disp(all_match_idx)
complete_match_idx = all_match_idx(Min1(K) == Min2(all_match_idx) | Min1(K) == Min2(all_match_idx) - 1);
msg_match{K} = complete_match_idx;
if isempty(complete_match_idx)
fprintf('line %#d -> "%s" matched text but not time\n', origidx1(K), msg1(K));
else
fprintf('line %#d -> "%s" matches on time too! Matches are:\n', origidx1(K), msg1(K));
msg2(complete_match_idx)
end
end
No text match for line #1 -> "!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C"
No text match for line #2 -> "!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B"
No text match for line #3 -> "!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*39"
No text match for line #4 -> "!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E"
No text match for line #5 -> "!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C"
No text match for line #6 -> "!AIVDM,1,1,,A,137FrD02Bu0:=9@GS16Vu5O`00S<,0*5D"
No text match for line #7 -> "!AIVDM,1,1,,B,33F6AD00@0P9ud6GbtWAQmob22rA,0*2B"
No text match for line #8 -> "!AIVDM,1,1,,B,13F9RTPP00P9rL0GasQf4?wf2D1E,0*43"
No text match for line #9 -> "!AIVDM,1,1,,A,4028j;1vDfG0o09cG0Gdh4i000S:,0*56"
No text match for line #10 -> "!AIVDM,1,1,,A,D028j;0flffp,0*43"
No text match for line #11 -> "!AIVDM,1,1,,B,13E`977P00P:06:Gc8DkW?wh2@Q3,0*69"
No text match for line #12 -> "!AIVDM,1,1,,A,13EpM3PP0009nVdGb9Itfwwh0HQ6,0*0A"
No text match for line #13 -> "!AIVDM,1,1,,B,13GQ:Fw01@P9qH6GaS:WiVEh2<1E,0*6D"
No text match for line #14 -> "!AIVDM,1,1,,B,13cq;9000IP9saTGb3d0b0ad8<1s,0*32"
No text match for line #15 -> "!AIVDM,1,1,,A,39NSCRU000P9uM`GbVRU=@qD0000,0*48"
No text match for line #16 -> "!AIVDM,1,1,,A,13Esmv000009qWBGb=BLHAUl0L1U,0*5A"
No text match for line #17 -> "!AIVDM,1,1,,A,13F:b60P0J09sKpGb1UhGOwl0<1l,0*3C"
No text match for line #18 -> "!AIVDM,1,1,,A,13EaMT?000P9wK2Gblptiooh0D1;,0*43"
No text match for line #19 -> "!AIVDM,1,1,,A,13GNje0P00P9nebGb5nv4?wl00SB,0*4B"
No text match for line #20 -> "!AIVDM,1,1,,A,13GQ>C@P00P9rHrGasGf4?wn20SK,0*02"
No text match for line #21 -> "!AIVDM,1,1,,B,H3dPfW4UC=D7@?>q82knjo2P9430,0*61"
No text match for line #22 -> "!AIVDM,1,1,,B,13GPhM0P00P9rGPGast>4?wn2<1C,0*08"
No text match for line #23 -> "!AIVDM,1,1,,A,13ErMfPP01P9rG0Gasc>4?wn26p4,0*0F"
No text match for line #24 -> "!AIVDM,1,1,,A,39NSVP500009tArGbL07q6Mn0F;r,0*49"
No text match for line #25 -> "!AIVDM,1,1,,A,4028ioivDfG0s09kDvGag6G006p0,0*01"
No text match for line #27 -> "!AIVDM,1,1,,A,4028jJ1vDfG1009cHtGdh1g026p4,0*10"
No text match for line #28 -> "!AIVDM,1,1,,B,137FrD0v2u0:=9TGRwMVu5N00H0j,0*5E"
No text match for line #29 -> "!AIVDM,1,1,,B,19NS@=@01qP9u@fGQs3PbP`40H0l,0*03"
No text match for line #30 -> "!AIVDM,1,1,,B,13FtuD?P00P9tuDGbG1TJgv40H1;,0*7D"
found_something_at = find(~cellfun(@isempty, msg_match))
found_something_at =
0×1 empty double column vector
Walter Roberson
on 16 Sep 2021
You are not clear as to what a "match" means, so I had to guess that you wanted to see the same !AIVDM text (without time) appearing in both streams, with the second stream being either the same minute as the original or else the next minute compared to the original.
As you can see, with the same data you provided, there are not text matches at all.
But if you meant that just the times had to match that way, without the text having to match, then it is confusing, as you talk as if there is "a" match in the second set of strings, when instead there are numerous matches if you are just considering "within the next calender minute" to be the match criteria -- and furthermore, that most strings in the second set match multiple strings in the first set if you consider only the time that way. The desired output is not clear.
Walter Roberson
on 16 Sep 2021
Are you wanting to compare only on time? If so then there are multiple matches for each input.
If you are wanting to compare based upon the part before the time, together with the time being close enough, then in the data you posted, there is no matches for that.
Walter Roberson
on 16 Sep 2021
If you want to compare only on time, then all except the last 8 of the first input matches, and everything in the second set matches each item in the first set (except the last 8)
Walter Roberson
on 17 Sep 2021
how could I change the string t1 and t2 to be able to make t1+100
t1p1dur = minutes(str2double(regexp(t1, '^\d{2}')) + 1) + seconds(str2double(regexp(t1, '\d{2}$')));
t1p1dur.Format = 'mm:ss';
t1p1 = string(t1p1dur);
This would construct new strings that were 1 minute later than the old strings.
Or... you could take the existing Min1 in the code, which is the minutes portion of (valid) t1 entries, and add 1 and compare against Min2 .
Or... you could notice that the existing line
complete_match_idx = all_match_idx(Min1(K) == Min2(all_match_idx) | Min1(K) == Min2(all_match_idx) - 1);
already compares Min1 to Min2 - 1, which is the same thing as comparing Min1+1 to Min2 . If you would feel more comfortable you could rewrite the line marginally to
complete_match_idx = all_match_idx(Min1(K) == Min2(all_match_idx) | Min1(K) + 1 == Min2(all_match_idx));
Walter Roberson
on 17 Sep 2021
I want to compare with all the messages sended one minut before and after of the set 1
What does it mean to "compare" ??
If you are going strictly by time, then notice that your first string input
"!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053"
is minute 00, and so matching on time would be asking to match strings with minute 00 or 01. Which strings are those?
"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"
"!AIVDM,1,1,,B,19NS@=@01qP9tp4GQkJ0bh`200SP,0*780000"
"!AIVDM,1,1,,B,137FrD0v2u0:=4pGS;s6u5On00SJ,0*000000"
"!AIVDM,1,1,,A,4028jJ1vDfG0009cIVGdh2?0280S,0*400000"
"!AIVDM,1,1,,B,H3GQ9khl4LLTF0l5T0000000000,2*070001"
"!AIVDM,1,1,,A,H33mw2Q>uV0luHTpN3800000000,2*080001"
"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbFw4Jgv40L1f,0*030002"
"!AIVDM,1,1,,B,D028jJ03`N?b<`O6Dl<O6D0,2*350002"
"!AIVDM,1,1,,B,137JlD51h0P9tddGbCQSm0j2081e,0*0E0002"
"!AIVDM,1,1,,A,13EuB00P0009n`TGb82:ugv600RQ,0*110003"
"!AIVDM,1,1,,A,13EsReP00009vQ`Gbj65gPh400SJ,0*350003"
"!AIVDM,1,1,,B,33EtT>5000P9nI2Gb8H5FP<60Dm:,0*580004"
"!AIVDM,1,1,,B,13Efqs800109q6DGb0wHhq@826p0,0*0A0004"
"!AIVDM,1,1,,A,13F9RTPP00P9rKpGasQf4?v820Sf,0*6D0004"
"!AIVDM,1,1,,B,13EmCs70010:0;bGc<Lbh3280<14,0*340004"
"!AIVDM,1,1,,A,13EcsW7P00P:07PGc9Ws@gv82D0l,0*060004"
"!AIVDM,1,1,,B,13EpM3PP0009nVPGb9EJG?v:0<1N,0*590005"
"!AIVDM,1,1,,A,15AMJH00000:1i6Ga0oP0Af:0<16,0*470005"
"!AIVDM,1,1,,A,13cq;9000GP9sUlGb2IPRPb680T5,0*510005"
"!AIVDM,1,1,,B,4028j;1vDfG0509cFhGdh5Q0083a,0*5C0006"
"!AIVDM,1,1,,B,D028j;0flffp,0*400006"
"!AIVDM,1,1,,A,137FrD032u0:=5<GS;6Vu5N:0<1=,0*620006"
"!AIVDM,1,1,,B,13Esmv000009qW:Gb=BLHAT@0H4>,0*660007"
"!AIVDM,1,1,,A,13E`977P00P:06DGc8D00?v>26p0,0*2B0007"
"!AIVDM,1,1,,A,33cm<M1th209wjpGb066k1v<0P00,0*470007"
"!AIVDM,1,1,,A,13GQ:Fw01?P9qW4GaW=7d6:>26p0,0*150007"
"!AIVDM,1,1,,B,13F:b60P0I09sHjGb0D0Bwv@0<1i,0*780008"
"!AIVDM,1,1,,B,B3E`?Q00002Q8LUrpW7Q3wT5kP06,0*760008"
"!AIVDM,1,1,,B,H3`fKwPiDp40000000000000000,2*5F0008"
"!AIVDM,1,1,,B,H3`fKwTTDBE5847@4lpnl0200320,0*700008"
"!AIVDM,1,1,,B,33EaMT?000P9wK4Gblq<iop<00jk,0*390009"
"!AIVDM,1,1,,B,H4hJ<S0l58T4R118Tp<E=>0TTV0,2*450009"
"!AIVDM,1,1,,A,13GPhM0P00P9rGHGast>4?vB20Sr,0*610009"
"!AIVDM,2,1,2,B,53cq;982CRFhT8P<000`thiV1H4p4@Tt00000017D1@CC4en0K1DhPkP0000,0*2C0009"
"!AIVDM,2,2,2,B,00000000000,2*250009"
"!AIVDM,1,1,,B,13GQ>C@P00P9rHjGasGf4?vB2@5d,0*0D0009"
"!AIVDM,1,1,,A,7000003dTINH,0*6D0010"
"!AIVDM,1,1,,A,19NS@=@01qP9tt6GQlahb@`F0H65,0*2D0010"
"!AIVDM,1,1,,A,33FMMd0P0009o1fGapC<Uwv@000k,0*330010"
"!AIVDM,1,1,,B,4028ioivDfG0909kDvGag6G00@6;,0*730010"
"!AIVDM,1,1,,B,D028ioj<Tffp,0*2F0010"
"!AIVDM,2,1,3,A,53FMMd82;AMHD4i4001=049DpdE:3C400000001I081234He0=hUDTR@CPK0,0*030010"
"!AIVDM,2,2,3,A,hDm1C33kP00,2*1F0010"
"!AIVDM,1,1,,B,13ErMfPP00P9rFrGasc>4?vD20RI,0*3C0010"
"!AIVDM,1,1,,B,4028jJ1vDfG0:09cIRGdh1g0286J,0*090010"
"!AIVDM,1,1,,B,13GNje0P00P9nePGb5mN4?vD00S=,0*170011"
"!AIVDM,1,1,,B,13EoPo7P00P:0IjGc:d00?vF2@6T,0*490011"
"!AIVDM,1,1,,A,13FtuD?P00P9tuDGbFw4JgvH00T<,0*3F0011"
"!AIVDM,1,1,,B,H3GQ9klTC=D4s;H51npno0106400,0*040011"
"!AIVDM,1,1,,B,137FrD0uBu0:=68GS9P6u5NF0@7B,0*2D0012"
"!AIVDM,1,1,,A,H33mw2TTCBD6ubp00000001@4440,0*170012"
"!AIVDM,1,1,,B,H4hJ<S4T1=30000J7;FoPP1P<560,0*4A0013"
"!AIVDM,1,1,,A,B3EpfoP0002OoB5rlUwQ3wT5kP06,0*1B0013"
"!AIVDM,1,1,,B,15AMJH00000:1i6Ga0pP0AhL0D16,0*5B0014"
"!AIVDM,1,1,,B,13F9RTPP00P9rKrGasQf4?vL288J,0*570014"
EVERY one of those is minute 00, so EVERY one of them would match on time.
What does not match on time? Well, the last 8 of the S1 entries
"!AIVDM,1,1,,A,13ErMfPP01P9rG0Gasc>4?wn26p4,0*0F0100"
"!AIVDM,1,1,,A,39NSVP500009tArGbL07q6Mn0F;r,0*490100"
"!AIVDM,1,1,,A,4028ioivDfG0s09kDvGag6G006p0,0*010100"
"!AIVDM,1,1,,A,D028ioj"
"!AIVDM,1,1,,A,4028jJ1vDfG1009cHtGdh1g026p4,0*100100"
"!AIVDM,1,1,,B,137FrD0v2u0:=9TGRwMVu5N00H0j,0*5E0101"
"!AIVDM,1,1,,B,19NS@=@01qP9u@fGQs3PbP`40H0l,0*030101"
"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbG1TJgv40H1;,0*7D0102"
are either no valid time (4th one) or minute 01. You previously asked only to look into the same and next minute, so minute 01 in the first strings should match minute 01 or 02 in the second strings, and none of those exist, so the last 8 would not match under the old rules.
Look: S1 message 1 has t0(time) so I want to compare it with the messages from S2 with t0 +- 1 minute.
And you just modified the rules to also look backwards by 1 minute. So the 0102 in the input would look back up to minute 00 in t2... which would match everything in t2.
Under the rules you just defined, everything in t1 matches everyhing in t2, with the exception of
"!AIVDM,1,1,,A,D028ioj"
But... saying +/- 1 minute might mean that you want the difference to be no more than 60 seconds, which is different than what you had asked for before, which involved only looking at the minute number. Should a t1 entry of 0000 match a t2 entry of 0105 because the minute 00 is +/- 1 to the minute 01 in t2? Or would you want the match to fail because the time difference would be more than 60 seconds?
With the data you have, ever entry in t1 is within +/- 60 seconds of every entry in t2, with the exception of the
"!AIVDM,1,1,,A,D028ioj"
entry which has no time.
So... matching only on time is not going to be useful.
flashpode
on 17 Sep 2021
Yeah, if we got the message from S1 that ends with 0000 get all the messages from S2 that end from 0000 to 0100(that is a minute) and then do a ismember of those.
if the message os S1 ends with a 0100 get all the messages from S2 that ends from 0000 to 0200(that is one minut before and after the time from S1 '0100').
I only put some lines as you asked but I got numbers from messages that go from the number 0000 to the number 5959.
I was asking myself if this could be done by just adding a +-100 to those numbers. But I am working on it by now.
flashpode
on 17 Sep 2021
hey, I changed your code to give me in the end the message not the line where you can find it. But as I do not have a good level of matlab development do you know if I could do this comparisson less complicated to do it?
I already have t1 and t2 with number using the function double for this comparison it is just able the way you did. it seems a little complicated for me.
Walter Roberson
on 17 Sep 2021
What is your desired output:
- for each S1 input message, a list of all S2 messages that are within 1 "calendar minute" (the minute fields differ by at most 1)?
- for each S1 input message, a list of all S2 messages that are within 60 seconds? (0117 matching 0017 to 0217 but 0117 not matching 0243 because that is more than 60 seconds difference) ?
- or two blobs of messages -- a lump in which every S1 message that has some time-matching entry in S2 is put together, and another lump in which every S2 message that has some time-matching entry in S1 is put together, with no attempt to match point out which message which which other message?
If #3, then perhaps it would be easier to think of it as removing from S1 any message that does not match within 1 minute to something in S2, and remove from S2 any message that does not match within 1 minute to something in S1 ? The logic for that can be more efficient.
flashpode
on 17 Sep 2021
Here is the code you gave me with some diferences, the lines I've put % are the ones that do not understand why you have done them because they do not change nothing
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once');
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once');
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
mask1 = ismissing(msg_AIS1) | ismissing(t1);
mask2 = ismissing(msg_AIS2) | ismissing(t2);
origidx1 = (1:length(msg_AIS1));
origidx2 = (1:length(msg_AIS2));
msg_AIS1(mask1) = []; t1(mask1) = []; origidx1(mask1) = [];
msg_AIS2(mask2) = []; t2(mask2) = []; origidx2(mask2) = [];
DN1 = str2double(t1);
DN2 = str2double(t2);
dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));
dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));
[~, M1, S1] = hms(dur1); % Dar tiempo en 2 variables, utilizar M1 para crear rangos
[~, M2, S2] = hms(dur2);
num_msg_AIS1 = length(msg_AIS1);
msg_match = cell(num_msg_AIS1, 1);
for K = 1 : num_msg_AIS1
all_match_AIS = find(msg_AIS1(K) == msg_AIS2); % encontrar mesnaes iguales
if isempty(all_match_AIS);
fprintf('No hay coincidencias para la linia #%d -> "%s"\n', origidx1(K), msg_AIS1(K));
continue;
end
fprintf('potencial coincidencia #%d -> "%s", checking times\n', origidx1(K), msg_AIS1(K));
disp(K), disp(all_match_AIS)
complete_match_AIS = all_match_AIS(M1(K) == M2(all_match_AIS) | M1(K) == M2(all_match_AIS) - 1 |M1(K) == M2(all_match_AIS) + 1);% Rango creado +-1 minuto de cada mensaje
msg_match{K} = msg_AIS1(complete_match_AIS); %coger los mensajes de las lineas que han coincidido
% if isempty(complete_match_AIS)
% fprintf('line %#d -> "%s" coincide texto pero no tiempo\n', origidx1(K), msg_AIS1(K));
% else
% fprintf('line %#d -> "%s" matches on time too! Matches are:\n', origidx1(K), msg_AIS1(K));
% msg_AIS2(complete_match_AIS)
% end
end
%# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
%# quitar las celdas vacias
msg_match(emptyCells) = [];
then I removed the emptycells but there are some cells that contain a string of 2x1 or 3x1 that are messages. Why are those messages on a string? If they are repeated I want to have them in a different line. I am gonna do it now.
AND answering your question it would be the second option as you already done. I am really greatful
Walter Roberson
on 18 Sep 2021
S1s = [
"!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053"
"!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053"
"!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053"
"!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E0053"
"!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C0054"
"!AIVDM,1,1,,A,137FrD02Bu0:=9@GS16Vu5O`00S<,0*5D0054"
"!AIVDM,1,1,,B,33F6AD00@0P9ud6GbtWAQmob22rA,0*2B0055"
"!AIVDM,1,1,,B,13F9RTPP00P9rL0GasQf4?wf2D1E,0*430055"
"!AIVDM,1,1,,A,4028j;1vDfG0o09cG0Gdh4i000S:,0*560056"
"!AIVDM,1,1,,A,D028j;0flffp,0*430056"
"!AIVDM,1,1,,B,13E`977P00P:06:Gc8DkW?wh2@Q3,0*690056"
"!AIVDM,1,1,,A,13EpM3PP0009nVdGb9Itfwwh0HQ6,0*0A0056"
"!AIVDM,1,1,,B,13GQ:Fw01@P9qH6GaS:WiVEh2<1E,0*6D0056"
"!AIVDM,1,1,,B,13cq;9000IP9saTGb3d0b0ad8<1s,0*320057"
"!AIVDM,1,1,,A,39NSCRU000P9uM`GbVRU=@qD0000,0*480057"
"!AIVDM,1,1,,A,13Esmv000009qWBGb=BLHAUl0L1U,0*5A0057"
"!AIVDM,1,1,,A,13F:b60P0J09sKpGb1UhGOwl0<1l,0*3C0058"
"!AIVDM,1,1,,A,13EaMT?000P9wK2Gblptiooh0D1;,0*430059"
"!AIVDM,1,1,,A,13GNje0P00P9nebGb5nv4?wl00SB,0*4B0059"
"!AIVDM,1,1,,A,13GQ>C@P00P9rHrGasGf4?wn20SK,0*020059"
"!AIVDM,1,1,,B,H3dPfW4UC=D7@?>q82knjo2P9430,0*610059"
"!AIVDM,1,1,,B,13GPhM0P00P9rGPGast>4?wn2<1C,0*080059"
"!AIVDM,1,1,,A,13ErMfPP01P9rG0Gasc>4?wn26p4,0*0F0100"
"!AIVDM,1,1,,A,39NSVP500009tArGbL07q6Mn0F;r,0*490100"
"!AIVDM,1,1,,A,4028ioivDfG0s09kDvGag6G006p0,0*010100"
"!AIVDM,1,1,,A,D028ioj"
"!AIVDM,1,1,,A,4028jJ1vDfG1009cHtGdh1g026p4,0*100100"
"!AIVDM,1,1,,B,137FrD0v2u0:=9TGRwMVu5N00H0j,0*5E0101"
"!AIVDM,1,1,,B,19NS@=@01qP9u@fGQs3PbP`40H0l,0*030101"
"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbG1TJgv40H1;,0*7D0102"
];
S2s = [
"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"
"!AIVDM,1,1,,B,19NS@=@01qP9tp4GQkJ0bh`200SP,0*780000"
"!AIVDM,1,1,,B,137FrD0v2u0:=4pGS;s6u5On00SJ,0*000000"
"!AIVDM,1,1,,A,4028jJ1vDfG0009cIVGdh2?0280S,0*400000"
"!AIVDM,1,1,,B,H3GQ9khl4LLTF0l5T0000000000,2*070001"
"!AIVDM,1,1,,A,H33mw2Q>uV0luHTpN3800000000,2*080001"
"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbFw4Jgv40L1f,0*030002"
"!AIVDM,1,1,,B,D028jJ03`N?b<`O6Dl<O6D0,2*350002"
"!AIVDM,1,1,,B,137JlD51h0P9tddGbCQSm0j2081e,0*0E0002"
"!AIVDM,1,1,,A,13EuB00P0009n`TGb82:ugv600RQ,0*110003"
"!AIVDM,1,1,,A,13EsReP00009vQ`Gbj65gPh400SJ,0*350003"
"!AIVDM,1,1,,B,33EtT>5000P9nI2Gb8H5FP<60Dm:,0*580004"
"!AIVDM,1,1,,B,13Efqs800109q6DGb0wHhq@826p0,0*0A0004"
"!AIVDM,1,1,,A,13F9RTPP00P9rKpGasQf4?v820Sf,0*6D0004"
"!AIVDM,1,1,,B,13EmCs70010:0;bGc<Lbh3280<14,0*340004"
"!AIVDM,1,1,,A,13EcsW7P00P:07PGc9Ws@gv82D0l,0*060004"
"!AIVDM,1,1,,B,13EpM3PP0009nVPGb9EJG?v:0<1N,0*590005"
"!AIVDM,1,1,,A,15AMJH00000:1i6Ga0oP0Af:0<16,0*470005"
"!AIVDM,1,1,,A,13cq;9000GP9sUlGb2IPRPb680T5,0*510005"
"!AIVDM,1,1,,B,4028j;1vDfG0509cFhGdh5Q0083a,0*5C0006"
"!AIVDM,1,1,,B,D028j;0flffp,0*400006"
"!AIVDM,1,1,,A,137FrD032u0:=5<GS;6Vu5N:0<1=,0*620006"
"!AIVDM,1,1,,B,13Esmv000009qW:Gb=BLHAT@0H4>,0*660007"
"!AIVDM,1,1,,A,13E`977P00P:06DGc8D00?v>26p0,0*2B0007"
"!AIVDM,1,1,,A,33cm<M1th209wjpGb066k1v<0P00,0*470007"
"!AIVDM,1,1,,A,13GQ:Fw01?P9qW4GaW=7d6:>26p0,0*150007"
"!AIVDM,1,1,,B,13F:b60P0I09sHjGb0D0Bwv@0<1i,0*780008"
"!AIVDM,1,1,,B,B3E`?Q00002Q8LUrpW7Q3wT5kP06,0*760008"
"!AIVDM,1,1,,B,H3`fKwPiDp40000000000000000,2*5F0008"
"!AIVDM,1,1,,B,H3`fKwTTDBE5847@4lpnl0200320,0*700008"
"!AIVDM,1,1,,B,33EaMT?000P9wK4Gblq<iop<00jk,0*390009"
"!AIVDM,1,1,,B,H4hJ<S0l58T4R118Tp<E=>0TTV0,2*450009"
"!AIVDM,1,1,,A,13GPhM0P00P9rGHGast>4?vB20Sr,0*610009"
"!AIVDM,2,1,2,B,53cq;982CRFhT8P<000`thiV1H4p4@Tt00000017D1@CC4en0K1DhPkP0000,0*2C0009"
"!AIVDM,2,2,2,B,00000000000,2*250009"
"!AIVDM,1,1,,B,13GQ>C@P00P9rHjGasGf4?vB2@5d,0*0D0009"
"!AIVDM,1,1,,A,7000003dTINH,0*6D0010"
"!AIVDM,1,1,,A,19NS@=@01qP9tt6GQlahb@`F0H65,0*2D0010"
"!AIVDM,1,1,,A,33FMMd0P0009o1fGapC<Uwv@000k,0*330010"
"!AIVDM,1,1,,B,4028ioivDfG0909kDvGag6G00@6;,0*730010"
"!AIVDM,1,1,,B,D028ioj<Tffp,0*2F0010"
"!AIVDM,2,1,3,A,53FMMd82;AMHD4i4001=049DpdE:3C400000001I081234He0=hUDTR@CPK0,0*030010"
"!AIVDM,2,2,3,A,hDm1C33kP00,2*1F0010"
"!AIVDM,1,1,,B,13ErMfPP00P9rFrGasc>4?vD20RI,0*3C0010"
"!AIVDM,1,1,,B,4028jJ1vDfG0:09cIRGdh1g0286J,0*090010"
"!AIVDM,1,1,,B,13GNje0P00P9nePGb5mN4?vD00S=,0*170011"
"!AIVDM,1,1,,B,13EoPo7P00P:0IjGc:d00?vF2@6T,0*490011"
"!AIVDM,1,1,,A,13FtuD?P00P9tuDGbFw4JgvH00T<,0*3F0011"
"!AIVDM,1,1,,B,H3GQ9klTC=D4s;H51npno0106400,0*040011"
"!AIVDM,1,1,,B,137FrD0uBu0:=68GS9P6u5NF0@7B,0*2D0012"
"!AIVDM,1,1,,A,H33mw2TTCBD6ubp00000001@4440,0*170012"
"!AIVDM,1,1,,B,H4hJ<S4T1=30000J7;FoPP1P<560,0*4A0013"
"!AIVDM,1,1,,A,B3EpfoP0002OoB5rlUwQ3wT5kP06,0*1B0013"
"!AIVDM,1,1,,B,15AMJH00000:1i6Ga0pP0AhL0D16,0*5B0014"
"!AIVDM,1,1,,B,13F9RTPP00P9rKrGasQf4?vL288J,0*570014"
];
AIS1 = S1s;
AIS2 = S2s;
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once');
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once');
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
mask1 = ismissing(msg_AIS1) | ismissing(t1);
mask2 = ismissing(msg_AIS2) | ismissing(t2);
origidx1 = (1:length(msg_AIS1));
origidx2 = (1:length(msg_AIS2));
msg_AIS1(mask1) = []; t1(mask1) = []; origidx1(mask1) = [];
msg_AIS2(mask2) = []; t2(mask2) = []; origidx2(mask2) = [];
DN1 = str2double(t1);
DN2 = str2double(t2);
dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));
dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));
num_msg_AIS1 = length(msg_AIS1);
msg_match = cell(num_msg_AIS1, 1);
for K = 1 : num_msg_AIS1
time_mask = isbetween(dur2, dur1(K)-minutes(1), dur1(K)+minutes(1));
msg_match{K} = AIS2(time_mask);
end
%# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
%# quitar las celdas vacias
AIS1_with_matches = AIS1;
AIS1_with_matches(emptyCells) = [];
msg_match(emptyCells) = [];
%cross-checks to see that everything worked okay
AIS1_with_matches(1:3)
ans = 3×1 string array
"!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053"
"!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053"
"!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053"
msg_match(1:3)
ans = 3×1 cell array
{58×1 string}
{58×1 string}
{58×1 string}
msg_match{1}(1:3)
ans = 3×1 string array
"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
AIS1_with_matches{end}
ans = '!AIVDM,1,1,,B,13FtuD?P00P9tuDGbG1TJgv40H1;,0*7D0102'
msg_match{end}(1:3)
ans = 3×1 string array
"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbFw4Jgv40L1f,0*030002"
"!AIVDM,1,1,,B,D028jJ03`N?b<`O6Dl<O6D0,2*350002"
"!AIVDM,1,1,,B,137JlD51h0P9tddGbCQSm0j2081e,0*0E0002"
Yes, it worked. The entries with time 0102 are more than 1 minute from the entries with 0000 and 0001 so the 0000 and 0001 did not make it into the match list.
The outputs here are AIS1_with_matches and msg_match. AIS1_with_matches is the list of messages in AIS1 that match something inside AIS2. Then for each of those entries, msg_match is a cell array of all of the messages within +/- 1 minute in AIS2.
Notice that most messages are repeated a lot, since most messages are within 1 minute of most entries.
flashpode
on 18 Sep 2021
it did not work to me. The output I get is a cell in where every line contains a string. the last lines of your code do not run in my computer:
AIS1_with_matches(1:3)
msg_match(1:3)
msg_match{1}(1:3)
AIS1_with_matches{end}
msg_match{end}(1:3)
the code that you gave me before worked but I got strings inside the cell and I have to remove them. I am working on it but do not how.
[nRows, ~] = cellfun(@size,msg_match);
isMultiRow = nRows>1;
msg_match(isMultiRow) = cellfun(@(a) {a'}, msg_match(isMultiRow));
msg_match(isMultiRow) = cellfun(@(a){strsplit(a,'!AIV')},msg_match(isMultiRow));
here is the code I used but it gave me problems. Notice I want to split the strings in two or three different rows
Walter Roberson
on 18 Sep 2021
I will need extended data to test with. Please attach a .mat with more extensive data. It does not need to be your full data -- just enough to be able to reproduce the problems.
flashpode
on 18 Sep 2021
And this is the code:
linia_dolenta1=[];
linia_dolenta2=[];
N=size(AIS1,1)
P=size(AIS2,1)
for i=1:1:N
seq1=AIS1(i);
linia=convertStringsToChars(seq1);
if length(linia)<15
linia_dolenta1 = [linia_dolenta1,i];
end
end
for j=1:1:P
seq2=AIS2(j);
linia=convertStringsToChars(seq2);
if length(linia)<15
linia_dolenta2 = [linia_dolenta2,j];
end
end
size(AIS1)
size(AIS2)
AIS1([linia_dolenta1],:) = [];
AIS2([linia_dolenta2],:) = [];
size(AIS1)
size(AIS2)
N=size(AIS1,1)
P=size(AIS2,1)
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once');
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once');
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
mask1 = ismissing(msg_AIS1) | ismissing(t1);
mask2 = ismissing(msg_AIS2) | ismissing(t2);
origi_AIS1 = (1:length(msg_AIS1));
origi_AIS2 = (1:length(msg_AIS2));
msg_AIS1(mask1) = []; t1(mask1) = []; origi_AIS1(mask1) = [];
msg_AIS2(mask2) = []; t2(mask2) = []; origi_AIS2(mask2) = [];
DN1 = str2double(t1);
DN2 = str2double(t2);
dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));
dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));
[~, M1, S1] = hms(dur1); % Dar tiempo en 2 variables, utilizar M1 para crear rangos
[~, M2, S2] = hms(dur2);
num_msg_AIS1 = length(msg_AIS1);
msg_match = cell(num_msg_AIS1, 1);
for K = 1 : num_msg_AIS1
all_match_AIS = find(msg_AIS1(K) == msg_AIS2); % encontrar mensajes iguales
if isempty(all_match_AIS);
fprintf('No hay coincidencias para la linia #%d -> "%s"', origi_AIS1(K), msg_AIS1(K)); % '%s' para un string
continue;
end
fprintf('potencial coincidencia #%d -> "%s", checking times', origi_AIS1(K), msg_AIS1(K));
disp(K), disp(all_match_AIS)
complete_match_AIS = all_match_AIS(M1(K) == M2(all_match_AIS) | M1(K) == M2(all_match_AIS) - 1 |M1(K) == M2(all_match_AIS) + 1);% Rango creado +-1 minuto de cada mensaje
msg_match{K} = msg_AIS1(complete_match_AIS); %coger los mensajes de las lineas que han coincidido
if isempty(complete_match_AIS)
fprintf('line %#d -> "%s" coincide texto pero no tiempo\n', origi_AIS1(K), msg_AIS1(K));
else
fprintf('line %#d -> "%s" Tambien coincide el tiempo Son:\n', origi_AIS1(K), msg_AIS1(K));
msg_AIS2(complete_match_AIS)
end
end
%# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
%# quitar las celdas vacias
msg_match(emptyCells) = [];
% find(strcmp(msg_match, string))
[nRows, ~] = cellfun(@size,msg_match);
isMultiRow = nRows>1;
msg_match(isMultiRow) = cellfun(@(a) {a'}, msg_match(isMultiRow));
msg_match(isMultiRow) = cellfun(@(a){strsplit(a,'!AIV')},msg_match(isMultiRow)); % here I got the problem
Walter Roberson
on 18 Sep 2021
That last line,
msg_match(isMultiRow) = cellfun(@(a){strsplit(a,'!AIV')},msg_match(isMultiRow)); % here I got the problem
What is the intention of that line?
Is the intention to remove the !AIV prefix from the 2nd and following entries for any one row?
Walter Roberson
on 18 Sep 2021
Revised code:
AIS1_file = '2021030100AIS1.txt';
AIS2_file = '2021030100AIS2.txt';
AIS1 = string(readlines(AIS1_file));
AIS2 = string(readlines(AIS2_file));
AIS1(strlength(AIS1) < 15) = [];
AIS2(strlength(AIS2) < 15) = [];
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once');
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once');
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
mask1 = ismissing(msg_AIS1) | ismissing(t1);
mask2 = ismissing(msg_AIS2) | ismissing(t2);
origidx1 = (1:length(msg_AIS1));
origidx2 = (1:length(msg_AIS2));
msg_AIS1(mask1) = []; t1(mask1) = []; origidx1(mask1) = [];
msg_AIS2(mask2) = []; t2(mask2) = []; origidx2(mask2) = [];
DN1 = str2double(t1);
DN2 = str2double(t2);
dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));
dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));
num_msg_AIS1 = length(msg_AIS1);
msg_match = cell(num_msg_AIS1, 1);
for K = 1 : num_msg_AIS1
time_mask = isbetween(dur2, dur1(K)-minutes(1), dur1(K)+minutes(1));
msg_match{K} = reshape(AIS2(time_mask), 1, []); %user wants rows
end
%# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
%# quitar las celdas vacias
AIS1_with_matches = AIS1;
AIS1_with_matches(emptyCells) = [];
msg_match(emptyCells) = [];
nRows = cellfun(@length, msg_match);
isMultiRow = nRows>1;
%msg_match(isMultiRow) = cellfun(@(a){strsplit(a,'!AIV')},msg_match(isMultiRow)); % here I got the problem
Walter Roberson
on 23 Sep 2021
AIS1_file = '2021030100AIS1.txt';
AIS2_file = '2021030100AIS2.txt';
AIS1 = string(readlines(AIS1_file));
AIS2 = string(readlines(AIS2_file));
AIS1(strlength(AIS1) < 15) = [];
AIS2(strlength(AIS2) < 15) = [];
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once');
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once');
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
mask1 = ismissing(msg_AIS1) | ismissing(t1);
mask2 = ismissing(msg_AIS2) | ismissing(t2);
origidx1 = (1:length(msg_AIS1));
origidx2 = (1:length(msg_AIS2));
msg_AIS1(mask1) = []; t1(mask1) = []; origidx1(mask1) = [];
msg_AIS2(mask2) = []; t2(mask2) = []; origidx2(mask2) = [];
DN1 = str2double(t1);
DN2 = str2double(t2);
dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));
dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));
num_msg_AIS1 = length(msg_AIS1);
msg_match = cell(num_msg_AIS1, 1);
matches_anything_in_AIS1 = false;
for K = 1 : num_msg_AIS1
time_mask = isbetween(dur2, dur1(K)-minutes(1), dur1(K)+minutes(1));
matches_anything_in_AIS1 = matches_anything_in_AIS1 | time_mask;
msg_match{K} = reshape(AIS2(time_mask), 1, []); %user wants columns
end
%# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
%# quitar las celdas vacias
AIS1_with_matches = AIS1;
AIS1_NO_MATCHING = AIS1(emptyCells);
AIS1_with_matches(emptyCells) = [];
msg_match(emptyCells) = [];
AIS2_NO_MATCHING = AIS2(~matches_anything_in_AIS1);
nRows = cellfun(@length, msg_match);
isMultiRow = nRows>1;
%msg_match(isMultiRow) = cellfun(@(a){strsplit(a,'!AIV')},msg_match(isMultiRow)); % here I got the problem
flashpode
on 23 Sep 2021
Well I meant with that code:
AIS1(strlength(AIS1) < 15) = [];
AIS2(strlength(AIS2) < 15) = [];
N=size(AIS1,1); %% Importante detras que sino daba error el codigo
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once'); % todo el mensaje menos las ultimas 4 cifras
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once'); % sacar ultimas 4 cifras
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
Time_AIS1 = duration(strcat('00:',extractBefore(t1,3),':',extractAfter(t1,2))); % Poner en formato hh:mm:ss
Time_AIS1 = Time_AIS1+hours(cumsum([0;diff(Time_AIS1)<0])); %añadir una unidad en hh cada vez que se reinicia mm:ss
Time_AIS2 = duration(strcat('00:',extractBefore(t2,3),':',extractAfter(t2,2)));
Time_AIS2 = Time_AIS2+hours(cumsum([0;diff(Time_AIS2)<0]));
mask1 = ismissing(msg_AIS1) | ismissing(Time_AIS1);
mask2 = ismissing(msg_AIS2) | ismissing(Time_AIS2);
origi_AIS1 = (1:length(msg_AIS1));
origi_AIS2 = (1:length(msg_AIS2));
msg_AIS1(mask1) = []; Time_AIS1(mask1) = []; origi_AIS1(mask1) = [];
msg_AIS2(mask2) = []; Time_AIS2(mask2) = []; origi_AIS2(mask2) = [];
[H1, M1, S1] = hms(Time_AIS1); % Dar tiempo en 2 variables, utilizar M1 para crear rangos
[H2, M2, S2] = hms(Time_AIS2);
msg_match = cell(N, 1);
for K = 1:1:N
all_match_AIS = find(msg_AIS1(K) == msg_AIS2); % encontrar mensajes iguales
if isempty(all_match_AIS) %fprintf para escribir datos en un archivo de texto
% fprintf('No hay coincidencias para la linia #%d -> "%s"\n', origi_AIS1(K), msg_AIS1(K)); % '%s' para un string
continue;
end
% fprintf('potencial coincidencia #%d -> "%s", checking times\n', origi_AIS1(K), msg_AIS1(K));
% disp(K), disp(all_match_AIS)
if H1(K)== H2(all_match_AIS)
% crear rango de coincidencia de minutos
complete_match_AIS = all_match_AIS(M1(K) == M2(all_match_AIS) | M1(K) == M2(all_match_AIS) - 1 | M1(K) == M2(all_match_AIS) + 1);% Rango creado +-1 minuto de cada mensaje
msg_match{K} = msg_AIS1(complete_match_AIS); %coger los mensajes de las lineas que han coincidido. IMPORTANTE
end
if isempty(complete_match_AIS)
% fprintf('line %#d -> "%s" coincide texto pero no tiempo\n', origi_AIS1(K), msg_AIS1(K));
else
% fprintf ('line %#d -> "%s" coincide tambien el tiempo. Los resultados son:\n', origi_AIS1(K), msg_AIS1(K));
msg_AIS2(complete_match_AIS) %IMPORTANTE
end
%# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
%# quitar las celdas vacias
msg_match(emptyCells) = [];
% Quitar los strings de dentro de la cell (cat)--> para concadenar
Matching_msg = cellstr(cat(1, msg_match{:}));
end
Matching_msg = string(Matching_msg);
Walter Roberson
on 23 Sep 2021
Why are you still using that version ? I gave you revised efficient tested code 5 days ago.
More Answers (1)
chrisw23
on 22 Sep 2021
strEx = "!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053 !AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053";
% check/modify the expression under https://regex101.com/
exp = "(?<prefix>!\w*),(?<ident1>\d),(?<ident2>\d),,(?<ident3>\w),(?<strLoad>[\w\d:?<>@`]*),(?<time>[*\d\w]*)";
tbl = struct2table(regexp(strEx,exp,'names'))
This is just an example how to parse text by a simple grouped regular expression. I use the website described to write and test expressions. The table allows easy access for further processing (ie. datetime conversion) as previously shown. Look at string based compare methods like 'contains' or 'matches' , i.e. tbl.strLoad.contains("137JlD52h0P9td") -> results in logical index to access matches
Hope it helps
Christian
2 Comments
Walter Roberson
on 23 Sep 2021
[\w\d:?<>@`]
I think that could more easily be [^,] which is "anything other than a comma"
See Also
Categories
Find more on Timetables in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)