How to detect and manage occurrencies in a large .csv file?

1 view (last 30 days)
Hi to everyone!
I have a large .csv file that contains results about intersections bewteen two elliptical orbits: asteroids orbits and spacecraft orbits.
I try to explain well the types of data I have so you can understand how to help me solve the problem.
Let's consider, for instance, the following table fragment below (the full table is attached to this question):
Ast, S/C, nInters, Sol_inters, d_omega_deg, error_om2_A_deg, error_om2_B_deg, av_om_error_deg, TOF_S/C_Ly_A_d, TOF_S/C_Ly_B_d, epoch_exit_CI_A, epoch_exit_CI_B, DV_A_km/s, DV_B_km/s, dist_ast_S_C_ptA_km, dist_ast_S_C_ptB_km
39, L2-02-007, 2, B, 307.65, 0.109944019453062E+00, 0.293932762447767E-01, 0.961152534962726E-01, 0.533331139142907E+03, 0.526417642355725E+03, 0.245950334376218E+07, 0.245950326203535E+07, 0.346335923232639E+00, 0.347249159104803E+00, 0.293709950733154E+06, 0.787334303377599E+05
39, L2-02-007, 1, A or B, 307.74, 0.732771847295692E-01, 0.732771847295692E-01, 0.961152534962726E-01, 0.529845402472029E+03, 0.529845402472029E+03, 0.245950340404568E+07, 0.245950340404568E+07, 0.344508783925687E+00, 0.344508783925687E+00, 0.196016188153376E+06, 0.196016188153376E+06
39, L2-02-008, 2, A, 306.87, 0.925872687404537E-01, 0.141629205617105E+00, 0.961152534962726E-01, 0.540292285694384E+03, 0.520880053711052E+03, 0.245950253846427E+07, 0.245950230082811E+07, 0.362822520681023E+00, 0.365273972285802E+00, 0.246756860882465E+06, 0.380296156813702E+06
39, L2-02-008, 2, A, 306.97, 0.882567118249506E-01, 0.130066572015990E+00, 0.961152534962726E-01, 0.539643449621091E+03, 0.521470670629045E+03, 0.245950263158793E+07, 0.245950241007700E+07, 0.360685749111593E+00, 0.362996796491447E+00, 0.235267669658578E+06, 0.349161650184528E+06
39, L2-02-008, 2, A, 307.06, 0.834175775051655E-01, 0.117999762789471E+00, 0.961152534962726E-01, 0.538947018851022E+03, 0.522109029511155E+03, 0.245950272419558E+07, 0.245950251983744E+07, 0.358545302576901E+00, 0.360701789255686E+00, 0.222421506550744E+06, 0.316683536565954E+06
39, L2-02-008, 2, A, 307.16, 0.779294957715048E-01, 0.105288430855583E+00, 0.961152534962726E-01, 0.538190584973931E+03, 0.522807539537756E+03, 0.245950281614481E+07, 0.245950263025180E+07, 0.356402743411489E+00, 0.358387000720349E+00, 0.207843298450224E+06, 0.282486820940043E+06
39, L2-02-008, 2, A and B, 307.26, 0.715761629082839E-01, 0.917162995961507E-01, 0.961152534962726E-01, 0.537355119699791E+03, 0.523585229774273E+03, 0.245950290721616E+07, 0.245950274153954E+07, 0.354260539921359E+00, 0.356049563336923E+00, 0.190954990359404E+06, 0.245993537643158E+06
39, L2-02-008, 2, A and B, 307.35, 0.639920128388102E-01, 0.769178279464663E-01, 0.961152534962726E-01, 0.536408618358321E+03, 0.524474105671694E+03, 0.245950299703872E+07, 0.245950285407151E+07, 0.352122931729968E+00, 0.353684826027230E+00, 0.170779491730742E+06, 0.206226680030842E+06
39, L2-02-008, 2, A and B, 307.45, 0.544613907117058E-01, 0.601773856509154E-01, 0.961152534962726E-01, 0.535288729994966E+03, 0.525536518970640E+03, 0.245950308488638E+07, 0.245950296857382E+07, 0.349998285734353E+00, 0.351283997706840E+00, 0.145403706063315E+06, 0.161273234508592E+06
39, L2-02-008, 2, A and B, 307.54, 0.411290462165725E-01, 0.396397482453050E-01, 0.961152534962726E-01, 0.533834625689053E+03, 0.526933299383255E+03, 0.245950316887682E+07, 0.245950308692876E+07, 0.347908300320221E+00, 0.348824942521129E+00, 0.109867396981676E+06, 0.106173051072993E+06
39, L2-02-008, 1, A or B, 307.64, 0.434501847190341E-02, 0.434501847190341E-02, 0.961152534962726E-01, 0.530355376573519E+03, 0.530355376573519E+03, 0.245950322907317E+07, 0.245950322907317E+07, 0.346080092313682E+00, 0.346080092313682E+00, 0.116221730142742E+05, 0.116221730142742E+05
39, L2-02-009, 2, A, 273.32, 0.404789173791976E-01, 0.290110239830519E+01, 0.961152534962726E-01, 0.595378757082011E+03, 0.492407555768265E+03, 0.245946844783704E+07, 0.245946546330698E+07, 0.882072624319625E+00, 0.886006182121303E+00, 0.106973707904012E+06, 0.789126988424564E+07
In the first two columns we have the asteroid and spacecraft orbits identifiers. In general two ellipses with a common focus may have 2 distinct intersections or two coincident ones that I named A and B. The third column tells us which of two point is a solution for the couple AST-S/C reported in the first two columns; as you see in the table we can have as solution only A, only B, both when they are distinct points (A and B) or both when they are coincident points (A or B).
Then in the fourth column there is the angle that represente the relative geometry between ellipses and in the other columns we have some data about these intersection points.Sorry for this big preamble.
My objectvie is to find if a S/C orbit is associated with more than one asteroid and if yes which one. For instance, in the table fragment above the S/C orbit L2-02-007 is associated to asteroid 39 and so I want to find if the same S/C orbit is associated to another asteroid orbit and which is its identifier and at which row of the table it is located.
Can you help me to get the following results?
  • I want to store the result of this search in a matlab table to get something like the following one (for instance, Let' s assume that the S/C L2-02-007 is associated to asteroids 20, 25, 39 and 54 and then let us assume other results for different S/C orbits).
S/C, No. ast. associated, Ast_id associated,
L2-02-007, 4, 20-25-39-54,
L2-04-010, 3, 1-12-15,
L2-06-012, 7, 2-3-6-12-26-30-35,
L2-09-023, 2, 2-41,
...
  • I want to produce a filterd table starting from the original one to mantain only the S/C that are associated to multiple asteroids; something like the following one:
Ast, S/C, nInters, Sol_inters, d_omega_deg, error_om2_A_deg, error_om2_B_deg, av_om_error_deg, TOF_S/C_Ly_A_d, TOF_S/C_Ly_B_d, epoch_exit_CI_A, epoch_exit_CI_B, DV_A_km/s, DV_B_km/s, dist_ast_S_C_ptA_km, dist_ast_S_C_ptB_km
20, L2-02-007, 2, B, 307.65, 0.109944019453062E+00, 0.293932762447767E-01, 0.961152534962726E-01, 0.533331139142907E+03, 0.526417642355725E+03, 0.245950334376218E+07, 0.245950326203535E+07, 0.346335923232639E+00, 0.347249159104803E+00, 0.293709950733154E+06, 0.787334303377599E+05
25, L2-02-007, 1, A or B, 307.74, 0.732771847295692E-01, 0.732771847295692E-01, 0.961152534962726E-01, 0.529845402472029E+03, 0.529845402472029E+03, 0.245950340404568E+07, 0.245950340404568E+07, 0.344508783925687E+00, 0.344508783925687E+00, 0.196016188153376E+06, 0.196016188153376E+06
39, L2-02-008, 2, A, 306.87, 0.925872687404537E-01, 0.141629205617105E+00, 0.961152534962726E-01, 0.540292285694384E+03, 0.520880053711052E+03, 0.245950253846427E+07, 0.245950230082811E+07, 0.362822520681023E+00, 0.365273972285802E+00, 0.246756860882465E+06, 0.380296156813702E+06
54, L2-02-008, 2, A, 306.97, 0.882567118249506E-01, 0.130066572015990E+00, 0.961152534962726E-01, 0.539643449621091E+03, 0.521470670629045E+03, 0.245950263158793E+07, 0.245950241007700E+07, 0.360685749111593E+00, 0.362996796491447E+00, 0.235267669658578E+06, 0.349161650184528E+06
01, L2-04-010, 2, A, 307.06, 0.834175775051655E-01, 0.117999762789471E+00, 0.961152534962726E-01, 0.538947018851022E+03, 0.522109029511155E+03, 0.245950272419558E+07, 0.245950251983744E+07, 0.358545302576901E+00, 0.360701789255686E+00, 0.222421506550744E+06, 0.316683536565954E+06
12, L2-04-010, 2, A, 307.16, 0.779294957715048E-01, 0.105288430855583E+00, 0.961152534962726E-01, 0.538190584973931E+03, 0.522807539537756E+03, 0.245950281614481E+07, 0.245950263025180E+07, 0.356402743411489E+00, 0.358387000720349E+00, 0.207843298450224E+06, 0.282486820940043E+06
15, L2-04-010, 2, A and B, 307.26, 0.715761629082839E-01, 0.917162995961507E-01, 0.961152534962726E-01, 0.537355119699791E+03, 0.523585229774273E+03, 0.245950290721616E+07, 0.245950274153954E+07, 0.354260539921359E+00, 0.356049563336923E+00, 0.190954990359404E+06, 0.245993537643158E+06
02, L2-06-012, 2, A and B, 307.35, 0.639920128388102E-01, 0.769178279464663E-01, 0.961152534962726E-01, 0.536408618358321E+03, 0.524474105671694E+03, 0.245950299703872E+07, 0.245950285407151E+07, 0.352122931729968E+00, 0.353684826027230E+00, 0.170779491730742E+06, 0.206226680030842E+06
03, L2-06-012, 2, A and B, 307.45, 0.544613907117058E-01, 0.601773856509154E-01, 0.961152534962726E-01, 0.535288729994966E+03, 0.525536518970640E+03, 0.245950308488638E+07, 0.245950296857382E+07, 0.349998285734353E+00, 0.351283997706840E+00, 0.145403706063315E+06, 0.161273234508592E+06
06, L2-06-012, 2, A and B, 307.54, 0.411290462165725E-01, 0.396397482453050E-01, 0.961152534962726E-01, 0.533834625689053E+03, 0.526933299383255E+03, 0.245950316887682E+07, 0.245950308692876E+07, 0.347908300320221E+00, 0.348824942521129E+00, 0.109867396981676E+06, 0.106173051072993E+06
12, L2-06-012, 1, A or B, 307.64, 0.434501847190341E-02, 0.434501847190341E-02, 0.961152534962726E-01, 0.530355376573519E+03, 0.530355376573519E+03, 0.245950322907317E+07, 0.245950322907317E+07, 0.346080092313682E+00, 0.346080092313682E+00, 0.116221730142742E+05, 0.116221730142742E+05
26, L2-06-012, 2, A, 273.32, 0.404789173791976E-01, 0.290110239830519E+01, 0.961152534962726E-01, 0.595378757082011E+03, 0.492407555768265E+03, 0.245946844783704E+07, 0.245946546330698E+07, 0.882072624319625E+00, 0.886006182121303E+00, 0.106973707904012E+06, 0.789126988424564E+07
...
N.B I modified the data in the two example tables above to explain what I want to achieve, and so these data do not reflect the actual values in the original table.

Accepted Answer

dpb
dpb on 29 Jul 2022
Start perhaps with something like
>> tAst=readtable('table.csv');
tAst.S_C=categorical(tAst.S_C);
tAst.Sol_inters=categorical(tAst.Sol_inters);
tByAst=groupsummary(tAst,{'Ast','S_C'});
tBySC=groupsummary(tAst,{'S_C','Ast'});
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the
table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
>> head(tByAst,20)
ans =
20×3 table
Ast S_C GroupCount
___ _________ __________
39 L2-02-001 23
39 L2-02-002 23
39 L2-02-003 22
39 L2-02-004 21
39 L2-02-005 20
39 L2-02-006 15
39 L2-02-007 11
39 L2-02-008 9
39 L2-02-009 227
39 L2-02-010 353
39 L2-03-001 25
39 L2-03-002 25
39 L2-03-003 26
39 L2-03-004 25
39 L2-03-005 25
39 L2-03-006 25
39 L2-03-007 25
39 L2-03-008 24
39 L2-03-009 25
39 L2-03-010 25
>> head(tBySC,20)
ans =
20×3 table
S_C Ast GroupCount
_________ ___ __________
L2-02-001 39 23
L2-02-001 41 23
L2-02-001 43 68
L2-02-001 44 40
L2-02-002 39 23
L2-02-002 41 23
L2-02-002 43 68
L2-02-002 44 39
L2-02-003 39 22
L2-02-003 41 22
L2-02-003 43 67
L2-02-003 44 39
L2-02-004 39 21
L2-02-004 41 23
L2-02-004 43 67
L2-02-004 44 28
L2-02-005 39 20
L2-02-005 41 22
L2-02-005 43 67
L2-02-006 39 15
>>
that collects them in groupings -- a magic way to collect the members of the group into a single array as a one-liner hasn't struck yet...

More Answers (0)

Categories

Find more on Language Fundamentals in Help Center and File Exchange

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!