Table row retrieval with dependent on column values

I am dealing with a sorting issue on a table which. The table looks like follows:
Rec_id =[1; 1; 2; 2; 2; 3; 3; 4; 4]; % for clarification: the values here go up to 80 in the original table
Rec_type= {'A';'B';'A';'B';'C';'A';'C';'A';'B'};
data_table = table(Rec_id, Rec_type)
data_table = 9×2 table
Rec_id Rec_type ______ ________ 1 {'A'} 1 {'B'} 2 {'A'} 2 {'B'} 2 {'C'} 3 {'A'} 3 {'C'} 4 {'A'} 4 {'B'}
From this I'm trying to extract all rows of Rec_id's which have Rec_type='B', but also the rows of the same Rec_id with Rec_type='A'.
The table i try to extract from the former should look like this:
Rec_id =[1; 1; 2; 2; 4; 4]; % Rec_id = 3 gets filtered out completely, since there is no Rec_id = 3 with Rec_type 'B'
Rec_type= {'A';'B';'A';'B';'A';'B'};
new_table = table(Rec_id, Rec_type)
new_table = 6×2 table
Rec_id Rec_type ______ ________ 1 {'A'} 1 {'B'} 2 {'A'} 2 {'B'} 4 {'A'} 4 {'B'}
My approach was to find all Rec_id's that have a Rec_type of 'B' and use these.
Ids_corr_to_type_B = table.Rec_id(table.Rec_type == 'B');
But I was unable to use the array retrieved from that operation successfully.
Thank you for your help!

 Accepted Answer

Rec_id =[1; 1; 2; 2; 2; 3; 3; 4; 4]; % for clarification: the values here go up to 80 in the original table
Rec_type= {'A';'B';'A';'B';'C';'A';'C';'A';'B'};
data_table = table(Rec_id, Rec_type);
%Row indices which have Rec_type as {'B'}
idx1 = contains(data_table.Rec_type, 'B')
idx1 = 9×1 logical array
0 1 0 1 0 0 0 0 1
%Get the corresponding Rec_ids
arr1 = data_table.Rec_id(idx1)
arr1 = 3×1
1 2 4
%Row indices which contain Rec_type as {'A'} and have the same Rec_id as {'B'}
idx2 = contains(data_table.Rec_type, 'A') & ismember(data_table.Rec_id, arr1)
idx2 = 9×1 logical array
1 0 1 0 0 0 0 1 0
%Output
new_table = data_table( idx1 | idx2, :)
new_table = 6×2 table
Rec_id Rec_type ______ ________ 1 {'A'} 1 {'B'} 2 {'A'} 2 {'B'} 4 {'A'} 4 {'B'}

4 Comments

Thanks a lot, this works perfectly!
In addition to what Dyuman showed, I would recommend using string, not cell arrays of char row vectors, to store text, which allows you to use your original thought:
Rec_id =[1; 1; 2; 2; 2; 3; 3; 4; 4]; % for clarification: the values here go up to 80 in the original table
Rec_type= ["A";"B";"A";"B";"C";"A";"C";"A";"B"];
data_table = table(Rec_id, Rec_type);
data_table(data_table.Rec_type=="B",:)
ans = 3×2 table
Rec_id Rec_type ______ ________ 1 "B" 2 "B" 4 "B"
Even better would be to use categorical for those data, since you have a small number of values, each repeated a bunch of times:
data_table.Rec_type = categorical(data_table.Rec_type)
data_table = 9×2 table
Rec_id Rec_type ______ ________ 1 A 1 B 2 A 2 B 2 C 3 A 3 C 4 A 4 B
categories(data_table.Rec_type) % for historical reasons this returns a cellstr
ans = 3×1 cell array
{'A'} {'B'} {'C'}
data_table(data_table.Rec_type=="B",:)
ans = 3×2 table
Rec_id Rec_type ______ ________ 1 B 2 B 4 B
In this toy example it makes little difference, in large problems it makes a big difference.
I didn't know about the categorical option for tables so I'm going to read into it.
Thank you for the addition!

Sign in to comment.

More Answers (0)

Categories

Tags

Asked:

on 16 Nov 2023

Commented:

on 17 Nov 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!