Creating a new matrix based on matching two columns
Show older comments
I have generated two matrices from two different csv files. Matrix 1 contains two columns of information (matrix contain middle names and first name). Matrix 2 only contains one Column of information (middle name). I want to create a loop where I take the middle name from matrix 2 and match it to the middle names in matrix 1. Once the names have been matched I want to read out each matches corresponding first name in a new matrix
3 Comments
the cyclist
on 17 Oct 2023
Can you upload the data? You can use the paper clip icon in the INSERT section of the toolbar.
One thing that is confusing is that you refer to the data as "matrices", but in MATLAB matrices are numeric. So, it's not clear if you have cell arrays, or string arrays, or tables. Uploading the data is the easiest way for us to make sure a solution works for you.
User
on 26 Oct 2023
Answers (1)
T1 = readtable('example1.xlsx');
T2 = readtable('example2.xlsx');
disp(T1);
disp(T2);
"I want to [...] take the [last] name from [array] 2 and match it to the [last] names in [array] 1. Once the names have been matched I want to read out each [match's] corresponding first name in a new [array]"
If you want exact matches:
result = T1{ismember(T1{:,2},T2{:,1}),1}
Or, if you want to split the cells where there are more than one last name separated by commas into multiple separate entries, then something like this:
% make a new table with only one first and last name per row:
C = regexp(T1{:,2},',\s*','split');
idx = arrayfun(@(x,c)x(ones(1,numel(c{1}))),1:numel(C),C.','un',0);
T1_new = table(T1{[idx{:}],1},[C{:}].')
% this part is the same as before with T1 but now using T1_new:
result = T1_new{ismember(T1_new{:,2},T2{:,1}),1}
5 Comments
User
on 1 Nov 2023
T1 = readtable('example1.xlsx');
T2 = readtable('example2.xlsx');
% exact matches only:
T_out = T1(ismember(T1{:,2},T2{:,1}),:)
result = join(T_out{:,:},' ')
% matches any last name in given row of T1:
C = regexp(T1{:,2},',\s*','split');
idx = arrayfun(@(x,c)x(ones(1,numel(c{1}))),1:numel(C),C.','un',0);
T1_new = table(T1{[idx{:}],1},[C{:}].');
T_out = T1_new(ismember(T1_new{:,2},T2{:,1}),:)
result = join(T_out{:,:},' ')
User
on 1 Nov 2023
Walter Roberson
on 1 Nov 2023
With the original last name is something I posted code for several days ago at https://www.mathworks.com/matlabcentral/answers/2034499-loading-in-a-table-that-has-multiple-values-in-a-single-cell-seperated-by-a-comma#comment_2938846
@User: Does this produce the expected result? The only change is changing the comma to a semicolon in the regexp() call, since the IDs are separated by semicolons in the real data file (it was commas in the example files).
T1 = readtable('allPharm1.csv')
T2 = readtable('drugID_names_matched_CCM.csv')
C = regexp(T1{:,2},';\s*','split');
idx = arrayfun(@(x,c)x(ones(1,numel(c{1}))),1:numel(C),C.','un',0);
T1_new = table(T1{[idx{:}],1},[C{:}].');
T_out = T1_new(ismember(T1_new{:,2},T2{:,1}),:)
result = join(T_out{:,:},' ')
Categories
Find more on Data Type Conversion in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!