# intersection of multiple arrays

101 views (last 30 days)
Ananya Malik on 14 Sep 2017
Commented: Ananya Malik on 14 Sep 2017
I have multiple 1D arrays with integers. Eg. A=[1,2,3,4] B=[1,4,5,6,7] C=[1,5,8,9,10,11]. Here 1 is repeated in all the arrays and 4 in A&B, 5 in B&C. I want these repeated values to be assigned to a single array, based on the size of the array, and removed from other arrays. How can I achieve this. Please help. TIA.
Ananya Malik on 14 Sep 2017
Edited: Ananya Malik on 14 Sep 2017
No. The common elements are [1,4,5]. Upon removing them from all arrays, we get A=[2,3] B=[6,7] and C=[8,9,10,11]. 1 can be present in either A or B. Suppose we put 1 in A ==> A=[1,2,3] B=[6,7] C=[8,9,10,11]. 4 can be present in A or B. Size(B)<size(A), therefore A=[1 2 3] B=[4,6 7] c=[8,9,10,11]. Finally 5 can be present in B or C, but size(B)<size(C). Therefore the final output should be A=[1,2,3] B=[4,5,6,7] and C=[8,9,10,11].

Guillaume on 14 Sep 2017
This is in no way guaranteed to give you a perfectly balanced output. After that you need to look at optimisation algorithm which is beyond my field. This is also a lot less efficient than my other solution:
C = {[1 2 3 4 5 6], [1 4 5 6 7], [1 5 8 9 12 20]}
[uvals, ~, bin] = unique(cell2mat(C));
origarray = repelem(1:numel(C), cellfun(@numel, C));
dist = table(uvals', accumarray(bin, 1), accumarray(bin, origarray, [], @(x) {x}), 'VariableNames', {'value', 'repetition', 'arrayindices'});
dist = sortrows(dist, 'repetition');
The above builds a table of all the unique values, how many times they're repeated and where they come from originally. You can then iterate through that to distribute the values:
%distribute non-repeated values first
newC = accumarray(cell2mat(dist.arrayindices(dist.repetition == 1)), ...
dist.value(dist.repetition == 1), [], @(x){x'});
%and remove them from table
dist(dist.repetition == 1, :) = [];
%distribute remaining values one at a time
for row = 1:height(dist)
destarrays = dist.arrayindices{row}; %which array originally had the current value?
[~, destidx] = min(cellfun(@numel, newC(destarrays))); %find which is currently smallest
newC{destarrays(destidx)} = [newC{destarrays(destidx)}, dist.value(row)]; %append value to that smallest array
end
celldisp(newC)
I'm using a table here, which is not the most efficient container in matlab, but that make it easier to understand the code.
Ananya Malik on 14 Sep 2017
Thank you so much. Not exactly what I wanted, but pretty close. Will tweak it further for my use. Cheers

KL on 14 Sep 2017
A=[1,2,3,4];
B=[1,4,5,6,7];
C=[1,5,8,9,10,11];
M = {A,B,C};
N = {[B C], [A C], [A B]};
CommonElements = unique(cell2mat(cellfun(@intersect, M,N,'UniformOutput',false)));
NewM = cellfun(@(x) removeEl(x,CommonElements),M,'UniformOutput',false);
%and then
function x = removeEl(x,a)
for el=1:length(a)
x(x==a(el))=[];
end
end

Guillaume on 14 Sep 2017
This is how I'd do it:
C = {[1 2 3 4], [1 4 5 6 7], [1 5 8 9 10 11]};
%sort cell array by increasing vector size:
[~, order] = sort(cellfun(@numel, C));
C = C(order);
%get all unique values
uvals = unique(cell2mat(C));
%go through all arrays, keeping all values in uvals, then removing them from uvals so they can't be used by the next array
for iarr = 1:numel(C)
C{iarr} = intersect(C{iarr}, uvals); %only keep the values in uvals
uvals = setdiff(uvals, C{iarr}); %and remove the one we've just used
end
celldisp(C)
Ananya Malik on 14 Sep 2017
Edited: Ananya Malik on 14 Sep 2017
I am trying to balance the number of elements in the arrays. So if I change my input to A=[1 2 3 4 5 6], B=[1 4 5 6 7] and C=[1 5 8 9 10 11]. Using the code you provided gives me A=[1 4 5 6 7] B=[2 3] and C=[8 9 10 11]. Whereas ideal case would be A=[2 3 4 6] B=[1 7 5] and C=[8 9 10 11] or A=[1 2 3 4] B=[5 6 7] C=[8 9 10 11] or something along this line.

### Categories

Find more on Matrix Indexing in Help Center and File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!