Solving overdetermined linear sytem for possible solutions

I have a coefficient Matrix A with the size N x M with values 0 or 1 and a Vector B with the size 1 x M. I'm searching for one or multiple possible solutions for C, a vector with the length K, which statisfy sum(A(C,:),1) == B. There are not infinite solutions, however at least 1 solutions exists.
Simplified example:
A = [1 0 1; 1 1 1; 1 0 0; 0 0 1]
A = 4×3
1 0 1 1 1 1 1 0 0 0 0 1
B = [2 1 2]
B = 1×3
2 1 2
A solution for C with K = 2 could be
C2 = [1 2];
C2l = logical([1 1 0 0]); % or as logical indexing
And for C with the K = 3 could be
C3 = [2 3 4];
C3l = logical([0 1 1 1]); % or as logical indexing
sum(A(C2l,:),1) == B
ans = 1×3 logical array
1 1 1
sum(A(C3,:),1) == B
ans = 1×3 logical array
1 1 1
I have access to optimisation toolbox, but I don't know how to formulate this constraint.
K = 3;
prob = optimproblem;
RI = optimvar('RowIndex',K,'Type','integer','LowerBound',0);
%% Constraints
prob.Constraints.colsum = sum(A(RI,:),1) == B;
Unable to use a value of type optim.problemdef.OptimizationVariable as an index.
I'm open for any approaches.

Answers (2)

Torsten
Torsten on 2 Sep 2022
Edited: Torsten on 2 Sep 2022
Choose RI as a binary integer vector (0 or 1) of length N. Constraint is sum(RI) = K and sum(RI.'*A) = B.
Of course, this is only for the case that you know an RI such that sum(RI.'*A) = B exists.

1 Comment

Toy example
N=5;M=4;
A=randi([0,1],N,M);
K = 3;
B=sum(A(3:5,:)); %Known solution RI=[3,4,5]
prob = optimproblem;
X = optimvar('X', [1 size(A,1)], 'Type','integer','LowerBound',0, 'UpperBound', 1);
Constraint1 = sum(X,2)==K;
Constraint2 = X*A == B;
prob.Constraints = [Constraint1 Constraint2];
sol = solve(prob);
Solving problem using intlinprog. LP: Optimal objective value is 0.000000. Optimal solution found. Intlinprog stopped at the root node because the objective value is within a gap tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the default value).
C = find(round(sol.X))
C = 1×3
3 4 5

Sign in to comment.

Download minL1intlin,
and solve min. subject to sum(c)=K, c(i){0,1}..
N=5;M=4;
A=randi([0,1],N,M);
K = 3;
B=sum(A(3:5,:)); %Known solution RI=[3,4,5]
c=minL1intlin(A.',B.',1:N, [],[],ones(1,N),K,zeros(N,1),ones(N,1) );
LP: Optimal objective value is 0.000000. Optimal solution found. Intlinprog stopped at the root node because the objective value is within a gap tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon variables are integer within tolerance, options.IntegerTolerance = 1e-05 (the default value).
RI=find(round(c))
RI = 3×1
3 4 5

7 Comments

It is super fast, but it does not give a valid solution for real A (might be my array is too large).
N = 12105; M = 37; K = 9;
B_real = [ 3 5 1 1 3 1 2 1 1 2 1 1 2 5 4 1 4 1 1 1 3 1 1 2 1 1 1 2 2 2 1 3 2 1 4 1 3 ];
Solver B
B_solver = [3 5 0 1 0 1 0 0 1 0 2 0 1 0 0 1 8 1 0 0 3 1 1 8 3 0 0 9 9 0 0 3 6 1 1 1 2];
Output from the console:
LP: Optimal objective value is 62.000000.
Heuristics: Found 1 solution using total rounding.
Upper bound is 62.000000.
Relative gap is 0.00%.
Optimal solution found.
Intlinprog stopped at the root node because the objective value is within a gap tolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default value). The intcon variables
are integer within tolerance, options.IntegerTolerance = 1e-05 (the default value).
Can you please try Torsen's formulation?
It is super fast, but it does not give a valid solution for real A (might be my array is too large).
What evidence do you have that an ideal solution exists? Please attach the A and B data, along with the known solution RI.
I tried Torsen's formulation and the output is that there is no feasible solution, which I cannot understand. The matrices are constructed from data sets using all possible combinations, so there should be at least one solution.
Deep explanation:
I have 2 Matrices, A (n x 1) and B (m x 1) with items stored in Cells.
A = num2cell(randi([1 88],88,1)); % Matrix of items A
B = num2cell(randi([1 44],23,1)); % Matrix of items B
%not all items in A and B are necessary unique
From Matrix A, I pick 54 items and Matrix B, 18 items at random with randperm. Resulting Matrix A1 and B1 have items, which might have duplicate items.
A1 = randperm(size(A,1),54);
B1 = randperm(size(B,1),18);
%I work with Cells (and Characters) but when I tried to use Numbers in Cells, your follow
%code did not work with setdiff errors.
A1 = string(A(A1));
B1 = string(B(B1));
Now I pick 6 items from A and 2 items from B to go through all possible combinations and keep the combinations, which only have unique items, resulting in varying Matrix A6B2 ~10000-20000 x 8 combinations.
% your code
n = 6; m = 2;
AnotB = setdiff(A1,B1);
BnotA = setdiff(B1,A1);
AorB = intersect(A1,B1);
A6B2 = [];
for i=0:min(n,size(AnotB,2))
SA = nchoosek(AnotB,i);
for j=0:min(m,size(BnotA,2))
SB = nchoosek(BnotA,j);
SC = nchoosek(AorB,m+n-i-j);
[iA,iB,iC] = ndgrid(1:size(SA,1),1:size(SB,1),1:size(SC,1));
SABC = [SA(iA,:),SC(iC,:),SB(iB,:)];
A6B2 = [A6B2; SABC];
end
end
size(A6B2)
ans = 1×2
288 8
groupcounts([A1;B1]) to get the randomly selected Items and their count.
[cItem, uItem] = groupcounts([A1;B1]);
B_real = cItem % since I picked from A1 and B1, the total count must stay the same
Now I construct a coefficient Matrix and applied the Torsen optimisation problem algorithm.
for i = 1:size(cItem,1)
CM(:,i) = sum(strcmp(A6B2,uItem{i})~=0,2);
end
K = 9;
prob = optimproblem;
X = optimvar('X', [1 size(CM,1)], 'Type','integer','LowerBound',0, 'UpperBound', 1);
Constraint1 = sum(X,2)==K;
Constraint2 = X*double(CM) == cItem';
prob.Constraints = [Constraint1 Constraint2];
sol = solve(prob);
Now my thinking, why there should be at least one solution:
Since I picked the amount which is divisible by 8 and (54+18 = 72 ->72/8 -> K = 9), now there has to be at least one combination in A6B2 (since A6B2 stores all possible combinations) to distribute all items of A1 and B1 to K "Boxes".
"Deep explanation:" ..
I honestly don't understand from "Resulting Matrix A1 and B1 have partially the repeating items." onwards/
cleared up my explanation in hope of better readibility
"since A6B2 stores all possible combinations"
Be careful, the combinations are not necessary A in the first 6 letters and B the last 2 in case AorB = intersect(A1,B1) is a non-empty set.
My code pick m+n-i-j that is sorted elements to form SC. You would have to make all permutation of SC to get ALL combinations with 6 first letters from A and 2 first letters from B.
It is not clear from your previous question, since you use the word "subset", a set does not have order: "abc", "bac" "cba" ... all are the same

Sign in to comment.

Categories

Products

Asked:

on 2 Sep 2022

Commented:

on 2 Sep 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!