Numerical packing
    16 views (last 30 days)
  
       Show older comments
    
Hi all, here's a quick one. I've got an array and a limit:
A = [235 235 235 234 235 235 234 234 234 220 280 215 234]
lim = 1000
I want to group the elements of A so that:
1. The sum of each group's elements is below my lim
2. The group numbers of consecutive elements of A never decrease
3. A minimum number of groups is used. Ie, if a valid solution exists using 3 groups, all solutions using 4 or more groups are inferior.
4. The maximum of the sums within each group is minimized (clarified below)
As an example:
grps = zeros(size(A));
while any(grps==0)
 openIdxs = find(grps==0);
 mask = cumsum(A(openIdxs))<lim;
 grps(openIdxs(mask)) = max(grps)+1;
end
The above returns:
grps =
     1  1  1  1  2  2  2  2  3  3  3  3  4
This satisfies (1), (2) and (3), but note that the 4th group has only one element, and the sums of each group are:
grpSums = arrayfun(@(i)sum(A(grps==i)),1:max(grps))
grpSums =
   939   938   949   234
Can anyone think of a nice way to "balance" the group sums so that the maximum group sum is minimised? Note that I want to keep the group count minimised (at 4 in this case)... I don't want to just make lots of groups of 1 element.
Bonus votes for simplicity of code over speed and/or exactness of solution... I don't mind if something is close to optimal but not quite there, and I'm only working with tiny arrays similar to the example above.
Clarification:
Point (4) refers to the grpSums variable calculated above. The command max(grpSums) in this example returns 949. If move some elements from group 3 to group 4:
grps = [1  1  1  1  2  2  2  2  3  3  4  4  4]
my grpSums becomes [939 938 734 449], making max(grpSums) give 939. An even better (and perhaps optimal for this case?) solution is:
grps = [1  1  1  2  2  2  3  3  3  3  4  4  4]
which gives grpSums = [705 704 922 729]. Note that it's not so much the range of numbers in grpSums that I care about - it's specifically its maximum.
8 Comments
  Sean de Wolski
      
      
 on 19 Dec 2011
				Oh it's fine, and a great simplification (if what I'm doing makes any sense at all (which it might (probably does) not)).
Accepted Answer
  Sven
      
 on 29 Dec 2011
        3 Comments
  Sean de Wolski
      
      
 on 30 Dec 2011
				For the example above, timing for mine on R2011b 64bit Windows:
Elapsed time is 0.313661 seconds.
You could also speed your's up a little bit by using 'prodofsize' as the function for cellfun. ( http://www.mathworks.com/matlabcentral/answers/10966-is-there-any-process-to-convert-cell-type-array-to-numarical-array Jan's comment and my time test)
(And don't worry about the Answer; at least you didn't delete this question :) and ask it again)
  Sean de Wolski
      
      
 on 30 Dec 2011
				Ans 0.55 seconds for A = [A;A];
Afetr that it's not obeying constraints.
More Answers (4)
  Walter Roberson
      
      
 on 19 Dec 2011
        After the clarifications, this is a knapsack problem. Knapsack problems are generally NP-complete (exponential time) to solve exactly, but in the restricted case of non-negative integer "weights", one can use Dynamic Programming; the algorithm is hinted at over here
0 Comments
  Seth DeLand
    
 on 10 Mar 2014
        Seeing how Optimization Toolbox has a mixed-integer solver in R2014a, I figured I would post this alternate solution that uses intlinprog to solve this problem. If you examine the output structure from intlinprog, the absolute gap is zero, meaning the solver has proven that this is the optimal solution to this problem.
A few things to note:
- This finds the same solution found in the brute force approach, with a max group size of 922.
- This code runs in a few hundredths of a second, likely much faster than brute force or global optimization methods for this problem.
%%Data
AA = [235 235 235 234 235 235 234 234 234 220 280 215 234];
lim = 1000; % maximum allowable in any group  
ng = 4; % number of groups
%%Useful sizes
na = length(AA);
nx = ng*na;
nz = 1;
%%Variables
% There are two types of variables in this formulation: 
% x: ng x na matrix of binary variables. x_i,j = 1 if A(i) is in group j. 
% z: auxiliary variable used to keep track of maximum size of any group.
% The goal is to minimize z.
%%Objective function coefficients
f = zeros(nx + nz, 1);
f(end) = 1; % Minimize z, which is last in f
%%Sum of each group is less than limit
  % A*x <= b
  A = zeros(ng,nx+nz);
  for ii = 1:na
    A(:,ng*(ii-1)+1:ng*ii) = diag(repmat(AA(ii),ng,1));
end
b = repmat(lim,ng,1);
%%Minimax formulation
% Variable z changes the minimization problem into a minimax problem:
% min z  s.t.  Sum Ax - z <=0 (z is greater than or equal to size of each 
% group)
Atmp =  A;
Atmp(:,end) = -1;
A = [A; Atmp];
b = [b; zeros(ng,1)];
%%Groups of consecutive elements never increase (linear inequality)
Atmp = zeros(na-1,nx+nz);
for ii = 1:na-1
    Atmp(ii,ng*(ii-1)+1:ng*ii) = 1:ng;
    Atmp(ii,ng*ii+1:ng*(ii+1)) = -(1:ng)';
end
A = [A; Atmp];
b = [b; zeros(na-1,1)];
%%Equality constraint that assigns each member to exactly 1 group
Aeq = zeros(na,nx+nz);
for ii = 1:na
    Aeq(ii,ng*(ii-1)+1:ng*ii) = 1;
end
beq = ones(na,1); % exaclty 1 in each group
%%Bounds
lb = zeros(nx+nz,1); % all variables positive
ub = ones(nx+nz,1); % upper bounds for binary variables
ub(end) = Inf; % no upper bound on z
%%Integer variables
intIdxs = 1:nx; % All x variables are binary integers
%%Solve
[xopt,fval,eflag,output] = intlinprog(f, intIdxs, A, b, Aeq, beq, lb, ub);
% If the problem is infeasible (negative eflag), might need to add more
% bins.
%%Post-process
x = xopt(1:end-1); % Extract x from solution vector
grpMax = xopt(end); % grpMax is z
x = round(reshape(x,ng,na));
optBins = zeros(size(AA)); % assigned bins
for ii = 1:na
    optBins(ii) = find(x(:,ii));
end
binVals = A(1:ng,:)*xopt; % grpMax is z
The found solution:
binVals =
705.0000
704.0000
922.0000
729.0000
  Dr. Seis
      
 on 19 Dec 2011
        UGLY/BRUTE-FORCE: I assumed you were trying to minimize the standard deviation of your group sum AND the difference between your lim and mean group sum. I also assumed that there would always be at least 2 groups. I check all possible group sizes/combinations. Here goes:
A = [235 235 235 234 235 235 234 234 234 220 280 215 234];
lim = 1000;
intmax = prod(A);
% WEIGHT 
%   >0.5 weight STD of sum more heavily 
%   <0.5 weight difference of lim and MEAN of sum more heavily 
weight = 0.5; 
best_combo = zeros(1,length(A));
best_fit = intmax;
for i = 1 : length(A)-1
  B = nchoosek(1:length(A)-1,i);
  for j = 1 : nchoosek(length(A)-1,i)
      C = (i+1)*ones(1,length(A));
      for k = sort(1 : i,'descend')
          C(1:length(A)<=B(j,k))=k;
      end
      fit = (std(arrayfun(@(k)sum(A(C==k)),1:i+1))*(weight))^2 + ...
            ((lim-mean(arrayfun(@(k)sum(A(C==k)),1:i+1)))*(1-weight))^2;
      if sum(arrayfun(@(k)sum(A(C==k)),1:i+1)>lim) == 0 && fit < best_fit
          best_fit = fit;
          best_combo = C;
      end
  end
end
grpsum=arrayfun(@(k)sum(A(best_combo==k)),1:max(best_combo));
This results in:
best_combo =
     1     1     1     2     2     2     3     3     3     3     4     4     4
grpsum =
   705   704   922   729
2 Comments
  Sean de Wolski
      
      
 on 20 Dec 2011
        Alright, here goes!! accumarray is certainly a part of it, as is ga() the awesome genetic algorithm function. So yes, Global Optimization Toolbox is required :)
main.m:
%12/19/2011
%
%Data
A = [235 235 235 234 235 235 234 234 234 220 280 215 234]'; %column vector (because they're cool)
lim = 1000; %upper limit.
szA = size(A);
% Get bounds and number of variables:
[bounds, n] = getValidGroup(A,lim);
% Engine:
ObjectiveFunction = @(x)binWeight(A,index2groups(x,szA),lim); %get the weight
[x,fval] = ga(ObjectiveFunction,n,[],[],[],[],ones(1,n),bounds(2,:),[],1:n);
index = index2groups(x,szA);
fprintf('Groups: %s \n',num2str(index));
fprintf('Weight is: %i!\n',binWeight(A,index,lim));
binWeight.m:
function weight = binWeight(A,index,lim)
%
%data nx1 vector of values
%index = nx1 vector of groupings
%lim = threshold of group sum
%
groups = accumarray(index,A); %sum bins
groups(groups>lim) = lim*10; %lim is soft constraint with stiff penalty.
weight = max(groups); %worst one
getValidGroup.m:
function [bounds cnt] = getValidGroup(A,lim)
%
%inputs:
%A  - nx1 vector of values
%lim - the group limit
%
%outputs:
%bounds - 2x(cnt) lower/upper bounds
%cnt - number of bins
%
cnt = 0;
szA = size(A);
index = zeros(szA);
A2 = A;
while ~isempty(A)
    cnt = cnt+1;
    index(cnt) = find(cumsum(A)<lim,1,'last');
    A(1:index(cnt)) = [];
end
index = index(1:cnt);
index(end) = find(cumsum(flipud(A2))<lim,1,'last');
bounds = 1:cnt; %lower bounds
bounds(2,:) = index; %upper bounds
index2groups.m:
function groups = index2groups(index,szA)
%index is a vector cumulative max values
index = index(:);
index = cumsum([1;index(1:end-1)]);
groups = zeros(szA(1),1);
groups(index) = 1; %min(index,szA(1))
groups = cumsum(groups);
And executing main.m:
Optimization terminated: average change in the penalty fitness value less than options.TolFun and constraint violation is less than options.TolCon. Groups: 1112223333444 Weight is: 922!
I have to say this is one of the most fun/educational/rewarding MATLAB problems I've ever worked on. Thanks!
2 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!



