Solve equationsystem (A*V).'*B*(A*V)=C for matrix A

Hello all,
im asking me now a while how can i solve this equation system:
(A*V).'*B*(A*V)=C
where V, B and C are known.
V is a vector basis. B and C are symmetric square matrices. i want to find a matrix A which is initial simply a unity square matrix. The matrix A should be like a scaling matrix for a vector basis.
Is there a possibility to solve the equation system, maybe iterative, for the matrix A?
Maybe there are more than one solution. but tell me your guesses or ideas what is maybe possible (algorithms, ideas what ever).
i have to find the minimum: min((A*V).'*B*(A*V)-C=0).
Thanks in advance and i hope for some answers :)
best regards

 Accepted Answer

John D'Errico
John D'Errico on 8 Jan 2020
Edited: John D'Errico on 8 Jan 2020
As it turns out though, your problem is actually quite simple.
Assume that V is full rank. As long as that is true, then as can arbitrarily make the substitution
X = A*V
V is known and of full rank, so if we can solve the simpler quadratic form
X'*B*X = C
then we can trivially recover A from X.
I'll note there is much to be found in the literature, solving what seems to be a similar equation, X*A*X=B. For example:
Of course, your problem is different because of the transpose. And that is what makes the solution trivial. Lets take this a step further. B is symmetric and square. Is it positive definite? If so, then we can write B as
B = L'*L
So a Cholesky decomposition of B. Now make a further transformation, with
Y = L*X
Your problem now becomes to solve for Y in the problem
Y'*Y = C
Again, is C positive definite? If so, then we can write Y in the form of a Cholesky decomposition of C. Once you have Y, recover X. Once you have X, recover A.
So the problem becomes trivial, if B and C are positive definite, and V is any full rank matrix.
Edit: Since I see by your response to Matt that V is non-square, the problem is still relatively easy, although the solution now becomes far less unique. If
X = A*V
then once X is known, recover a solution for A as
A = X*pinv(V)

7 Comments

Matt J
Matt J on 8 Jan 2020
Edited: Matt J on 8 Jan 2020
Again, is C positive definite? If so, then we can write Y in the form of a Cholesky decomposition of C.
I wonder if it is optimal to assume that Y is triangular...
i currently trying to follow your solution. looks really good. i trying to code it now
B and C are positive definite.
V has linear independent columns, so i think its full of non-square matrix. lets say if V has 10 rows and 2 columns its rank is 2, so its full rank, right?
the problem here is, that Y must be non square to work. so cholesky decomp may doesnt work here.
again this example: B is 10x10, C is 2x2 so A*V must be 10x2. so if i do this decompositions, it work till Y'*Y=C. this Y must be also 10x2
Still trivial. :) You missed something important.
C is 2x2.
Solve for Y, such that Y'*Y = C, using Y as a Cholesky factor of C. So now Y will be 2x2, an upper triangular matrix. So what?
You can trivially append an 8x2 matrix of ALL zeros to Y.
Y = chol( C);
Y = [Y;zeros(8,2)];
Y is still a matrix such that Y'*Y = C. Appending rows of zeros to Y does not change that product. Y is now 10x2.
Yes, it is true that Y is not unique, but that was clear from the beginning, especially once we were told that V was not square and that B and C were different in size.
Once you have Y as a 10x2 matrix, then recover X from Y, and A from X.
Again, the solution is all totally non-unique, since C is only a 2x2 matrix, and you wish to recover a 10x10 matrix A from effectively only 4 pieces of information. But you cannot expect to do any more than that. (Even worse, C is symmetric, so there are not even 4 degrees of freedom in the problem.)
yes, thats true that we can simply extend Y in dimensions. And yes the problem is that Y is not unique.
Another improvement idea:
If i had many different B-C combinations. So different B and C matrices but the vector basis V and scaling matrix A should be the same for all these combinations. Can i improve my solution to make it more unique?
It is not just that Y is not unique. The extraction of A, given X is also not unique. So there is a set of infinitely many matrices A that solve the problem.
Anyway, this is now a completely different problem. For example, suppose you had matrices B1,C1, B2,C2, with a fixed basis V? The method I showed that can generate one non-unique solution A is not directly extensible to two sets of matrices. Could you generate two solutions, A1 and A2, (or A1,A2, ... , An) then take the average? It is not clear that the average of thoe matrices, for example, would also be a solution, or even be at all meaningful.
The set of 10x2 matrix solutions for any given 2x2 matrix C, such that Y'*Y ==C is not related linearly to the matrix C. Those are essentially quadratic equations.
That leaves you with a problem where you have a 10x10 matrix of unknowns, so 100 unknowns, but still possibly not enough pieces of information to generate a unique solution.
You might decide to try a nonlinear optimizer, perhaps lsqnonlin.
yes it it should be a fit procedure (as you said with lsqnonlin), if i decide to take a training set a C and B matrices.
EDIT:
It is not just that Y is not unique. The extraction of A, given X is also not unique.
I maybe can get Y with a known solution of the product (A*V).'*B*(A*V) =V.'*D*V
i have the matrix D (known) and the projection of D onto a subspace spanned by vector basis V is the correct solution which i know. This projection should be equal to the matrix B projected onto a subspace spanned by a modified vector basis X=A*V. So its enough for me, to get a solution for X which is the modified new vector basis.

Sign in to comment.

More Answers (3)

Matt J
Matt J on 8 Jan 2020
Edited: Matt J on 8 Jan 2020
Looks like it would just be,
A = sqrtm(B)\sqrtm(C)/V;

5 Comments

hi, thanks for this answer. Never heard of a squareroot function for matrices. Thats interesting.
I have tried your solution already, but the problem here is that B and C are not the same size. Matrix B is larger than C.
(A*V).'*B*(A*V) results in a matrix which has the same size as C (subspace projection).
simple example: C is 2x2, B is 10x10, V is 10x2, A should be 10x10.
do you have any idea?
So, 4 equations and 100 unknowns?
its 2 equation and 100 unknowns.. or if A is diagonal then 10 unknowns
Matt J
Matt J on 8 Jan 2020
Edited: Matt J on 8 Jan 2020
its 2 equation and 100 unknowns.
I don't see how it's 2 equations. Both sides of the equation are 2x2 matrices, so that gives 4 conditions.
or if A is diagonal then 10 unknowns
Is A diagonal? Do you want us to assume that?
for a first solution we could assume A is diagonal. but maybe there is also a solution for a full matrix A

Sign in to comment.

Matt J
Matt J on 8 Jan 2020
Edited: Matt J on 8 Jan 2020
for a first solution we could assume A is diagonal.
An iterative solution:
fun=@(a) objective(a,B,C,V);
a=lsqnonlin(fun, ones(length(B),1) );
A=diag(a);
function F=objective(a,B,C,V)
A=diag(a);
F=C-(A*V).'*B*(A*V);
end

8 Comments

thats look also like a nice iterative solution. i tried, but variable a is not changing while iteration, it always stays at the initial value ones(length(B),1). do you have any guesses?
it also gives the warning:
Warning: Trust-region-reflective algorithm requires at least as many equations as variables; using Levenberg-Marquardt
algorithm instead.
i will try to set it to the Levenberg-Marquardt algorithm
with Levenberg-Marquardt algorithm it gives the same result. it doesnt change variable a within the iteration, also the residuum is constant
That doesn't happen for me. Presumably, for your data ones(length(B))) is already almost optimal.
in the MAT file attached here i have the matrices which i use currently. can you try it?
the residual is 6.83151e+24. it also doesnt change
it seems to me, that this residual is too high
In the data you sent, B is not 10x10. It is 3330x3330.
yes, but the principle should be the same. even for a first try
With the modified code below, and after suitable normalization of your B,C,V data (note that this doesn't change the solutions A), I obtain a solution a that indeed is not very different from the initial guess, but it is different and measurably improves the error. As I told you, the initial guess of a=ones(N,1) was almost optimal to begin with.
First-Order Norm of
Iteration Func-count Residual optimality Lambda step
0 3331 0.000683151 0.37 0.01
1 6664 0.000681193 0.074 1 0.00211418
2 9999 0.000671141 0.0379 10000 2.57346e-05
C =
0.0042 0.0000
0.0000 227.3884
>> Error(a)
ans =
-0.0259 0.0000
0.0000 -0.0000
Modified code:
load matlab.mat
B=sparse(B/1e4);
C=C/1e14;
V=V/1e5;
Error=@(A) objective(A,B,C,V);
opts=optimoptions(@lsqnonlin,'MaxIterations',10);
a=lsqnonlin(Error, ones(length(B),1) ,[],[],opts );
function F=objective(a,B,C,V)
N=length(B);
A=spdiags(a(:),0,N,N);
F=C-(A*V).'*B*(A*V);
end
ok thanks for that.
so that means that a diagonalization of A is not optimal for this specific problem.

Sign in to comment.

David Kriebel
David Kriebel on 9 Jan 2020
Edited: David Kriebel on 9 Jan 2020
If i had many different B-C combinations. So different B and C matrices but the vector basis V and scaling matrix A should be the same for all these combinations. Can i improve my solution to make it more unique?
EDIT:
i can make it unique if i have a known solution for Y which can be expressed by another known matrix D.
Y.'*Y = V.'*D*V = V.'*D^(1/2)*D^(1/2)*V
then Y results in:
Y=D^(1/2)*V
Then i can use the cholesky decomp as described above to recover A in the original subspace spanned by the basis V.
Thanks for all your ideas, guesses and help!!!
I got it now to work and get a unique solution!

Products

Release

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!