Smooth derivative of a matrix

11 views (last 30 days)
Daniel Kaplan
Daniel Kaplan on 30 Jun 2020
Edited: Matt J on 17 Sep 2021
Hello all!
I'm trying to solve a problem which I'm sure is familiar to many of you. I have a Hermitian matrix H, which depends on a continuous parameter . Diagonalizing this matrix, I obtain a similarity transformation such that . My goal is calculate an object of this form: .
This, as you probably know, is no mean feat. First, I approximate the derivative by a finite difference , but for even this to work, I've had to impose a strict gauge fixing for each so that the result makes sense. I do this, by demanding that the first element in a given column vector greater than some cutoff (arbitrary value) is purely real, and positive. This condition is then imposed on every column in turn.
Unfortunately, some problems still remain. Between k-points, I find that the matrix Ustill has some freedom in the way the eigenvectors are constructed. For example, a sample vector for a given n is attached below.
As you can see, most entries between (k+dk) and (k-dk) look very much alike. However, occasionally I find myself taking a difference between numbers like 0.271216847084423 + 0.121180161772403i and 8.05451483378083e-06 - 9.43994294873909e-05i resulting in a huge error.
It appears that the actual elements have been swapped: I find the "smooth" value right next to it (look at entries 3, and 4 in the Us below).
So my question is: is there any way to smoothly connect the eigenvectors of neighbouring k points? Is there any way to feed the "previous" step, into the current diagonalization, to be used as a guide? (So that the algorithm always pivots towards a solution as similar as possible to the previous k)?
Any help will be greatly appreciated!
%k+dk/2%
U(91,:)=(..., -1.60107995247200e-05 - 0.000216571122223191i, 0.271216847084423 + 0.121180161772403i, -4.49985882512863e-07 + 3.52816362273402e-06i, -1.35727709652833e-08 - 1.37078930406644e-07i ...)
%k-dk/2%
U(91,:) = (..., -8.56457993055567e-06 + 0.000210558245922497i, 8.05451483378083e-06 - 9.43994294873909e-05i, 0.256786805859489 + 0.114760798783825i, -3.58422797004591e-08 - 3.43888311460517e-07i, ... )
  5 Comments
Daniel Kaplan
Daniel Kaplan on 30 Jun 2020
Edited: Daniel Kaplan on 30 Jun 2020
Oh, perhaps I explained myself incorrectly. Suppose H is 200X200, then obviously U is 200X200.
Now for every n, pick U(98,n). Calculate,
theta = atan(imag(U(98,n))/real(U(98,n)));
And then,
U(:,n) = exp(-i*theta)*U(:,n);
Finally, verify that U(98,n) is positive,
if(U(98,n) < 0.0) U(:,n) = -1*U(:,n); end;
And repeat for every n.
The point was picking 98 globally, instead of selecting a different m for every column n works better.
I would be very happy to hear more, and any suggestion is welcome.
Christine Tobler
Christine Tobler on 1 Jul 2020
Ah yes, I had interpreted U(n, m) to refer to the two parameters x and y you had mentioned previously. Changing the sign of each eigenvector so that U(98, :) (for example) is always real and positive makes sense.
I think you could combine this with the approach in eigenshuffle (or any other approach that's reordering the columns of U), since the rescaling here seems unrelated.
One other approach would be to take to matrices U and U2 which are close together, and find the scaling si for each column U2(:, i) that makes it as similar to U(:, i) as possible.

Sign in to comment.

Answers (4)

David Goodmanson
David Goodmanson on 18 Jul 2020
Edited: David Goodmanson on 24 Jul 2020
Hi Daniel
As you know, multiplying each column of a unitary matrix by a scalar phase factor still results in a unitary matrix (the phase factors can vary for each column). The hermitian matrix H(k) is diagonalized with U(k)'*H(k)*U(k). Let's say k goes in steps of dk. At each step k --> k+dk, each column of U(k+dk) can be multiplied by an independent phase factor. Controlling those phase factors is a necessity, as you are doing wirh certain elements defined to be real.
The code below uses the following idea. Let Un(k+dk) denote the nth column of U(k+dk). Choose theta so that the quantity
| Un(k+dk)*exp(i*theta) - Un(k) |^2
is minimized. This occurs when
theta = angle(Un'(k+dk)*Un(k))
Start at k = kstart with U given by the Matlab eig function, and proceed as above (n times for the n columns of U) to obtain U(k) as a smooth function of k. At that point you can calculate U'(k)*dU/dk using the diff function with finely spaced k. However. if you have an analytic or numeric expression for dH/dk, you don't have to do any further differentiation. That's because solving
H(k+dk)*U(k+dk) = U(k+dk)*E(k+dk)
gives to first order
dH*U + H*dU = dU*E + U*dE
which has as a solution
dE/dk = diag(U' dH/dk U) (vector of dE/dk values)
U'*dU/dk = G.*(U' dH/dk U) (nondiagonal part of U'dU/dk only)
where
Gij = 1/(Ej -Ei) i~=j
= 0 i=j
U'*dU/dk could still have a diagonal part, which is as yet undetermined. However, for reasons I have not derived yet, using the theta procedure above results in a U'*dU/dk that has zero values on the diagonal anyway.
The code below uses the standard idea of two complex hermitian matrices H1 and H2, with H(k) = H1*k + H2*(1-k) as k runs from 0 to 1. Then dH/dk is just H1-H2.
For hermitian matrices it appears that the output of eig always has sorted eigenvalues, and that the eigenvectors are normalized to 1 so that the eigenvector matrix is automatically unitary. Otherwise some sorting/normalizing steps would have to be taken, but so far that does not appear to be necessary.
With all off-diagonal elements of H being nonzero, the eigenvalue plot shows much level repulsion as it should.
Depending on the random numbers, some of the plots, especially the eigenvector plots, feature quite nice colored spaghetti.
The theta procedure uses a for loop in the k variable. It's possible to vectorize that out, but the speed advantage over the for loop starts to go away when the dimension of the matrix gets larger. Just the for loop version is included here.
% create hermitian matrices
mu = .2;
d = 10
h1 = diag(8*rand(d,1)-4) ...
+ mu*(2*rand(d,d)-1 + 2i*rand(d,d) -i); % corrected
h1 = h1 + h1';
h2 = diag(8*rand(d,1)-4) ...
+ mu*(2*rand(d,d)-1 + 2i*rand(d,d) -i); % corrected
h2 = h2 + h2';
N = 20000; % size of k array
E = zeros(d,N);
u = zeros(d,d,N);
A = zeros(d,d,N); % the result, U'*dU/dk
k = (1:N)/N;
delk = k(2)-k(1);
% store U and E for each value of k = k(j)
for j = 1:N
H = k(j)*h1 + (1-k(j))*h2;
Hk = h1-h2;
[uj Ej] = eig(H,'vector');
E(:,j) = Ej;
u(:,:,j) = uj;
end
% phase angle procedure
ufix = u;
for nd = 1:d
for j = 2:N
a = ufix(:,nd,j-1);
b = ufix(:,nd,j);
theta = angle(b'*a);
ufix(:,nd,j) = b*exp(i*theta);
end
end
figure(99)
plot(k,E)
if true % do plots of U if d is not too large
for j = 1:d
figure(j)
uplot = squeeze(ufix(:,j,:));
plot(k,real(uplot))
end
end
% calculate A = U'dU/dk
for j = 1:N
uj = ufix(:,:,j);
Ej = E(:,j);
G = 1./(Ej' - Ej);
G(1:d+1:d^2) = 0;
Hktrans = uj'*Hk*uj;
A(:,:,j) = Hktrans.*G;
end
if true % do plots of A if d is not too large
for j = 1:d
figure(j+d)
uplot = squeeze(A(:,j,:));
plot(k,real(uplot))
end
end
  3 Comments
David Goodmanson
David Goodmanson on 29 Jul 2020
HI Daniel,
I tried a couple more things, and have found that, regardless of what is going on with U' * dU/dk, the process that I called the phase angle procedure does very well at producing smooth results for the eigenvectors as a function of k. Jumps that are produced in eigenvector components by eig go away, and you can even introduce randomly chosen multiplicative phase factors for every eigenvector at every value of k and these are removed as well. Not that there can't be cases where this doesn't work, but I have not seen one so far.
Since the value of U' * dU/dk of on the diagonal can't be found with the U' dH/dk U method, I went with numerical differentiation as I imagine you did as well. Each diagonal element is of course the same as
u' * du/dk for one of the eigenvectors u contained in U. Since the u are continuously normalized, then (based on just this fact) u' * du/dk must be pure imaginary. I think I have a decent proof that with the phase angle procedure, the imaginary part is zero as well, so that the diagonal elements are zero. I found that with oridinary gradient-type differention for du/dk the values were orders of magnitude smaller than a typical off-diagonal element but the real part was greater than the imaginary part. Simple first differences are not good enough. When I went to differentiation by cubic spline the imaginary part is stll orders of magnitude smaller than a typical off-diagonal value and the real part is even smaller.
I think that the imaginary part is due to numerical imprecision and is consistent with zero, and any result based on the diagonal elements is probably not meaningful. If that's the case then with this choice of phase factors there would be no 'interesting contributiions' due to the diagonal part.
What you said in point 3 is troublesome, I have to admit.
Interesting what you said about v/E in point 4. I just think about it as a Green's function.
Lior Faeyrman
Lior Faeyrman on 15 Sep 2021
Great solution!
But how do I find the diagonal parts?

Sign in to comment.


Matt J
Matt J on 24 Jul 2020
Edited: Matt J on 24 Jul 2020
So my question is: is there any way to smoothly connect the eigenvectors of neighbouring k points?
Not in general. The eigenvectors are not generally continuous functions of the matrix entries, as shown in this old example from J.W. Givens:
H(0)=eye(2)
H(k)=[1 + k*cos(2/k), -k*sin(2/k);
-k*sin(2/k), 1 - k*cos(2/k)]
This matrix is continuous as a function of k, even at k=0, but its U matrix is not
U(k)=[sin(1/k) -cos(1/k);
cos(1/k) sin(1/k)]
  8 Comments
Matt J
Matt J on 1 Aug 2020
Edited: Matt J on 3 Aug 2020
And does it matter if H is hermitian or not?
I don't thnk, so. I think it's more about whether H has eigenvalues with geometric multiplicity greater than 1. We know the disconinuity will not occur when H has distinct eigenvalues,
Also, when H(t) has geometric multiplicity greater than 1 at some point t0, its eigenvector space is discontinuously larger in dimension than at neighboring t, so it makes sense that we could see eigenvector discontinuities there.
Bruno Luong
Bruno Luong on 3 Aug 2020
"so it makes sense that we could see eigenvector discontinuities there."
But I think this is just an artefact of output of specific algorithm. If the eigenspace has an order >= 0, any non degeneraed basis of the eigen space form a valid eigen vectors. Therefore correction of the phase will not bring them necesseraly to be continuous, let alone differentiable wrt time k. However I'm convinced that there must exist a linear combination that make them continuous/differentiable. It's just make more effort to find them than the simple phase rotation of individual eigen vectors.

Sign in to comment.


Bruno Luong
Bruno Luong on 30 Jul 2020
Edited: Bruno Luong on 30 Jul 2020
I give just an idea that is like a big hammer. What if you use minimization technique to get a closest eigen vectors to the previous step?
meaning
argmin( norm(U(k+1) - U(k)^2) such that
U*(k+1)*D(k+1)*U(k+1) = H(k+1)
U*(k+1) *U(k+1) = I
Here U(k+1) is the only unknown. All the rest are known. The matrix norm under minimization is Frobenious norm, for example.
You might run FMINCON to find the solution for each step k, and using starting point as MATLAB decomposition and using MATLAB eigen values (known then) to form D(k+1). This will be nasty and probably applicable in practice only for small dimension - said <= 10, but it should work, as it kill all the freedom that make U(k) jumps when k changes.
You should provide the graduent to FMINCON to help it for better convergence. The gradient expression can be derived without difficulty.
If you push further the formal calculation and write down the euler-lagrage of the above, it might even lead you to something useful. There are many papers on similar problem and people end up with an eigen problem with bigger problem.
If it doesn't work then you might be in Matt's example.

Bruno Luong
Bruno Luong on 30 Jul 2020
Edited: Bruno Luong on 30 Jul 2020
Here is my take on the derivative by using a phase correction on all components of Uk.
As example, I use the random matrix H generated as above, provided by David Goodmans
% create hermitian matrices, see David Goodmans's post
mu = .2;
d = 10;
h1 = diag(8*rand(d,1)-4) ...
+ mu*(2*rand(d,d)-1 + 2i*rand(d,d) -1i); % corrected
h1 = h1 + h1';
h2 = diag(8*rand(d,1)-4) ...
+ mu*(2*rand(d,d)-1 + 2i*rand(d,d) -1i); % corrected
h2 = h2 + h2';
N = 20000; % size of k array
k = (1:N)/N;
delk = k(2)-k(1);
% store H(k) in Hseq
Hseq = zeros(d,d,N);
for j = 1:N
H = k(j)*h1 + (1-k(j))*h2;
Hseq(:,:,j) = H;
end
Here is the main code
% John d'Errico FEX to compute eigen values and eigen vectors
% https://www.mathworks.com/matlabcentral/fileexchange/22885-eigenshuffle
[U,E] = eigenshuffle(Hseq);
% Phase correction
[m,n,N] = size(U);
Up = U(:,:,1);
zu = speye(n);
for p=2:N
Upm1 = Up;
Up = U(:,:,p);
% for each column j we look sj, a complex scalar such that
% |sj| = 1
% sj*U(:,j,p) ~= U(:,j,p-1)
% in the sense that sj = argmin |sj*U(:,j,p)-U(:,j,p-1)|^2
for j=1:n
sj = Upm1(:,j).' / Up(:,j).'; % transpose, NOT conjugate transpose
zu(j,j) = sj / norm(sj); %#ok
end
Up = Up * zu; % scale each eigen vector U(:,j) by sj
U(:,:,p) = Up; % Update the eigen vectors
end
% Compute derivative of U
dU = diff(U,1,3) / delk;
Then graphical quick verification
close all
% Plot the real part
kmid = 0.5*(k(1:end-1)+k(2:end));
for i=1:m
fig = figure(0+i);
ax = axes('parent', fig);
h = plot(ax, kmid, permute(real(dU(i,:,:)),[3,2,1]));
xlabel(ax,'k');
ylabel(ax,'real(dU/dk)');
title(ax, sprintf('Eigen vector component #%d', i));
legend(h, arrayfun(@(j) sprintf('U(:,%d)',j), 1:n, 'unif', 0));
end
% Plot imaginary part
for i=1:m
fig = figure(m+i);
ax = axes('parent', fig);
h = plot(ax, kmid, permute(imag(dU(i,:,:)),[3,2,1]));
xlabel(ax,'k');
ylabel(ax,'imag(dU/dk)');
title(ax, sprintf('Eigen vector component #%d', i));
legend(h, arrayfun(@(j) sprintf('U(:,%d)',j), 1:n, 'unif', 0));
end
  1 Comment
Bruno Luong
Bruno Luong on 30 Jul 2020
Edited: Bruno Luong on 30 Jul 2020
Forget it, I think my method is exactly like David's phase correction. The only new thing is I combine it with John's eigenshuffe

Sign in to comment.

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!