How to quickly do Cholesky factorization for many small matrices?
Show older comments
In my project, I have to check many small matrices for positive-semidefiniteness (PSDness). I use chol() since it is much faster than using eig() to check for negative eigenvalues. Still it is very slow and the most time consuming part of my calculations, since I have to check 10e6 32x32 matrices over and over again. I already tried to use parfor and batch functions but it doesn't work (I also read this in the forums). However, it occured to me that the processor utilization is very low, so I checked what happens when I run the code on two instances of matlab on the same computer. The processor utilization doubled and each script finished in the same time it would have taken if they had been running alone. So it is possible to let chol() run in parallel on the same system. Any ideas how to do this in the same instance of matlab without incurring the problems you get with chol() in a parfor loop?
12 Comments
s pernot
on 23 Jul 2021
hi
if you are not too demanding and accept to move for instance to a QR decomposition instead of a Cholesky one, you may have two combined options to solve your question :
1 . seek for Householder QR algorithm proposed on Cleve Moler's blog and update it by using bsyfun and pagemtimes functions to process 32x32x1e6 tensors in a vectorized way....
2. make a variation on the theme to send computations to a GPU if available with enough GPU memory. This latter options may speed up the code by a factor greater than to 100 times.
good luck
Bruno Luong
on 23 Jul 2021
Edited: Bruno Luong
on 23 Jul 2021
Have you tried loop on
eigs(A, 1, 'sa')
EDIT: I tried it and it much solwer than chol or qr
Stephan Orzada
on 23 Jul 2021
s pernot
on 23 Jul 2021
a single comment to speed up your code :
gpuArray(true(N,1)) creates a logical vector on the cpu memory and then transfer it to the gpu unit....
you may directly use
true(N,1,'gpuarray') or gpuArray.true(N,1) which directly creates the vector on the gpu without any transfer loss...
the same applies to zeros(1,1,N)...
Something else to improve your code:
the if else statement within the two for loops is not efficient at all... and slows down computations
have you tried to vectorize the code in another way by changing your algorithm.... especially by using pagemtimes or bsxfun ? it could be helpful to eliminate one of the for loop ....
otherwise, you could switch to Bruno's Luong way using mex code
Stephan Orzada
on 23 Jul 2021
Edited: Stephan Orzada
on 23 Jul 2021
Christine Tobler
on 30 Jul 2021
Edited: Christine Tobler
on 30 Jul 2021
Late comment, I'm just back from vacation: What are you doing based on knowing if chol succeeded on each matrix? That is, do you need to Cholesky factors for the next computation step? If not, I'd still be interested to know what the computation is - it's possible that chol and eig don't agree on whether a matrix is SPD in edge cases (that is, if there's an eigenvalue that's zero up to round-off error).
EDIT: If most of your matrices are expected to NOT be SPD, you can check this more efficiently by looking at the diagonal of each matrix - if any element is <= 0, it means the matrix cannot be SPD, so no need to call chol at all.
Stephan Orzada
on 30 Jul 2021
Edited: Stephan Orzada
on 30 Jul 2021
Bruno Luong
on 31 Jul 2021
Edited: Bruno Luong
on 31 Jul 2021
To Paul
PSD is equivalent to L'*L factorizable, this is trivial to see, since
x'*A*x is equal to y'*y = norm(y)^2 with y = L*x.
Though I also agree with Christine Tobler, numerically one must be careful using CHOL in liminting cases.
The cholesky factorization is defined for matrix that is fHermitian so the diagonal must be real. <=0 check makes sense.
I was referring to the OP's statements about using the chol() function to check for positive semi-definite. As I understand chol(), it doesn't factor a positive semi-definite matrix, so I don't understand how chol() would be used in this application.
M = diag([0 1 1]); % a positive semi-definite matrix that should be accepeted for this problem
[R,flag] = chol(M)
Good point (as usual) about the the diagonal of a Hermitian matrix. The OP didn't explicitly state that the input matrices are Hermitian, but I guess that's a good assumption.
Bruno Luong
on 31 Jul 2021
Edited: Bruno Luong
on 31 Jul 2021
In real life 0 eigen value does not exist exactly. CHOL, EIG, EIGS all might fail due to round off.
And Stephan Orzada end up programming his own Cholesky decomposition for his own need, not MATLAB CHOL.
Stephan Orzada
on 1 Aug 2021
Answers (1)
Bruno Luong
on 23 Jul 2021
0 votes
It requires MEX, but it should be fast
2 Comments
Stephan Orzada
on 23 Jul 2021
Edited: Stephan Orzada
on 23 Jul 2021
Bruno Luong
on 24 Jul 2021
Edited: Bruno Luong
on 24 Jul 2021
Indeed mmx only deal with real matrix.
But if you are willing to modify the code, you might change the function dpotrf in line 284 of mmx.cpp to zpotrf
Of course you have to take care of retriving MATLAB complex internal interleaved data.
You might ask the author if he can gives you a hand for such task.
Categories
Find more on Linear Algebra in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!