Help: Speed up for loop with two functions
Show older comments
Hello everyone, i am currently running this code. But unfortunately it takes for ever to finsish. A 12x10 matrix took nearly 30 minutes and i need one for 1415596x10. Is there any way i can speed up the process? Is there something that i am missing?
My goal is to optimize my estimation parameters of fun with a particle swarm algorithm. For each day in my sample (6009 days), i have multiple data. I want to minimize the error between observed data (implied_volatility) and my estimation data (via fun) with my 10 parameters.
A = [log_moneyness maturity]; xdata = A; ydata = implied_volatility;
fun = @(x,xdata) (x(1)+x(2).*xdata(:,2)) + (x(3)+x(4).*xdata(:,2)).*((x(5)+x(6).*xdata(:,2)).*...
(xdata(:,1) -(x(7)+x(8).*xdata(:,2))) + sqrt((xdata(:,1) - (x(7)+x(8).*xdata(:,2)).^2 ...
+ (x(9)+x(10).*xdata(:,2)).^2))); %original function
parameters_forall = table2array(T_param (:,2:11)); %1415596x10 matrix
%implied_volatility is a 1415596x1 matrix with original values from market
%now i want to minimize the error of my estimate by:
tDays = size(parameters_forall(:,1));
%%
for i = 1:tDays
x = parameters_forall(i,:);
residue = @(x) (sum(fun(x,xdata)-implied_volatility(i)).^2);
y = particleswarm(residue, 10);
swarm_parameters(i,:) = y;
end
Grateful for any help, all the best to you!
11 Comments
Walter Roberson
on 12 Oct 2021
x = parameters_forall(i,:);
residue = @(x) (sum(fun(x,xdata)-implied_volatility(i)).^2);
You do not use the x that you assign to there. The x in the next line is a different x.
Also, the way your code is structured, all 1415596x10 of xdata would be used in every optimization, with the only difference between optimizations being the implied volatility. It would seem more likely to me that you would only want xdata to be a subset of the available data, such as the data for one day (possibly less)
Kai Koslowsky
on 13 Oct 2021
Walter Roberson
on 13 Oct 2021
Yes... except you seem to have lost the implied volatility, which was a scalar ??
Kai Koslowsky
on 13 Oct 2021
Edited: Kai Koslowsky
on 13 Oct 2021
Walter Roberson
on 13 Oct 2021
I guess that should be okay.
As discussed before, though, I think your upper bound should be N(i+1)-1 . Otherwise each N(K) except the first and last are used twice, once at the end row for ii = K-1, and once as the start row when ii = K
About how many entries are expected per day?
When I was testing symbolically last night, I found the unexpected conclusion that if you
syms XD [1, size(Xdata,1)]
syms YD [size(Xdata,1), 1]
syms X [1 10]
R = sum(fun(X,XD)-YD).^2))
part1 = solve(diff(R,X(1)),X(1))
R2 = subs(R, X(1), part1)
then the result came out as R2 = 0. Same if you solve the derivative with respect to X(2) .
So at the optimal X(1) or optimal X(2) [according to calculus], the entire residue is 0.
This did not happen for the other variables that I happened to test.
I think the implication is that there might be a lot of different solutions -- which might make it difficult to minimize.
Kai Koslowsky
on 14 Oct 2021
Walter Roberson
on 14 Oct 2021
residue = @(x) sqrt((1/ii).*(sum(fun(x,Xdata)-Ydata).^2));
At the moment, I do not undertand why the residue should be proportional to the index of the last entry? The implication is that if you were to have exactly the same data in (say) January 2018 and (say) March 2018, that the residue should be smaller for the second case because it would have a greater ii and so 1/ii would be smaller ??
On the other hand, the 1/ii factor acts as a constant multiple, and for any two proposed x vectors for the same ii, the relative residue is what is important for determining which is smaller residue, and the relative order is not affected by 1/ii . Likewise, unless you had imaginary components, sum of squared would always be positive, and the relative order of sqrt() of two positive numbers is the same as the relative order of the two numbers itself.
What you need to know for minimization is the relative successes. So you might as well use
residue = @(x) sum((fun(x,Xdata)-Ydata).^2);
Please note that this has an important change to your formula. Watch the location of the () !!
fun(x,Xdata) %invoke function to obtain prediction
- Ydata %subtract actual value
( ).^2 %square the difference
sum( ) %sum the squares
whereas your formula had
fun(x,XData) %invoke function to obtain prediciton
- Ydata %subtract actual value
sum( ) %sum the differences
.^2 %square the sum
Note: You never store the residues, so you do not care about the absolute values, only about the relative values.
Kai Koslowsky
on 14 Oct 2021
Edited: Kai Koslowsky
on 14 Oct 2021
Walter Roberson
on 15 Oct 2021
you currently only store the coefficients, and for coefficients you do not need the sqrt or the division.
You might want to know what the estimate is with the final coefficients. You can evaluate the residue with the coefficients, divide, sqrt, save the value.
ii-kk+1 is the proper divisor
Kai Koslowsky
on 15 Oct 2021
Edited: Kai Koslowsky
on 15 Oct 2021
Kai Koslowsky
on 3 Nov 2021
Answers (0)
Categories
Find more on Solver Outputs and Iterative Display in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!