Optimize ESN Hyperparameters with Grid Search MATLAB

1 view (last 30 days)
I have the following echo state state network (ESN) hyperparameters: m_GS,k_GS,c_GS,gamma_GS. I would like to use my ESN to find the optimal values for these four hyperparameters using the grid search method in MATLAB. The function I would like to minimize is the discrete time vector L2_loss(m), where M=101 is the number of simulation timesteps in the function for m=1,2,...,M. Hence, L2_loss is a 1 by 101 double vector in my case. My attempt is shown below:
% Grid Search
ndatapoints = 20;
m_GS = linspace(0,1,ndatapoints); % [kg]
k_GS = linspace(0,100,ndatapoints); % [N/m]
c_GS = linspace(0,100,ndatapoints); % [kg/s]
gamma_GS = linspace(100,200,ndatapoints); % [N/m^3]
[m_G,k_G,c_G,gamma_G] = ndgrid(m_GS,k_GS,c_GS,gamma_GS);
fitresult = L2_loss;
[minval, minidx] = min(fitresult);
m_GS_optimal = m_G(minidx);
k_GS_opitimal = k_G(minidx);
c_GS_optimal = c_G(minidx);
gamma_GS_optimal = gamma_G(minidx);
I'm not sure that this is correct as c_GS_optimal and k_GS_optimal are both zero. Do I need to increase ndatapoints or set it equal to m?
  3 Comments
Jonathan Frutschy
Jonathan Frutschy on 25 Apr 2024
Edited: Jonathan Frutschy on 25 Apr 2024
@Torsten Based on the example I grabbed this from (see below), L2_loss or my fitting_function should be a 20 by 20 by 20 by 20 matrix, not vector. My confusion with defining fitting_function as L2_loss is that I don't have an exact function that relates L2_loss to my hyperparameters like in the example. In other words, I don't know the exact form of L2_loss(m) = fitting_function(m_GS, k_GS,c_GS,gamma_GS). Furthermore, I can only obtain an L2_loss(m) vector for one value of each hyperparameter at a time in my code. That is, I specify M, choose one value from the linspace distribution for each hyperparameter, then run my code with the given M and 4 hyperameter values. This returns a 1 by M L2_loss(m) vector for each hyperparameter set of 4 values. I guess I would then do this for each hyperparameter set until I have covered all possible combinations, where the total number of possible combinations is 20^4? That would yield 160,000 1 by M L2_loss(m) vectors, which I would then have to figure out how to distribute as a 20 by 20 by 20 by 20 double matrix and assign as my fitting_function? Or should my fitting function be a 1 by 160,000 double vector as you suggested? In that case, what happens to the M dimension? Do i simple take a time average of each L2_loss(m) vector for each run of my code to reduce L2_loss(m) from a 1 by M vector to a 1 by 1 vector?
firstparam = [1, 2, 3.3, 3.7, 8, 21]; %list of places to search for first parameter
secondparam = linspace(0,1,20); %list of places to search for second parameter
[F,S] = ndgrid(firstparam, secondparam);
fitting_function = @(p1, p2) p1^2 + p2^2;
fitresult = arrayfun(fitting_function, F, S) %run a fitting on every pair fittingfunction(F(J,K), S(J,K))
fitresult = 6x20
1.0000 1.0028 1.0111 1.0249 1.0443 1.0693 1.0997 1.1357 1.1773 1.2244 1.2770 1.3352 1.3989 1.4681 1.5429 1.6233 1.7091 1.8006 1.8975 2.0000 4.0000 4.0028 4.0111 4.0249 4.0443 4.0693 4.0997 4.1357 4.1773 4.2244 4.2770 4.3352 4.3989 4.4681 4.5429 4.6233 4.7091 4.8006 4.8975 5.0000 10.8900 10.8928 10.9011 10.9149 10.9343 10.9593 10.9897 11.0257 11.0673 11.1144 11.1670 11.2252 11.2889 11.3581 11.4329 11.5133 11.5991 11.6906 11.7875 11.8900 13.6900 13.6928 13.7011 13.7149 13.7343 13.7593 13.7897 13.8257 13.8673 13.9144 13.9670 14.0252 14.0889 14.1581 14.2329 14.3133 14.3991 14.4906 14.5875 14.6900 64.0000 64.0028 64.0111 64.0249 64.0443 64.0693 64.0997 64.1357 64.1773 64.2244 64.2770 64.3352 64.3989 64.4681 64.5429 64.6233 64.7091 64.8006 64.8975 65.0000 441.0000 441.0028 441.0111 441.0249 441.0443 441.0693 441.0997 441.1357 441.1773 441.2244 441.2770 441.3352 441.3989 441.4681 441.5429 441.6233 441.7091 441.8006 441.8975 442.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
[minval, minidx] = min(fitresult);
bestFirst = F(minidx)
bestFirst = 1x20
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
bestSecond = S(minidx)
bestSecond = 1x20
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Jonathan Frutschy
Jonathan Frutschy on 25 Apr 2024
@Torsten You are right. I simply needed to reduce the M dimension by taking the time average of each combination to get a 20^4 sized vector, then take the minimum of this and use the index of this minimum to extract the optimal hyperparameters.

Sign in to comment.

Answers (0)

Products


Release

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!