Simplification of creating new variables in a loop from preset array names

I am woking on sensetivity analysis in Matlab and have to calculate sensetivity for every input parameter I have. Currently there is 10+ parameters inside so hardcoding all parameters by hand becomes waste of time. Thought, all calculations are exactly the same and I just need to provide different names to every variable.
Is there any solution to give the names within the for loop so it will automatially assign new name while itteration.
Here is a current view of the code:
%-------------------------------------------------------------------------
% Evaluate the first-order sensitivity r_Percentage_Internal_Wall
% sort according to Percentage_Internal_Wall values:
[r_Percentage_Internal_Wall_sort,I_r_Percentage_Internal_Wall] = sort(r_Percentage_Internal_Wall);
% evaluate the mean values of blocks of 100 samples:
r_Percentage_Internal_Wall_blocks = mean(reshape(r_Percentage_Internal_Wall_sort,n_MCS/100,100),1);
r_grey_energy_gwp_blocks_r_Percentage_Internal_Wall = mean(reshape(r_grey_energy_gwp(I_r_Percentage_Internal_Wall),n_MCS/100,100),1);
% these are the means of blocks of 100 samples of r_grey_energy_gwp sorted according to r_Const_Thickness values
% Estimate the first-order sensitivity values:
% Form is: VarY_X1=var(Y_blocks_X1); % the variance of E[Y|X1]
Var_r_grey_energy_gwp_r_Percentage_Internal_Wall = var(r_grey_energy_gwp_blocks_r_Percentage_Internal_Wall);
% First-order sensitivity indexes:
S_r_Percentage_Internal_Wall = Var_r_grey_energy_gwp_r_Percentage_Internal_Wall/r_grey_energy_gwp_var %#ok<NOPTS>
%-------------------------------------------------------------------------
% Evaluate the first-order sensitivity r_Reinforcement
% sort according to Reinforcement values:
[r_Reinforcement_sort,I_r_Reinforcement] = sort(r_Reinforcement);
% evaluate the mean values of blocks of 100 samples:
r_Reinforcement_blocks = mean(reshape(r_Reinforcement_sort,n_MCS/100,100),1);
r_grey_energy_gwp_blocks_r_Reinforcement = mean(reshape(r_grey_energy_gwp(I_r_Reinforcement),n_MCS/100,100),1);
% these are the means of blocks of 100 samples of r_grey_energy_gwp sorted according to r_Const_Thickness values
% Estimate the first-order sensitivity values:
% Form is: VarY_X1=var(Y_blocks_X1); % the variance of E[Y|X1]
Var_r_grey_energy_gwp_r_Reinforcement = var(r_grey_energy_gwp_blocks_r_Reinforcement);
% First-order sensitivity indexes:
S_r_Reinforcement = Var_r_grey_energy_gwp_r_Reinforcement/r_grey_energy_gwp_var %#ok<NOPTS>
I would like to do it in a for loop in such a way, that it automatically creates new variables and puts the name from array = ["Reinforcement","Percentage_Internal_Wall", ... ] and doing all the calculations presented above.
As well I have python code which was doing quite the same. Here it is:
for indi in indicators:
result_df['grey_energy_'+ indi] = interim_df['grey_energy_insul_' + indi] + interim_df['grey_energy_const_'+indi] + \
interim_df['grey_energy_window_' + indi] + interim_df['grey_energy_internal_' + indi]
result_df['grey_energy_D_'+ indi] = interim_df['grey_energy_insul_D_' + indi] + interim_df['grey_energy_const_D_'+indi] + \
interim_df['grey_energy_window_D_' + indi] + interim_df['grey_energy_internal_D_' + indi]
The structure I would like to have:
Indicators = {'r_Length' 'r_Height' and etc.}
for i = 1: indicators
% Evaluate the first-order sensitivity r_Reinforcement
% sort according to Reinforcement values:
[(Indicators{i})_sort,I_(Indicators{i})] = sort((Indicators{i}));
% evaluate the mean values of blocks of 100 samples:
(Indicators{i})_blocks = mean(reshape((Indicators{i})_sort,n_MCS/100,100),1);
r_grey_energy_gwp_blocks_(Indicators{i}) = mean(reshape(r_grey_energy_gwp(I_(Indicators{i})),n_MCS/100,100),1);
% these are the means of blocks of 100 samples of r_grey_energy_gwp sorted according to r_Const_Thickness values
% Estimate the first-order sensitivity values:
% Form is: VarY_X1=var(Y_blocks_X1); % the variance of E[Y|X1]
Var_r_grey_energy_gwp_(Indicators{i}) = var(r_grey_energy_gwp_blocks_(Indicators{i}));
% First-order sensitivity indexes:
S_(Indicators{i}) = Var_r_grey_energy_gwp_(Indicators{i})/r_grey_energy_gwp_var %#ok<NOPTS>
end

 Accepted Answer

Embedding metadata in variable names is a sure way of creating complex and buggy code. See Tutorial: why variables should not be named dynamically.
I don't know much about Python, but it looks like your Python code only use two variables Result_df and interim_df. That's what you should do in matlab as well. The two variables appear to be dictionaries. The equivalent in matlab is containers.Map.
%asuming you have an interim_df containers.Map already filled
%and indicator a row string vector
result_df = containers.Map;
for indi = indicators
result_df("grey_energy_" + indi) = interim_df("grey_energy_insul_" + indi) + ...
%etc
end
However, you may want to rethink the storage altogether and store everything in a single table which would avoid all this variable name manipulation.

5 Comments

The reason why I am working not in single table is that I am sortig the data by every parameter and then reshape it by blocks for every variable. Won't this be a problem for one table for constantly sorting all the variables?
Just to give a better explanation what is hapenning in the program:
1) I generate 10000 random values for each input parameter with the help of Normal Distribution generator. Inputs are mean and standart deviation from input table(as an example):
%Sensetivity analysis part
n_MCS=1e5;
%Means
m_Length = ParamFile{contains(ParamFile.Var1,'Length'),'Var4'};
%Standart Deviations
s_Length = ParamFile{contains(ParamFile.Var1,'Length'),'Var5'};
%Random values with ERADist MCS
seed = 322; rng(seed,'twister');
r_Length = random(ERADist('normal','MOM',[m_Length,s_Length]),n_MCS,1);
2) Then I calculate the building paramaters according to generated random variables. Calculation is done for each row of random values so we obtain 10000 random outputs(as an example) and obtain distribution output values for further Sensetivity analysis:
%Const Materia Volume
r_const_material_volume = (r_no_of_floors+1).*r_Const_Thickness.*r_Width.*r_Length + r_Height.*r_Const_Thickness.*r_OpaqueWallSurfaceArea-r_steel_mass/7850;
%GWP calculation from random variables
r_grey_energy_const_gwp = r_const_material_volume .* sum(table2array(...
LCA_Database(contains(LCA_Database.Name,'Concrete C 20/25')&~contains(LCA_Database.lca_phase,'D'),'gwp')))...
+ r_steel_mass .* sum(table2array(...
LCA_Database(contains(LCA_Database.Name,'Construction Steel')&~contains(LCA_Database.lca_phase,'D'),'gwp')));
r_grey_energy_gwp = r_grey_energy_const_gwp + r_grey_energy_insul_gwp + r_grey_energy_window_gwp + r_grey_energy_internal_gwp;
%GWP distribution parameters
r_grey_energy_gwp_mean = mean(r_grey_energy_gwp) %#ok<NOPTS>
r_grey_energy_gwp_var = var(r_grey_energy_gwp) %#ok<NOPTS>
r_grey_energy_gwp_std = std(r_grey_energy_gwp) %#ok<NOPTS>
3) Then I make evaluation of importance of one of 1) parameters in total result 2)
%-------------------------------------------------------------------------
% Evaluate the first-order sensitivity Length
% sort according to r_Length values:
[r_Length_sort,I_r_Length] = sort(r_Length);
% evaluate the mean values of blocks of 100 samples:
r_Length_blocks = mean(reshape(r_Length_sort,n_MCS/100,100),1);
r_grey_energy_gwp_blocks_r_Length = mean(reshape(r_grey_energy_gwp(I_r_Length),n_MCS/100,100),1);
% these are the means of blocks of 100 samples of r_grey_energy_gwp sorted according to r_Length values
% Estimate the first-order sensitivity values:
% Form is: VarY_X1=var(Y_blocks_X1); % the variance of E[Y|X1]
Var_r_grey_energy_gwp_r_Length=var(r_grey_energy_gwp_blocks_r_Length);
% First-order sensitivity indices:
S_r_Length = Var_r_grey_energy_gwp_r_Length/r_grey_energy_gwp_var %#ok<NOPTS>
Part 3 is my final output. The problem that write now I have to do this for 10+ input parameters and part 3) is coppied and pasted 10+ times and I manually change form r_Length to r_Height for example.
So now I am looking for the way to simplify this calculations to make the loop for all this parameters and automatically generate new variables from array of names I already have. This is exactly what Python indi in indicators is doing. But at the time of creating new variables it has to fill it with data as well. All the data for calculation I have before the Run of the program.
Because of that I do not need mapping, I need the creation of new variables and computing them.
"This is exactly what Python indi in indicators is doing"
No, it is not. As Guillaume already correctly explained, your Python code does NOT create new variables. In fact your example Python code accesses only two variables (named result_df and interim_df), which are dictionaries (judging by the syntax that you are using). Your Python code actually accesses/creates new items in those dictionaries. Your Python code has nothing to do with the creation of new variables.
"Because of that I do not need mapping"
Ironically that is exactly what Python dictionaries are (i.e. the code that you have shown us), which is made quite clear in the Python documentation:
"...I need the creation of new variables and computing them."
I very much doubt that. Dictionaries in Python provide very efficient data access: you can get much the same behavior using a containers.Map object (as Guillaume wrote) or a simple structure object: https://www.mathworks.com/help/matlab/structures.html
If you were interested in writing efficient code then the best choice in MATLAB is to avoid forcing meta-data into fieldnames and use indexing: this can of course be trivially achieved using any array type (numeric, cell, structure, table, etc.).
An internet search will also give plenty of explanations that that dynamically accessing variables in Python is very inefficient and is not recommended (just like MATLAB).
Thank you for the explanation!
Now I see the point.
Will give a second look on mapping and will try access the already existing variables and add the result of the computation in same table.
Thought, my second question is still there. Will it be possible to use the sorting I am using for the computation of Sensitivity analysis on one big 10000 sample table I have. Or it is better to separate the samples by name and do the sorting in different tables and then compute it there?
"Or it is better to separate the samples by name and do the sorting in different tables and then compute it there?"
If I was given as task like this I would use a non-scalar structure to store the data, with fields for the data, names, etc. It is trivial to loop over a non-scalar structure and apply whatever operations you want to the fields of each structure element.
Something like this (pseudocode):
S(1).name = 'r_Length';
S(1).data = [...]
S(2).name = 'r_Height';
S(2).data = [...]
...
names = {S.name}
for k = 1:numel(S)
tmp = mean(S(k).data)
val = somefunction(tmp)
... sorting, whatever else you want to do
S(k).out = ...
end
Note that this code uses simple and very efficicent indexing.
Tables also support applying functions to groups of rows, you can probably do something equivalent with tables:
"Is there any solution to give the names within the for loop so it will automatially assign new name while itteration."
It is certainly possible, but you would be forcing yourself into writing very inefficicent, complex, buggy code that would be hard to debug. You can easily avoid this using a structure array, or a cell array, or a table, etc.
Thank you a lot for the help!
Definitely restructure my code in this way now!
Seems way cleaner and more efficient.

Sign in to comment.

More Answers (0)

Products

Release

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!