Specifying a cost function to optimize a simbiology model response to plausible ranges

1 view (last 30 days)
I am trying to generate a virual population simulation for my model using a uniform distribution for some parameter set from my simbiology model. Then I specify some if statements to reduce the simdata to fall within biological bounds and perform some calculations such as: mean, median, max, and min. Next, I would like to speicfy the cost funciton to optimize my model response with respect to the chosen parameter set. How can I correctly specify the cost function in terms of the simulaiton data? The cost functions is given as: and ; where l and u are lower and upper bounds for plausible ranges of model states. I've provided the simdataReduced and timeVector to help with solving the problem. How can I specify Mi(p) in this case?
%% New script to perform analysis on the simdataReduced
stopTime = simdataReduced(1).Time(end);
timeVector = linspace(0,stopTime,700);
simdataResampled = resample(simdataReduced, timeVector);
% "Stack" the matrices of simulation results and average them
% Calculate the 25th and 75th percentiles
stackedData = cat(3, simdataResampled.Data);
meanData = mean(stackedData, 3);
maxData = max(stackedData, [],3);
minData = min(stackedData, [],3);
medianData= median(stackedData, 3);
prc75= prctile(stackedData, 75, 3);
prc25= prctile(stackedData, 25, 3);
%%--------------- function script---------------------------------------------------
function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y
% Initialize some useful values
m = length(y); % number of training examples (length of simdataResampled)
% You need to return the following variables correctly
J = 0;
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
prediction = X.*theta;
%sqrError = (prediction - y).^2;
c= 0.5*(lb+up)
sqrError= (((prediction- c).^2) - (ub-c).^2);
J = sum(max(sqrError), 0);
% =========================================================================
end
  3 Comments
Fearass Zieneddin
Fearass Zieneddin on 27 May 2022
Suppose that in a real population of interest (which could be ‘all-comers’, a clinical trial population or subsets of these) that these measures, or a subset of them, have a known distribution , that is .~D
We may write as the set of model parameters p ={k1, .., kn} . Denote the parameter set p as a “plausible patient” if
where and is the physiologically plausible range for measurement . So M(p) is the ith state variable or model outcome. Hope this makes it clearer
Arthur Goldsipe
Arthur Goldsipe on 27 May 2022
I think I'm still missing something. It sounds to me like M sub i of P is some particular simulation result (a particular simulated state value at a particular time) for a parameter set P. But I think you've already figured out how to do that by looking at simdataResampled.Data. More specifically, it looks like simdataResampled(p).Data(t,i) would be timepoint t for state i and parameter sample p. But if that's not what you intend, I think it might be easiest to talk through this "live." If you want to do that, please contact me directly on MATLAB Answers, or reach out to Technical Support.

Sign in to comment.

Accepted Answer

Arthur Goldsipe
Arthur Goldsipe on 1 Jun 2022
Fearass and I had a chat about this. I figured out that the cost function is essentially a modification of some of squares of residuals. The are reduced so that they are zero when the simulated values are within the plausible range.
With that information, I realized that M sub i of P represents the simulation results of a particular state at a particular time and for a particular parameter set. In code, this would be simdataResampled(p).Data(t,i) for timepoint t, state i, and parameter sample p. The code to calculate the cost function could take advantage of "implicit expansion", and using the variables defined in the example code it might be written something like this: sum(max(0,(stackedData-c).^2-(u-c).^2),'all')
  6 Comments
Fearass Zieneddin
Fearass Zieneddin on 6 Jun 2022
Edited: Fearass Zieneddin on 6 Jun 2022
I cannot seem to find a way how to map back to the target parameter index becuase I have already reduced down the number of outcomes of intrest. Here are more lines of code for your reference
simfunc = createSimFunction(m4, paramNames, outputNames, 'drug_Ctr', varObj(3));
paramValues = simfunc.Parameters.Value;
samples = SimBiology.Scenarios();
n= 1000; %number of random parameter sets to be generated
for i = 1:numel(paramNames)
probdist = makedist("Uniform", "lower", 0.5*paramValues(i), "upper", 2*paramValues(i));
add(samples, "elementwise", paramNames(i), probdist, "Number", n);
end
simdata = simfunc(samples, 30);
Here we set up our simulation bounds based on information found in the literature:
%% 2. Set up plausible bounds for biological outcomes
% reduce the population size based on constraints
j = 0;% initialize the temporary index
for i = 1:length(simdata) % Looping over total number of subjects (n in the present case)
[~,x]= max(simdata(i, 1).Data(:,7)); % concentrations
[~,z]= max(simdata(i, 1).Data(:,4)); %blood levels for some biological outcome
idx= find(simdata(i,1).Time >=15 & simdata(i,1).Time <=20)
%Define the index for recovery after 15-20 days
% recover at least %90 after 15-20 days we REJECT the parameter sets
if simdata(i,1).Data(end,1) >= 0.90 && simdata(i, 1).Data(x, 7)<= 15100 ...
&& simdata(i, 1).Data(x, 7)>=1500 ...
&& simdata(i,1).Data(z,4)>=232 && simdata(i,1).Data(z,4)<= 341 ...
&& simdata(i,1).Data(idx(1),1) >= 0.90
j = j+1;% Index increment whenever the above condition is satisfied
simdataReduced(j,1) = simdata(i,1);% Mapping data
%with new constraint into a new simdata object
end
end
j % Number of subjects which satisfies the constraint
And finally we perform the statistcal calculation using these lines of code
stopTime = simdataReduced(1).Time(end);
timeVector = linspace(0,stopTime,700);
simdataResampled = resample(simdataReduced, timeVector);
% "Stack" the matrices of simulation results and average them
% Calculate the 25th and 75th percentiles
stackedData = cat(3, simdataResampled.Data);
meanData = mean(stackedData, 3);
maxData = max(stackedData, [],3);
minData = min(stackedData, [],3);
medianData= median(stackedData, 3);
prc75= prctile(stackedData, 75, 3);
prc25= prctile(stackedData, 25, 3);
Arthur Goldsipe
Arthur Goldsipe on 7 Jun 2022
I can think of several ways to do this. The key thing is that you need to do something to keep track of the parameter mapping when you filter down the SimData. Let me first offer a simple change but then suggest a more "MATLAB-y" way to approach this that might help you long term.
The simplest thing you could do would be to store additional information when you add an element to simdataResampled. You create a variable that helps you map back to the sample number (for example, sampleIndex(j) = i;). You could also store the parameter values themselves in the UserData property of the SimData. For example, if you use sampleValues from my previous comment, you could write simdataResampled(j).UserData = sampleValues(i,:).
But a more MATLAB way to approach this is to vectorize as much of your code as possible, and to avoid "growing" vectors one element at a time the way you currently create simdataReduced. This will have the added benefit. Specifically, I would create a logical vector isPlausible to indicate whether each of your original samples is plausible. I would then use that vector (after your for loop) to create simdataReduced and optionally to identify the parameter values associated with the reduced sample set. Specifically:
%% 2. Set up plausible bounds for biological outcomes
% reduce the population size based on constraints
n = length(simdata);
isPlausible = false(n,1);
for i = 1:length(simdata) % Looping over total number of subjects (n in the present case)
[~,x]= max(simdata(i, 1).Data(:,7)); % concentrations
[~,z]= max(simdata(i, 1).Data(:,4)); %blood levels for some biological outcome
idx= find(simdata(i,1).Time >=15 & simdata(i,1).Time <=20)
%Define the index for recovery after 15-20 days
% recover at least %90 after 15-20 days we REJECT the parameter sets
isPlausible(i) = simdata(i,1).Data(end,1) >= 0.90 && simdata(i, 1).Data(x, 7)<= 15100 ...
&& simdata(i, 1).Data(x, 7)>=1500 ...
&& simdata(i,1).Data(z,4)>=232 && simdata(i,1).Data(z,4)<= 341 ...
&& simdata(i,1).Data(idx(1),1) >= 0.90;
end
sampleValues = generate(samples);
simdataReduced = simdata(isPlausible);
sampleValuesReduced = sampleValues(isPlausible,:);

Sign in to comment.

More Answers (0)

Communities

More Answers in the  SimBiology Community

Categories

Find more on Perform Sensitivity Analysis in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!