Main Content

Results for

We are often asked to help with parameter estimation problems. This discussion aims to provide guidance for parameter estimation, as well as troubleshooting some of the more common failure modes. We welcome your thoughts and advice in the comments below.

Guidance and Best Practices:

1. Make sure your data is formatted correctly. Your data should have:

  • a time column (defined as independent variable) that is monotonically increasing within every grouping variable,
  • one or more concentration columns (dependent variable),
  • one or more dose columns (with associated rate, if applicable) if you want your model to be perturbed by doses,
  • a column with a grouping variable is optional.

Note: the dose column should only have entries at time points where a dose is administered. At time points where the dose is not administered, there should be no entry. When importing your data, MATLAB/SimBiology will replace empty cells with NaNs. Similarly, the concentration column should only have entries where measurements have been acquired and should be left empty otherwise.

If you import your data first in MATLAB, you can manipulate your data into the right format using datatype ' table ' and its methods such as sortrows , join , innerjoin , outerjoin , stack and unstack . You can then add the data to SimBiology by using the 'Import data from MATLAB workspace' functionality.

2. Visually inspect data and model response. Create a simulation task in the SimBiology desktop where you plot your data ( plot external data ), together with your model response. You can create sliders for the parameters you are trying to estimate (or use group simulation). You can then see whether, by varying these parameter values, you can bring the model response in line with your data, while at the same time giving you good initial estimates for those parameters. This plot can also indicate whether units might cause a discrepancy between your simulations and data, and/or whether doses administered to the model are configured correctly and result in a model response.

3. Determine sensitivity of your model response to model parameters. The previous section can be considered a manual sensitivity analysis. There is also a more systematic way of performing such an analysis: a global or local sensitivity analysis can be used to determine how sensitive your responses are to the parameters you are trying to estimate. If a model is not sensitive to a parameter, the parameter’s value may change significantly but this does not lead to a significant change in the model response. As a result, the value of the objective function is not sensitive to changes in that parameter value, hindering estimating the parameter’s value effectively.

4. Choose an optimization algorithm. SimBiology supports a range of optimization algorithms, depending on the toolboxes you have installed. As a default, we would recommend using lsqnonlin if you have access to the Optimization Toolbox. See troubleshooting below for more considerations choosing an appropriate optimization algorithm.

5. Map your data to your model components: Make sure the columns for your dependent variable(s) and dose(s) are mapped to the corresponding component(s) in your model.

6. Start small: bring the estimation task down to the smallest meaningful objective. If you want to estimate 10 parameters, try to start with estimating one or two instead. This will make troubleshooting easier. Once your estimation is set up properly with a few parameters, you can increase the number of parameters.

Troubleshooting

1. Are you trying to estimate a parameter that is governed by a rule? You can’t estimate parameters that are the subject of a rule (initial/repeated assignment, algebraic rule, rate rule), as the rule would supersede the value of the parameter you are trying to estimate. See this topic.

2. Is the optimization using the correct initial conditions and parameter values? Check whether - for the fit task - the parameter values and initial conditions that are used for the model, make sense. You can do this by passing the relevant dose(s) and variant(s) to the getequations function. In the SimBiology App, you can look at your equation view (When you have your model open, in the Model Tab, click Open -> Equations). Subsequently, - in the Model tab - click "Show Tasks" and select your fit task and inspect the initial conditions for your parameters and species. A typical example of this is when you do a dosing species but ka (the absorption rate) is set to zero. In that case, your dose will not transfer into the model and you will not see a model response.

3. Are your units consistent between your data and your model? You can use unit conversion to automatically achieve this.

4. Have you checked your solver tolerances? The absolute and relative tolerance of your solver determine how accurate your model simulation is. If a state in your model is on the order of 1e-9 but your tolerances only allow you to calculate this state with an accuracy down to 1e-8, your state will practically represent a random error around 1e-8. This is especially relevant if your data is on an order that is lower than your solver tolerances. In that case, your objective function will only pick up the solver error, rather than the true model response and will not be able to effectively estimate parameters. When you plot your data and model response together and by using a log-scale on the y-axis (right-click on your Live plot, select Properties, select Axes Properties, select Log scale under “Y-axis”) you can also see whether your ODE solver tolerances are sufficiently small to accurately compute model responses at the order of magnitude of your data. A give-away that this is not the case is when your model response appears to randomly vary as it bottoms-out around absolute solver tolerance.

_ Tolerances are too low to simulate at the order of magnitude of the data. Absolute Tolerance: 0.001, Relative Tolerance: 0.01_

_ Sufficiently high tolerances. Absolute Tolerance: 1e-8, Relative Tolerance: 1e-5_

5. Have you checked the tolerances and stopping criteria of your optimization algorithm? The goal for your optimization should be that it terminates because it meets the imposed tolerances rather than because it exceeds the maximum number of iterations. Optimization algorithms terminate the estimation based on tolerances and stopping criteria. An example of a tolerance here is that you specify the precision with which you want to estimate a certain parameter, e.g. Cl with a precision down to 0.1 ml/hour. If these tolerances and stopping criteria are not set properly, your optimization could terminate early (leading to loss of precision in the estimation) or late (leading to unnecessarily long optimization compute times).

6. Have you considered structural and practical identifiability of your parameters? In your model, there might exist values for two (or more) parameters that result in a very similar model response. When estimating these parameters, the objective function will be very similar for these two parameters, resulting in the optimization algorithm not being able to find a unique set of parameter estimates. This effect is sometimes called aliasing and is a structural identifiability problem. An example would be if you have parallel enzymatic (Km, Vm) and linear clearance (Cl) routes. Practical identifiability occurs when there is not enough data available to sufficiently constrain the parameters you are estimating. An example is estimating the intercompartmental clearance (Q12), when you only have data on the central compartment of a two-compartment model. Another example would be that your data does not capture the process you are trying to estimate, e.g. you don’t have data on the absorption phase but are trying to estimate the absorption constant (Ka).

7. Have you considered trying another optimization algorithm? SimBiology supports a range of optimization algorithms, depending on the toolboxes you have installed. There is no single answer as to which algorithm you should use but some general guidelines can help in selecting the best algorithm.

  • Non-linear regression: If your aim is to estimate parameters estimates for each group in your dataset (unpooled) or for all groups (pooled), you can use non-linear regression estimation methods. The optimization algorithms can be broken down into local and global optimization algorithms. You can use a local optimization algorithm when you have good initial estimates for the parameters you are trying to estimate. Each of the local optimization functions has a different default optimization algorithm: fminsearch (Nelder-Mead/downhill simplex search method), fmincon (interior-point), fminunc (quasi-newton), nlinfit (Levenberg-Marquardt), lsqcurvefit, lsqnonlin (both trust-region-reflective algorithm). As a default, we would recommend using lsqnonlin if you have access to the Optimization Toolbox. Note that all but the fminsearch algorithm are gradient based. If a gradient based algorithm fails to find suitable estimates, you can try fminsearch and see whether that improves the optimization. All local optimization algorithms can get “stuck” in a local minimum of the objective function and might therefore fail to reach the true minimum. Global optimization algorithms are developed to find the absolute minimum of the objective function. You can use global optimization algorithms when your fitting task results in different parameter estimates when repeated with different initial values (in other words, your optimization is getting stuck in local minima). You are more likely to encounter this as you increase the number of parameters you are estimating, as you increase the parameter space you are exploring (in other words, the bounds you are imposing on your estimates) and when you have poor initial estimates (in other words, your initial estimates are potentially very far from the estimates that correspond with the minimum of the objective function). A disadvantage of global optimization algorithms is that these algorithms are much more computationally expensive – they often take significantly more time to converge than the local optimization methods do. When using global optimization methods, we recommend using SimBiology’s built-in scattersearch algorithm, combined with lsqnonlin as a local solver. If you have access to the Global Optimization Toolbox, you can try the functions ga (genetic algorithm), patternsearch and particleswarm. Note that some of the global optimization algorithms, including scattersearch, lend themselves well to be accelerated using parallel or distributed computing.
  • Estimate category-specific parameters: If you want to estimate category-specific parameters for multiple subjects, e.g. you have 10 male and 10 female subjects in your dataset and you want to estimate a separate clearance value for each gender while all other parameters will be gender-independent, you can also use non-linear regression. Please refer to this example in the documentation.
  • Non-linear mixed effects: If your data represents a population of individuals where you think there could be significant inter-individual variability you can use mixed effects modeling to estimate the fixed and random effects present in your population, while also understanding covariance between different parameters you are trying to estimate. When performing mixed effects estimation, it is advisable to perform fixed effects estimation in order to obtain reasonable initial estimates for the mixed effects estimation. SimBiology supports two estimation functions: nlmefit (LME, RELME, FO or FOCE algorithms), and nlmefitsa (Stochastic Approximation Expectation-Maximization). Sometimes, these solvers might seem to struggle to converge. In that case, it is worthwhile determining whether your (objective) function tolerance is set too low and increasing the tolerance somewhat.

8. Does your optimization get stuck? Sometimes, the optimization algorithm can get stuck at a certain iteration. For a particular iteration, the parameter values that model is simulated with as part of the optimization process, can cause the model to be in a state where the ODE solver needs to take very small time-steps to achieve the tolerances (e.g. very rapid changes of model responses). Solutions can include: changing your initial estimates, imposing lower and upper bounds on the parameters you are trying to estimate, selecting to a different solver, easing solver tolerances (only where possible, see also “Visually inspect data and model response”).

9. Are you using the proportional error model? The objective function for the proportional error model contains a term where your response data is part of the denominator. As response variables get close to zero or are exactly zero, this effectively means the objective function contains one or more terms that divide by zero, causing errors or at least very slow iterations of your optimization algorithm. You can try to change the error model to constant or combined to circumvent this problem. Alternatively, you can define separate error models for each response: proportional for those responses that don’t have measurements that contain values close to zero and a constant error model for those responses that do.

Dear all,

I am new to SimBiology. I am doing my research in Molecular Communication. Recently I have found out that SimBiology can be used for simulating the Bit Error Rate performance of molecular Communication systems. Please help me to find good reference materials/examples for using SimBiology as a simulation tool for Molecular Communications.

Thank you.

Tomorrow (Wednesday, January 23) during Rosa's Impact of Modeling & Simulation in Drug Development webinar series, Chi-Chung Li, a Senior Scientist at Genentech, will present a case study where SimBiology was used to create a QSP model that enhanced decision making in a Phase I trial.

Sign up now and have the ability to ask questions at the end of the webinar, or access the archived version later: https://www.rosaandco.com/webinars/2019/phase-i-clinical-decision-making-qsp-case-study

Iraj Hosseini, Ph.D., of Genentech will present a webinar on gPKPDSim , a MATLAB app that facilitates non-modelers to explore and simulate PKPD models built in SimBiology.

While model development typically requires mathematical modeling expertise, model exploration and simulation could be performed by non-modeler scientists to support experimental studies. Dr. Hosseini and his colleagues collaborate with MathWorks' consulting services to develop an App to enable easy use of any model constructed in SimBiology to execute common PKPD analyses.

Webinar will be hosted by Rosa & Co. on Wednesday October 24. To register, go to: https://register.gotowebinar.com/register/7922912955745684993?mw

This project presents a SimBiology implementation of Mager and Jusko’s generic Target-Mediated Drug Disposition model (TMDD) as described in "General pharmacokinetic model for drugs exhibiting target-mediated drug disposition". Target-mediated drug disposition is a common source of nonlinearity in PK profiles for biotherapeutics. Nonlinearities are introduced because drug-target bindings saturate at therapeutic dosing levels.

Drug in the Plasma reversibly binds with the unbound Target to form drug-target Complex. kon and koff are the association and dissociation rate constants, and clearance of free Drug and Complex from the Plasma is described by first-order processes with rate constants, kel and km, respectively. Free target turnover is described by a zero-order synthesis rate, ksyn, and a first order elimination (rate constant, kdeg). The model also includes an optional Tissue compartment to account for non-specific tissue binding or distribution.

References [1] Mager DE and Jusko WJ (2001) General pharmacokinetic model for drugs exhibiting target-mediated drug disposition. J Pharmacokinetics and Pharmacodynamics 28: 507–532.

This project presents SimBiology model implementation of the systemic Renin-Angiotensin-System that was first developed by Lo et al. and used to investigate the effects of different RAS-modulating therapies. The RAS pathway is crucial for blood pressure and kidney function control as well as a range of other organism-wide functions. The model describes the enzymatic conversion of the precursor protein Angiotensinogen to Angiotensin I and its downstream products Angiotensin 1-7, Angiotensin II and Angiotensin IV. Key pathway effects are triggered by the association of Angiotensin II with the AT1-Receptor. A positive feedback loop connects the Angiotensin II–AT1-Receptor complex with the Angiotensinogen conversion (not shown in the diagram). Enzymatic reactions are modeled as pseudo-unimolecular using enzymatic activities as reaction rates. Degradation reactions are described using protein half-life times. Drug pharmacodynamics are included in the model using the term (1-DrugEffect), where DrugEffect follows a sigmoidal dependence on the Drug concentration, to modify the target enzyme activity.

References [1] Lo, A., Beh, J., Leon, H. D., Hallow, M. K., Ramakrishna, R., Rodrigo, M., & Sarkar, A. (2011). Using a Systems Biology Approach to Explore Hypotheses Underlying Clinical Diversity of the Renin Angiotensin System and the Response to Antihypertensive Therapies. Clinical Trial Simulations, 1, 457–482.

This project presents a SimBiology implementation of a physiologically-based pharmacokinetic (PBPK) model for trichloroethylene (TCE) and its metabolites. It is based on the article, “A human physiologically based pharmacokinetic model for trichloroethylene and its metabolites, trichloroacetic acid and free trichloroethanol” by Fisher et al. [1].

The human PBPK model for TCE and its metabolites presented here was developed by Fisher et al. [1] in order to assess human health risks associated with low level exposure to TCE. TCE is a commonly used solvent in the automotive and metal industries for vapor degreasing of metal parts. Exposure to TCE has been associated with toxic responses such as cancer formation and brain disorders in rodents and in humans [1]. In this PBPK model, TCE enters the systemic circulation through inhalation. Its disposition is described by a six-compartment model representing the liver, lung, kidney, fat, and slowly perfused and rapidly perfused tissues. In the liver, TCE is metabolized to trichloroacetic acid (TCA) and free trichloroethanol (TCOH-f) via P450-mediated metabolism where a fraction of TCOH-f is converted to TCA. For simplicity, a four-compartment submodel was used to describe the disposition of metabolites, TCA and TCOH-f, in the lung, liver, kidney, and body (muscle). Both metabolites are described to be excreted in the urine. TCOH-f is glucuronidated in the liver, forming glucuronide-bound TCOH (TCOH-b), and excreted in the urine via a saturable process whereas TCA is excreted by a first-order process by the kidney.

Reference: Fisher, J. W., Mahle, D., & Abbas, R. (1998). A human physiologically based pharmacokinetic model for trichloroethylene and its metabolites, trichloroacetic acid and free trichloroethanol. Toxicology and applied pharmacology, 152(2), 339-359.

Summary:
Dynamically accessing variable names can negatively impact the readability of your code and can cause it to run slower by preventing MATLAB from optimizing it as well as it could if you used alternate techniques. The most common alternative is to use simple and efficient indexing.
Explanation:
Sometimes beginners (and some self-taught professors) think it would be a good idea to dynamically create or access variable names, the variables are often named something like these:
  • matrix1, matrix2, matrix3, matrix4, ...
  • test_20kmh, test_50kmh, test_80kmh, ...
  • nameA, nameB, nameC, nameD,...
Good reasons why dynamic variable names should be avoided:
There are much better alternatives to accessing dynamic variable names:
Note that avoiding eval (and assignin, etc.) is not some esoteric MATLAB restriction, it also applies to many other programming languages as well:
MATLAB Documentation:
If you are not interested in reading the answers below then at least read MATLAB's own documentation on this topic Alternatives to the eval Function, which states "A frequent use of the eval function is to create sets of variables such as A1, A2, ..., An, but this approach does not use the array processing power of MATLAB and is not recommended. The preferred method is to store related data in a single array." Data in a single array can be accessed very efficiently using indexing.
Note that all of these problems and disadvantages also apply to functions load (without an output variable), assignin, evalin, and evalc, and the MATLAB documentation explicitly recommends to "Avoid functions such as eval, evalc, evalin, and feval(fname)".
The official MATLAB blogs explain why eval should be avoided, the better alternatives to eval, and clearly recommend against magically creating variables. Using eval comes out at position number one on this list of Top 10 MATLAB Code Practices That Make Me Cry. Experienced MATLAB users recommend avoiding using eval for trivial code, and have written extensively on this topic.
The community is very helpful, yet I feel really powerless that I cannot find the appropriate way to code, nor find the problems with the codes I have written. I have read numerous books on MATLAB, mostly related with science and engineering applications. Any advice to improve would be greatly appreciated. Thanks.
Hello all,
Please explain good MATLAB programming practice methods. It will help to the guys who are new to programming like me.
Previously I used
for i=1:10
after following some suggestions from this answers pages I learnt to use
for i1=1:100
This is the good way to write programs.
Like this, as a professional programmer, please mention some good programming practice techniques.
It will useful to all!
Capital letters are obtained by capitalizing the LaTeX command for the lowercase version. Capital letters in grey are exceptions which have no LaTeX commands. For example, to produce a capital chi simply type X (this also applies for the lowercase omicron).
When two versions of the lowercase letter are available, a var prefix can be added to obtain the second version. For example, the two versions of epsilon are \epsilon and \varepsilon.
--------------------------------------------------------------------------------------------------------------------------------------------------------
The code used to generate the table:
greeks = ...
{'ALPHA' 'A' '\alpha'
'BETA' 'B' '\beta'
'GAMMA' '\Gamma' '\gamma'
'DELTA' '\Delta' '\delta'
'EPSILON' 'E' {'\epsilon','\varepsilon'}
'ZETA' 'Z' '\zeta'
'ETA' 'H' '\eta'
'THETA' '\Theta' {'\theta','\vartheta'}
'IOTA' 'I' '\iota'
'KAPPA' 'K' '\kappa'
'LAMBDA' '\Lambda' '\lambda'
'MU' 'M' '\mu'
'NU' 'N' '\nu'
'XI' '\Xi' '\xi'
'OMICRON' 'O' 'o'
'PI' '\Pi' {'\pi','\varpi'}
'RHO' 'P' {'\rho','\varrho'}
'SIGMA' '\Sigma' {'\sigma','\varsigma'}
'TAU' 'T' '\tau'
'UPSILON' '\Upsilon' '\upsilon'
'PHI' '\Phi' {'\phi','\varphi'}
'CHI' 'X' '\chi'
'PSI' '\Psi' '\psi'
'OMEGA' '\Omega' '\omega'};
h = figure('units','pixels','pos',[300,100,620,620],'Color','w');
axes('units','pixels','pos',[10,10,600,600],'Xcol','w','Ycol','w',...
'Xtick',[],'Ytick',[],'Xlim',[0 6],'Ylim',[0,4]);
% Loop by column and row
for r = 1:4
for c = 1:6
el = (r-1)*6 + c;
% Title
text(c-0.5,5-r,greeks{el,1},'Fonts',14,'FontN','FixedWidth',...
'Hor','center','Ver','cap')
% Color cap latter in grey or black
if strcmp(greeks{el,2}(1),'\')
clr = [0, 0, 0];
else
clr = [0.65, 0.65, 0.65];
end
% Cap letter
text(c-0.5,4.87-r,['$\rm{' greeks{el,2} '}$'],'Fonts',40,...
'Hor','center','Ver','cap','Interp','Latex','Color',clr)
% Lowercase letter/s (if two variants)
if iscell(greeks{el,3})
text(c-0.75,4.48-r,['$' greeks{el,3}{1} '$'],'Fonts',20,...
'Hor','center','Interp','Latex')
text(c-0.25,4.48-r,['$' greeks{el,3}{2} '$'],'Fonts',20,...
'Hor','center','Interp','Latex')
% Latex command
text(c-0.5,4.3-r,['\' greeks{el,3}{1}],'Fonts',12,'FontN','FixedWidth',...
'Hor','center','Ver','base')
else
text(c-0.5,4.48-r,['$' greeks{el,3} '$'],'Fonts',20,...
'Hor','center','Interp','Latex')
text(c-0.5,4.3-r,['\' greeks{el,3}],'Fonts',12,'FontN','FixedWidth',...
'Hor','center','Ver','base')
end
end
end
% Print to pdf
export_fig greeks.pdf
The link to export_fig.
And here is the link to the pdf on scribd: http://www.scribd.com/doc/159011120/Greek-alphabet-in-latex
[INDEX]
--------------------------------------------------------------------------------------------------------------------------------------
[MOTIVATION]
Why should we use markups in the body of our questions?
The answer is a question: which of the two versions is more likely to be understood in a glimpse and has more chances to be answered by our readers?
.
< Consider the following question >
I have a vector of weights W=[10,20,30,50,23434,1,2.4,2] and a matrix A=rand(100,8) and I would like to find the row-wise weighted sum of A. I am proceeding in the following way: B=zeros(size(A)); for c=1:numel(W) B(:,c)=A(:,c)*W(c); end B=sum(B,2); Somehow I get huge numbers can you please help?
.
< Now, consider its formatted version >
I have a vector of weights W = [10,20,30,50,23434,1,2.4,2] and a matrix A = rand(100,8) and I would like to find the row-wise weighted sum of A.
I am proceeding in the following way:
B = zeros(size(A));
for c = 1:numel(W)
B(:,c) = A(:,c)*W(c);
end
B = sum(B,2);
Somehow I get huge numbers can you please help?
--------------------------------------------------------------------------------------------------------------------------------------
[AKNOWLEDGMENTS]
In alphabetical order by nickname, thanks for their suggestions to:
Walter Roberson
--------------------------------------------------------------------------------------------------------------------------------------
[LOG]
  • 06 Aug 2011, 13:17 BST - created and added boldface.gif
  • 06 Aug 2011, 14:59 BST - added italic.gif
  • 06 Aug 2011, 18:58 BST - added index section
  • 07 Aug 2011, 00:03 BST - added code.gif and tutorial series section
  • 07 Aug 2011, 01:50 BST - added monospaced.gif, numlist.gif, bullist.gif and hyperlink.gif
  • 13 Aug 2011, 14:27 BST - added motivation section
  • 18 Aug 2011, 01:44 BST - added aknowledgments section and link to wish-list
--------------------------------------------------------------------------------------------------------------------------------------
[TUTORIAL Series]
Do not forget to read the Markup help (located on the top-right corner of the Body pane)
Vote on Wish-list for MATLAB Answer sections my post if you think that a tutorial section on top of Answers could be useful.
Now, I am still a novice when it comes to programming. I believe MATLAB is definitely a great programming tool, one that I can play with, particularly, when I have free time.
I would love to hear from all answerers, what are the ways that can make one proficient in this field?
amit jain
amit jain
Last activity on 2 Jun 2023

What is the best way to learn MATLAB at home without a tutor?