iMAT Thresholds in CobraToolbox

4 views (last 30 days)
Deborah
Deborah on 20 Feb 2025
Answered: Prathamesh on 20 Aug 2025
Hi Everyone,
I am comparing two proteomic analyses of Bordetella pertussis under wild type (control) and Fe growth conditions. I'm using the iMAT code
I’m having trouble selecting appropriate threshold_lb and threshold_ub values to get the best results. Also, I replaced many NaN values in the original data table with zeros, but I’m unsure if this was the right approach.
I’ve attached the Bordetella pertussis model, and the data table
I’d greatly appreciate any guidance or recommendations.
Thank you for your time and support.
Best regards,
Deborah
% Set the lb and ub thresholds to run iMAT,
threshold_ub = quantile(filteredExpression, 0.75); % Top 25%
threshold_lb = quantile(filteredExpression, 0.25); % Bottom 25%
% Review how this affects the model
[highlyActive, lowlyActive] = deal(sum(expressionRxns > threshold_ub), sum(expressionRxns < threshold_lb));
disp(['Highly active reactions: ', num2str(highlyActive)]);
disp(['Inactive reactions: ', num2str(lowlyActive)]);
Highly active reactions: 280
Inactive reactions: 1261
% Run iMAT
tissueModel = iMAT(model, expressionRxns, threshold_lb, threshold_ub);
Set parameter Username
Set parameter TimeLimit to value 60
Set parameter IntFeasTol to value 1e-09
Set parameter MIPGap to value 1e-12
Set parameter LogFile to value "MILPlog"
Academic license - for non-commercial use only - expires 2026-02-04
Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (win64 - Windows 10.0 (19045.2))
CPU model: Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz, instruction set [SSE2|AVX|AVX2]
Thread count: 8 physical cores, 16 logical processors, using up to 16 threads
Optimize a model with 4337 rows, 3699 columns and 11969 nonzeros
Model fingerprint: 0x64c64857
Variable types: 1878 continuous, 1821 integer (1821 binary)
Coefficient statistics:
Matrix range [1e-05, 1e+05]
Objective range [1e+00, 1e+00]
Bounds range [3e-02, 1e+05]
RHS range [3e-02, 1e+05]
Presolve removed 3279 rows and 2307 columns
Presolve time: 0.02s
Presolved: 1058 rows, 1392 columns, 4841 nonzeros
Variable types: 688 continuous, 704 integer (704 binary)
Found heuristic solution: objective 1204.0000000
Root relaxation: objective 1.360309e+03, 1109 iterations, 0.02 seconds (0.02 work units)
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 1360.30876 0 69 1204.00000 1360.30876 13.0% - 0s
H 0 0 1280.0000000 1360.30876 6.27% - 0s
0 0 1342.33655 0 58 1280.00000 1342.33655 4.87% - 0s
H 0 0 1283.0000000 1342.02258 4.60% - 0s
0 0 1337.49864 0 69 1283.00000 1337.49864 4.25% - 0s
0 0 1336.76409 0 66 1283.00000 1336.76409 4.19% - 0s
0 0 1336.43203 0 60 1283.00000 1336.43203 4.16% - 0s
0 0 1335.08203 0 72 1283.00000 1335.08203 4.06% - 0s
0 0 1334.98943 0 72 1283.00000 1334.98943 4.05% - 0s
0 0 1332.32371 0 64 1283.00000 1332.32371 3.84% - 0s
H 0 0 1288.0000000 1332.27738 3.44% - 0s
0 0 1332.27738 0 64 1288.00000 1332.27738 3.44% - 0s
0 0 1331.33108 0 68 1288.00000 1331.33108 3.36% - 0s
0 0 1331.12550 0 68 1288.00000 1331.12550 3.35% - 0s
0 0 1330.30704 0 70 1288.00000 1330.30704 3.28% - 0s
0 0 1330.16588 0 76 1288.00000 1330.16588 3.27% - 0s
0 0 1330.16588 0 76 1288.00000 1330.16588 3.27% - 0s
0 0 1330.04034 0 86 1288.00000 1330.04034 3.26% - 0s
0 0 1329.78342 0 89 1288.00000 1329.78342 3.24% - 0s
0 0 1329.46261 0 85 1288.00000 1329.46261 3.22% - 0s
0 0 1329.27813 0 90 1288.00000 1329.27813 3.20% - 0s
0 0 1329.27813 0 93 1288.00000 1329.27813 3.20% - 0s
0 0 1328.55803 0 90 1288.00000 1328.55803 3.15% - 0s
0 0 1328.43796 0 81 1288.00000 1328.43796 3.14% - 0s
0 0 1328.43760 0 81 1288.00000 1328.43760 3.14% - 0s
0 0 1327.47668 0 66 1288.00000 1327.47668 3.06% - 0s
0 0 1327.08584 0 73 1288.00000 1327.08584 3.03% - 0s
0 0 1326.96546 0 73 1288.00000 1326.96546 3.03% - 0s
0 0 1326.93213 0 81 1288.00000 1326.93213 3.02% - 0s
0 0 1326.54294 0 74 1288.00000 1326.54294 2.99% - 0s
H 0 0 1296.0000000 1326.53803 2.36% - 0s
0 2 1326.53803 0 74 1296.00000 1326.53803 2.36% - 0s
H 904 722 1298.0000000 1320.38916 1.72% 17.5 0s
H 1295 714 1300.0000000 1320.38916 1.57% 17.3 1s
H 1308 625 1301.0000000 1320.38916 1.49% 17.2 1s
Cutting planes:
Learned: 11
Gomory: 16
Cover: 48
Implied bound: 50
Clique: 23
MIR: 37
Flow cover: 32
Network: 1
RLT: 2
Relax-and-lift: 6
BQP: 1
Explored 2027 nodes (34300 simplex iterations) in 1.53 seconds (1.63 work units)
Thread count was 16 (of 16 available processors)
Solution count 10: 1301 1300 1298 ... 1254
Optimal solution found (tolerance 1.00e-12)
Best objective 1.301000000000e+03, best bound 1.301000000000e+03, gap 0.0000%

Answers (1)

Prathamesh
Prathamesh on 20 Aug 2025 at 9:49
I understand that you are comparing two proteomic analyses using iMAT code.
Your current method uses the ‘25th’ and ‘75th’ percentiles of ‘filteredExpression’ to set the lower and upper bounds. This is a strong start because it uses the unique characteristics of your data to set the thresholds. However, the best thresholds can vary depending on your dataset and the biological question.
Here are some recommendations to get the results:
  • Your current approach is already relative, using the ‘25th’ and ‘75th’ percentiles. You can refine this by experimenting with other percentile ranges, like the ‘20th’ and ‘80th’ or ‘30th’ and ‘70th’. This helps you see how different thresholds affect the number of active and inactive reactions, which is a type of sensitivity analysis.
  • Before picking your thresholds, plot your expression data using a histogram or a box plot. This lets you visually check the distribution and decide if the 25th and 75th percentiles are the right fit. For instance, if your data is highly skewed, these percentiles might not be the best way to define 'highly active' and 'inactive' genes.
  • Consider a more data-driven approach by using existing biological knowledge. If there are ‘Bordetella pertussis’ genes known to be highly or lowly expressed under iron growth conditions, you can use their expression levels to help set your thresholds.
Hope this helps.
Refer below MathWorks Documentation:

Categories

Find more on Genomics and Next Generation Sequencing in Help Center and File Exchange

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!