Clear Filters
Clear Filters

Proximity Matrix of Random Forest

6 views (last 30 days)
Keyan Li
Keyan Li on 3 Mar 2022
Answered: Aneela on 26 May 2024
I want to know how to get the proximity matrix of random forest in Matlab. For random forest, I build the model through fitcensemble with bag method. The concept of proximity matrix is not complex. It is a N by N matrix for N data points. Element (i, j) of proximity matrix represents the number of trees x_i nad x_j end in the same leaf. The proximity usually is scaled with the total number of trees. I want to know how to extract related information within Matlab.

Answers (1)

Aneela
Aneela on 26 May 2024
Hi Keyan Li,
In MATLAB, there is a built-in function, “proximity” to calculate proximity matrix. But this function works only for “CompactTreeBagger”. You can refer to the following link: https://www.mathworks.com/help/stats/compacttreebagger.proximity.html
However, there isn’t direct built-in support for obtaining the proximity matrix from a random forest model built with “fitcensemble” using the “Bag” method.
Yet, the proximity matrix for a random forest can be calculated manually, here’s a workaround:
  • For each tree in the ensemble, predict the leaf indices for each data point.
  • For each tree, if two data points end up in the same leaf, increment their corresponding entry in the proximity matrix.
  • Optionally, scale the proximity matrix by the total number of trees to get the average proximity.

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!