Predict Cluster Assignments Using Python Scikit-learn Model Predict Block
This example shows how to use the Scikit-learn Model Predict block for prediction in Simulink®. The block accepts observations (predictor data), and returns the predicted cluster assignments using a trained unsupervised machine learning model that is executed in Python®. MATLAB® supports the reference implementation of Python, often called CPython. If you use a Mac or Linux® platform, you already have Python installed. If you use Windows®, you need to install a distribution, such as those found at https://www.python.org/downloads/. For more information, see Configure Your System to Use Python. Your MATLAB Python environment must have the scikit-learn
module installed.
The Scikit-learn Model Predict block requires a pretrained scikit-learn™ model file that you save in Python using pickle.dump()
, joblib.dump()
, or skops.io.dump()
. The model must support a corresponding predict()
method in Python. This example provides the saved model sklearnmodel.pkl
, which is a MiniBatchKMeans
clustering model trained on standardized Fisher's iris data and saved with pickle.dump()
in scikit-learn
version 1.3.2. The example also provides the Python files scaler.pkl
, sklearnmodel.py
and preprocessor.py
.
Open Provided Simulink Model
This example provides the Simulink model slexScikitLearnPredictExample.slx
, which includes the Scikit-learn Model Predict block. You can open the Simulink model or create a new model as described in the next section.
Open the Simulink model slexScikitLearnPredictExample.slx
.
open_system("slexScikitLearnPredictExample");
When you open the Simulink model, the software runs the code in the PreLoadFcn
callback function before loading the Simulink model. The PreLoadFcn
callback function of slexScikitLearnPredictExample
includes code to check if your workspace contains the modelInput
variable for the trained model. If the workspace does not contain the variable, PreLoadFcn
loads the sample data and creates an input signal for the Simulink model. To view the callback function, in the Setup section on the Modeling tab, click Model Settings and select Model Properties. Then, on the Callbacks tab, select the PreLoadFcn
callback function in the Model callbacks pane.
Create Simulink Model
To create a new Simulink model, open the Blank Model template and add the Scikit-learn Model Predict block from the Statistics and Machine Learning Toolbox™ library. Add an Inport block and an Outport block, and connect them to the Scikit-learn Model Predict block.
Double-click the Scikit-learn Model Predict block to open the Block Parameters dialog box. Enter sklearnmodel.pkl
in the Path to scikit-learn model file text box.
On the Input tab, set the Python Datatype to float
. On the Pre/Post-processing tab, enter preprocessor.py
in the Path to Python file defining preprocess() text box. Click OK.
The Scikit-learn Model Predict block expects observations containing four predictor values, because the Python model was trained using a data set with four predictor variables. Double-click the Inport block, and set Port dimensions to 4 on the Signal Attributes tab. To specify that the output signals have the same length as the input signal, set Sample Time to 1 on the Execution tab. Clear the Interpolate data check box and click OK.
Load Fisher's iris data set, which contains 150 observations and 4 predictors. To simulate new observations that differ from those used to train the Python model, add random Gaussian noise to the observations.
rng(0,"twister") % For reproducibility load fisheriris meas = meas + 0.1*randn(size(meas));
Create an appropriate structure array for the input data. For more information, see Control How Models Load Input Data (Simulink).
modelInput.time = (1:size(meas,1))'-1; modelInput.signals.values = meas; modelInput.signals.dimensions = size(meas,2);
To import the signal data from the workspace:
On the Modeling tab, click Model Settings to open the Configuration Parameters dialog box.
On the left of the Configuration Parameters dialog box, click Data Import/Export. Then select the Input check box and enter
modelInput
in the adjacent text box.On the left, click Solver. Under Simulation time, set Stop time to
size(meas,1)-1
. Under Solver selection, set Type toFixed-step
, and set Solver todiscrete (no continuous states)
. Click OK.
For more details, see Load Signal Data for Simulation (Simulink).
Save the model as slexScikitLearnPredictExample.slx
in Simulink.
Simulate Simulink Model
Simulate the Simulink model to predict cluster assignments for the input observations. You might receive a warning message if your Python installation uses a scikit-learn
version prior to 1.3.2.
simOut=sim("slexScikitLearnPredictExample");
When the Inport block detects observations, it places them in the Scikit-learn Model Predict block. The Scikit-learn Model Predict block converts the predictor data to the Python or NumPy datatype specified in the Python Datatype column on the Input tab of the Block Parameters dialog box. The block passes the data to Python, where the software standardizes the data using the function defined in preprocessor.py
and then sends the data to the Python model. The Python model returns the predicted cluster assignments for the observations. You can use the Simulation Data Inspector (Simulink) to view the logged data of the Outport block.
Visualize Model Predictions
Create a scatter plot of the third predictor variable versus the second predictor variable. Assign a different color to each cluster assignment predicted by the model.
C = squeeze(simOut.yout.getElement(1).Values.Data); gscatter(meas(:,2),meas(:,3),C')