About Generalized Linear Models

10 views (last 30 days)
Baloo
Baloo on 15 Sep 2022
Answered: Dolon Mandal on 12 Sep 2023
Dear All,
my question my sound naive, but I am pretty new to the field. I am trying to train a GLM on a dataset, consisting of 4 predictor vectors + 1 binary response vector.
First, what I did was to use the function stepwiseglm() to find which was the model select most often, while bootstrapping vector responses, and selecting prredictors accordingly. As a result, I got that a model with two continuous variables was selected most often.
Second, I wanted to focus on such two variables, and study more in detail their behavior. I thus implemented an analysis using the two selected predictor vectors + 1 binary response vector, and I launched the glmfit() function, again bootstrapping variables.
Here comes my question: apparently, despite the same setup of the functions, I get different results for the coefficients aassociated with the two predictor variables (different in absolute values and also in the sign). Moreover, while the model is associated with a significant p-value when running stepwiseglm(), this is not the case with glmfit().
I was not able to find out how the two functions compute the coefficients, and how the fit works (I was expecting very similar results, but this is apparently not the case).
To confound even more, I found that that if I perform a fit with fitglm() the results I get are similar to what retrieved with stepwiseglm().
Could you please provide some further detail on what would be the best choice in my case, and where is the difference between the stepwiseglm() and glmfit() algorithms, apart from the adding/removing of variables?
I thank you in advance.
Best regards

Answers (1)

Dolon Mandal
Dolon Mandal on 12 Sep 2023
The differences you observe in the coefficients and p-values between `stepwiseglm`, `glmfit`, and `fitglm` can be attributed to the different algorithms and methodologies employed by these functions. Here's an explanation of each function and their differences:
1. `stepwiseglm`: This function performs stepwise model selection using generalized linear models (GLMs). It automatically adds or removes predictors to find the best subset of predictors based on a specified criterion (e.g., AIC, BIC). The selection process is based on statistical tests and model comparison. However, it's important to note that stepwise selection can be sensitive to the specific dataset and may not always produce the most accurate or stable results.
2. `glmfit`: This function fits a GLM using maximum likelihood estimation (MLE). It estimates the model coefficients by maximizing the likelihood of the observed data given the model. `glmfit` does not perform automatic variable selection or model comparison. It simply estimates the coefficients based on the specified predictors and response.
3. `fitglm`: This function also fits a GLM using maximum likelihood estimation (MLE). It is similar to `glmfit`, but it provides additional flexibility and options for specifying the model, including different link functions and error distributions. `fitglm` allows for more customization in the GLM fitting process.
The differences in coefficients and p-values between `stepwiseglm` and `glmfit` can arise due to the different approaches used for model selection and estimation. `stepwiseglm` may prioritize a subset of predictors based on the selection criterion, while `glmfit` estimates coefficients for all specified predictors without any selection process.
In your case, since you have already identified the two predictor variables using `stepwiseglm`, it might be more appropriate to use `fitglm` to fit the GLM with the selected predictors. `fitglm` provides more flexibility and control over the model specification and fitting process.
It's important to note that no single method guarantees the "best" choice of predictors or model. It's recommended to consider the specific characteristics of your data, the goals of your analysis, and the underlying assumptions of the GLM to make an informed decision.

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!