It is being said that Resnet model requires less training time but when I used resnetLayer function of matLab to create a residual network why it takes more time

Question

debojit sharma on 15 Jul 2022

0
Link

Direct link to this question

https://in.mathworks.com/matlabcentral/answers/1760645-it-is-being-said-that-resnet-model-requires-less-training-time-but-when-i-used-resnetlayer-function

Answered: Hari on 15 Sep 2023

It is being said that Resnet model requires less training time as it eliminate vanishing gradient problem but when I used resnetLayer function of matLab to create a residual network and do the training it takes more time in comparison to CNN-LSTM model why it is so?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Hari on 15 Sep 2023

0
Link

Direct link to this answer

https://in.mathworks.com/matlabcentral/answers/1760645-it-is-being-said-that-resnet-model-requires-less-training-time-but-when-i-used-resnetlayer-function#answer_1311346

Hi Debojit,

I understand that you have observed, the “ResNet” model is taking more time to train compared to the “CNN-LSTM” model, contrary to the expectation that “ResNet” should have faster training due to its ability to address the vanishing gradient problem.

The “ResNet” model is known for its ability to mitigate the vanishing gradient problem, which can occur in deep neural networks during training.

However, the actual training time of a model can be influenced by various factors, including the specific architecture, dataset, hyperparameters, and implementation details. It's important to note that the “ResNet” architecture itself does not guarantee faster training time in all scenarios compared to other models like “CNN-LSTM”.

Here are a few reasons why you might observe longer training time with the “ResNet” model compared to the CNN-LSTM model in your specific case:

Model complexity: “ResNet” models can have a larger number of parameters compared to CNN-LSTM models, especially if you use deeper “ResNet” variants like ResNet-50 or ResNet-101. This increased complexity may require more computational resources and training time.
Dataset characteristics: The characteristics of your dataset, such as size, complexity, and class imbalance, can affect training time. If your dataset is particularly large or contains complex patterns, it may require more time to train regardless of the model architecture.
Hyperparameters: The choice of hyperparameters, such as learning rate, batch size, and regularization techniques, can impact training time. Suboptimal hyperparameter settings may result in slower convergence or require more iterations to achieve good performance.
Implementation details: The efficiency of the implementation, including the software framework and hardware used, can affect training time. Different frameworks or hardware configurations may have varying levels of optimization, which can influence the overall training speed.

Refer to the documentation of “Sequence Classification Using CNN-LSTM Network” for more information.

Sequence Classification Using CNN-LSTM Network - MATLAB & Simulink (mathworks.com)

Refer to the documentation of “resnetLayers” for more information.

Create 2-D residual network - MATLAB resnetLayers (mathworks.com)

I hope this helps.

Thanks,

Hari.