Training Loss is NaN, Deep Learning
36 views (last 30 days)
Show older comments
FERNANDO CALVO RODRIGUEZ
on 22 Feb 2023
Hey everyone!
I am making a neural network in Matlab and I have a database with a lot of zeros (which are NaN values that I had to convert into zeros to be able to run the program), the thing is that when I train the network I get the error "Training finished: Training loss is NaN" and I don't know why or how to solve it. If anyone has any idea it would be appreciated.
Thank you very much!
0 Comments
Accepted Answer
Shubh Dhyani
on 27 Feb 2023
Hi Fernando,
I understand that you are trying to train a neural network and your dataset has a lot of NaN values. So you are converting the NaN values to zero to train the model. As a result of this, your training loss is NaN too.
The issue you are encountering is likely due to the fact that you have many zero values in your dataset, which can cause numerical instability during training. When a network encounters a large number of zeros in the input, the gradients can become very small, leading to numerical precision issues such as overflow, underflow, or division by zero. This can result in the loss becoming NaN during training.
One possible solution is to preprocess your data to remove or modify the zero values. Here are a few strategies you could try:
1. Remove the zero values: If the zero values in your dataset are not critical to the analysis, you could consider removing them from the dataset. However, this could lead to loss of important information, so it should be done with caution.
2. Add noise to the zero values: You could add a small amount of noise to the zero values in the dataset to help break the symmetry and improve the stability of the gradients during training. One way to add noise is to sample from a small Gaussian distribution centered at zero.
3. Replace the zero values with a small constant: Instead of removing the zero values, you could replace them with a small constant value, such as 1e-6. This can help prevent the gradients from becoming too small and improve the stability of the training.
4. Use a different activation function: If you are using an activation function such as ReLU, which can become inactive (i.e., output zero) for negative inputs, this can exacerbate the issue of numerical instability. You could try using a different activation function, such as Leaky ReLU, which does not completely block negative inputs.
5. Use batch normalization: Batch normalization can help stabilize the gradients during training by normalizing the inputs to each layer. This can be especially helpful when dealing with large amounts of zero values.
It's worth noting that converting NaN values to zeros is not always a good strategy, as it can lead to confusion between true zero values and missing values. If possible, it's better to keep the NaN values and handle them explicitly during training.
2 Comments
More Answers (0)
See Also
Categories
Find more on Deep Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!