Custom loss function (based on error multiplication rather than sum) in classification neural network

Question

0 votes

Hi everyone,

First, thank you! This is a fantastic community from which I’m learning so much. This is my first question (hopefully, I’ll be able to contribute answers in the future!).

I have a system consisting of 10 elements, where each element can exist in one of four states (or classes). This means the system has 410410 possible states. For each element, I have 61 features that can be used to predict its state. I’ve experimented with different neural networks (FF networks have worked well so far), mainly focusing on predicting the labels of individual elements.

However, I’ve encountered some challenges:

The classes are naturally imbalanced.
The problem is non-deterministic, meaning two identical feature vectors can correspond to different labels.

I’ve been addressing these issues with relative success by applying techniques such as downsampling, oversampling, data augmentation, and soft labels (the latter has been the most effective).

Now, I want to predict the probability of the entire system being in each of its 410410 states. One issue I’ve noticed is that a misclassification error of 0.05 has minimal impact when the classification is close to random (e.g., 0.25), but it has a significant impact when probabilities are closer to 1 or 0.

What I’d like to do next is implement a loss function that considers the entire system rather than individual elements, while still being based on predictions for each element. My idea is to:

Take batches of 10 observations (corresponding to the 10 elements of the system).
Compute the probability of each element belonging to each of the 4 classes.
Calculate the probability of the system being in each of its 410410 possible states based on these predictions.
Sort these probabilities and use the known labels to find the index of the correct state.
Minimize this index.

Does this approach make sense? Is it feasible? And if so, how could it be implemented?

Many thanks!

David

2 Comments
Show None Hide None

Walter Roberson on 21 Jan 2025

Open in MATLAB Online

I have a system consisting of 10 elements, where each element can exist in one of four states (or classes). This means the system has 410410 possible states.

4^10
ans = 1048576

You would appear to have over 1 million states, not four hundred thousand.

david on 21 Jan 2025

Absolutely, I tried to copy-paste 4^10 from elsewhere. Sorry for the unchecked mistake.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Matt J on 21 Jan 2025

0 votes

Using trainnet you can provide any loss function you wish. However, a multiplicative loss function sounds like a doubtful idea. Multiplications of many terms can quickly underflow or overflow, which is why people normally try to optimize loglikelihoods instead of likelihoods.

3 Comments
Show 1 older comment Hide 1 older comment

Matt J on 22 Jan 2025

Edited: Matt J on 22 Jan 2025

Open in MATLAB Online

I may sound too much computing right?

There's no way to tell. You haven't given us any way of estimating the computation time to create your 1e6x10 probability matrix. Certainly the loss computation itself doesn't look very arduous, once that matrix has been calculated.

pmatrix=rand(1e6,10);
tic; [~,index]=min(prod(pmatrix,2)); loss=index-1; toc
Elapsed time is 0.003210 seconds.

However...

The loss would be abs(1-index)

...that is not a differentiable (or even a continuous) loss function. It's piecewise constant. None of the standard gradient-based training algorithms are going to be able to handle it.

david on 22 Jan 2025

Thank you, didn't see it until you pointed it out!

Sign in to comment.

Custom loss function (based on error multiplication rather than sum) in classification neural network

2 Comments
Show None Hide None

Accepted Answer

3 Comments
Show 1 older comment Hide 1 older comment

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

Custom loss function (based on error multiplication rather than sum) in classification neural network

2 Comments Show None Hide None

Accepted Answer

3 Comments Show 1 older comment Hide 1 older comment

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

2 Comments
Show None Hide None

3 Comments
Show 1 older comment Hide 1 older comment