A fully connected neural network with many options for customisation.
modelNN = learnNN(X, y);
p = predictNN(X_valid, modelNN);
One can use an arbitrary number of hidden layers, different activation functions (currently tanh or sigm), custom regularisation parameter, validation sets, etc. The code does not use any matlab toolboxes, therefore, it is perfect if you do not have the statistics and machine learning toolbox, or if you have an older version of matlab. I use the conjugate gradient algorithm for minimisation borrowed from Andrew Ngs machine learning course. See the github and comments in the code for more documentation.
Vahe Tshitoyan (2020). Simple Neural Network (https://github.com/vtshitoyan/simpleNN), GitHub. Retrieved .
You saved my final assignment of this semester! HERO!!!
repelem.m is not defined
Jesse, thanks for your feedback, I am glad everything works well in general. A few comments that might help you use this code further
1. simpleNN is designed only for classification at this point, so it expects that y values are always integer labels, where each number corresponds to one class. If you have non-integer labels but still want to perform classification, I suggest you identify all unique values (e.g. 1.1, 2.3, ...68.9, etc..), and assign an integer label to each of these values (e.g., 1, 2, .... 69, ...). After you perform the classification you can map back to the original values. If you supply non-integer values the code will most certainly crash and the results will definitely not be trustworthy.
2. sub2ind converts a (row, column) index in a matrix to a single index by which you can query the same element. E.g if my matrix A is 2x2 dimensional, I could query element (2,2) either using A(2,2) or A(4) - both of these will return the same value. So sub2ind([2 2], 2, 2) will return 4. Now, the purpose of sub2ind in the nnCostFunction.m is to convert integer labels (1, 2, 3...) to a matrix of 0s and 1s. If you check line 48 of the nnCostFunction.m, it creates a matrix of 0s. Then I use sub2ind to put 1s at the column values corresponding to the class labels for each row (training example). In a simple case where you have 2 training examples (m=2), one labelled 1 and the other labelled 2 (num_labels=2), this will work as follows. First you get a 2x2 matrix of 0s. Then sub2ind([2 2], 1:2, [1 2]) returns [1, 4] (sub2ind([2 2], 1, 1) -> 1, sub2ind([2 2], 2, 2) ->4), which after the assignment leads to y_nn = [1 0; 0 1]. I hope this is helpful.
Thank you for this! Generally, everything is working very well. However, I am having a couple of issues with the call to sub2ind in nnCostFunction.m.
The first issue is that sub2ind is returning non positive integer values. After some quick debugging, it appears that the reason for this is that my y data (input to the function via y') is not strictly integer values.
The second issue that I am having is that my y data is on the scale of ~0-70. However, there are often times not 70 unique y values in my training set (I have ~1500 training points). When this happens I receive the following error out of the sub2ind function "MATLAB:sub2ind:IndexOutOfRange".
After quickly reviewing the sub2ind function and its help page, it seems odd to me that y' is one of the inputs into the function call. For a function that is supposed to return index values for vectors/arrays of a given size, it seems odd to me that the contents of those vectors/arrays (y' in this case) would alter the output of that function. Perhaps I am slightly misunderstanding the purpose of sub2ind (as I will readily admit that I have never used it until using this code).
In any case, I was hoping that you might be able to shed some light as to what is going on here and the specific implementation and use of the sub2ind command in this case. THanks in advance!
Victor, thanks for your feedback, I am glad the code is useful.
- Thanks for noting it, fixed it.
- You must have been using an older version, I had fixed this since. I recommend you pull the latest code from GitHub directly.
When you are done with the cross-validation implementation, feel free to send a pull request on GitHub. I am happy to add you as a contributor after reviewing the code.
Regarding the cost functions, I use cross-entropy loss, which is the standard cost function for classification problems. You would use a quadratic loss function for regression problems. The cost function is the same for tanh and sigmoid, but tanh returns a value in (-1, 1) range, which has to be brought to (0, 1) range for consistency. Hence, the slight difference.
I do not understand what you mean by linear error term. The gradients are different for tanh and sigmoid, and are defined in lib/tanhGradient.m and lib/sigmoidGradient.m.
I hope this is helpful.
Thank you for this work!
Few tips and 1 question:
- The description in learnNN states the dimensions of the inputs the wrong way (transposed).
- Training confusion matrix is assigned the validation data; modelNN.confusion_train = confusion_valid;
I implemented k-fold cross-validation on this network and early stopping criteria based on increasing validation errors.
Will post update when finished.
Question: Why is the cost function on the output layer dependent on the activation function?
Why not a "simply" quadratic? Seems for gradient calculations of the output layer you use a linear error term, which seems odd since not positive definite. Training seems ok though so I'm likely not interpreting code correctly.
Could you elaborate a bit on this? Thanks again!
Automatically including the "lib" folder.
a fix to confusion matrix
basic correction to summary
Added a tag
Updated the summary.
Changed the description
Update the title and the picture