Why?? ... that is part of an effective design procedure!
1. Initialize the rand RNG.
2. Use an outer loop that specifies the number of hidden nodes, H
3. Use an inner loop that creates a net with a new set of random initial weights for each H; then trains, evaluates, and stores the result (e.g, NMSE (normalized MSE) or R^2 = 1-NMSE )in a 2-D matrix.
4. Search the stored results for the smallest net with an acceptable (e.g., R^2 >= 0.99) performance.
5. Alternatively, use a decreasing loop of H values and only store the index or weights of the current best design.
Typically, I create 100 designs (10 designs for each of 10 candidate values of H). Depending on results, I might continue with a finer search for an optimal value of H.
Bottom line: There are jillions of local minima in weight space. Only by investigating a large number of designs with random weight initializations can you be confident of the results.
If the 2-D indices of the "best design" are (i,j)= (I,J), the corresponding 1-D index of the columnized matrix (i.e., (:)) indicates how many RNG states beyond the specified initial state yields the desired set of initial weights.
Again, specify the initial RNG state before any designs are made. ANY design can then be replicated by keeping track of its position in the design sequence.
Hope this helps.
Greg