Gradient descent giving me Nan
7 views (last 30 days)
Show older comments
Hello,
I am running linear regression on realestate data. Gradient descent is giving me Nan answers for theta. When I did the normal equation it provided me with some numbers of theta. Can you please advise why my gradient descent is not working?
Thanks for your help..
This is my code
clear
clc
data=importfile('realestate.csv');
%% Setting data
X= data (:,2:7);
y= data(:,8);
m=height(y);
%Feature normalization
X=table2array(X);
y=table2array(y);
[X_norm, mu, sigma]=featureNormalize(X);
% Add intercept term to X
X = [ones(m, 1) X];
%% setting data for Gradient decsent
theta = rand(7, 1);
J = computeCostMulti(X, y, theta);
%chooses sume alpha and number of iteration
alpha = 0.5;
num_iters = 1500;
[theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters);
% Plot the convergence graph
figure;
plot(1:numel(J_history), J_history, '-b', 'LineWidth', 2);
xlabel('Number of iterations');
ylabel('Cost J');
% Display gradient descent's result
fprintf('Theta computed from gradient descent: \n');
fprintf(' %f \n', theta);
fprintf('\n');
%% using normal equation
clear
clc
%%Setting data
data=importfile('realestate.csv');
X= data (:,2:7);
y= data(:,8);
m=height(y);
X=table2array(X);
y=table2array(y);
% Add intercept term to X
X = [ones(m, 1) X];
% Calculate the parameters from the normal equation
theta = normalEqn(X, y);
% Display normal equation's result
fprintf('Theta computed from the normal equations: \n');
fprintf(' %f \n', theta);
fprintf('\n');
The functions are as follows:
function J = computeCostMulti(X, y, theta)
%COMPUTECOSTMULTI Compute cost for linear regression with multiple variables
% J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly
J = 0;
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
h = X * theta;
J = (1/(2*m) * sum((h - y).^2));
% =========================================================================
end
function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
%GRADIENTDESCENTMULTI Performs gradient descent to learn theta
% theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCostMulti) and gradient here.
%
h = X * theta;
theta = theta - (alpha/m) * ( (h - y)' * X)';
% ============================================================
% Save the cost J in every iteration
J_history(iter) = computeCostMulti(X, y, theta);
end
end
5 Comments
Answers (2)
Robert Misior
on 2 Sep 2020
Hi,
Your theta_change calucation (alpha/m) * ( (h - y)' * X)' is not correct.
Look at the notes provided with this problem: (https://www.coursera.org/learn/machine-learning/resources/O756o)
The change in theta (the "gradient") is the sum of the product of X and the "errors vector", scaled by alpha and 1/m. Since X is (m x n), and the error vector is (m x 1), and the result you want is the same size as theta (which is (n x 1), you need to transpose X before you can multiply it by the error vector.
0 Comments
See Also
Categories
Find more on Interpolation in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!