Linear Regression gives NaN

43 views (last 30 days)
John Ziggs
John Ziggs on 20 May 2021
Commented: Star Strider on 20 May 2021
Hi,
I am trying to do and plot a linear regression for the dataset attached. I tried following examples online, but I am getting NaN. This is my code and my data is attached.
clc
close all
clear all
%% Map
filename = 'TV_NYMA';
[num,string,vt] = xlsread(filename);
Year = num(:,1);
County = string(:,2);
VOC = num(:,3);
NOx = num(:,4);
CO = num(:,5);
PM25 = num(:,6);
Lat = num(:,7);
Lon = num(:,8);
%%
standard_NOx = normalize(NOx);
standard_VOC = normalize(VOC);
figure
scatter(standard_NOx,standard_VOC)
title('NOx vs VOC')
xlabel('NOx emissions standardized')
ylabel('VOC emissions standardized')
%%
X = [ones(length(standard_NOx),1) standard_NOx]
b = X\standard_VOC
regression_line = [ones(size(standard_NOx,1),1) standard_NOx]*b
I was wondering what am I doing wrong.
Thanks.
  2 Comments
dpb
dpb on 20 May 2021
I didn't download the data, but why not
b=polyfit(standard_NOx,standard_VOC,1);
yhat=polyval(b,[min(standard_NOX) max(standard_NOX)]);
or, if have one of Statistics or Curve Fitting Toolboxes, there are other higher-level routines as well...
Just out of curiosity, did you look at what was returned for the coefficients matrix? Was it NaN there, already? IF so, there's probaby a NaN in the mix in the data somewhere.
John Ziggs
John Ziggs on 20 May 2021
There indeed was an empty row in the dataset which caused the NaN. Thank you for your help! I appreciate it.

Sign in to comment.

Accepted Answer

Star Strider
Star Strider on 20 May 2021
There are 8 NaN values in those variables.
Eliminate them and it works —
T1 = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/625018/TV_NYMA.xlsx','VariableNamingRule','preserve')
T1 = 1803×8 table
Year County VOC Nox CO PM2.5 Lat Long ____ _________ _____ _____ ______ _____ ______ ______ 2010 {'BRONX'} 12.78 54.74 151.49 7.07 40.826 -73.92 2013 {'BRONX'} 0.33 7.03 5.01 0.44 40.826 -73.92 2013 {'BRONX'} 0.4 1.5 0.71 0.58 40.826 -73.92 2014 {'BRONX'} 0.33 10.94 5.37 0.45 40.826 -73.92 2015 {'BRONX'} 0.54 28.07 3.84 0.25 40.826 -73.92 2014 {'BRONX'} 0 0 0 0 40.826 -73.92 2014 {'BRONX'} 0.38 9.5 7.41 0.66 40.826 -73.92 2011 {'BRONX'} 0.11 7.88 1.97 0 40.826 -73.92 2013 {'BRONX'} 0.12 27.62 2.98 1.12 40.826 -73.92 2012 {'BRONX'} 6.4 67.55 40.88 1.68 40.826 -73.92 2014 {'BRONX'} 0.23 36.07 5.02 0.34 40.826 -73.92 2015 {'BRONX'} 0.35 8.47 6.16 0.52 40.826 -73.92 2016 {'BRONX'} 5.22 65.32 11.74 0 40.826 -73.92 2011 {'BRONX'} 0.42 1.65 0.69 0.9 40.826 -73.92 2013 {'BRONX'} 3.47 61.06 35.43 3.37 40.826 -73.92 2013 {'BRONX'} 0.43 5.71 4.38 0.39 40.826 -73.92
% Nox_Nan = nnz(isnan(T1.Nox))
% VOC_NaN = nnz(isnan(T1.VOC))
T1 = T1(~[isnan(T1.Nox) & isnan(T1.VOC)],:)
T1 = 1795×8 table
Year County VOC Nox CO PM2.5 Lat Long ____ _________ _____ _____ ______ _____ ______ ______ 2010 {'BRONX'} 12.78 54.74 151.49 7.07 40.826 -73.92 2013 {'BRONX'} 0.33 7.03 5.01 0.44 40.826 -73.92 2013 {'BRONX'} 0.4 1.5 0.71 0.58 40.826 -73.92 2014 {'BRONX'} 0.33 10.94 5.37 0.45 40.826 -73.92 2015 {'BRONX'} 0.54 28.07 3.84 0.25 40.826 -73.92 2014 {'BRONX'} 0 0 0 0 40.826 -73.92 2014 {'BRONX'} 0.38 9.5 7.41 0.66 40.826 -73.92 2011 {'BRONX'} 0.11 7.88 1.97 0 40.826 -73.92 2013 {'BRONX'} 0.12 27.62 2.98 1.12 40.826 -73.92 2012 {'BRONX'} 6.4 67.55 40.88 1.68 40.826 -73.92 2014 {'BRONX'} 0.23 36.07 5.02 0.34 40.826 -73.92 2015 {'BRONX'} 0.35 8.47 6.16 0.52 40.826 -73.92 2016 {'BRONX'} 5.22 65.32 11.74 0 40.826 -73.92 2011 {'BRONX'} 0.42 1.65 0.69 0.9 40.826 -73.92 2013 {'BRONX'} 3.47 61.06 35.43 3.37 40.826 -73.92 2013 {'BRONX'} 0.43 5.71 4.38 0.39 40.826 -73.92
standard_NOx = normalize(T1.Nox);
standard_VOC = normalize(T1.VOC);
X = [ones(length(standard_NOx),1) standard_NOx];
b = X\standard_VOC
b = 2×1
-0.0000 0.5010
regression_line = [ones(size(standard_NOx,1),1) standard_NOx]*b;
figure
plot(standard_NOx, standard_VOC, 'p')
hold on
plot(standard_NOx, regression_line, '-r')
hold off
grid
.
  2 Comments
John Ziggs
John Ziggs on 20 May 2021
Thank you for your help! I should've realized the empty row in the data would cause the NaN. Thanks again!
Star Strider
Star Strider on 20 May 2021
As always, my pleasure!

Sign in to comment.

More Answers (0)

Categories

Find more on Descriptive Statistics in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!