Correlation between two differently formatted datasets

12 views (last 30 days)
Hello,
I want to calculate the R^2 correlation between two different datasets.
The first one, A, is 192x288 (lat,lon) and I can visualize the values on a 2D colormap
The second one, B, is 555x2 (lat,lon) This data was from an excel file, in column format. The data is randomly spread throughout the globe, and do not lie on the same grid cells of A. The data is far too sparse to be able to interpolate.
I am having trouble figuring out how I can possibly find a correlation between these two different data formats. Is there a way to convert B into a map that I can visualize with a colormap like A? Also, how would the resolution affect this calculation?
Any help would be highly appreciated
Thank you,
Melissa

Accepted Answer

Chad Greene
Chad Greene on 9 Mar 2015
Melissa,
Without knowing anything about your project, my gut feeling is that it does not seem prudent to grid your B dataset because you'll end up interpolating over long, long distances between data points. I suppose you could use triscatteredinterp or gridfit, but you'd probably want to then mask out any grid boxes that are far away from the B data points.
You can, however, get a correlation between these data sets. I'm going to make up a gridded dataset A and a point dataset B:
% Some gridded dataset A:
[lonA,latA] = meshgrid(-180:2:180,90:-1:-90);
A = peaks(181)+.1*latA;
% Some measurements B at specific points:
latB = 180*(rand(30,1)-.5);
lonB = 360*(rand(30,1)-.5);
B = .1*latB+rand(size(latB));
% Plot the points atop the gridded dataset:
pcolor(lonA,latA,A)
hold on
plot(lonB,latB,'rp','markersize',15)
shading interp
xlabel('longitude')
ylabel('latitude')
Then get A values at points B by interpolating the A dataset:
A_interp = interp2(lonA,latA,A,lonB,latB);
You can then use corrcoef to get a correlation coefficient, which for this fake data is 0.89:
R = corrcoef([A_interp B])
R =
1.0000 0.8936
0.8936 1.0000
But note that correlation coefficient depends a bit on data means and scaling. Below I'm going to use polyplot to plot the linear regression:
plot(A_interp,B,'b*')
hold on
polyplot(A_interp,B,'k-')
axis tight; box off
xlabel('dataset A')
ylabel('dataset B')

More Answers (0)

Categories

Find more on Data Distribution Plots in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!