Problem with duplicated points in Matlab.

19 views (last 30 days)
I got a strange compilation error in my Matlab window after running the following script:
clear all
clc
close all
data = load('Mala_grav.txt');
Gravity = data(:,3);
X_pos = data(:,1);
Y_pos = data(:,2);
x = min(X_pos):100:max(X_pos); y = min(Y_pos):100:max(Y_pos);
% meshgrid returns 2-D grid coordinates based on the coordinates contained in vectors x and y
[XI,YI] = meshgrid(x,y); % meshgrid transformsvectors x and y into arrays XI YI
% Test with biharmonic interpolation.
ZI = griddata(data(:,1),data(:,2),Gravity,XI,YI,'v4');
figure;
[c,h] = contour(XI,YI,ZI);
clabel(c,h)
xlabel('Longitude');
ylabel('Latitude');
title('Interpolation of sample locations (biharmonic)');
Warning: Duplicate x-y data points detected: using average values for duplicate
points.
> In griddata>mergepoints2D (line 167)
In griddata>gdatav4 (line 245)
In griddata (line 129)
In TP_uppgift3 (line 18)
----------------
What's does it mean by duplicate points? Line 18 in my script refers to:
ZI = griddata(data(:,1),data(:,2),Gravity,XI,YI,'v4');
When I execute the code in Matlab, I get the above mentioned error message and after about two minutes a figure window pops up on the screen. Data is plotted correctly in the figure but I just wonder what's the meaning of my compilation error?
Can some one please explain this?
  5 Comments
Walter Roberson
Walter Roberson on 9 Jun 2021
These entries are all duplicates
1618773 7216412 -41.23
1618773 7216412 -41.22
1619200 7228810 -31.29
1619200 7228810 -31.28
1619669 7213038 -45.52
1619669 7213038 -45.51
1619697 7212840 -45.39
1619697 7212840 -45.38
1619773 7214414 -45.17
1619773 7214414 -45.15
1619781 7214615 -45.03
1619781 7214615 -45.03
1619783 7214213 -45.3
1619783 7214213 -45.28
1619787 7214111 -45.29
1619787 7214111 -45.29
1619816 7214710 -44.93
1619816 7214710 -44.92
1619845 7213252 -45.53
1619845 7213252 -45.52
1619898 7213833 -45.34
1619898 7213833 -45.34
1619922 7214877 -44.92
1619922 7214877 -44.91
1619956 7213536 -45.48
1619956 7213536 -45.46
1620126 7234756 -30.65
1620126 7234756 -30.62
1620265 7229959 -31.39
1620265 7229959 -31.37
1620640 7233282 -32.95
1620640 7233282 -32.92
1620693 7232043 -33.49
1620693 7232043 -33.43
1620693 7232043 -33.42
1620906 7231735 -33.34
1620906 7231735 -33.34
1620906 7231735 -33.33
1620911 7231837 -33.51
1620911 7231837 -33.51
1620911 7231837 -33.5
1620911 7231837 -33.5
1620945 7231830 -33.38
1620945 7231830 -33.38
1620953 7230024 -30.76
1620953 7230024 -30.75
1621068 7230760 -31.53
1621068 7230760 -31.5
1626339 7210345 -45.49
1626339 7210345 -45.48
1627064 7210886 -45.95
1627064 7210886 -45.95
1627521 7211487 -46.45
1627521 7211487 -46.45
1627727 7214471 -46.86
1627727 7214471 -46.83
1627735 7214485 -46.97
1627735 7214485 -46.96
1627873 7232923 -28.99
1627873 7232923 -28.98
1627884 7212817 -47.39
1627884 7212817 -47.37
1628118 7232740 -28.85
1628118 7232740 -28.84
1628923 7232217 -28.86
1628923 7232217 -28.84
1628942 7232141 -28.96
1628942 7232141 -28.94
1629013 7232264 -28.66
1629013 7232264 -28.66
1629013 7232264 -28.66
1629046 7232361 -28.5
1629046 7232361 -28.5
1629069 7232460 -28.34
1629069 7232460 -28.34
1629105 7232550 -28.22
1629105 7232550 -28.22
1629127 7233053 -26.68
1629127 7233053 -26.67
1629132 7233148 -25.79
1629132 7233148 -25.79
1629162 7233270 -24.78
1629162 7233270 -24.77
1629182 7232614 -28.24
1629182 7232614 -28.23
1629182 7232614 -28.22
1629182 7232614 -28.2
1629221 7233022 -26.67
1629221 7233022 -26.66
1629258 7232679 -28.17
1629258 7232679 -28.16
1629279 7232946 -27.02
1629279 7232946 -27.01
1629279 7232946 -27.01
1629323 7232757 -27.81
1629323 7232757 -27.8
1629328 7232859 -27.26
1629328 7232859 -27.25
1629821 7229988 -33.22
1629821 7229988 -33.21
1629821 7229988 -33.19
1630300 7224293 -40.52
1630300 7224293 -40.51
Mattias Larsson
Mattias Larsson on 9 Jun 2021
Thanks for clarifying. Now I understand better why Matlab is complaining on duplicate data points.

Sign in to comment.

Accepted Answer

dpb
dpb on 8 Jun 2021
You'll find that
numel(unique(data(:,1))) < size(data,1)
numel(unique(data(:,2))) < size(data,1)
for one or both expressions.
You can work around the problem by adding a very small (2*eps(data)) random noise value to each variable; it will not be sufficient to effect the results but will avoid the test/warning for duplicated values in griddata
  3 Comments
dpb
dpb on 8 Jun 2021
Edited: dpb on 8 Jun 2021
Anywhere before passing the data to griddata -- either just in place in the data arrays themselves or, as you say, in the argument list.
Remember to use +rand()*2*eps(), not just 2*eps(), which would still be deterministic and have the duplicates.
Mattias Larsson
Mattias Larsson on 9 Jun 2021
I modified line # 18 in my scripit to this:
ZI = griddata(data(:,1)+rand()*2*eps(),data(:,2)+rand()*2*eps(),Gravity+rand()*2*eps(),XI,YI,'v4');
but I still get the warning message in the Matlab window (described above). Maybe I should put the random noise value before the griddata function to get it work?!

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 9 Jun 2021
You have values such as 1630300 and 7224293 . Adding rand*2*eps to them leaves them unchanged. eps() by itself is eps(1), the amount by which the floating point number 1.0 differs from its closest neighbour. But the the amount that 1630300 differs from its closest neighbour is much more.
nr = size(data,1);
j1 = data(:,1).*(1+randn(nr,1)*10*eps);
j2 = data(:,2).*(1+randn(nr,1)*10*eps);
ZI = griddata(j1, j2, Gravity, XI, YI, 'v4');
This will get rid of the warning message, but is it truly the reasonable thing to do for your situation?
1619956 7213536 -45.48
1619956 7213536 -45.46
you would be establishing a gradient of 0.02 change in z over a euclidean distance of roughly 1e-8 or less, which is a slope of roughly 2000000 or more. That is going to lead to big overshoots in the polynomial interpolation, and the angle and positions of those overshoots is not going to be under your control.
In the immortal words of the UTexas SuperStarTrek, "Captain, in view of the alternatives, are you sure this is wise?"
  4 Comments
Mattias Larsson
Mattias Larsson on 10 Jun 2021
What does the function - scatteredInterpolant - do?? And how is 'F' related to the griddata function in my script?
Walter Roberson
Walter Roberson on 10 Jun 2021
Well, we can get rid of the warnings about duplicated points
data = readmatrix('https://www.mathworks.com/matlabcentral/answers/uploaded_files/646640/Mala_grav.txt');
nr = size(data,1);
Gravity = data(:,3);
%this is the distance that griddata() uses internally to decide
%whether points are unique. It is about 0.00015 for this data
epsx = eps((max(data(:,1))-min(data(:,1)))/2)^(1/3);
epsy = eps((max(data(:,2))-min(data(:,2)))/2)^(1/3);
%now figure out which points are duplicates are others
[~, ~, G] = uniquetol(data(:,1:2), 1, 'byrows', true, 'datascale', [epsx, epsy]);
%and for each one, calculate a relative occurance number. First instance
%of a duplicate gets 0, second gets 1, and so on
offset = diag(cumsum(G == G.',1)) - 1;
X_pos = data(:,1) + 2 .* epsx .* offset;
Y_pos = data(:,2) + 2 * epsy * offset;
%there, now the points are uniquified for the purposes of griddata
x = min(X_pos):100:max(X_pos); y = min(Y_pos):100:max(Y_pos);
% meshgrid returns 2-D grid coordinates based on the coordinates contained in vectors x and y
[XI,YI] = meshgrid(x,y); % meshgrid transformsvectors x and y into arrays XI YI
% Test with biharmonic interpolation.
ZI = griddata(X_pos,Y_pos,Gravity,XI,YI,'v4');
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 1.957571e-21.
But then we get griddata() complaining that the system is numerically unstable.
This message will, by the way, go away if you do not use 'v4'
figure;
[c,h] = contour(XI,YI,ZI);
clabel(c,h)
xlabel('Longitude');
ylabel('Latitude');
title('Interpolation of sample locations (biharmonic)');

Sign in to comment.

Categories

Find more on Interpolation in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!