Is there a way to use setdiff or any function to compare two data sets within a range of each other?

I have two large data sets that are equal in dimensions but not in variables. Also the variables are similar but not exact so setdiff returns every value in the smaller data set. I'd like to know is there a way to add some range into the setdiff function like setdiff(A,B<+-2,B<+-2)?

2 Comments

Can you just do setdiff(B, A) instead? Can you give a small example?
No that doesn't make a a difference because the values aren't exactly the same. So if the data looked like this:
ds1
  • 1.21 0.550
  • 9.78 0.989
  • 13.67 0.947
ds2
  • 1.19 0.45
  • 13.55 1.05
I want setdiff to return only row 2 as different.

Sign in to comment.

 Accepted Answer

Starting in R2015a, there is a function ismembertol. You can use this and the set diff logic on idx.

More Answers (1)

Since every row needs to be compared to every other row in the other matrix in a range of values, you can use (and may have to) use a double for loop with an if statement inside. Might not be a one-liner but at least it's straightforward and intuitive.

7 Comments

I'm sorry. I'm pretty new to MATLAB. Could you explain or give an example?
This is what I've tried but it doesn't work
if(A<=B+1)&(A<=B-1)&(A>=B+1)&(A>+B-1)
C=setdiff(A,B);
end
No - you have to index over rows and columns. You're comparing a whole matrix at a time, which won't even work if A and B are different sizes. Try something like
[rowsA, columnsA] = size(A);
[rowsB, columnsB] = size(B);
tolerance = 2;
for rowa = 1 : rowsA
thisRowA = A(rowa, :); % Extract just one row.
for rowb = 1 : rowsB
thisRowB = B(rowb, :); % Extract just one row.
different(rowa, rowb) = abs(thisRowB - thisRowA) > tolerance
end
end
That's a good start but since it sounds very much like a homework assignment (though you didn't tag it as such), I'm going to let you finish up the last little touches. Let me know if it's not homework and if you are not able to figure out the rest.
This is not homework. I'm doing research and the data sets are from two different techniques. They give similar values but not exactly the same. I need to compare them to determine if one technique is better than the other.
OK. If this straightforward brute force approach worked, then mark it as accepted. If you still didn't get it working, let me know. Or if you can get the pdist2 method Star suggested working, then mark the best one as Accepted. Whatever toolbox pdist2 is in, I don't have it so I can't help you with that method.
I still can't get it to work. How do I get matlab to run the command? I feel like this is simple to do. I'm sorry that I'm so lost.
Try this:
A = [...
1.21 0.550
9.78 0.989
13.67 0.947]
B =[...
1.19 0.45
13.55 1.05]
[rowsA, columnsA] = size(A)
[rowsB, columnsB] = size(B)
different = false(rowsA, rowsB);
tolerance = 2;
for rowa = 1 : rowsA
thisRowA = A(rowa, :); % Extract just one row.
% Now check all rows of B to see if they are any that are different.
for rowb = 1 : rowsB
thisRowB = B(rowb, :); % Extract just one row.
% Check if rowb of B is different than rowa of A
itsDifferent = any(abs(thisRowB - thisRowA) > tolerance);
% Record whether it's different or not.
different(rowa, rowb) = itsDifferent;
end
end
different

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!