Finding unique rows using "uniquetol" from top

It seems like there is no option for finding unique rows from top using uniquetol unlike unique where there is option for the argument "first". Is there a way to do this
[~,idx] = uniquetol(Q2(:,1:ns),'ByRows',true,'first') % the argument "first" picks unique rows from the top
some other way, since the "first" option is not supported by uniquetol?
Thanks!

9 Comments

"...since the "first" option is not supported by uniquetol? "
According to the documentation, uniquetol returns the index for the first row of each set of matching rows: "Index to A, returned as a column vector of indices to the first occurrence of repeated elements".
It is easy to check that it returns the first index of multiple matching rows:
>> M = [0.0001,2;0.0002,2;1e2,2]
M =
0.0001 2.0000
0.0002 2.0000
100.0000 2.0000
>> [U,X] = uniquetol(M,1e-3,'ByRows',true)
U =
0.0001 2.0000
100.0000 2.0000
X =
1
3
Given that uniquetol returns the index of first occurence by default and apparently that is what you want, what exactly is the problem? Can you given an example where this does not return what you need?
This doc is obviously not correct or not precise, seems only applicable for EXACT match. If you run my code bellow where tolerance occurs you'll see the 'first' is not fullfilled.
Stephen23
Stephen23 on 27 Jul 2020
Edited: Bruno Luong on 27 Jul 2020
"This doc is obviously not correct or not precise, seems only applicable for EXACT match."
Did you look at my example? The first two rows are NOT an exact match.
Oops I edit out your post! (now I restore it)
Your example is too small, too specific. When it works for one example doesn't mean it works for all cases.
"Your example is too small, too specific. When it works for one example doesn't mean it works for all cases."
A small repeatable example would be useful. No random numbers.
Why? if it fails once (on random numbers) then it simply fails, but anyway I'll prepare you a deterministic example.
Here you can try my random example
A = ceil(3*rand(1000,2));
A = A + rand(size(A))*1e-10;
[Au,matlabidx] = uniquetol(A,1e-6,'byrows',true);
[tf,I] = ismembertol(A,Au,1e-6,'byrows',true);
firstidx = accumarray(I,(1:size(A,1))',[],@min);
% Check index returned
matlabidx
firstidx
norm(A(matlabidx,:)-A(firstidx,:),'fro') % small if they match
I get this (R2020a, Update4, Windows), it shows clearly MATLAB does not return smallest indexes
matlabidx =
868
524
502
489
810
97
537
91
308
firstidx =
2
12
4
1
6
40
5
13
9
ans =
1.9880e-10
Here we go, small example
>> A = 1 + eye(2)*2^-40
A =
1.0000 1.0000
1.0000 1.0000
>> [Au,matlabidx] = uniquetol(A,1e-6,'byrows',true)
Au =
1.0000 1.0000
matlabidx =
2
Even smaller (smallest example)
>> A = 1 + [1;0]*2^-40
A =
1.0000
1.0000
>> [Au,matlabidx] = uniquetol(A,1e-6,'byrows',true)
Au =
1
matlabidx =
2
The data I was trying to work on is not random at all, rather it is from a well-known experiment in machine learning called the "Tiger example" often found in POMDP literature. Since several examples already have been provided by Bruno already, I am going to pass.

Sign in to comment.

 Accepted Answer

I suppose you can do something like this. I'm affraid the way UNIQUETOL and ISMEMBERTOL consider the TOL internally, and in some cases the result is not coherently match when the frontier is fuzzy. You might set 'DataScale' to 1 and control TOL to have more robust match.
Fake data
A = ceil(3*rand(1000,2));
A = A + rand(size(A))*1e-10;
Engine
[Au,idx] = uniquetol(A,1e-6,'DataScale',1,'byrows',true);
[tf,I] = ismembertol(A,Au,1e-6,'DataScale',1,'byrows',true);
if ~all(tf)
error('incompatible tolerance');
end
firstidx = accumarray(I,(1:size(A,1))',[],@min)
if ~all(firstidx) || size(firstidx,1)~=size(Au,1)
error('incompatible tolerance');
end
Check
A(idx,:)
A(firstidx,:)
If those matching error checking bother you, here is a way to ignore with the risk that the result might be different than UNIQUETOL alone
Au = uniquetol(A,1e-6,'DataScale',1,'byrows',true);
[tf,I] = ismembertol(A,Au,1e-6,'DataScale',1,'byrows',true);
firstidx = accumarray(I(tf),find(tf),[],@min)
Au = A(firstidx,:);

2 Comments

Thanks a lot!
Actually I was stupid; you can use the third output of UNIQUETOL, no need fot ISMEMBERTOL
[Au,matlabidx,I] = uniquetol(A, 1e-6, 'byrows', true);
firstidx = accumarray(I, (1:size(A,1))', [], @min)
% Check
norm(A(matlabidx,:)-A(firstidx,:),Inf)

Sign in to comment.

More Answers (0)

Products

Release

R2020a

Asked:

on 27 Jul 2020

Edited:

on 27 Jul 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!