Removal of duplicate entries and counting them
Show older comments
Hi, Sorry if I have asked a dumb question, but I am new to matlab. So please bear with me. I want to create a matlab script that calculates the number of duplicate entries from an array and also removes them. Can you please let me know how can it be done ? I will further explain my question with the help of an example.
a[] = the the she he she she
so the output should be :
the - 2 times she - 3 times he - 1 time/s
I was thinking of using the for loop and comparing each and every elements with one another. But I am having problems in eliminating duplicate data.
Thank you, Sean
Accepted Answer
More Answers (2)
Matt Tearle
on 14 Feb 2011
If you happen to have access to Statistics Toolbox, and you are actually messing with strings (like in your example), you might want to consider using nominal arrays:
a = nominal({'the' 'the' 'she' 'he' 'she' 'she'});
tokens = getlevels(a)
n = hist(a)
Nominal arrays take less memory and have some other nice features.
Andrew Newell
on 6 Feb 2011
a = {'the', 'the', 'she', 'he', 'she', 'she'};
au = unique(a); % This finds one of each string
% strcmp gives a logical vector with 1 for each match and
% zero otherwise
str = [];
for i=1:numel(au)
numDuplicates = sum(strcmp(au(i),a));
str = [str,au{i},' - ',num2str(numDuplicates),' times '];
end
disp(str)
Categories
Find more on Data Type Identification in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!