Removal of duplicate entries and counting them

Hi, Sorry if I have asked a dumb question, but I am new to matlab. So please bear with me. I want to create a matlab script that calculates the number of duplicate entries from an array and also removes them. Can you please let me know how can it be done ? I will further explain my question with the help of an example.
a[] = the the she he she she
so the output should be :
the - 2 times she - 3 times he - 1 time/s
I was thinking of using the for loop and comparing each and every elements with one another. But I am having problems in eliminating duplicate data.
Thank you, Sean

 Accepted Answer

You could use a loop, or do something like this:
a = {'the' 'the' 'she' 'he' 'she' 'she'}
[V,N,X] = unique(a);
N = histc(X,1:length(N));
Now look at V and N. N tells you that there is 1 he, 3 she and 2 the.

More Answers (2)

If you happen to have access to Statistics Toolbox, and you are actually messing with strings (like in your example), you might want to consider using nominal arrays:
a = nominal({'the' 'the' 'she' 'he' 'she' 'she'});
tokens = getlevels(a)
n = hist(a)
Nominal arrays take less memory and have some other nice features.
a = {'the', 'the', 'she', 'he', 'she', 'she'};
au = unique(a); % This finds one of each string
% strcmp gives a logical vector with 1 for each match and
% zero otherwise
str = [];
for i=1:numel(au)
numDuplicates = sum(strcmp(au(i),a));
str = [str,au{i},' - ',num2str(numDuplicates),' times '];
end
disp(str)

Categories

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!