How can I find duplicate first four columns?

Question

0 votes

test.xlsx

Hello!

My excel file format is as follows:

A column: Time
B column: Year
C column: Day of the year
D column: Hour

There are many duplicate values at the first four column in this excel file. For example, 4101 and 4102 row are same values in the first four columns. 4114 and 4115, 4127 and 4128 and so forth...And I'd like to remove upper row such as 4101, 4114, 4127 etc...

How can I find duplicate A column and just remove upper row?

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Image Analyst on 20 Aug 2018

Open in MATLAB Online

1 vote

Not sure if it's the most compact code, but I think it's simple and intuitive, and I think this will work:

[numbers, strings, raw] = xlsread('test.xlsx');
col1 = numbers(:, 1);
[C, ia, ic] = unique(col1);
fprintf('Length of column 1 = %d.\nNumber of unique numbers in column 1 = %d.\n', ...
  length(col1), length(C));
% Scan down array and keep track of the first time we encounter anumber.
rowsToDelete = zeros(1, length(C));
for k = 1 : length(C)
  % Find out how many times this unique number occurs.
  indexes = (col1 == C(k));
  numOccurrences = sum(indexes);
  fprintf('For k = %d, found %d occurrences of %d\n', k, numOccurrences, C(k));
  % More than 2?  Then delete the first one.
  if numOccurrences >= 2
    % If there are two or more, delete the first one ONLY.
    rowsToDelete(k) = find(indexes, 1, 'first');
  end
end
% Remove zeros
rowsToDelete(rowsToDelete <= 0) = [];
col1(rowsToDelete) = [];
fprintf('Deleted these rows: ');
fprintf('%d ', rowsToDelete);
fprintf('\nAfter removing first numbers we now have %d numbers.\n', ...
  length(col1));

10 Comments
Show 8 older comments Hide 8 older comments

Suat YAZICI on 20 Aug 2018

Edited: Suat YAZICI on 20 Aug 2018

Sorry but it doesn't work. Why didn't I get edited 25x10 matrix instead of 25x1 matrix because of col1 = numbers(:, 1) ?

Image Analyst on 20 Aug 2018

I was just using col1 as a convenience to figure out what rows need to be deleted. Once I figured that out, I applied it to all columns in "numbers" to delete those rows from every column. If numbers is 10 columns, then it should still be 10 columns after deleting some rows. It just won't have as many rows. If you didn't get a new height for numbers, then you didn't use the code from my last comment where I deleted the rows from numbers.

Sign in to comment.

How can I find duplicate first four columns?

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

10 Comments
Show 8 older comments Hide 8 older comments

More Answers (0)

Categories

Tags

Community Treasure Hunt

How can I find duplicate first four columns?

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

10 Comments Show 8 older comments Hide 8 older comments

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

10 Comments
Show 8 older comments Hide 8 older comments