Detecting duplicates in multiple files and printing the unique results to a file
Show older comments
I am trying to compare two (or more) files, containing chromosomal positions in the form 2:282828282828, there are about 70,000 of these positions, and whilst Excel works for smaller datasets, it is causing me too much trouble on this scale.
What I was trying to do is 1) compare each position in the matrix with every position in the other matrix, and if the value is unique, print that value to a new file OR 2) compare all cells in a merged file, and print the unique values in to a different file.
I am a little stuck as to where to start for the first option.
But for the second option, I was thinking of this:
dataset1=a [n, bin] = histc(a, unique(a)); multiple = find(n > 1);
to find the multiple values, but how do I get MATLAB to write the unique values where n=1 to a new file?
Answers (0)
Categories
Find more on Software Development Tools in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!