Merging files based on location similarity use polyshape and overlaps

1 view (last 30 days)
I have a ton of survey data, like thousands of files. Im writing a program right now to process all of this data, its going well in that I have a nice ui that can output really nice surfaces. What I want to do though is figure out how to merge files based on lat and long similarity. So I want the program to first figure out what files overlap or are in similar geographic regions, move each file to a folder for each preset region, then concatenate those files, then process the data. Because I have so many files and they generally fall in a few different regions, what Ive been trying to do is first plot the bounds for each geographic region. Then I want to plot the bounding box for each survey and figure out if one file overlaps with one of the preset geographic regions. Because I have so many files im trying first to plot every file as a polyshape on top of the preset regions, afterwards Im trying to figure out if they overlap and by what percent. This is where im stuck. If they overlap by say 30 percent, then move the files and concatenate any files in that new folder and write a new .csv file. Below is the code ive used to create the figure, right now I can plot each file that is processed on top of the preset areas, a section of the code also attempts to find the overlap but the result is always 0 overlap, which is not true, some files are completly within other files. I assumed the answer is always 0 because im in a for loop, but I dont know how to seperate the files outside of the for loop. Hopefully this rambling makes sense. After I have determined which files overlap other files, how do I then work my way backwards and figure out which one it was that overlapped? I can answer any clarifying questions becuase Im not sure my description is insufficient. Thanks again to you all, your knowledge is valuable and been very helpful for me.
P = 'C:/Users/keith/OneDrive/Desktop/Single Beam Bathy/SN06222';
Q = 'C:/Users/keith/OneDrive/Desktop/Single Beam Bathy/SN06222/Corrected CSV';
S = dir(fullfile(P,'*.csv'));
S = natsortfiles(S);
N = numel(S);
C = cell(N,1);
C1 = cell(N,1);
Area1 = polyshape([-117.27917 -117.27902 -117.20199 -117.20199 -117.17096 -117.17096 -117.16667 -117.23000 -117.26158 ], ...
[32.55800 32.61202 32.61202 32.60187 32.60187 32.57167 32.53333 32.52911 32.55501 ]);
Area2 = polyshape([-117.17947 -117.14449 -117.16990 -117.2037], ...
[32.62047 32.63668 32.67197 32.65642]);
Area3 = polyshape([-117.14358 -117.18173 -117.17832 -117.13523],...
[32.63557 32.61703 32.60967 32.61132]);
Area4 = polyshape([-117.17417 -117.13528 -117.13309 -117.17365], ...
[32.6093 32.61017 32.59383 32.59296]);
Area5 = polyshape([-117.38598 -117.27852 -117.26048 -117.36795], ...
[32.69488 32.6111 32.62763 32.71148]);
Area6 = polyshape([-117.21995 -117.26462 -117.35953 -117.35015 -117.25807 -117.21995], ...
[32.63495 32.63495 32.76783 32.77073 32.64248 32.64248]);
Area7 = polyshape([-117.20177 -117.25125 -117.26588 -117.25718 -117.24505 -117.20177], ...
[32.61213 32.61213 32.63698 32.6408 32.62052 32.62052]);
for k= 1:numel(S)
F2 = fullfile(S(k).folder,S(k).name);
M1 = readmatrix(F2);
lat1 = M1(:,1);
lon1 = M1(:,2);
[minLat, maxLat] = bounds(lat1);
[minLon, maxLon] = bounds(lon1);
C = [maxLon maxLon minLon minLon; maxLat minLat minLat maxLat];
c = C.';
p1(k)=polyshape(c);
polygons = {p1, Area1, Area2, Area3, Area4, Area5, Area6, Area7};
polygons1 = [p1, Area1, Area2, Area3, Area4, Area5, Area6, Area7];
num_polygons = numel(polygons);
areas = zeros(num_polygons, 1);
for i = 1:num_polygons:-1:1
areas(i) = polyarea(polygons{i}(:,1), polygons{i}(:,2));
end
areas = abs(diff([areas;0]));
for i = 1:num_polygons
fprintf('Polygon %d area: %f square units\n', i, areas(i));
end
end
figure
plot(polygons1)
Heres an example of the output for area overlap:
Polygon 1 area: 0.000000 square units
Polygon 2 area: 0.000000 square units
Polygon 3 area: 0.000000 square units
Polygon 4 area: 0.000000 square units
Polygon 5 area: 0.000000 square units
Polygon 6 area: 0.000000 square units
Polygon 7 area: 0.000000 square units
Polygon 8 area: 0.000000 square units
Polygon 1 area: 0.000000 square units
Polygon 2 area: 0.000000 square units
Polygon 3 area: 0.000000 square units
Polygon 4 area: 0.000000 square units
Polygon 5 area: 0.000000 square units
Polygon 6 area: 0.000000 square units
Polygon 7 area: 0.000000 square units
Polygon 8 area: 0.000000 square units
Polygon 1 area: 0.000000 square units
Polygon 2 area: 0.000000 square units
Polygon 3 area: 0.000000 square units
Polygon 4 area: 0.000000 square units
Polygon 5 area: 0.000000 square units
Polygon 6 area: 0.000000 square units
Polygon 7 area: 0.000000 square units
Polygon 8 area: 0.000000 square units
Polygon 1 area: 0.000000 square units
Polygon 2 area: 0.000000 square units
Polygon 3 area: 0.000000 square units
Polygon 4 area: 0.000000 square units
Polygon 5 area: 0.000000 square units
Polygon 6 area: 0.000000 square units
Polygon 7 area: 0.000000 square units
Polygon 8 area: 0.000000 square units
Polygon 1 area: 0.000000 square units
Polygon 2 area: 0.000000 square units
Polygon 3 area: 0.000000 square units
Polygon 4 area: 0.000000 square units
Polygon 5 area: 0.000000 square units
Polygon 6 area: 0.000000 square units
Polygon 7 area: 0.000000 square units
Polygon 8 area: 0.000000 square units
Polygon 1 area: 0.000000 square units
Polygon 2 area: 0.000000 square units
Polygon 3 area: 0.000000 square units
Polygon 4 area: 0.000000 square units
Polygon 5 area: 0.000000 square units
Polygon 6 area: 0.000000 square units
Polygon 7 area: 0.000000 square units
Polygon 8 area: 0.000000 square units
Polygon 1 area: 0.000000 square units
Polygon 2 area: 0.000000 square units
Polygon 3 area: 0.000000 square units
Polygon 4 area: 0.000000 square units
Polygon 5 area: 0.000000 square units
Polygon 6 area: 0.000000 square units
Polygon 7 area: 0.000000 square units
Polygon 8 area: 0.000000 square units
Polygon 1 area: 0.000000 square units
Polygon 2 area: 0.000000 square units
Polygon 3 area: 0.000000 square units
Polygon 4 area: 0.000000 square units
Polygon 5 area: 0.000000 square units
Polygon 6 area: 0.000000 square units
Polygon 7 area: 0.000000 square units
Polygon 8 area: 0.000000 square units
Polygon 1 area: 0.000000 square units
Polygon 2 area: 0.000000 square units
Polygon 3 area: 0.000000 square units
Polygon 4 area: 0.000000 square units
Polygon 5 area: 0.000000 square units
Polygon 6 area: 0.000000 square units
Polygon 7 area: 0.000000 square units
Polygon 8 area: 0.000000 square units
and an example of the figure that is created. The smaller squares are the 9 surveys I have in my directory. The end goal is to figure out which files overlap, put them all in a folder for each preset region and then concatenate them in that folder.

Accepted Answer

Matt J
Matt J on 27 Mar 2024
Edited: Matt J on 27 Mar 2024
The variable naming in your posted code doesn't reflect the terminology in your text explanation, making it hard to interpret. Therefore, I will propose something from scratch. Suppose you have M geographic regions whose bounding boxes are represented as a polyshape M-vector georeg(m), m=1,...,M. Suppose in addition you have N surveys whose bounding boxes are likewise represented as a polyshape N-vector survey(n), n=1,...,N
Now, we can create a boolean matrix TF(m,n) describing whether region m intersects survey n,
TF=false(M,N); %preallocate
for m=1:M
TF(m,:)=area( intersect(georeg(m), survey ) )>0.3*area(georeg(m)); %populate
end
Now you can loop through the m regions, collect the polyshapes for the surveys intersecting each one, and write to a file
for m=1:M
mthSubset=survey(TF(m,:)); %<---- write this survey sub-collection to a file somehow.
end
  3 Comments
Matt J
Matt J on 28 Mar 2024
Edited: Matt J on 28 Mar 2024
N = numel(S);
clear Area Surveys
Area(1:7)=polyshape();
Surveys(1:N)=polyshape();
[xSurvey,ySurvey]=nan(4,N);
Area(1) = polyshape([-117.27917 -117.27902 -117.20199 -117.20199 -117.17096 -117.17096 -117.16667 -117.23000 -117.26158 ], ...
[32.55800 32.61202 32.61202 32.60187 32.60187 32.57167 32.53333 32.52911 32.55501 ]);
Area(2) = polyshape([-117.17947 -117.14449 -117.16990 -117.2037], ...
[32.62047 32.63668 32.67197 32.65642]);
Area(3) = polyshape([-117.14358 -117.18173 -117.17832 -117.13523],...
[32.63557 32.61703 32.60967 32.61132]);
Area(4) = polyshape([-117.17417 -117.13528 -117.13309 -117.17365], ...
[32.6093 32.61017 32.59383 32.59296]);
Area(5) = polyshape([-117.38598 -117.27852 -117.26048 -117.36795], ...
[32.69488 32.6111 32.62763 32.71148]);
Area(6) = polyshape([-117.21995 -117.26462 -117.35953 -117.35015 -117.25807 -117.21995], ...
[32.63495 32.63495 32.76783 32.77073 32.64248 32.64248]);
Area(7) = polyshape([-117.20177 -117.25125 -117.26588 -117.25718 -117.24505 -117.20177], ...
[32.61213 32.61213 32.63698 32.6408 32.62052 32.62052]);
M=numel(Area);
TF=false(M,N);
for n = 1:N %Loop over surveys
F2 = fullfile(S(n).folder,S(n).name);
M1 = readmatrix(F2);
lat1 = M1(:,1);
lon1 = M1(:,2);
[minLat, maxLat] = bounds(lat1);
[minLon, maxLon] = bounds(lon1);
C = [maxLon maxLat; maxLon minLat; minLon minLat; minLon maxLat];
xSurvey(:,n) = C(:,1);
ySurvey(:,n) = C(:,2);
Surveys(n)=polyshape(M1);
end
for m=1:M %Loop over Areas
inside = reshape( isinterior(Area(m),xSurvey(:), ySurvey(:)) ,4,N);
TF(m,:) = sum( inside ,1)>=3; %TF(m,n)=1 if 3 corners of survey n are in Area(m)
end
for m=1:M %Loop over Areas -- get the surveys lying inside the m-th Area
surveySubset=Surveys(TF(m,:)); %<----these are the survey polyshapes inside Area(m)
%Do stuff...
end

Sign in to comment.

More Answers (0)

Categories

Find more on Elementary Polygons in Help Center and File Exchange

Products


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!