MATLAB not calculate all my data
Show older comments
Hi, im now having some problem with my code in matlab. Firstly, i have 19 sets of subjects with each of it has 25 data in N x 3 matrix (3D data). My problem arise when i want to calculate the distance between the data it only calculate until 19th data instead of the total which is 25 for each of the sets. Means every set, it only calculate until 19th data and the rest which start from 20th until 25th data is not been calculated. The calculation is the same for all the data. May i know how could i fix my code and what should i do? I really appreciate anybody who could help to solve my problem. Thank you so much. This is how my code looks like:
for A = 1: size(set,2)
for B = 1: size(fil,2)
i = double(set{:,A}{:,B});
out{A} = squareform(pdist(i));
temp{A}= out{A};
end
d{A} = temp;
list{A} = setname{A};
end
18 Comments
for A = 1: size(set,2)
for B = 1: size(fil,2)
i = double(set{:,A}{:,B});
out{A} = squareform(pdist(i));
...
Your loop on A only goes to 19 so you're only going to get 19 elements from
i = double(set{:,A}{:,B});
Nurul Atifah
on 30 Apr 2018
dpb
on 30 Apr 2018
Certainly it's possible, it wasn't clear just what set consists of but as you've written it, need a double loop inside the outer for each set that ends up taking all pairs of columns.
I think it's probable can eliminate some of the loops using the builtin vector-awareness of pdist but need to see an actual data set the way it is stored and what it is that you're trying to calculate for sure to write specific code.
A subset of the data would be all needed; show a small section of the other data for specific storage pattern for just a couple "sets" to get the layout.
Nurul Atifah
on 30 Apr 2018
dpb
on 30 Apr 2018
Images are nearly useless; my eyes aren't good enough to read them and can't operate on the data...
Nurul Atifah
on 30 Apr 2018
Nurul Atifah
on 1 May 2018
Edited: Nurul Atifah
on 1 May 2018
dpb
on 1 May 2018
OK, got it...wowsers!!! That's deeply nested, indeed!!! Why so many layers down to get to the data? That has a lot to do with how difficult it is to operate on the data.
At the very bottom is a dataset which is a remnant of an earlier incarnation of what is now the builtin table class; it is recommended to use it instead. Was this code inherited from some time back, perhaps?
Secondly, maybe we don't have enough code posted after all; I'd prefer to see if couldn't solve the problem at a higher level of revisiting how the data are created instead of leaving as is and having to deal with that...are you allowed to change whatever you want or are there restrictions to what you're allowed to do here?
Let's see the code that loads the data and see about making it more efficient from that point instead...this can be dealt with, but it would essentially be taking what the current storage is and rearranging into something different so why not arrange it that way from the beginning?
OBTW, using set as a variable name is a bad idea; that's the builtin function for munging on objects of all sorts, primarily graphics and to alias it is likely to cause grief elsewhere...
Nurul Atifah
on 2 May 2018
No, the data are deeply nested because you kept adding layers of {} on top of existing cell arrays...then topped it off by using dataset. :) Give me a little while to recast that some...I'll start by consolidating the return from textread into array of double instead of a cell array and then consider from there given the rest of the structure in the input files....
Nurul Atifah
on 2 May 2018
Almost there with first cut...do you know the size of each array ahead of time; it appears they're all the same length in the given dataset, can that always be assumed to be true? And, if so, is the length known a priori or must it be determined from the data itself?
If the data are same-sized, I'd suggest using a 3D array of
nObs X 25*3 X nFiles
as the most efficient holding pattern as well as simplest to traverse. That way, there are no cell arrays to dereference and each file is simply one plane of the 3D array and consequently computations across or within files can be far more easily vectorized using the power of Matlab syntax...but if the data arrays aren't all of the same length, then do need a cell array to hold disparate sizes (of course, then you're going to run into issues in comparing across files when/where that happens).
For confirmation, please explain precisely which sets of position measurements are to be cross-compared.
Nurul Atifah
on 2 May 2018
Edited: Nurul Atifah
on 2 May 2018
dpb
on 2 May 2018
pdist computes the pairwise distances of each element of a given array so using it on each of the 25 position-data arrays for a given subject will return 25 83*82/2=3403 distance vectors (ignoring squareform orientation that effectively doubles the size by creating the symmetric matrix from the upper triangular elements vector).
Consequently, for 19 (or however many there happened to be) subjects you would end up with 25 such vectors each, correct?
Nurul Atifah
on 2 May 2018
Edited: dpb
on 2 May 2018
dpb
on 2 May 2018
OK, just wanted to be sure understood specifically which distance measure you actually wanted; within variable or between (or both, maybe).
The latter question is a detail can test when have the rest working; in general it's faster to use straight-ahead vectorized functions over bsxfun but that can be a tried alternative if the first is shown to be performance lacking (I doubt that will be the case albeit when dealing with sizable data sizes, computation time is inevitably going to be noticeable).
I don't have time this instant to finish up; should be able to find a few spare moments later tonight...but I do now think I know the problem definition and believe the implementation is now straightforward and quite a lot simplified from your first try--not that that is intended as criticism; it's easy to get lost in the weeds in cell arrays when textread is insistent upon returning everything as a cell array even when it isn't needed (or would be better if it weren't). Not much in the documentation that helps the beginner understand this (as in nothing :( ).
Nurul Atifah
on 3 May 2018
Edited: Nurul Atifah
on 3 May 2018
Answers (1)
Was too tired last night; building fence is tough work for old men... :) Got 3 mi in; only 3 mi more to go... :)
Try the following to see if will read your data successfully...you'll end up with a 4D array of a series of 3D arrays which are stacked (planes) of 25 81x3 2D arrays if I didn't screw things up.
% calling folder and data directory
N=81; M=3; % size of data array of position data
projdir = 'FaceData';
d=dir(projdir); % use a name for the dir() struct
d=d(~ismember({d.name},{'.','..'})); % remove the current, parent references
s=d([files.isdir]); % the subdirectories to traverse (shorter name)
S=length(s); % number subdirs (19)
f=dir(fullfile(s(1).folder,s(1).name, '*.DATA') ); % directory for first to find F
F=length(f); % number files each subject (25)
data=zeros(N,M,S,F);
for j=1:S
f=dir(fullfile(s(j).folder,s(j).name, '*.DATA') ); % directory for each
for k=1:F % over found files in the folder
fname=fullfile(f(k).folder,f(k).name); % fully-qualified filename
data(:,:,k,j)=dlmread(fname,'delimiter',' '); % read data; save in 4D array
end
end
dataname= {s.name}.'; % save the subdirectory names
NB: I really don't like the hardcoded sizes as a general rule but it appeared that your cases are pre-ordained to be of a known size so it's probably not too bad. If there is need to have variable numbers/sizes, we can deal with that going forward one way or another; if absolutely had to could go back to cell arrays but just not so deeply entwined; just one level instead of three! :)
To then do the distances, we'll just walk through the array backwards from the last dimension across the 19 subject sets to the 25 observations from which can call pdist for each array in turn.
But, let's fix any typos/etc., I've made here first...
4 Comments
Nurul Atifah
on 3 May 2018
Edited: Nurul Atifah
on 3 May 2018
dpb
on 3 May 2018
Oh dang!!! You would want it, wouldn't you? :) My bad, I forgot to add the code...it's short enough I'll just paste in into the Answer above.
Thanks; spring has come and we finally got some rain so pastures are beginning to green up and now things are in a rush...and I don't move as quickly as once't upon a time... :(
Nurul Atifah
on 3 May 2018
Edited: Nurul Atifah
on 3 May 2018
dpb
on 3 May 2018
OK, came in for a break; try the above and see if it does read your data and return the data and datename arrays. It would be surprising if I didn't make a typo or other gaffe without having any data to test, but should be close...if it does happen to run, then make a .mat file of that array and attach it and I can work on the next step with real data instead of made up. If it doesn't work and you can't see an obvious error and correct it, attach the full error text and I'll try to look back in later on tonight.
Categories
Find more on Data Structures in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

