concatenate array of structures with same field names

Hi
I am reading datafile from a camera which spits out data in a certain format - about several Gb's in size.
The output format is more complex than the example below:
m = memmapfile(def_file,'Format',{'uint8',[8,1],'identifier';'uint8',[504,1],'Header';'uint16',[2^16,1],'Pix'},'Repeat',N,'Offset',0,'writable',false);
where N=8000
and this is the output
m.Data
8000×1 struct array with fields:
identifier
Header
Pix
so that m.Data.Pix has 8000 fields each with 65536 element vector
and m.Data.identifier also has 8000 fields each with 8 element vector
I want to concatenate each structure array example for N=20 is a script I adapted from the forum
M = CatStructFields(m.Data,1,1);
tic
Dat = CatStructFields(m.Data,1,3);
toc
To concatenate N=8000 for Pix (j=3) takes ~ half an hour.
There must be a faster way to do this. Any ideas here?
%%
function M = CatStructFields(S, dim, j)
fields = fieldnames(S);
M=[];
for k = 1:numel(S)
aField = fields{j};
M = cat(dim, M, S(k).(aField));
end
end

1 Comment

Its way faster (like 7 seconds) if I dump the whole file using memmap as a uniform *uint16 into a Matlab variable and then sort out different fields.
But I ran the same using C++ and fstream where you can read a structure directly (where the struct is {identifier, header, Pix, ...}) and it was 10x faster than the uniform uint16 bit dumped version of my code.
So why is it that Matlab cannot use fread to read a structured file? Not sure how fread is implemented (does it use C++ fread or fstream?)

Sign in to comment.

 Accepted Answer

Matt J
Matt J on 5 Jun 2023
Edited: Matt J on 5 Jun 2023
So why is it that Matlab cannot use fread to read a structured file? Not sure how fread is implemented (does it use C++ fread or fstream?)
I think it may have to do with the fact that Matlab does not assume that the field data in a struct array are of a fixed size. Therefore, structs are neither stored nor read in contiguously.

9 Comments

I didn't realize that fread or fstream reads fixed size structure. I think fread can read any mixed type provided one issues a pragma push(1) command so that it does not pad to zeros and fits exactly to the desired structure.
So there isnt a way to concatenate fast a memmap structured data?
pragma push? I suspect you referring to C++ such as the code at https://www.mathworks.com/matlabcentral/answers/417591-how-can-i-change-the-byte-packing-of-structures-generated-from-simulink-coder . MATLAB's fread() does not pay attention to pragmas.
@gujax since you seem to be C/C++ fluent, you might be able to write your own specialized MEX file reader. That way, you can probably ensure that the data is read in contiguously. Matlab has no native mecahnism to ensure that struct arrays are contiguously stored, so I can't imagine doing it in mcode alone.
So to repeat my question: anyway to concatenate structure of vectors for a field when using memmap with ‘repeat’ like in my example for field ‘Pix’?
If you mean is there a faster implementation of CatStructFields, then yes:
function M = CatStructFields(S, dim, aField)
M = cat(dim, S.(aField));
end
If you mean, is there any way to get memmapfile to run faster, then no. As I said above, I think you need a specialized MEX.
Oh wow! Matt, your function worked like a charm. But I do not get it because I tried something similar and it led me astray.
Let me explain
When I tried this below (just like your function statement)
M2=cat(1,m.Data.Pix);
I get the following error message:
Error using memmapfile/subsref (line 790). A subscripting operation on the Data field attempted to create a comma-separated list. The memmapfile class does not support the use of comma-separated lists when subscripting.
But I copy pasted your function definition and tried this
tic
Dat = CatStructFields(m.Data,1,'Pix');
toc
And it worked (took 1.3 secs versus my original CatStructFields with for-loop which took more than half hour and now I understand whyis that based on your solution though I fail to understand based on the error message above).
Can you please explain why it works your way?
Thank you for the solution.
And I also found a simple solution which surprised me!
tic
M=zeros(65536,8000);
for jk=1:8000
M(:,jk)=m.Data(jk).Pix;
end
toc
This took 4 seconds still slow compared to yours
I want to accept Matt J's answer. How do I do that? Dont see an option
I think it's because m is not of type struct, and so the indexing syntax I was using does not apply. You could probably do,
S=m.Data;
M2=cat(1,S.Pix);
yes that worked! Of course m is a filetype. The 'dot' in m.Data led me off thinking its a structure! Should have paid more attention

Sign in to comment.

More Answers (0)

Products

Release

R2018b

Asked:

on 5 Jun 2023

Commented:

on 6 Jun 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!