scalar vs nonscalar structure: advantages?

Hi,
What are the advantages/disadvantages of both typs of structures?
I am about to save my research participant data in Matlab but undecided about which type of structure to use.
Thank you,
TD

1 Comment

Adam
Adam on 10 Mar 2020
Edited: Adam on 10 Mar 2020
I've never found a situation in which I'd need to choose between the two, to be honest. Putting aside the fact that I hate structures and use classes instead, since one is scalar and one is non-scalar that means, by definition, they are suitable for different scenarios. So even though it is obvious the only real advantage and disadvantage of the two is their property of being scalar or non-scalar and in most situations your data dictates whether you need something scalar or not! Both are still structures - basically like a christmas tree that you can just hang anything on and don't know what it is until you interrogate the struct.

Sign in to comment.

 Accepted Answer

What you want to do with your data? This documentation page gives two examples that show how each type of struct allows you to "slice" your data in a different way.
In the first example (the RGB image) accessing all the data in one color plane is easy while accessing all the data for a single pixel across the three planes is more involved.
In the second example (the patient data) accessing all the data for one patient is easy while accessing a particular piece of data for all patients is more involved.
I would like to point out one alternative that may be a better fit for your needs. If you have data arranged as a rectangular table consider storing it as a table array or a timetable array. If you look at the examples on the table documentation page you can see that you can access all data for a particular patient (in the "Specify Row Names" example) or all of a particular type of data for each patient ("Store Related Data Variables in Table") through one indexing expression.

13 Comments

Basically it's participant data with demographics and behavioural data (likert scale for some tests). So either I have the data as a scalar structure:
S.Name = {'CLARK';'BROWN';'MARTIN'};
S.Gender = {'M';'F';'M'};
S.SystolicBP = [124;122;130];
S.DiastolicBP = [93;80;92];
S = struct with fields:
Name: {3x1 cell}
Gender: {3x1 cell}
SystolicBP: [3x1 double]
DiastolicBP: [3x1 double]
Or as a nonscalar structure:
S(1,1).Name = 'CLARK';
S(1,1).Gender = 'M';
S(1,1).SystolicBP = 124;
S(1,1).DiastolicBP = 93;
S(2,1).Name = 'BROWN';
S(2,1).Gender = 'F';
S(2,1).SystolicBP = 122;
S(2,1).DiastolicBP = 80;
S(3,1).Name = 'MARTIN';
S(3,1).Gender = 'M';
S(3,1).SystolicBP = 130;
S(3,1).DiastolicBP = 92;
S=3x4 struct
Name
Gender
SystolicBP
DiastolicBP
In any case I can always use
T = struct2table(S)
to convert it to a Table, which would be an identical table. I think it is easier and less time consuming for data input the former given the much less text one must type, because after all, I just found out that using table2strct one can recover either by using
S = table2struct(T,'ToScalar',true)
or
S = table2struct(T)
Would you agree?
Thanks.
On the scalar vs scalar structure, the scalar structure will use significantly less memory than the non-scalar version, particularly the more elements you have.
However, I was going to suggest exactly the same as Steven, you would be much better served by using a table. There's a lot of processing functions available for tables that you'd have to code yourself if you use a structure.
Thanks, Guillaume. Then, if I am to use a Table as the final configuration to do operations on my data, it does not matter whether to start from a scalar or nonscalar structure to introduce the data, so I am better off using a scalar structure (faster). Of course, the other option is to use some import tools from excel or csv, but these do not work properly, so I have to do a lot of hand work to correct everything, I rather introduce everything again.
I wouldn't bother with a structure at all. However you were going to construct the structure you can directly construct the table the same way.
"the other option is to use some import tools from excel or csv but these do not work properly"
readtable combined with detectImportOptions can do wonders. Of course, if your file is inherently not tabular you may have to write your own import. An example of the file you want to import would tell us.
Thanks, I will look into those table import features. However, as I understand, to create a Table you
T = table(var1,...,varN)
You must already have the variables defined and filled with data already beforehand. So my question I assume is still valid (whether it is the same having those variables as scalar or nonscalar structures [or single variables, which would be the same in terms of inputing the data into them]. Unless there is a different way to create a Table from scratch I don't know of.
There are many ways to create a table. The simplest and most common is to import the data directly into a table with readtable (or for big data with a datastore). You can also create a table directly from variables as you show but I would think that's actually fairly rare outside of demo code. You can create/fill up a table row by row or even cell by cell. You can also convert a structure, a cell array or a matrix into a table.
As for importing from a scalar structure or a structure array, both are supported by struct2table and table2struct can convert to either.
To go back to your question of scalar vs array structure. I prefered working with array structure since they're more similar to the way you'd work with structures in C-based languages. It's also easier to slice a structure array (mystruct(5:10) is a slice of the array. You'd have to loop over the fields to do the same with a scalar structure). However, the poor memory performance often forced me to use scalar structures with field arrays. Since tables have been introduced I no longer use structure. The only downside of tables is that they can be slow. Speed has never been an essential criteria for me, code clarity and maintainability are a lot more important.
As Guillaume said, there are many ways to create a table array and not all of them require you to call the table function directly. This documentation page shows how to call table directly to build the array and how to call readtable. It also mentions (but does not show) how to use the Import Tool. The first two "Related Examples" in the Examples section on that page lists the options for how MATLAB will import your data, and one of those is as a table array.
If you have multiple files of data (in the same format) to read in and don't want to interactively import each one in turn, you can select the options to import the first file then (once you're satisfied with the results) generate a script or function using those options. That script or function will facilitate importing the remaining data files.
asThanks to both of you. Sometimes I get confused with the names given to certain elements in Matlab. For instance, when Guillaume talks about scalar struct vs array struct, I understand scalar structure is
S.Name = {'CLARK';'BROWN';'MARTIN'};
S.Gender = {'M';'F';'M'};
S.SystolicBP = [124;122;130];
S.DiastolicBP = [93;80;92];
and array structure is
S(1,1).Name = 'CLARK';
S(1,1).Gender = 'M';
S(1,1).SystolicBP = 124;
S(1,1).DiastolicBP = 93;
S(2,1).Name = 'BROWN';
S(2,1).Gender = 'F';
S(2,1).SystolicBP = 122;
S(2,1).DiastolicBP = 80;
S(3,1).Name = 'MARTIN';
S(3,1).Gender = 'M';
S(3,1).SystolicBP = 130;
S(3,1).DiastolicBP = 92;
But I feel that it is the other way round by the way I understand his comment. I will try to create a Table from scratch by importing, which seems probably to be the most efficient way.
No, you've got that right. Your first structure is a scalar structure where each field is an array. The second is a structure array where each field is scalar. The memory overhead for storing the 1st structure is 1 structure + 4 matrices overhead whereas for the 2nd it's 1 structure + 12 matrices overhead (where overhead is all the bookkeeping involved with arrays: storing type / number of dimensions / size of each dimension / etc.)
Thanks. In my case, it's neuroscientific data, so the number of participants is very low (<40), so I don't think memory issues are involved with these numbers (and the data stored in this structure is just demographic plus likert-scale).
In this use case, I'd say table structure array scalar structure.
The Finnish Rein Deer: if you get confused, you can always ask MATLAB.
S.Name = {'CLARK';'BROWN';'MARTIN'};
S.Gender = {'M';'F';'M'};
S.SystolicBP = [124;122;130];
S.DiastolicBP = [93;80;92];
isscalar(S)
ans =
logical
1
isstruct(S)
ans =
logical
1
If you ask if your other struct array isscalar it will tell you false (logical 0.)
isvector, iscolumn, and ismatrix all return true on the other struct array from your example, while isrow returns false.
Guillaume: The second is a structure array where each field is scalar.
To be pedantic, that's mostly but not completely correct. The Gender, SystolicBP, and DiastolicBP fields contain scalars. The Name fields contain char vectors. If you made them store string arrays you could make them scalar string arrays.
S(1).Name = 'CLARK'; % ' makes a char vector
isscalar(S(1).Name) % false
S(2).Name = "BROWN"; % " makes a string in recent releases
isscalar(S(2).Name) % true
Thanks very much for your help.

Sign in to comment.

More Answers (1)

There are probably people out there much smarter than me, so take this with a grain of salt.
If it were me, I would choose that my "main structure" should be a scalar structure - e.g. one simply called data. Inside this structure, you can then have multiple fields which can be other structures if you want, and each of these structures can be scalar or nonscalar as you choose. (You can also have cells, arrays, both of character and double, within all of these substructures). The primary advantage here is that a simple
save('researchdata.mat','data');
Saves all your stuff. As well as that your workspace is not cluttered, since it only has the one variable.

Categories

Products

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!