Create a datastore from a table

43 views (last 30 days)
Brian
Brian on 30 Nov 2021
Commented: Jeremy Hughes on 1 Dec 2021
Hi folks,
I'm getting up to speed with the Deep Learning Toolbox. The Datastore concept has several benefits. The obvious one is that it manages data that is too big to fit in memory. But it has other advantages, like the "splitEachLabel" function, which divides the data preserving the proportion of each label.
I have a table with my predictor and response variables. I'd like to be able to convert it to a (in-memory) datastore. The function arrayDatastore would seem to be the way to go, but it seems to make a datastore only of a homogeneous array, for example my predictors. I can't figure out how to combine the predictors and responses (as Labels) so that I can hand the one datastore to trainNetwork.
What am I missing?
Thanks.
Brian

Answers (1)

Jeremy Hughes
Jeremy Hughes on 30 Nov 2021
I had no issue with arrayDatastore taking a table. Could you share some sample code with the errors or problems you're seeing?
A = array2table(rand(5))
A = 5×5 table
Var1 Var2 Var3 Var4 Var5 _______ _______ _______ ________ _______ 0.61526 0.56228 0.54068 0.24683 0.85154 0.74062 0.27071 0.33636 0.19196 0.67527 0.90689 0.9864 0.39159 0.42828 0.31253 0.87219 0.449 0.41665 0.78843 0.72222 0.83744 0.49673 0.39907 0.039265 0.41129
ds = arrayDatastore(A,"OutputType","same")
ds =
ArrayDatastore with properties: ReadSize: 1 IterationDimension: 1 OutputType: "same"
read(ds)
ans = 1×5 table
Var1 Var2 Var3 Var4 Var5 _______ _______ _______ _______ _______ 0.61526 0.56228 0.54068 0.24683 0.85154
Each read call returns a one row table. Maybe not what you're lookinf for, but it's "working" for some definition.
BTW: If you don't supply the OutputType, the result is a cell, but it still reads the data, it just wraps the contents in a cell.
  2 Comments
Brian
Brian on 30 Nov 2021
Hi Jeremy,
Assuming we have to convert the table to a set of arrays first: predictor (X) and response (lab), then the problem I'm trying to solve is to construct a datastore with a Labels property. It seems only imageDatastore has that property. The snippet below trys a few ways that don't work.
What would be really nice is to have a table with one column titled "Labels" and then have that column turned into the "Labels" field in the datastore.
X = randn(1000,10);
dsX = arrayDatastore(X);
disp('1 Try looking at labels');
try
dsX.Labels
catch ME
disp(ME.message)
end
% Create a 2-category label
ilab = randi(2,[1000,1]);
clab = categorical(ilab);
categories(clab)
disp('2 Try assigning labels to the datastore');
try
dsX.Labels = clab;
catch ME
disp(ME.message)
end
Jeremy Hughes
Jeremy Hughes on 1 Dec 2021
I think you should look over this:
There are examples, and descriptions of what you need to have the datastore return. For a single input layer you need the output of the datastore to be a table (or two-column cell) which looks something like:
Predictors Response
__________________ ________
{224×224×3 double} 2
{224×224×3 double} 7
{224×224×3 double} 9
{224×224×3 double} 9
The predictors come from the imageDatastore, and the Response can be from the arrayDatastore, or if all your data is in memory, get it into this form:
Predictors = linspace(1,10,10)';
Response = rand(10,1);
T = table(Predictors,Response)
T = 10×2 table
Predictors Response __________ ________ 1 0.50668 2 0.10959 3 0.43082 4 0.3513 5 0.49086 6 0.97434 7 0.18396 8 0.88822 9 0.73619 10 0.4003
ds = arrayDatastore(T,"OutputType","same")
ds =
ArrayDatastore with properties: ReadSize: 1 IterationDimension: 1 OutputType: "cell"
Then that datastore should work as the first input to trainNetwork.

Sign in to comment.

Categories

Find more on Image Data Workflows in Help Center and File Exchange

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!