How to best present multi-parameter, multi-read, multi-sample data for neural net learning?

10 views (last 30 days)
I understand how to input multiple parameters of data for each sample in a NN (as in the example datasets). However in flow cytometry each sample consists of multiparameter data from thousands of cells (that all belong to the one sample) and it is the distribution of data from each sample that is important that I want to enter as a numeric array of data.
I have only been successful in entering data in a similar way as the example data sets but have not had not much luck entering the data I want to.
I.e. my eg dataset looks like this
Sample 1 = 4000 data reads of 18 parameters with outcome 1 Sample 2 = 5390 data reads of 18 parameters with outcome 1 Sample 3 = 8999 data reads of 18 parameters with outcome 0 etc etc etc...
What is the best way to present this data to a NN, remembering that the distribution of the data reads per sample is important (so I don't want to just join all the data reads together (omitting sample distribution)?

Answers (2)

Greg Heath
Greg Heath on 24 Mar 2018
Both target and inputs should be as mixed as possible. Something close to
S1,S3,S2,S3,S1,S3,S2,S3,...
should work well.
Hope this helps.
Thank you for formally accepting my answer
Greg

NA NA
NA NA on 25 Mar 2018
Edited: NA NA on 29 Mar 2018
But then wouldn't I lose the relationship between S1-1, and S1-2?
I have seen it recommended to use pca to reduce dimensionality and then columnise to make the input data but I see that as losing the specific sample relationships.
It would be nice of neural networks in matlab could input data like 10000x12x500 data sets (500 samples, each with 10000 reads of 12 parameters) but it looks like it can not be done at the moment using the current NN implementation.

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!