Index and combine datastores tables
3 views (last 30 days)
Show older comments
I have two datastores, ds1 and ds2 consisting of large CSV files. ds1 has some bad lines and ds2 has the corrections. I need to combine the two tables, but delete the bad lines from ds1 and save in a single datastore. This simple code seems like it should work.
t1 = tall(ds1);
t2 = tall(ds2);
t3 = [t1(idx,:); t2];
write('test/out_*.csv',t3);
However, I get an error saying that write does not work with column indexing.
Evaluating tall expression using the Local MATLAB Session:
- Pass 1 of 2: 48% complete
Evaluation 21% complete
Error using tall/write (line 222)
Unable to access intermediate data in the temporary folder, most likely because it ran out of space. Clear space on the local drive, or avoid operations that reorder
tall arrays (such as SORT or indexing with a tall numeric column vector).
Caused by:
Error in adding keys and values.
Error during serialization
Any ideas on how I can get around this error?
0 Comments
Answers (1)
Salman Ahmed
on 19 Nov 2021
Hi Russell,
To combine tall arrays with different underlying datastores, it is recommended that you write the arrays (or indexing results) to disk and then create a single new datastore referencing those locations:
files = {'folder/path/to/file1','folder/path/to/file2'};
ds = datastore(files);
0 Comments
See Also
Categories
Find more on Tall Arrays in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!