More efficient jsonencode for large data?

21 views (last 30 days)
Seb
Seb on 20 Jul 2017
Commented: Joris Brouwer on 11 Aug 2022
So I am using matlabs jsonencode function to encode a structure array to a character array, and then write this to an output json text file. The structure array mat file equalls approximately 10GB. This takes both a long time, and alot of computer memory.
I then import the JSON text file into mongodb.
Is there a more efficient way to directly get the data into mongodb? Maybe the only option is the MatLab database toolbox...?
Thanks

Answers (3)

Seb
Seb on 1 Aug 2017
I have found a temporary way to better do this.
I still use matlabs built in jsonencode function, and output a JSON file, but I do it in small stages/chunks. For example, I have a devision factor, which determines how many chunks I do. This then splits the data structure into row indices. I then encode the json within those row indices, and write that chunk of data to the file. I then go onto the next chunk and so forth until all data is written.
If someone is in need of the code then let me know and I can provide it. It does not take any less time than a single jsonencode call would be, however greatly reduces load and memory consumption on the computer, and can be made to be 32-bit compliant (each chunk cant be larger than ~1gb)
  1 Comment
Marcel
Marcel on 17 Apr 2020
Hi Seb, do you still have the code for this? I am interested in this. How do you puzzle together the parts in the json file? THANKS

Sign in to comment.


Carl
Carl on 25 Jul 2017
Edited: Carl on 25 Jul 2017
You can try following the example in this File Exchange function, which uses the MongoDB Java driver to insert a document:
The following Stackoverflow post also has some good suggestions:
With these approaches, you should be able to avoid writing text to disk, which may be more efficient. Note that this workflow has not been qualified, and it may or may not be more efficient than your current method.

Marco Rossi
Marco Rossi on 28 Jul 2021
Is matlab development team planning to remove this limitation? I also noticed that, when the 32 bit limit is overtaken, no warnings/errors are returned.
  1 Comment
Joris Brouwer
Joris Brouwer on 11 Aug 2022
I second that. Running into what seems to be exponentionally / halting jsonencode performance as well. Trying the chunked solution mentioned above.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!