Is java.util.Hashtable more memory efficient than containers.Map / struct?

2 views (last 30 days)
I am saving large amounts of traffic data in a format suitable for real-time data analysis/visualisation. I need quick/constant access to traffic data corresponding to a given day/direction. At the moment, I am structuring my data as follows:
  • containers.Map containing one key per direction+date. The value is a struct with fields Speed (single array), Flow (single array), Incidents (array), etc.
The issue is that the files easily exceed 1GB, so I am trying to minimise their footprint.
I did a quick test storing 2 arrays of 100x100 uint8 in different formats and found java.util.Hashtable to be surprisingly small:
- array 20kB
- struct 29kB
- cell 24kB
- map 24kB
- java hashtable 8kB
I am considering to swap my containers.Map and structs with nested java hashTables. Is this a good idea? And how come the size of the file is so small for a java.Hashtable? (I am assuming some kind of compression is happening there)
  3 Comments
Erik Johannes Loo
Erik Johannes Loo on 2 Mar 2022
There is definitely something interesting happening. I used arrays of zeros in the first test, and i've now used one 100x100 single rand arrays, but the results are still consistent:
- array 40kB
- struct 43kB
- cell 42kB
- map 54kB
- java hashtable 42kB
I can understand struct/cell seem to have a 2/3kB overhead per fieldname/index. What's surprising about the java hashtable is that adding more fields does not seem to increase its overhead as with structs/cells/maps.
If I save 2 arrays, the size is 78kB. If I save a java hash table with 1 field being another java hashtable with an array and the second field being just the array the size is also 78kB. If both fields have the same array, the size is 44kB.
Walter Roberson
Walter Roberson on 2 Mar 2022
Each place that can contain data of different size or type has a header of about 108 bytes (I think I found 104 bytes in one case.) Each field of each entry of a struct array has a different size / type so it can add up. Likewise each cell entry has the same overhead.
I seem to recall that I measured the storage space for fieldnames, but I do not clearly recall what I found. Possibly 64 bytes per field per struct array. (Struct arrays have the same organization for each entry so the field names do not need to be repeated.)

Sign in to comment.

Answers (0)

Categories

Find more on Dictionaries in Help Center and File Exchange

Products


Release

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!